Fluorescent in situ sequencing

Last updated
FISSEQ in fibroblast cells. Each spot is an RNA molecule reverse transcribed and amplified. The spot color indicates the base. The image represents one cycle in a sequencing run. 30 cycles would yield 30 bases of the RNA sequence. Scale bar 20um. FibroblastFisseqSciencePaperScale20um.jpg
FISSEQ in fibroblast cells. Each spot is an RNA molecule reverse transcribed and amplified. The spot color indicates the base. The image represents one cycle in a sequencing run. 30 cycles would yield 30 bases of the RNA sequence. Scale bar 20um.

Fluorescent in situ sequencing (FISSEQ) is a method of sequencing a cell's RNA while it remains in tissue or culture using next-generation sequencing.

Contents

Introduction

FISSEQ combines the spatial context of RNA-FISH and the global transcriptome profiling of RNA-seq. [1] FISSEQ preserves the tissue allowing single molecule in situ RNA localization. The foundation of the method is a novel nucleic acid sequencing library construction method that stably cross-links cDNA amplicons within biological samples. [2] Sequencing data is then generated through an intensive interleaved microscopy and biochemistry protocol and subsequent image processing and bioinformatics. FISSEQ is compatible with diverse sample types including cell culture, tissue sections, and whole mount embryos. FISSEQ is an example of an extremely dense form of in-situ nucleic acid readout: every letter along the RNA chain is read. Thus, barcodes for FISSEQ can be packed into a short string of DNA, as short as 15-20 nucleotides long for the mouse brain or 5 nucleotides for targeted cancer gene panels.

Methods

FISSEQ Schematic: A tagged random hexamer primer is used to prime M-MuLV reverse transcriptase to generate aminoallyl dUTP- modified cDNA fragments in fixed cells or tissues from RNA. BS(PEG)9 permanently cross-links the modified cDNA and the cellular protein matrix. After cDNA circularization, Phi29 DNA polymerase generates cDNA amplicons. Amplicons in 3D in situ RNA sequenced within the cell. FISSEQ schematic.png
FISSEQ Schematic: A tagged random hexamer primer is used to prime M-MuLV reverse transcriptase to generate aminoallyl dUTP- modified cDNA fragments in fixed cells or tissues from RNA. BS(PEG)9 permanently cross-links the modified cDNA and the cellular protein matrix. After cDNA circularization, Phi29 DNA polymerase generates cDNA amplicons. Amplicons in 3D in situ RNA sequenced within the cell.

In FISSEQ, a series of biochemical processing steps, such as DNA ligations or single-base DNA polymerase extensions, are performed on a block of fixed tissue, interlaced with fluorescent imaging steps. The process is conceptually identical to the mechanism of fluorescent sequencing by synthesis in a commercial bulk DNA sequencing machine, except that it is performed in fixed tissue. Each DNA or RNA molecule in the sample is first “amplified” (i.e., copied) in-situ via rolling-circle amplification to create a localized “rolling circle colony” (rolony) consisting of identical copies of the parent molecule. A series of biochemical steps are then carried out. In the kth cycle, a fluorescent tag is introduced, the color of which corresponds to the identity of the kth base along the rolony’s parent DNA strand. The system is then “paused” in this state for imaging. The entire sample can be imaged in each cycle. The fluorescent tags are then cleaved and washed away, and the next cycle is initiated. Each rolony – corresponding to a single “parent” DNA or RNA molecule in the tissue – thus appears across a series of fluorescent images, as a localized “spot” with a sequence of colors corresponding to the nucleotide sequence of the parent molecule. The nucleotide sequence of each DNA or RNA molecule is thus read out in-situ via fluorescent microscopy.

History

The development of FISSEQ began over a decade ago. [1] The underlying principles are similar to those that ushered in the sequencing revolution. [3] [4] [5] In both bulk high-throughput sequencing and FISSEQ, short sequences are locally amplified and then imaged one nucleotide at a time. However, the requirement to sequence RNA in intact tissue—rather than isolated and purified DNA, as in conventional bulk sequencing—posed additional challenges. These limitations were overcome, [2] and FISSEQ allows the joint, high-throughput readout of sequence and spatial information.

See also

Related Research Articles

In genetics and biochemistry, sequencing means to determine the primary structure of an unbranched biopolymer. Sequencing results in a symbolic linear depiction known as a sequence which succinctly summarizes much of the atomic-level structure of the sequenced molecule.

<span class="mw-page-title-main">DNA sequencing</span> Process of determining the nucleic acid sequence

DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery.

Fluorescence <i>in situ</i> hybridization Genetic testing technique

Fluorescence in situ hybridization (FISH) is a molecular cytogenetic technique that uses fluorescent probes that bind to only particular parts of a nucleic acid sequence with a high degree of sequence complementarity. It was developed by biomedical researchers in the early 1980s to detect and localize the presence or absence of specific DNA sequences on chromosomes. Fluorescence microscopy can be used to find out where the fluorescent probe is bound to the chromosomes. FISH is often used for finding specific features in DNA for use in genetic counseling, medicine, and species identification. FISH can also be used to detect and localize specific RNA targets in cells, circulating tumor cells, and tissue samples. In this context, it can help define the spatial-temporal patterns of gene expression within cells and tissues.

<span class="mw-page-title-main">Sanger sequencing</span> Method of DNA sequencing developed in 1977

Sanger sequencing is a method of DNA sequencing that involves electrophoresis and is based on the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. After first being developed by Frederick Sanger and colleagues in 1977, it became the most widely used sequencing method for approximately 40 years. It was first commercialized by Applied Biosystems in 1986. More recently, higher volume Sanger sequencing has been replaced by next generation sequencing methods, especially for large-scale, automated genome analyses. However, the Sanger method remains in wide use for smaller-scale projects and for validation of deep sequencing results. It still has the advantage over short-read sequencing technologies in that it can produce DNA sequence reads of > 500 nucleotides and maintains a very low error rate with accuracies around 99.99%. Sanger sequencing is still actively being used in efforts for public health initiatives such as sequencing the spike protein from SARS-CoV-2 as well as for the surveillance of norovirus outbreaks through the Center for Disease Control and Prevention's (CDC) CaliciNet surveillance network.

<span class="mw-page-title-main">Serial analysis of gene expression</span> Molecular biology technique

Serial Analysis of Gene Expression (SAGE) is a transcriptomic technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. Several variants have been developed since, most notably a more robust version, LongSAGE, RL-SAGE and the most recent SuperSAGE. Many of these have improved the technique with the capture of longer tags, enabling more confident identification of a source gene.

<i>In situ</i> hybridization

In situ hybridization (ISH) is a type of hybridization that uses a labeled complementary DNA, RNA or modified nucleic acids strand to localize a specific DNA or RNA sequence in a portion or section of tissue or if the tissue is small enough, in the entire tissue, in cells, and in circulating tumor cells (CTCs). This is distinct from immunohistochemistry, which usually localizes proteins in tissue sections.

<span class="mw-page-title-main">Bisulfite sequencing</span> Lab procedure detecting 5-methylcytosines in DNA

Bisulfitesequencing (also known as bisulphite sequencing) is the use of bisulfite treatment of DNA before routine sequencing to determine the pattern of methylation. DNA methylation was the first discovered epigenetic mark, and remains the most studied. In animals it predominantly involves the addition of a methyl group to the carbon-5 position of cytosine residues of the dinucleotide CpG, and is implicated in repression of transcriptional activity.

<span class="mw-page-title-main">Nucleic acid test</span> Group of techniques to detect a particular nucleic acid sequence

A nucleic acid test (NAT) is a technique used to detect a particular nucleic acid sequence and thus usually to detect and identify a particular species or subspecies of organism, often a virus or bacterium that acts as a pathogen in blood, tissue, urine, etc. NATs differ from other tests in that they detect genetic materials rather than antigens or antibodies. Detection of genetic materials allows an early diagnosis of a disease because the detection of antigens and/or antibodies requires time for them to start appearing in the bloodstream. Since the amount of a certain genetic material is usually very small, many NATs include a step that amplifies the genetic material—that is, makes many copies of it. Such NATs are called nucleic acid amplification tests (NAATs). There are several ways of amplification, including polymerase chain reaction (PCR), strand displacement assay (SDA), or transcription mediated assay (TMA).

Single-molecule real-time (SMRT) sequencing is a parallelized single molecule DNA sequencing method. Single-molecule real-time sequencing utilizes a zero-mode waveguide (ZMW). A single DNA polymerase enzyme is affixed at the bottom of a ZMW with a single molecule of DNA as a template. The ZMW is a structure that creates an illuminated observation volume that is small enough to observe only a single nucleotide of DNA being incorporated by DNA polymerase. Each of the four DNA bases is attached to one of four different fluorescent dyes. When a nucleotide is incorporated by the DNA polymerase, the fluorescent tag is cleaved off and diffuses out of the observation area of the ZMW where its fluorescence is no longer observable. A detector detects the fluorescent signal of the nucleotide incorporation, and the base call is made according to the corresponding fluorescence of the dye.

Nucleic acid methods are the techniques used to study nucleic acids: DNA and RNA.

Optical mapping is a technique for constructing ordered, genome-wide, high-resolution restriction maps from single, stained molecules of DNA, called "optical maps". By mapping the location of restriction enzyme sites along the unknown DNA of an organism, the spectrum of resulting DNA fragments collectively serves as a unique "fingerprint" or "barcode" for that sequence. Originally developed by Dr. David C. Schwartz and his lab at NYU in the 1990s this method has since been integral to the assembly process of many large-scale sequencing projects for both microbial and eukaryotic genomes. Later technologies use DNA melting, DNA competitive binding or enzymatic labelling in order to create the optical mappings.

<span class="mw-page-title-main">Transmission electron microscopy DNA sequencing</span> Single-molecule sequencing technology

Transmission electron microscopy DNA sequencing is a single-molecule sequencing technology that uses transmission electron microscopy techniques. The method was conceived and developed in the 1960s and 70s, but lost favor when the extent of damage to the sample was recognized.

<span class="mw-page-title-main">DNA nanoball sequencing</span>

DNA nanoball sequencing is a high throughput sequencing technology that is used to determine the entire genomic sequence of an organism. The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs. Fluorescent nucleotides bind to complementary nucleotides and are then polymerized to anchor sequences bound to known sequences on the DNA template. The base order is determined via the fluorescence of the bound nucleotides This DNA sequencing method allows large numbers of DNA nanoballs to be sequenced per run at lower reagent costs compared to other next generation sequencing platforms. However, a limitation of this method is that it generates only short sequences of DNA, which presents challenges to mapping its reads to a reference genome. After purchasing Complete Genomics, the Beijing Genomics Institute (BGI) refined DNA nanoball sequencing to sequence nucleotide samples on their own platform.

Massive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation sequencing. Some of these technologies emerged between 1993 and 1998 and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads per instrument run.

<span class="mw-page-title-main">Illumina dye sequencing</span> DNA sequencing method

Illumina dye sequencing is a technique used to determine the series of base pairs in DNA, also known as DNA sequencing. The reversible terminated chemistry concept was invented by Bruno Canard and Simon Sarfati at the Pasteur Institute in Paris. It was developed by Shankar Balasubramanian and David Klenerman of Cambridge University, who subsequently founded Solexa, a company later acquired by Illumina. This sequencing method is based on reversible dye-terminators that enable the identification of single nucleotides as they are washed over DNA strands. It can also be used for whole-genome and region sequencing, transcriptome analysis, metagenomics, small RNA discovery, methylation profiling, and genome-wide protein-nucleic acid interaction analysis.

Magnetic sequencing is a single-molecule sequencing method in development. A DNA hairpin, containing the sequence of interest, is bound between a magnetic bead and a glass surface. A magnetic field is applied to stretch the hairpin open into single strands, and the hairpin refolds after decreasing of the magnetic field. The hairpin length can be determined by direct imaging of the diffraction rings of the magnetic beads using a simple microscope. The DNA sequences are determined by measuring the changes in the hairpin length following successful hybridization of complementary nucleotides.

<span class="mw-page-title-main">Single-cell analysis</span> Testbg biochemical processes and reactions in an individual cell

In the field of cellular biology, single-cell analysis and subcellular analysis is the study of genomics, transcriptomics, proteomics, metabolomics and cell–cell interactions at the single cell level. The concept of single-cell analysis originated in the 1970s. Before the discovery of heterogeneity, single-cell analysis mainly referred to the analysis or manipulation of an individual cell in a bulk population of cells at a particular condition using optical or electronic microscope. To date, due to the heterogeneity seen in both eukaryotic and prokaryotic cell populations, analyzing a single cell makes it possible to discover mechanisms not seen when studying a bulk population of cells. Technologies such as fluorescence-activated cell sorting (FACS) allow the precise isolation of selected single cells from complex samples, while high throughput single cell partitioning technologies, enable the simultaneous molecular analysis of hundreds or thousands of single unsorted cells; this is particularly useful for the analysis of transcriptome variation in genotypically identical cells, allowing the definition of otherwise undetectable cell subtypes. The development of new technologies is increasing our ability to analyze the genome and transcriptome of single cells, as well as to quantify their proteome and metabolome. Mass spectrometry techniques have become important analytical tools for proteomic and metabolomic analysis of single cells. Recent advances have enabled quantifying thousands of protein across hundreds of single cells, and thus make possible new types of analysis. In situ sequencing and fluorescence in situ hybridization (FISH) do not require that cells be isolated and are increasingly being used for analysis of tissues.

Single-cell sequencing examines the nucleic acid sequence information from individual cells with optimized next-generation sequencing technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment. For example, in cancer, sequencing the DNA of individual cells can give information about mutations carried by small populations of cells. In development, sequencing the RNAs expressed by individual cells can give insight into the existence and behavior of different cell types. In microbial systems, a population of the same species can appear genetically clonal. Still, single-cell sequencing of RNA or epigenetic modifications can reveal cell-to-cell variability that may help populations rapidly adapt to survive in changing environments.

Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.

<span class="mw-page-title-main">Spatial transcriptomics</span> Range of methods designed for assigning cell types

Spatial transcriptomics is a method for assigning cell types to their locations in the histological sections and can also be used to determine subcellular localization of mRNA molecules. First described in 2016 by Ståhl et al., it has since undergone a variety of improvements and modifications.

References

  1. 1 2 Mitra RD, Shendure J, Olejnik J, Edyta-Krzymanska-Olejnik, Church GM (2003). "Fluorescent in situ sequencing on polymerase colonies" (PDF). Analytical Biochemistry . 320 (1): 55–65. CiteSeerX   10.1.1.69.5582 . doi:10.1016/S0003-2697(03)00291-4. PMID   12895469.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  2. 1 2 Lee JH, Daugharthy ER, Scheiman J, Kalhor R, Yang JL, Ferrante TC, Terry R, Jeanty SS, Li C, Amamoto R, Peters DT, Turczyk BM, Marblestone AH, Inverso SA, Bernard A, Mali P, Rios X, Aach J, Church GM (2014). "Highly multiplexed subcellular RNA sequencing in situ". Science . 343 (6177): 1360–1363. Bibcode:2014Sci...343.1360L. doi:10.1126/science.1250212. PMC   4140943 . PMID   24578530.
  3. Church, G. M.; Kieffer-Higgins, S. (1988-04-08). "Multiplex DNA sequencing". Science. 240 (4849): 185–188. Bibcode:1988Sci...240..185C. doi:10.1126/science.3353714. ISSN   0036-8075. PMID   3353714.
  4. Bentley, David R.; Balasubramanian, Shankar; Swerdlow, Harold P.; Smith, Geoffrey P.; Milton, John; Brown, Clive G.; Hall, Kevin P.; Evers, Dirk J.; Barnes, Colin L. (2008-11-06). "Accurate whole human genome sequencing using reversible terminator chemistry". Nature. 456 (7218): 53–59. Bibcode:2008Natur.456...53B. doi:10.1038/nature07517. ISSN   1476-4687. PMC   2581791 . PMID   18987734.
  5. Peters, Brock A.; Kermani, Bahram G.; Sparks, Andrew B.; Alferov, Oleg; Hong, Peter; Alexeev, Andrei; Jiang, Yuan; Dahl, Fredrik; Tang, Y. Tom (2012-07-12). "Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells". Nature. 487 (7406): 190–195. Bibcode:2012Natur.487..190P. doi:10.1038/nature11236. ISSN   1476-4687. PMC   3397394 . PMID   22785314.