RNA spike-in

Last updated
Three-dimensional structure of an RNA molecule. RNA spike-ins are short synthetic RNA polymers. ARNm-Rasmol.gif
Three-dimensional structure of an RNA molecule. RNA spike-ins are short synthetic RNA polymers.

An RNA spike-in is an RNA transcript of known sequence and quantity used to calibrate measurements in RNA hybridization assays, such as DNA microarray experiments, RT-qPCR, and RNA-Seq. [1]

Contents

A spike-in is designed to bind to a DNA molecule with a matching sequence, known as a control probe. [2] [3] [4] This process of specific binding is called hybridization. A known quantity of RNA spike-in is mixed with the experiment sample during preparation. [2] The degree of hybridization between the spike-ins and the control probes is used to normalize the hybridization measurements of the sample RNA. [2]

History

Nucleic acid hybridization assays have been used for decades to detect specific sequences of DNA or RNA, [5] with a DNA microarray precursor used as early as 1965. [6] In such assays, positive control oligonucleotides are necessary to provide a standard for comparison of target sequence concentration, and to check and correct for nonspecific binding; that is, incidental binding of the RNA to non-complementary DNA sequences. [7] These controls became known as "spike-ins". [1] With the advent of DNA microarray chips in the 1990s [8] and the commercialization of high-throughput methods for sequencing and RNA detection assays, manufacturers of hybridization assay "kits" started to provide pre-developed spike-ins. [1] In the case of gene expression assay microarrays or RNA sequencing (RNA-seq), RNA spike-ins are used.

Manufacturing

RNA spike-ins can be synthesized by any means of creating RNA synthetically, or by using cells to transcribe DNA to RNA in vivo (in cells). [1] RNA can be produced in vitro (cell free) using RNA polymerase and DNA with the desired sequence. [1] Large scale biotech manufacturers produce RNA synthetically via high-throughput techniques and provide solutions of RNA spike-ins at predetermined concentration. [1] Bacteria containing DNA (usually on plasmids) for transcription to spike-ins are also commercially available. [1] The purified RNA can be stored long-term in a buffered solution at low temperature. [1]

Applications

Example of DNA microarray data. The bright spots show locations where hybridization has occurred, indicating that RNA of the corresponding sequence was present in the sample. DNA microarray.jpg
Example of DNA microarray data. The bright spots show locations where hybridization has occurred, indicating that RNA of the corresponding sequence was present in the sample.

DNA microarrays

DNA microarrays are solid surfaces, usually a small chip, to which short DNA polymers of known sequence are covalently bound. [6] When a sample of unknown RNA is flowed over the array, the RNA base pairs with and binds to complementary DNA. [6] Bound transcripts can be detected, indicating the presence of RNA with the corresponding sequence. [6] DNA microarray assays are useful in studies of gene expression, because many of the mRNA transcripts present in a cell can be detected at the same time. [6] RNA spike-ins of known quantity can provide a baseline signal for comparison with the signal from transcripts of unknown quantity, such that the data can be normalized within an array and between different arrays. [2]

Sequencing

RNA sequencing (RNA-Seq) is performed by reverse transcribing RNA to complementary DNA (cDNA) and high-throughput sequencing the cDNA. [9] Such high-throughput methods can be error prone, and known controls are necessary to detect and correct for levels of error. [9] RNA spike-in controls can provide a measure of sensitivity and specificity of an RNA-Seq experiment. [9]

See also

Related Research Articles

<span class="mw-page-title-main">Complementary DNA</span> DNA reverse transcribed from RNA

In genetics, complementary DNA (cDNA) is DNA that was reverse transcribed from an RNA. cDNA exists in both single-stranded and double-stranded forms and in both natural and engineered forms.

<span class="mw-page-title-main">Northern blot</span> Molecular biology technique

The northern blot, or RNA blot, is a technique used in molecular biology research to study gene expression by detection of RNA in a sample.

<span class="mw-page-title-main">DNA microarray</span> Collection of microscopic DNA spots attached to a solid surface

A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Each DNA spot contains picomoles of a specific DNA sequence, known as probes. These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. The original nucleic acid arrays were macro arrays approximately 9 cm × 12 cm and the first computerized image based analysis was published in 1981. It was invented by Patrick O. Brown. An example of its application is in SNPs arrays for polymorphisms in cardiovascular diseases, cancer, pathogens and GWAS analysis. It is also used for the identification of structural variations and the measurement of gene expression.

<span class="mw-page-title-main">Functional genomics</span> Field of molecular biology

Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "candidate-gene" approach.

The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The term transcriptome is a portmanteau of the words transcript and genome; it is associated with the process of transcript production during the biological process of transcription.

<span class="mw-page-title-main">Serial analysis of gene expression</span> Molecular biology technique

Serial Analysis of Gene Expression (SAGE) is a transcriptomic technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. Several variants have been developed since, most notably a more robust version, LongSAGE, RL-SAGE and the most recent SuperSAGE. Many of these have improved the technique with the capture of longer tags, enabling more confident identification of a source gene.

<i>In situ</i> hybridization Laboratory technique to localize nucleic acids

In situ hybridization (ISH) is a type of hybridization that uses a labeled complementary DNA, RNA or modified nucleic acids strand to localize a specific DNA or RNA sequence in a portion or section of tissue or if the tissue is small enough, in the entire tissue, in cells, and in circulating tumor cells (CTCs). This is distinct from immunohistochemistry, which usually localizes proteins in tissue sections.

<span class="mw-page-title-main">Nuclear run-on</span>

A nuclear run-on assay is conducted to identify the genes that are being transcribed at a certain time point. Approximately one million cell nuclei are isolated and incubated with labeled nucleotides, and genes in the process of being transcribed are detected by hybridization of extracted RNA to gene specific probes on a blot. Garcia-Martinez et al. (2004) developed a protocol for the yeast S. cerevisiae that allows for the calculation of transcription rates (TRs) for all yeast genes to estimate mRNA stabilities for all yeast mRNAs.

<span class="mw-page-title-main">ABI Solid Sequencing</span>

SOLiD (Sequencing by Oligonucleotide Ligation and Detection) is a next-generation DNA sequencing technology developed by Life Technologies and has been commercially available since 2006. This next generation technology generates 108 - 109 small sequence reads at one time. It uses 2 base encoding to decode the raw data generated by the sequencing platform into sequence data.

ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.

<span class="mw-page-title-main">RNA immunoprecipitation chip</span>

RIP-chip is a molecular biology technique which combines RNA immunoprecipitation with a microarray. The purpose of this technique is to identify which RNA sequences interact with a particular RNA binding protein of interest in vivo. It can also be used to determine relative levels of gene expression, to identify subsets of RNAs which may be co-regulated, or to identify RNAs that may have related functions. This technique provides insight into the post-transcriptional gene regulation which occurs between RNA and RNA binding proteins.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

<span class="mw-page-title-main">RNA-Seq</span> Lab technique in cellular biology

RNA-Seq is a technique that uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.

Massive parallel signature sequencing (MPSS) is a procedure that is used to identify and quantify mRNA transcripts, resulting in data similar to serial analysis of gene expression (SAGE), although it employs a series of biochemical and sequencing steps that are substantially different.

MicroRNA sequencing (miRNA-seq), a type of RNA-Seq, is the use of next-generation sequencing or massively parallel high-throughput DNA sequencing to sequence microRNAs, also called miRNAs. miRNA-seq differs from other forms of RNA-seq in that input material is often enriched for small RNAs. miRNA-seq allows researchers to examine tissue-specific expression patterns, disease associations, and isoforms of miRNAs, and to discover previously uncharacterized miRNAs. Evidence that dysregulated miRNAs play a role in diseases such as cancer has positioned miRNA-seq to potentially become an important tool in the future for diagnostics and prognostics as costs continue to decrease. Like other miRNA profiling technologies, miRNA-Seq has both advantages and disadvantages.

Single-cell transcriptomics examines the gene expression level of individual cells in a given population by simultaneously measuring the RNA concentration of hundreds to thousands of genes. Single-cell transcriptomics makes it possible to unravel heterogeneous cell populations, reconstruct cellular developmental pathways, and model transcriptional dynamics — all previously masked in bulk RNA sequencing.

Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.

Time-resolved RNA sequencing methods are applications of RNA-seq that allow for observations of RNA abundances over time in a biological sample or samples. Second-Generation DNA sequencing has enabled cost effective, high throughput and unbiased analysis of the transcriptome. Normally, RNA-seq is only capable of capturing a snapshot of the transcriptome at the time of sample collection. This necessitates multiple samplings at multiple time points, which increases both monetary and time costs for experiments. Methodological and technological innovations have allowed for the analysis of the RNA transcriptome over time without requiring multiple samplings at various time points.

<span class="mw-page-title-main">Spatial transcriptomics</span> Range of methods designed for assigning cell types

Spatial transcriptomics is a method for assigning cell types to their locations in the histological sections and can also be used to determine subcellular localization of mRNA molecules. First described in 2016 by Ståhl et al., it has since undergone a variety of improvements and modifications.

CITE-Seq is a method for performing RNA sequencing along with gaining quantitative and qualitative information on surface proteins with available antibodies on a single cell level. So far, the method has been demonstrated to work with only a few proteins per cell. As such, it provides an additional layer of information for the same cell by combining both proteomics and transcriptomics data. For phenotyping, this method has been shown to be as accurate as flow cytometry by the groups that developed it. It is currently one of the main methods, along with REAP-Seq, to evaluate both gene expression and protein levels simultaneously in different species.

References

  1. 1 2 3 4 5 6 7 8 Yang IV (2006). "Use of External Controls in Microarray Experiments". DNA Microarrays, Part B: Databases and Statistics. Methods in Enzymology. Vol. 411. pp. 50–63. doi:10.1016/S0076-6879(06)11004-6. ISBN   9780121828165. PMID   16939785.
  2. 1 2 3 4 Fardin P, Moretti S, Biasotti B, Ricciardi A, Bonassi S, Varesio L (2007). "Normalization of low-density microarray using external spike-in controls: analysis of macrophage cell lines expression profile". BMC Genomics. 8: 17. doi: 10.1186/1471-2164-8-17 . PMC   1797020 . PMID   17229315.
  3. Wilkes T, Laux H, Foy CA (2007). "Microarray data quality - review of current developments". OMICS. 11 (1): 1–13. doi:10.1089/omi.2006.0001. PMID   17411392.
  4. Schuster EF, Blanc E, Partridge L, Thornton JM (2007). "Estimation and correction of non-specific binding in a large-scale spike-in experiment". Genome Biol. 8 (6): R126. doi: 10.1186/gb-2007-8-6-r126 . PMC   2394775 . PMID   17594493.
  5. Southern, Edwin M. (2001). "DNA Microarrays: History and Overview". DNA Arrays . Methods in Molecular Biology. Vol. 170. Humana Press. pp.  1–15. doi:10.1385/1-59259-234-1:1. ISBN   9780896038226. PMID   11357674.
  6. 1 2 3 4 5 Gillespie, D.; Spiegelman, S. (July 1965). "A quantitative assay for DNA-RNA hybrids with DNA immobilized on a membrane". Journal of Molecular Biology. 12 (3): 829–842. doi:10.1016/s0022-2836(65)80331-x. ISSN   0022-2836. PMID   4955314.
  7. Yang, Ivana V. (2006-01-01). "[4] Use of External Controls in Microarray Experiments". DNA Microarrays, Part B: Databases and Statistics. Methods in Enzymology. Vol. 411. Academic Press. pp. 50–63. doi:10.1016/S0076-6879(06)11004-6. ISBN   9780121828165. PMID   16939785.
  8. Schena, Mark; Shalon, Dari; Davis, Ronald W.; Brown, Patrick O. (1995-10-20). "Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray". Science. 270 (5235): 467–470. Bibcode:1995Sci...270..467S. doi:10.1126/science.270.5235.467. ISSN   0036-8075. PMID   7569999.
  9. 1 2 3 Jiang, Lichun; Schlesinger, Felix; Davis, Carrie A.; Zhang, Yu; Li, Renhua; Salit, Marc; Gingeras, Thomas R.; Oliver, Brian (2011-09-01). "Synthetic spike-in standards for RNA-seq experiments". Genome Research. 21 (9): 1543–1551. doi:10.1101/gr.121095.111. ISSN   1088-9051. PMC   3166838 . PMID   21816910.