Degradome sequencing (Degradome-Seq), [1] [2] also referred to as parallel analysis of RNA ends (PARE), is a modified version of 5'-Rapid Amplification of cDNA Ends (RACE) using high-throughput, deep sequencing methods such as Illumina's SBS technology. The degradome encompasses the entire set of proteases that are expressed at a specific time in a given biological material, including tissues, cells, organisms, and biofluids. [3] Thus, sequencing this degradome offers a method for studying and researching the process of RNA degradation. This process is used to identify and quantify RNA degradation products, or fragments, present in any given biological sample. [4] This approach allows for the systematic identification of targets of RNA decay and provides insight into the dynamics of transcriptional and post-transcriptional gene regulation. [4]
Degradome sequencing is a complex process which includes multiple steps such as isolating RNA fragments in a given sample as well as ligation and reverse transcription to form complementary DNA (cDNA) strands. [4] This cDNA can be sequenced, and the results are compared with a transcriptome, or reference genome, in order to determine and characterize the abundance of the RNA fragments identified in this process. [4]
In general, the basic steps necessary for degradome sequencing include:
When analyzing the raw data derived from degradome sequencing, software tools like CleaveLand, PAREsnip, and miRferno are beneficial resources for researchers. [6]
Degradome sequencing data and structural RNAs are used to remove all degradome sequences with exact matches to structural RNAs. The cDNA database is then used to map degradome sequences to cDNA sequences. The degradome sequences with many transcriptome hits are normalized. Then, query sequences of mRNA are generated for the matching degradome sequence. These query sequences are mapped to small RNAs, and a complementarity search is performed to match query sequences to small RNAs. A signal is then released to initiate noise analysis which works to distinguish and separate spurious results from real targets. Lastly, the resulting output of data analysis includes a list of all mRNA targets with the associated alignments for the small RNA-mRNA pairs. [7]
The applications of degradome sequencing include identifying microRNA (miRNA) targets, establishing mRNA methods of decay, and finding novel non-coding RNA fragments. In particular, this tool has been used to determine miRNA targets in numerous organisms, such as plants [8] and mammals. [9] Degradome sequencing has also been used to study the role of RNA decay pathways in cancer [9] and identify new types of non-coding RNAs. [8]
Ultimately, degradome sequencing is a powerful tool for the comprehensive analysis of RNA degradation with a variety of applications in biological research as well as medicine.
MicroRNAs are a class of small noncoding RNA created by removing stem-loop precursors. [10] MiRNAs play a role in controlling gene expression post-transcriptionally in addition to during transcription via RNA silencing. [11] In order to accomplish this, the RNA-induced silencing complex (RISC) processes pre-microRNAs into mature microRNAs. [12] Mature miRNAs target specific mRNA species for regulation, often via the RISC complex disassembling specific mRNA sequences to inhibit translation. [12]
MiRNAs are highly conserved across a variety of species, so degradome sequencing is used in research to identify mRNA targets in many species. [13] Degradome sequencing has been used to identify miRNA cleavage sites, [13] because miRNAs can cause endonucleolytic cleavage of mRNA by extensive and often perfect complementarity to mRNAs. [1] [2] Degradome sequencing revealed many known and novel plant miRNA and small interfering RNA (siRNA) targets. [1] [2] [14] [15] [16] [17] Recently, degradome sequencing also has been applied to identify animal (human and mouse) miRNA-derived cleavages. [18] [19] [20]
In this study, researchers tracked and reported miRNA processing intermediates. Degradome signals on miRNA precursors were extracted and processed for 15 different species. The use of degradome sequencing in this study allowed for the collection of data that supported the analysis and processing of many miRNA precursors, with a greater ratio of high-confidence miRNAs annotated in miRBase, an miRNA database, than those considered low-confidence. Additionally, this study highlighted the importance of degradome sequencing as a technique in the study of miRNA annotation. In particular, the processing signal distribution provided by degradome sequencing data allowed the researchers to propose a new model for the method by which miRNAs are diced and to determine the frequency with which the loop-to-base mode of processing occurred. Ultimately, the results of this study are indicative of the impressive capability of degradome sequencing data to track miRNA processing signals, providing novel insights into miRNA processing and function. [21]
In this study, researchers developed a model in which biologists could use data derived from degradome sequencing to determine the effect of transcriptional and/or post-transcriptional regulation on patterns of gene expression in plants. In particular, this model applies degradome sequencing data to establish the method by which small RNAs (sRNAs) mature and guide the process of targeted gene regulation. The results of this study demonstrate the vast potential applications of degradome sequencing analysis in future research regarding RNA biology in eukaryotes. In particular, degradome sequencing data can be used to track non-coding RNA (ncRNA) processing signals which would be a valuable tool if expanded to include animal-based research. [8]
Degradome sequencing can be used to identify cleavage sites of RNAs by sequencing the 5' end of the cleaved RNA fragments. [4] This technique has been widely used in cancer research to identify potential targets of RNA-degrading enzymes involved in cancer progression. As such, degradome sequencing has provided a new method of discovering markers for earlier diagnosis and prognosis determination in cancer patients. Given the established role of extracellular proteases in promoting tumor development and growth across different tissues, degradome sequencing also holds important implications for discovering novel therapeutic targets for cancer treatments. [22]
In this study, researchers utilized degradome sequencing to analyze all genome-encoded proteases involved in cell growth associated with breast cancer. These genetic screens were performed in two breast cancer cell lines in mice which were phenotypically distinct. One of these was a stem-cell like breast cancer cell line that altered its behavior under varied environmental conditions, such as the availability of oxygen and nutrients. Degradome sequencing, followed by a multistep selection process, revealed 100 protease genes that played a role in the growth of breast cancer cells. While the role of many of these protease genes in breast cancer growth was supported by previous research, this study found some proteases previously unknown to be involved in cancer growth. Additionally, this study revealed that environmental factors, such as nutrient and oxygen abundance, affect the extent to which breast cancer cells rely on specific proteases identified via degradome sequencing. [9]
The results of this study were validated by using individual knockdown constructs in mice which functionally diminished the proteases of interest and affected the expression of breast cancer cells. These results indicate the high degree of reliability of degradome sequencing in identifying proteases involved in the growth of breast cancer cell lines in mouse models. Ultimately, this study concluded that degradome sequencing is a beneficial research tool for discovering and analyzing the functions of proteases in the proliferation of breast cancer. This holds many important implications for the potential degradome sequencing possesses as a diagnostic tool in early breast cancer detection and treatment. [9]
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRNAs base-pair to complementary sequences in mRNA molecules, then silence said mRNA molecules by one or more of the following processes:
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. The process of gene expression is used by all known life—eukaryotes, prokaryotes, and utilized by viruses—to generate the macromolecular machinery for life.
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products. Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein. Often, one gene regulator controls another, and so on, in a gene regulatory network.
The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The term transcriptome is a portmanteau of the words transcript and genome; it is associated with the process of transcript production during the biological process of transcription.
Malignant transformation is the process by which cells acquire the properties of cancer. This may occur as a primary process in normal tissue, or secondarily as malignant degeneration of a previously existing benign tumor.
The Argonaute protein family, first discovered for its evolutionarily conserved stem cell function, plays a central role in RNA silencing processes as essential components of the RNA-induced silencing complex (RISC). RISC is responsible for the gene silencing phenomenon known as RNA interference (RNAi). Argonaute proteins bind different classes of small non-coding RNAs, including microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs). Small RNAs guide Argonaute proteins to their specific targets through sequence complementarity, which then leads to mRNA cleavage, translation inhibition, and/or the initiation of mRNA decay.
Drosha is a Class 2 ribonuclease III enzyme that in humans is encoded by the DROSHA gene. It is the primary nuclease that executes the initiation step of miRNA processing in the nucleus. It works closely with DGCR8 and in correlation with Dicer. It has been found significant in clinical knowledge for cancer prognosis and HIV-1 replication.
There are 89 known sequences today in the microRNA 19 (miR-19) family but it will change quickly. They are found in a large number of vertebrate species. The miR-19 microRNA precursor is a small non-coding RNA molecule that regulates gene expression. Within the human and mouse genome there are three copies of this microRNA that are processed from multiple predicted precursor hairpins:
The miR-29 microRNA precursor, or pre-miRNA, is a small RNA molecule in the shape of a stem-loop or hairpin. Each arm of the hairpin can be processed into one member of a closely related family of short non-coding RNAs that are involved in regulating gene expression. The processed, or "mature" products of the precursor molecule are known as microRNA (miRNA), and have been predicted or confirmed in a wide range of species.
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.
Post-transcriptional regulation is the control of gene expression at the RNA level. It occurs once the RNA polymerase has been attached to the gene's promoter and is synthesizing the nucleotide sequence. Therefore, as the name indicates, it occurs between the transcription phase and the translation phase of gene expression. These controls are critical for the regulation of many genes across human tissues. It also plays a big role in cell physiology, being implicated in pathologies such as cancer and neurodegenerative diseases.
RNA-Seq is a technique that uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.
In molecular biology, miR-137 is a short non-coding RNA molecule that functions to regulate the expression levels of other genes by various mechanisms. miR-137 is located on human chromosome 1p22 and has been implicated to act as a tumor suppressor in several cancer types including colorectal cancer, squamous cell carcinoma and melanoma via cell cycle control.
isomiRs are miRNA sequences that have variations with respect to the reference sequence. The term was coined by Morin et al in 2008. It has been found that isomiR expression profiles can also exhibit race, population, and gender dependencies.
MicroRNA sequencing (miRNA-seq), a type of RNA-Seq, is the use of next-generation sequencing or massively parallel high-throughput DNA sequencing to sequence microRNAs, also called miRNAs. miRNA-seq differs from other forms of RNA-seq in that input material is often enriched for small RNAs. miRNA-seq allows researchers to examine tissue-specific expression patterns, disease associations, and isoforms of miRNAs, and to discover previously uncharacterized miRNAs. Evidence that dysregulated miRNAs play a role in diseases such as cancer has positioned miRNA-seq to potentially become an important tool in the future for diagnostics and prognostics as costs continue to decrease. Like other miRNA profiling technologies, miRNA-Seq has both advantages and disadvantages.
Cancer epigenetics is the study of epigenetic modifications to the DNA of cancer cells that do not involve a change in the nucleotide sequence, but instead involve a change in the way the genetic code is expressed. Epigenetic mechanisms are necessary to maintain normal sequences of tissue specific gene expression and are crucial for normal development. They may be just as important, if not even more important, than genetic mutations in a cell's transformation to cancer. The disturbance of epigenetic processes in cancers, can lead to a loss of expression of genes that occurs about 10 times more frequently by transcription silencing than by mutations. As Vogelstein et al. points out, in a colorectal cancer there are usually about 3 to 6 driver mutations and 33 to 66 hitchhiker or passenger mutations. However, in colon tumors compared to adjacent normal-appearing colonic mucosa, there are about 600 to 800 heavily methylated CpG islands in the promoters of genes in the tumors while these CpG islands are not methylated in the adjacent mucosa. Manipulation of epigenetic alterations holds great promise for cancer prevention, detection, and therapy. In different types of cancer, a variety of epigenetic mechanisms can be perturbed, such as the silencing of tumor suppressor genes and activation of oncogenes by altered CpG island methylation patterns, histone modifications, and dysregulation of DNA binding proteins. There are several medications which have epigenetic impact, that are now used in a number of these diseases.
Extracellular RNA (exRNA) describes RNA species present outside of the cells in which they were transcribed. Carried within extracellular vesicles, lipoproteins, and protein complexes, exRNAs are protected from ubiquitous RNA-degrading enzymes. exRNAs may be found in the environment or, in multicellular organisms, within the tissues or biological fluids such as venous blood, saliva, breast milk, urine, semen, menstrual blood, and vaginal fluid. Although their biological function is not fully understood, exRNAs have been proposed to play a role in a variety of biological processes including syntrophy, intercellular communication, and cell regulation. The United States National Institutes of Health (NIH) published in 2012 a set of Requests for Applications (RFAs) for investigating extracellular RNA biology. Funded by the NIH Common Fund, the resulting program was collectively known as the Extracellular RNA Communication Consortium (ERCC). The ERCC was renewed for a second phase in 2019.
Single-cell sequencing examines the nucleic acid sequence information from individual cells with optimized next-generation sequencing technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment. For example, in cancer, sequencing the DNA of individual cells can give information about mutations carried by small populations of cells. In development, sequencing the RNAs expressed by individual cells can give insight into the existence and behavior of different cell types. In microbial systems, a population of the same species can appear genetically clonal. Still, single-cell sequencing of RNA or epigenetic modifications can reveal cell-to-cell variability that may help populations rapidly adapt to survive in changing environments.
CUT&RUN sequencing, also known as cleavage under targets and release using nuclease, is a method used to analyze protein interactions with DNA. CUT&RUN sequencing combines antibody-targeted controlled cleavage by micrococcal nuclease with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN sequencing does not.
CUT&Tag-sequencing, also known as cleavage under targets and tagmentation, is a method used to analyze protein interactions with DNA. CUT&Tag-sequencing combines antibody-targeted controlled cleavage by a protein A-Tn5 fusion with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global DNA binding sites precisely for any protein of interest. Currently, ChIP-Seq is the most common technique utilized to study protein–DNA relations, however, it suffers from a number of practical and economical limitations that CUT&RUN and CUT&Tag sequencing do not. CUT&Tag sequencing is an improvement over CUT&RUN because it does not require cells to be lysed or chromatin to be fractionated. CUT&RUN is not suitable for single-cell platforms so CUT&Tag is advantageous for these.