This article may be confusing or unclear to readers.(August 2016) |
isomiRs (from iso- + miR) are miRNA sequences that have variations with respect to the reference sequence. The term was coined by Morin et al in 2008. [1] It has been found that isomiR expression profiles can also exhibit race, population, and sex dependencies. [2]
There are four main variation types:
miRBase is considered to be the gold-standard miRNA database—it stores miRNA sequences detected by thousand of experiments. In this database each miRNA is associated with a miRNA precursor and with one or two mature miRNA (-5p and -3p). In the past it had always been said that the same miRNA precursor generates the same miRNA sequences. However, the advent of deep sequencing has now allowed researchers to detect a huge variability in miRNA biogenesis, meaning that from the same miRNA precursor many different sequences can be generated potentially have different targets, [3] [4] [5] or even lead to opposite changes in mRNA expression. [4]
The advent of sequencing has permitted scientists to elucidate a huge landscape of new miRNAs, to increase our knowledge of the biogenesis involved and to discover putative post-transcriptional editing processes in miRNAs ignored until now. These processes mostly generate variations of the current miRNAs that are annotated in miRBase in the 3' and 5' terminus and in minor frequencies, nucleotide substitution along the miRNA length. [6] [7] [8] [9] The variations are mainly generated by a shift of Drosha and Dicer in the cleavage site, but also by nucleotide additions at the 3'-end, [10] resulting in new sequences different from the annotated miRNA. These were named "isomiRs" by Morin et al., 2008. IsomiRs have been well established along different species in metazoa [11] [12] [13] [14] [15] and deeply described for the first time in human stem cells and human brain samples. [8] [9] Moreover, it has been proven that isomiRs are not caused by RNA degradation during sample preparation for next generation sequencing. [16] Some studies have tried to explain the miRNA diversity by structural bases of precursors but without clear results. [17] The functionality of adenylation or uridynilation at the 3'end (3'addition isomiRs) has been related to alterations in the miRNA-3'-UTR stability. [18] Furthermore, differential expression of isomiRs has been detected during development in D. melanogaster and Hippoglossus hippoglossus L., suggesting a biological function. [15] [19]
An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term exon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature RNA. Just as the entire set of genes for a species constitutes the genome, the entire set of exons constitutes the exome.
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRNAs base-pair to complementary sequences in mRNA molecules, then silence said mRNA molecules by one or more of the following processes:
Oligonucleotides are short DNA or RNA molecules, oligomers, that have a wide range of applications in genetic testing, research, and forensics. Commonly made in the laboratory by solid-phase chemical synthesis, these small fragments of nucleic acids can be manufactured as single-stranded molecules with any user-specified sequence, and so are vital for artificial gene synthesis, polymerase chain reaction (PCR), DNA sequencing, molecular cloning and as molecular probes. In nature, oligonucleotides are usually found as small RNA molecules that function in the regulation of gene expression, or are degradation intermediates derived from the breakdown of larger nucleic acid molecules.
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature mRNA for translation. In many bacteria, the poly(A) tail promotes degradation of the mRNA. It, therefore, forms part of the larger process of gene expression.
Transcriptional modification or co-transcriptional modification is a set of biological processes common to most eukaryotic cells by which an RNA primary transcript is chemically altered following transcription from a gene to produce a mature, functional RNA molecule that can then leave the nucleus and perform any of a variety of different functions in the cell. There are many types of post-transcriptional modifications achieved through a diverse class of molecular mechanisms.
Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.
The miR-199 microRNA precursor is a short non-coding RNA gene involved in gene regulation. miR-199 genes have now been predicted or experimentally confirmed in mouse, human and a further 21 other species. microRNAs are transcribed as ~70 nucleotide precursors and subsequently processed by the Dicer enzyme to give a ~22 nucleotide product. The mature products are thought to have regulatory roles through complementarity to mRNA.
There are 89 known sequences today in the microRNA 19 (miR-19) family but it will change quickly. They are found in a large number of vertebrate species. The miR-19 microRNA precursor is a small non-coding RNA molecule that regulates gene expression. Within the human and mouse genome there are three copies of this microRNA that are processed from multiple predicted precursor hairpins:
In bioinformatics, miRBase is a biological database that acts as an archive of microRNA sequences and annotations. As of September 2010 it contained information about 15,172 microRNAs. This number has risen to 38,589 by March 2018. The miRBase registry provides a centralised system for assigning new names to microRNA genes.
This microRNA database and microRNA targets databases is a compilation of databases and web portals and servers used for microRNAs and their targets. MicroRNAs (miRNAs) represent an important class of small non-coding RNAs (ncRNAs) that regulate gene expression by targeting messenger RNAs.
Degradome sequencing (Degradome-Seq), also referred to as parallel analysis of RNA ends (PARE), is a modified version of 5'-Rapid Amplification of cDNA Ends (RACE) using high-throughput, deep sequencing methods such as Illumina's SBS technology. The degradome encompasses the entire set of proteases that are expressed at a specific time in a given biological material, including tissues, cells, organisms, and biofluids. Thus, sequencing this degradome offers a method for studying and researching the process of RNA degradation. This process is used to identify and quantify RNA degradation products, or fragments, present in any given biological sample. This approach allows for the systematic identification of targets of RNA decay and provides insight into the dynamics of transcriptional and post-transcriptional gene regulation.
The Sequence Read Archive is a bioinformatics database that provides a public repository for DNA sequencing data, especially the "short reads" generated by high-throughput sequencing, which are typically less than 1,000 base pairs in length. The archive is part of the International Nucleotide Sequence Database Collaboration (INSDC), and run as a collaboration between the NCBI, the European Bioinformatics Institute (EBI), and the DNA Data Bank of Japan (DDBJ).
miR-146 is a family of microRNA precursors found in mammals, including humans. The ~22 nucleotide mature miRNA sequence is excised from the precursor hairpin by the enzyme Dicer. This sequence then associates with RISC which effects RNA interference.
MicroRNA sequencing (miRNA-seq), a type of RNA-Seq, is the use of next-generation sequencing or massively parallel high-throughput DNA sequencing to sequence microRNAs, also called miRNAs. miRNA-seq differs from other forms of RNA-seq in that input material is often enriched for small RNAs. miRNA-seq allows researchers to examine tissue-specific expression patterns, disease associations, and isoforms of miRNAs, and to discover previously uncharacterized miRNAs. Evidence that dysregulated miRNAs play a role in diseases such as cancer has positioned miRNA-seq to potentially become an important tool in the future for diagnostics and prognostics as costs continue to decrease. Like other miRNA profiling technologies, miRNA-Seq has both advantages and disadvantages.
The European Nucleotide Archive (ENA) is a repository providing free and unrestricted access to annotated DNA and RNA sequences. It also stores complementary information such as experimental procedures, details of sequence assembly and other metadata related to sequencing projects. The archive is composed of three main databases: the Sequence Read Archive, the Trace Archive and the EMBL Nucleotide Sequence Database. The ENA is produced and maintained by the European Bioinformatics Institute and is a member of the International Nucleotide Sequence Database Collaboration (INSDC) along with the DNA Data Bank of Japan and GenBank.
Single nucleotide polymorphism annotation is the process of predicting the effect or function of an individual SNP using SNP annotation tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is typically performed based on the available information on nucleic acid and protein sequences.
Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.
In molecular phylogenetics, relationships among individuals are determined using character traits, such as DNA, RNA or protein, which may be obtained using a variety of sequencing technologies. High-throughput next-generation sequencing has become a popular technique in transcriptomics, which represent a snapshot of gene expression. In eukaryotes, making phylogenetic inferences using RNA is complicated by alternative splicing, which produces multiple transcripts from a single gene. As such, a variety of approaches may be used to improve phylogenetic inference using transcriptomic data obtained from RNA-Seq and processed using computational phylogenetics.
Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.