Internal transcribed spacer (ITS) is the spacer DNA situated between the small-subunit ribosomal RNA (rRNA) and large-subunit rRNA genes in the chromosome or the corresponding transcribed region in the polycistronic rRNA precursor transcript.
In bacteria and archaea, there is a single ITS, located between the 16S and 23S rRNA genes. Conversely, there are two ITSs in eukaryotes: ITS1 is located between 18S and 5.8S rRNA genes, while ITS2 is between 5.8S and 28S (in opisthokonts, or 25S in plants) rRNA genes. ITS1 corresponds to the ITS in bacteria and archaea, while ITS2 originated as an insertion that interrupted the ancestral 23S rRNA gene. [1] [2]
In bacteria and archaea, the ITS occurs in one to several copies, as do the flanking 16S and 23S genes. When there are multiple copies, these do not occur adjacent to one another. Rather, they occur in discrete locations in the circular chromosome. It is not uncommon in bacteria to carry tRNA genes in the ITS. [3] [4]
In eukaryotes, genes encoding ribosomal RNA and spacers occur in tandem repeats that are thousands of copies long, each separated by regions of non-transcribed DNA termed intergenic spacer (IGS) or non-transcribed spacer (NTS).
Each eukaryotic ribosomal cluster contains the 5' external transcribed spacer (5' ETS), the 18S rRNA gene, the ITS1, the 5.8S rRNA gene, the ITS2, the 26S or 28S rRNA gene, and finally the 3' ETS. [5]
During rRNA maturation, ETS and ITS pieces are excised. As non-functional by-products of this maturation, they are rapidly degraded. [6]
Sequence comparison of the eukaryotic ITS regions is widely used in taxonomy and molecular phylogeny because of several favorable properties: [7]
For example, ITS markers have proven especially useful for elucidating phylogenetic relationships among the following taxa.
Taxonomic group | Taxonomic level | Year | Authors with references |
---|---|---|---|
Asteraceae: Compositae | Species (congeneric) | 1992 | Baldwin et al. [9] |
Viscaceae: Arceuthobium | Species (congeneric) | 1994 | Nickrent et al. [10] |
Poaceae: Zea | Species (congeneric) | 1996 | Buckler & Holtsford [11] |
Leguminosae: Medicago | Species (congeneric) | 1998 | Bena et al. [5] |
Orchidaceae: Diseae | Genera (within tribes) | 1999 | Douzery et al. [12] |
Odonata: Calopteryx | Species (congeneric) | 2001 | Weekers et al. [13] |
Yeasts of clinical importance | Genera | 2001 | Chen et al. [14] |
Poaceae: Saccharinae | Genera (within tribes) | 2002 | Hodkinson et al. [15] |
Plantaginaceae: Plantago | Species (congeneric) | 2002 | Rønsted et al. [16] |
Jungermanniopsida: Herbertus | Species (congeneric) | 2004 | Feldberg et al. [17] |
Pinaceae: Tsuga | Species (congeneric) | 2008 | Havill et al. [18] |
Chrysomelidae: Altica | Genera (congeneric) | 2009 | Ruhl et al. [19] |
Symbiodinium | Clade | 2009 | Stat et al. [20] |
Brassicaceae | Tribes (within a family) | 2010 | Warwick et al. [21] |
Ericaceae: Erica | Species (congeneric) | 2011 | Pirie et al. [22] |
Diptera: Bactrocera | Species (congeneric) | 2014 | Boykin et al. [23] |
Scrophulariaceae: Scrophularia | Species (congeneric) | 2014 | Scheunert & Heubl [24] |
Potamogetonaceae: Potamogeton | Species (congeneric) | 2016 | Yang et al. [25] |
ITS2 is known to be more conserved than ITS1 is. All ITS2 sequences share a common core of secondary structure, [26] while ITS1 structures are only conserved in much smaller taxonomic units. Regardless of the scope of conservation, structure-assisted comparison can provide higher resolution and robustness. [27]
The ITS region is the most widely sequenced DNA region in molecular ecology of fungi [28] and has been recommended as the universal fungal barcode sequence. [29] It has typically been most useful for molecular systematics at the species to genus level, and even within species (e.g., to identify geographic races). Because of its higher degree of variation than other genic regions of rDNA (for example, small- and large-subunit rRNA), variation among individual rDNA repeats can sometimes be observed within both the ITS and IGS regions. In addition to the universal ITS1+ITS4 primers [30] [31] used by many labs, several taxon-specific primers have been described that allow selective amplification of fungal sequences (e.g., see Gardes & Bruns 1993 paper describing amplification of basidiomycete ITS sequences from mycorrhiza samples). [32] Despite shotgun sequencing methods becoming increasingly utilized in microbial sequencing, the low biomass of fungi in clinical samples make the ITS region amplification an area of ongoing research. [33] [34]
In genetics, complementary DNA (cDNA) is DNA synthesized from a single-stranded RNA template in a reaction catalyzed by the enzyme reverse transcriptase. cDNA is often used to express a specific protein in a cell that does not normally express that protein, or to sequence or quantify mRNA molecules using DNA based methods. cDNA that codes for a specific protein can be transferred to a recipient cell for expression, often bacterial or yeast expression systems. cDNA is also generated to analyze transcriptomic profiles in bulk tissue, single cells, or single nuclei in assays such as microarrays, qPCR, and RNA-seq.
The nucleolus is the largest structure in the nucleus of eukaryotic cells. It is best known as the site of ribosome biogenesis, which is the synthesis of ribosomes. The nucleolus also participates in the formation of signal recognition particles and plays a role in the cell's response to stress. Nucleoli are made of proteins, DNA and RNA, and form around specific chromosomal regions called nucleolar organizing regions. Malfunction of nucleoli can be the cause of several human conditions called "nucleolopathies" and the nucleolus is being investigated as a target for cancer chemotherapy.
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.
Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses.
Ribosomal DNA (rDNA) is a DNA sequence that codes for ribosomal RNA. These sequences regulate transcription initiation and amplification, and contain both transcribed and non-transcribed spacer segments.
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal DNA (rDNA) and then bound to ribosomal proteins to form small and large ribosome subunits. rRNA is the physical and mechanical factor of the ribosome that forces transfer RNA (tRNA) and messenger RNA (mRNA) to process and translate the latter into proteins. Ribosomal RNA is the predominant form of RNA found in most cells; it makes up about 80% of cellular RNA despite never being translated into proteins itself. Ribosomes are composed of approximately 60% rRNA and 40% ribosomal proteins by mass.
RNA polymerase 1 is, in higher eukaryotes, the polymerase that only transcribes ribosomal RNA, a type of RNA that accounts for over 50% of the total RNA synthesized in a cell.
In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids or proteins across species, or within a genome, or between donor and receptor taxa. Conservation indicates that a sequence has been maintained by natural selection.
Semantides are biological macromolecules that carry genetic information or a transcript thereof. Three different categories or semantides are distinguished: primary, secondary and tertiary. Primary Semantides are genes, which consist of DNA. Secondary semantides are chains of messenger RNA, which are transcribed from DNA. Tertiary semantides are polypeptides, which are translated from messenger RNA. In eukaryotic organisms, primary semantides may consist of nuclear, mitochondrial or plastid DNA. Not all primary semantides ultimately form tertiary semantides. Some primary semantides are not transcribed into mRNA and some secondary semantides are not translated into polypeptides. The complexity of semantides varies greatly. For tertiary semantides, large globular polypeptide chains are most complex while structural proteins, consisting of repeating simple sequences, are least complex. The term semantide and related terms were coined by Linus Pauling and Emile Zuckerkandl. Although semantides are the major type of data used in modern phylogenetics, the term itself is not commonly used.
A structural gene is a gene that codes for any RNA or protein product other than a regulatory factor. A term derived from the lac operon, structural genes are typically viewed as those containing sequences of DNA corresponding to the amino acids of a protein that will be produced, as long as said protein does not function to regulate gene expression. Structural gene products include enzymes and structural proteins. Also encoded by structural genes are non-coding RNAs, such as rRNAs and tRNAs.
A nuclear gene is a gene that has its DNA nucleotide sequence physically situated within the cell nucleus of a eukaryotic organism. This term is employed to differentiate nuclear genes, which are located in the cell nucleus, from genes that are found in mitochondria or chloroplasts. The vast majority of genes in eukaryotes are nuclear.
In molecular biology, the 5.8S ribosomal RNA is a non-coding RNA component of the large subunit of the eukaryotic ribosome and so plays an important role in protein translation. It is transcribed by RNA polymerase I as part of the 45S precursor that also contains 18S and 28S rRNA. Its function is thought to be in ribosome translocation. It is also known to form covalent linkage to the p53 tumour suppressor protein. 5.8S rRNA can be used as a reference gene for miRNA detection. The 5.8S ribosomal RNA is used to better understand other rRNA processes and pathways in the cell.
16S ribosomal RNA is the RNA component of the 30S subunit of a prokaryotic ribosome. It binds to the Shine-Dalgarno sequence and provides most of the SSU structure.
60S ribosomal protein L7a is a protein that in humans is encoded by the RPL7A gene.
28S ribosomal RNA is the structural ribosomal RNA (rRNA) for the large subunit (LSU) of eukaryotic cytoplasmic ribosomes, and thus one of the basic components of all eukaryotic cells. It has a size of 25S in plants and 28S in mammals, hence the alias of 25S–28S rRNA.
DNA barcoding is a method of species identification using a short section of DNA from a specific gene or genes. The premise of DNA barcoding is that by comparison with a reference library of such DNA sections, an individual sequence can be used to uniquely identify an organism to species, just as a supermarket scanner uses the familiar black stripes of the UPC barcode to identify an item in its stock against its reference database. These "barcodes" are sometimes used in an effort to identify unknown species or parts of an organism, simply to catalog as many taxa as possible, or to compare with traditional taxonomy in an effort to determine species boundaries.
Microbial phylogenetics is the study of the manner in which various groups of microorganisms are genetically related. This helps to trace their evolution. To study these relationships biologists rely on comparative genomics, as physiology and comparative anatomy are not possible methods.
Microbial DNA barcoding is the use of DNA metabarcoding to characterize a mixture of microorganisms. DNA metabarcoding is a method of DNA barcoding that uses universal genetic markers to identify DNA of a mixture of organisms.
Fungal DNA barcoding is the process of identifying species of the biological kingdom Fungi through the amplification and sequencing of specific DNA sequences and their comparison with sequences deposited in a DNA barcode database such as the ISHAM reference database, or the Barcode of Life Data System (BOLD). In this attempt, DNA barcoding relies on universal genes that are ideally present in all fungi with the same degree of sequence variation. The interspecific variation, i.e., the variation between species, in the chosen DNA barcode gene should exceed the intraspecific (within-species) variation.
Genome skimming is a sequencing approach that uses low-pass, shallow sequencing of a genome, to generate fragments of DNA, known as genome skims. These genome skims contain information about the high-copy fraction of the genome. The high-copy fraction of the genome consists of the ribosomal DNA, plastid genome (plastome), mitochondrial genome (mitogenome), and nuclear repeats such as microsatellites and transposable elements. It employs high-throughput, next generation sequencing technology to generate these skims. Although these skims are merely 'the tip of the genomic iceberg', phylogenomic analysis of them can still provide insights on evolutionary history and biodiversity at a lower cost and larger scale than traditional methods. Due to the small amount of DNA required for genome skimming, its methodology can be applied in other fields other than genomics. Tasks like this include determining the traceability of products in the food industry, enforcing international regulations regarding biodiversity and biological resources, and forensics.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)