Gut-1 RNA | |
---|---|
Identifiers | |
Symbol | Gut-1 |
Alt. Symbols | gt-1 |
Rfam | RF01706 |
Other data | |
RNA type | sRNA |
Domain(s) | Metagenomic camples |
PDB structures | PDBe |
The Gut-1 RNA motif (also called gt-1) is a conserved RNA structure identified by bioinformatics. [1] [2] These RNAs are present in environmental sequences, and as of 2010 are not known to be present in any species that has been grown under laboratory conditions. Gut-1 RNA is exclusively found in DNA from uncultivated bacteria present in samples from the human gut.
Giant, Ornate, Lake- and Lactobacillales-Derived (GOLLD) RNA is a conserved RNA structure present in bacteria. GOLLD RNAs were originally detected based on metagenome sequences of DNA isolated from Lake Gatun in Panama. However, they are known to be present in at least eight strains of cultivated bacteria. GOLLD RNAs are extraordinarily large compared to other RNAs with a conserved, complex secondary structure, and average roughly 800 nucleotides. Such large, complex RNAs are often ribozymes, although the biochemical function of GOLLD RNAs remains unknown. The discovery of large RNAs like GOLLD RNAs among bacteria that are mostly uncultivated under laboratory conditions suggests that many other unusually large RNAs might be found in bacteria that have not yet been studied.
HNH Endonuclease-Associated RNA and ORF (HEARO) RNAs conform to a conserved RNA structure that was identified in bacteria by bioinformatics. HEARO RNAs average roughly 300 nucleotides, which is comparable to the size of many ribozymes, which catalyze chemical reactions.
A wide variety of non-coding RNAs have been identified in various species of organisms known to science. However, RNAs have also been identified in "metagenomics" sequences derived from samples of DNA or RNA extracted from the environment, which contain unknown species. Initial work in this area detected homologs of known bacterial RNAs in such metagenome samples. Many of these RNA sequences were distinct from sequences within cultivated bacteria, and provide the potential for additional information on the RNA classes to which they belong.
The IMES-1 RNA motif is a conserved RNA structure that was identified in marine environmental sequences by two studies based on metagenomics and bioinformatics, the first analyzing metatranscriptome (RNA) data and the second using metagenome (DNA) data. These RNAs are present in environmental sequences, and as of 2009 are not known to be present in any cultivated species. However, the species that use these RNAs are most closely related to known alphaproteobacteria and gammaproteobacteria. IMES-1 RNAs make up a significant portion of marine RNA transcripts and are exceptionally abundant in that over five times as many IMES-1 RNAs were found as ribosomes in RNAs sampled from the Pacific Ocean. Only two bacterial RNAs are known to be more highly transcribed than ribosomes. IMES-1 RNAs were also detected in abundance in Block Island Sound in the Atlantic Ocean.
The IMES-2 RNA motif is a conserved RNA structure that was identified by a study based on metagenomics and bioinformatics, and the underlying RNA sequences were identified independently by a similar earlier study. These RNAs are present in environmental sequences, and when discovered were not known to be present in any cultivated species. However, an IMES-2 RNA has been detected in alphaproteobacterium HIMB114, which is classified in the SAR11 clade of marine bacteria. This finding fits with earlier predictions that species that use IMES-2 RNAs are most closely related to alphaproteobacteria. IMES-2 RNAs are exceptionally abundant, as twice as many IMES-2 RNAs were found as ribosomes in RNAs sampled from the Pacific Ocean. Only two bacterial RNAs are known to be more highly transcribed than ribosomes.
The IMES-3 RNA motif is a conserved RNA structure that was identified based on metagenomics and bioinformatics, and the underlying RNA sequences were identified independently by an earlier study. These RNAs are present in environmental sequences, and as of 2009 are not known to be present in any cultivated species. IMES-3 RNAs are abundant in comparison to ribosomes in RNAs sampled from the Pacific Ocean.
The IMES-4 RNA motif is a conserved RNA structure that was identified in marine environmental sequences by metagenomics and bioinformatics. These RNAs are present in environmental sequences, and as of 2009 are not known to be present in any cultivated species. IMES-4 RNAs are fairly abundant in comparison to ribosomes in RNAs sampled from the Pacific Ocean.
The Bacteroidales-1 RNA motif is a conserved RNA structure identified by bioinformatics. It has been identified only in bacteria within the order (biology) Bacteroidales. Its presumed length is marked by a promoter on one end that conforms to an alternate consensus sequence that is common in the phylum Bacteroidota, and its 3′ end is indicated by predicted transcription terminators. It is often located downstream of a gene that encodes the L20 ribosomal subunit, although it is unclear whether there is a functional reason underlying this apparent association.
The Chlorobi-1 RNA motif is a conserved RNA secondary structure identified by bioinformatics. It is predicted to be used only by Chlorobiota, a phylum of bacteria. The motif consists of two stem-loops that are followed by an apparent rho-independent transcription terminator. The motif is presumed to function as an independently transcribed non-coding RNA.
The Clostridiales-1 RNA motif is a conserved RNA structure identified by bioinformatics. It is a four-stem structure common in bacteria that inhabit the human gut and is also found in a variety of bacteria classified within the order Clostridiales. Its function is unknown.
The Collinsella-1 RNA motif denotes a particular conserved RNA structure discovered by bioinformatics. Of the six sequences belonging to this motif that were originally identified, five are from uncultivated bacteria residing in the human gut, while only the sixth is in a cultivated species, Collinsella aerofaciens. The evidence supporting the stem-loops designated as "P1" and "P2" is ambiguous.
The Dictyoglomi-1 RNA motif is a conserved RNA structure that was discovered via bioinformatics. Only four instances of the RNA were detected, and all are in the bacterial phylum Dictyoglomota, whose members have not been extensively studied. The RNA might have a cis-regulatory role, but the evidence is ambiguous. Because of the few instances of Dictyoglomi-1 RNAs known, it is also unknown whether the RNA structure might extend further in the 5′ or 3′ direction, or in both directions.
The manA RNA motif refers to a conserved RNA structure that was identified by bioinformatics. Instances of the manA RNA motif were detected in bacteria in the genus Photobacterium and phages that infect certain kinds of cyanobacteria. However, most predicted manA RNA sequences are derived from DNA collected from uncultivated marine bacteria. Almost all manA RNAs are positioned such that they might be in the 5' untranslated regions of protein-coding genes, and therefore it was hypothesized that manA RNAs function as cis-regulatory elements. Given the relative complexity of their secondary structure, and their hypothesized cis-regulatory role, they might be riboswitches.
The wcaG RNA motif is an RNA structure conserved in some bacteria that was detected by bioinformatics. wcaG RNAs are found in certain phages that infect cyanobacteria. Most known wcaG RNAs were found in sequences of DNA extracted from uncultivated marine bacteria. wcaG RNAs might function as cis-regulatory elements, in view of their consistent location in the possible 5' untranslated regions of genes. It was suggested the wcaG RNAs might further function as riboswitches.
The Termite-flg RNA motif is a conserved RNA structure identified by bioinformatics. Genomic sequences corresponding to Termite-flg RNAs have been identified only in uncultivated bacteria present in the termite hindgut. As of 2010 it has not been identified in the DNA of any cultivated species, and is thus an example of RNAs present in environmental samples.
The TwoAYGGAY RNA motif is a conserved RNA structure identified by bioinformatics. Its name refers to the conserved AYGGAY nucleotide sequence found in the motif's two terminal loops. The RNAs are found in sequences derived from DNA extracted from uncultivated bacteria present in the human gut, as well as some bacteria in the classes Clostridia and Gammaproteobacteria.
The Whalefall-1 RNA motif refers to a conserved RNA structure that was discovered using bioinformatics. Structurally, the motif consists of two stem-loops, the second of which is often terminated by a CUUG tetraloop, which is an energetically favorable RNA sequence. Whalefall-1 RNAs are found only in DNA extracted from uncultivated bacteria found on whale fall, i.e., a whale carcass. As of 2010, Whalefall-1 RNAs have not been detected in any known, cultivated species of bacteria, and are thus one of several RNAs present in environmental samples.
Tetrahydrofolate riboswitches are a class of homologous RNAs in certain bacteria that bind tetrahydrofolate (THF). It is almost exclusively located in the probable 5' untranslated regions of protein-coding genes, and most of these genes are known to encode either folate transporters or enzymes involved in folate metabolism. For these reasons it was inferred that the RNAs function as riboswitches. THF riboswitches are found in a variety of Bacillota, specifically the orders Clostridiales and Lactobacillales, and more rarely in other lineages of bacteria. The THF riboswitch was one of many conserved RNA structures found in a project based on comparative genomics. The 3-d structure of the tetrahydrofolate riboswitch has been solved by separate groups using X-ray crystallography. These structures were deposited into the Protein Data Bank under accessions 3SD1 and 3SUX, with other entries containing variants.
The Rumen-Originating, Ornate, Large (ROOL) RNA motif was originally discovered by bioinformatics by analyzing metagenomic sequences from cow rumen. ROOL RNAs are found in a variety of bacterial species and apparently do not code for proteins. The RNA has a complex RNA secondary structure and its average size of 581 nucleotides is unusually large for bacterial non-coding RNAs. This large size and structural complexity for a bacterial RNA is consistent with properties of large ribozymes.
The Ocean-VII RNA motif is a conserved RNA structure that was discovered by bioinformatics. Ocean-VII motifs are found in metagenomic sequences isolated from various marine environments, and are not yet known in any classified organism. This environmental context is similar to other marine RNAs that were found previously by predominantly bioinformatic or experimental methods.