RNA motif

Last updated

An RNA motif is a description of a group of RNAs that have a related structure. RNA motifs consist of a pattern of features within the primary sequence and secondary structure of related RNAs. Thus, it extends the concept of a sequence motif to include RNA secondary structure. The term "RNA motif" can refer both to the pattern and to the RNA sequences that match it.

Contents

Descriptions of RNAs motifs

RNA motifs can be described in two main forms: a multiple sequence alignment or an explicit search pattern. An alignment is usually augmented with a consensus secondary structure, i.e. the structure that is common to all or most RNAs. The sequences in the alignment then implicitly define a pattern of conservation that can, for example, be used to find additional examples of the RNA. This search strategy is implemented by, among others, the Infernal software package. [1]

The Rfam database is a collection of multiple sequence alignments that define a large subset of reliably known RNA motifs and associated information. Its data can be used with the Infernal software to find examples of such RNAs in sequence databases, e.g. genome sequences.

Alternatively, RNA motifs can also be described using explicit search patterns, which define specific primary sequence patterns combined with constraints of where helices should form. Such patterns can be used to find matching subsequences in a large sequence database. Several software packages implement such a search, e.g. RNArobo [2] and RNAmotif. [3]

Discovery of novel RNA motifs

Main article: Bioinformatics discovery of non-coding RNAs

Many methods to discover novel RNAs use a comparative approach, in which different sequences are analyzed together in order to detect characteristic signals of a conserved RNA. When such methods are successful, the resulting novel conserved RNA can be viewed as an RNA motif, expressed using an alignment or a pattern. An early example is the RNA motif based around the T-box, which in 1993 was determined to be associated with aminoacyl-tRNA synthetase genes. [4] The mechanism by which this RNA motif regulates genes was later demonstrated, thus establishing the functional importance of the RNA motif. Later, in 1997, a conserved RNA motif called the B12-box was detected upstream of genes related to B12 metabolism. [5] This RNA motif was later found to correspond to a part of a riboswitch that binds the co-factor adenosylcobalamin, which is often called the cobalamin riboswitch. (Later variants were shown to bind other cobalamin derivatives.) Many other examples of RNA motifs whose functions were later determined are known, especially in the context of riboswitches. [6] However, other types of RNA motifs have been functionally characterized, such as bacterial sRNAs like the 6C RNA, which was discovered as a motif in 2007 [7] and functionally characterized in 2016, [8] or ribozymes like the twister ribozyme, which was detected as an RNA motif and functionally characterized in the same publication. [9]

Related Research Articles

Riboswitch

In molecular biology, a riboswitch is a regulatory segment of a messenger RNA molecule that binds a small molecule, resulting in a change in production of the proteins encoded by the mRNA. Thus, an mRNA that contains a riboswitch is directly involved in regulating its own activity, in response to the concentrations of its effector molecule. The discovery that modern organisms use RNA to bind small molecules, and discriminate against closely related analogs, expanded the known natural capabilities of RNA beyond its ability to code for proteins, catalyze reactions, or to bind other RNA or protein macromolecules.

Cobalamin riboswitch

Cobalamin riboswitch is a cis-regulatory element which is widely distributed in 5' untranslated regions of vitamin B12 (Cobalamin) related genes in bacteria.

GlmS glucosamine-6-phosphate activated ribozyme

The glucosamine-6-phosphate riboswitch ribozyme is an RNA structure that resides in the 5' untranslated region (UTR) of the mRNA transcript of the glmS gene. This RNA regulates the glmS gene by responding to concentrations of a specific metabolite, glucosamine-6-phosphate (GlcN6P), in addition to catalyzing a self-cleaving chemical reaction upon activation. This cleavage leads to the degradation of the mRNA that contains the ribozyme, and lowers production of GlcN6P. The glmS gene encodes for an enzyme glutamine-fructose-6-phosphate amidotransferase, which catalyzes the formation of GlcN6P, a compound essential for cell wall biosynthesis, from fructose-6-phosphate and glutamine. Thus, when GlcN6P levels are high, the glmS ribozyme is activated and the mRNA transcript is degraded but in the absence of GlcN6P the gene continues to be translated into glutamine-fructose-6-phosphate amidotransferase and GlcN6P is produced. GlcN6P is a cofactor for this cleavage reaction, as it directly participates as an acid-base catalyst. This RNA is the first riboswitch also found to be a self-cleaving ribozyme and, like many others, was discovered using a bioinformatics approach.

YdaO/yuaA leader RNA structure in bacteria

The YdaO/YuaA leader is a conserved RNA structure found upstream of the ydaO and yuaA genes in Bacillus subtilis and related genes in other bacteria. Its secondary structure and gene associations were predicted by bioinformatics.

ykkC-yxkD leader

The ykkC/yxkD leader is a conserved RNA structure found upstream of the ykkC and yxkD genes in Bacillus subtilis and related genes in other bacteria. The function of this family is unclear for many years although it has been suggested that it may function to switch on efflux pumps and detoxification systems in response to harmful environmental molecules. The Thermoanaerobacter tengcongensis sequence AE013027 overlaps with that of purine riboswitch suggesting that the two riboswitches may work in conjunction to regulate the upstream gene which codes for TTE0584 (Q8RC62), a member of the permease family.

PreQ1-II riboswitch Class of riboswitches

PreQ1-II riboswitches form a class of riboswitches that specifically bind pre-queuosine1 (PreQ1), a precursor of the modified nucleoside queuosine. They are found in certain species of Streptococcus and Lactococcus, and were originally identified as a conserved RNA secondary structure called the "COG4708 motif". All known members of this riboswitch class appear to control members of COG4708 genes. These genes are predicted to encode membrane-bound proteins and have been proposed to be a transporter of preQ1, or a related metabolite, based on their association with preQ1-binding riboswitches. PreQ1-II riboswitches have no apparent similarities in sequence or structure to preQ1-I riboswitches, a previously discovered class of preQ1-binding riboswitches. PreQ1 thus joins S-adenosylmethionine as the second metabolite to be found that is the ligand of more than one riboswitch class.

Cyclic di-GMP-I riboswitch

Cyclic di-GMP-I riboswitches are a class of riboswitch that specifically bind cyclic di-GMP, which is a second messenger that is used in a variety of microbial processes including virulence, motility and biofilm formation. Cyclic di-GMP-I riboswitches were originally identified by bioinformatics as a conserved RNA-like structure called the "GEMM motif". These riboswitches are present in a wide variety of bacteria, and are most common in Clostridia and certain varieties of Proteobacteria. The riboswitches are present in pathogens such as Clostridium difficile, Vibrio cholerae and Bacillus anthracis. Geobacter uraniumreducens is predicted to have 30 instances of this riboswitch in its genome. A bacteriophage that infects C. difficile is predicted to carry a cyclic di-GMP-I riboswitch, which it might use to detect and exploit the physiological state of bacteria that it infects.

Downstream-peptide motif

The Downstream-peptide motif refers to a conserved RNA structure identified by bioinformatics in the cyanobacterial genera Synechococcus and Prochlorococcus and one phage that infects such bacteria. It was also detected in marine samples of DNA from uncultivated bacteria, which are presumably other species of cyanobacteria.

Glutamine riboswitch Glutamine-binding RNA structure

The glutamine riboswitch is a conserved RNA structure that was predicted by bioinformatics. It is present in a variety of lineages of cyanobacteria, as well as some phages that infect cyanobacteria. It is also found in DNA extracted from uncultivated bacteria living in the ocean that are presumably species of cyanobacteria.

Pfl RNA motif

The pfl RNA motif refers to a conserved RNA structure present in some bacteria and originally discovered using bioinformatics. pfl RNAs are consistently present in genomic locations that likely correspond to the 5' untranslated regions of protein-coding genes. This arrangement in bacteria is commonly associated with cis-regulatory elements. Moreover, they are in presumed 5' UTRs of multiple non-homologous genes, suggesting that they function only in these locations. Additional evidence of cis-regulatory function came from the observation that predicted rho-independent transcription terminators overlap pfl RNAs. This overlap suggests that the alternate secondary structures of pfl RNA and the transcription terminator stem-loops compete with each other, and this is a common mechanism for cis gene control in bacteria.

YjdF RNA motif

The yjdF RNA motif is a conserved RNA structure identified using bioinformatics. Most yjdF RNAs are located in bacteria classified within the phylum Firmicutes. A yjdF RNA is found in the presumed 5' untranslated region of the yjdF gene in Bacillus subtilis, and almost all yjdF RNAs are found in the 5' UTRs of homologs of this gene. The function of the yjdF gene is unknown, but the protein that it is predicted to encode is classified by the Pfam Database as DUF2992.

Cyclic di-GMP-II riboswitch

Cyclic di-GMP-II riboswitches form a class of riboswitches that specifically bind cyclic di-GMP, a second messenger used in multiple bacterial processes such as virulence, motility and biofilm formation. Cyclic di-GMP II riboswitches are structurally unrelated to cyclic di-GMP-I riboswitches, though they have the same function.

SAM-V riboswitch is the fifth known riboswitch to bind S-adenosyl methionine (SAM). It was first discovered in the marine bacterium Candidatus Pelagibacter ubique and can also be found in marine metagenomes. SAM-V features a similar consensus sequence and secondary structure as the binding site of SAM-II riboswitch, but bioinformatics scans cluster the two aptamers independently. These similar binding pockets suggest that the two riboswitches have undergone convergent evolution.

RAGATH RNA motifs

RNAs Associated with Genes Associated with Twister and Hammerhead ribozymes (RAGATH) refers to a bioinformatics strategy that was devised to find self-cleaving ribozymes in bacteria. It also refers to candidate RNAs, or RAGATH RNA motifs, discovered using this strategy.

Non-coding RNAs have been discovered using both experimental and bioinformatic approaches. Bioinformatic approaches can be divided into three main categories. The first involves homology search, although these techniques are by definition unable to find new classes of ncRNAs. The second category includes algorithms designed to discover specific types of ncRNAs that have similar properties. Finally, some discovery methods are based on very general properties of RNA, and are thus able to discover entirely new kinds of ncRNAs.

<i>folE</i> RNA motif

The folE RNA motif, now known as the THF-II riboswitch, is a conserved RNA structure that was discovered by bioinformatics. folE motifs are found in Alphaproteobacteria.

FTHFS RNA motif

The FTHFS RNA motif is a conserved RNA structure that was discovered by bioinformatics. FTHFS motifs are found in metagenomic sequences derived from samples of the human gut.

<i>queA</i> RNA motif

The queA RNA motif is a conserved RNA structure that was discovered by bioinformatics. queA motif RNAs have not yet been found in any classified organism; they are known from metagenomic sequences.

<i>terC</i> RNA motif

The terC RNA motif is a conserved RNA structure that was discovered by bioinformatics. terC motif RNAs are found in Proteobacteria, within the sub-lineages Alphaproteobacteria and Pseudomonadales.

<i>uup</i> RNA motif

The uup RNA motif is a conserved RNA structure that was discovered by bioinformatics. uup motif RNAs are found in Firmicutes and Gammaproteobacteria.

References

  1. Nawrocki EP, Eddy SR (November 2013). "Infernal 1.1: 100-fold faster RNA homology searches". Bioinformatics. 29 (22): 2933–5. doi:10.1093/bioinformatics/btt509. PMC   3810854 . PMID   24008419.
  2. Rampášek L, Jimenez RM, Lupták A, Vinař T, Brejová B (May 2016). "RNA motif search with data-driven element ordering". BMC Bioinformatics. 17 (1): 216. doi:10.1186/s12859-016-1074-x. PMC   4870747 . PMID   27188396.
  3. Macke TJ, Ecker DJ, Gutell RR, Gautheret D, Case DA, Sampath R (November 2001). "RNAMotif, an RNA secondary structure definition and search algorithm". Nucleic Acids Res. 29 (22): 4724–35. doi:10.1093/nar/29.22.4724. PMC   92549 . PMID   11713323.
  4. Grundy FJ, Henkin TM (August 1993). "tRNA as a positive regulator of transcription antitermination in B. subtilis". Cell. 74 (3): 475–82. doi:10.1016/0092-8674(93)80049-k. PMID   8348614.
  5. Franklund CV, Kadner RJ (June 1997). "Multiple transcribed elements control expression of the Escherichia coli btuB gene". J Bacteriol. 179 (12): 4039–42. doi:10.1128/jb.179.12.4039-4042.1997. PMC   179215 . PMID   9190822.
  6. Sherlock ME, Breaker RR (June 2020). "Former orphan riboswitches reveal unexplored areas of bacterial metabolism, signaling, and gene control processes". RNA. 26 (6): 675–693. doi:10.1261/rna.074997.120. PMC   7266159 . PMID   32165489.
  7. Weinberg Z, Barrick JE, Yao Z, Roth A, Kim JN, Gore J, Wang JX, Lee ER, Block KF, Sudarsan N, Neph S, Tompa M, Ruzzo WL, Breaker RR (2007). "Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline". Nucleic Acids Res. 35 (14): 4809–19. doi:10.1093/nar/gkm487. PMC   1950547 . PMID   17621584.
  8. Pahlke J, Dostálová H, Holátko J, Degner U, Bott M, Pátek M, Polen T (September 2016). "The small 6C RNA of Corynebacterium glutamicum is involved in the SOS response". RNA Biol. 13 (9): 848–60. doi:10.1080/15476286.2016.1205776. PMC   5014011 . PMID   27362471.
  9. Roth A, Weinberg Z, Chen AG, Kim PB, Ames TD, Breaker RR (January 2014). "A widespread self-cleaving ribozyme class is revealed by bioinformatics". Nat Chem Biol. 10 (1): 56–60. doi:10.1038/nchembio.1386. PMC   3867598 . PMID   24240507.