This list of RNA structure prediction software is a compilation of software tools and web portals used for RNA structure prediction.
Name | Description | Knots [Note 1] | Links | References |
---|---|---|---|---|
SQUARNA | Secondary structure prediction based on a greedy stem formation model | Yes | sourcecode | [1] |
CentroidFold | Secondary structure prediction based on generalized centroid estimator | No | sourcecode webserver | [2] |
CentroidHomfold | Secondary structure prediction by using homologous sequence information | No | sourcecode webserver | [3] |
Context Fold | An RNA secondary structure prediction software based on feature-rich trained scoring models. | No | sourcecode webserver | [4] |
CONTRAfold | Secondary structure prediction method based on conditional log-linear models (CLLMs), a flexible class of probabilistic models which generalize upon SCFGs by using discriminative training and feature-rich scoring. | No | sourcecode webserver | [5] |
Crumple | Simple, cleanly written software to produce the full set of possible secondary structures for one sequence, given optional constraints. | No | sourcecode | [6] |
CyloFold | Secondary structure prediction method based on placement of helices allowing complex pseudoknots. | Yes | webserver | [7] |
E2Efold | A deep learning based method for efficiently predicting secondary structure by differentiating through a constrained optimization solver, without using dynamic programming. | Yes | sourcecode | [8] [9] |
EternaFold | A multitask-learning-based model trained on data from the Eterna project. | No | sourcecode webserver | [10] |
GTFold | Fast and scalable multicore code for predicting RNA secondary structure. | No | link sourcecode | [11] |
INTERPIN | Algorithm and database for prediction of transcription termination sites in bacteria. Uses Mfold for RNA secondary structure prediction. | No | webserver | [12] [13] |
IPknot | Fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming. | Yes | sourcecode webserver | [14] |
KineFold | Folding kinetics of RNA sequences including pseudoknots by including an implementation of the partition function for knots. | Yes | linuxbinary, webserver | [15] [16] |
Mfold | MFE (Minimum Free Energy) RNA structure prediction algorithm. | No | sourcecode, webserver | [17] |
pKiss | A dynamic programming algorithm for the prediction of a restricted class (H-type and kissing hairpins) of RNA pseudoknots. | Yes | sourcecode, webserver | [18] |
Pknots | A dynamic programming algorithm for optimal RNA pseudoknot prediction using the nearest neighbour energy model. | Yes | sourcecode | [19] |
PknotsRG | A dynamic programming algorithm for the prediction of a restricted class (H-type) of RNA pseudoknots. | Yes | sourcecode, webserver | [20] |
RNA123 | Secondary structure prediction via thermodynamic-based folding algorithms and novel structure-based sequence alignment specific for RNA. | Yes | webserver | |
RNAfold | MFE RNA structure prediction algorithm. Includes an implementation of the partition function for computing basepair probabilities and circular RNA folding. | No | sourcecode, webserver | |
RNAshapes | MFE RNA structure prediction based on abstract shapes. Shape abstraction retains adjacency and nesting of structural features, but disregards helix lengths, thus reduces the number of suboptimal solutions without losing significant information. Furthermore, shapes represent classes of structures for which probabilities based on Boltzmann-weighted energies can be computed. | No | source & binaries, webserver | [25] [26] |
RNAstructure | A program to predict lowest free energy structures and base pair probabilities for RNA or DNA sequences. Programs are also available to predict maximum expected accuracy structures and these can include pseudoknots. Structure prediction can be constrained using experimental data, including SHAPE, enzymatic cleavage, and chemical modification accessibility. Graphical user interfaces are available for Windows, Mac OS X, Linux. Programs are also available for use with Unix-style text interfaces. Also, a C++ class library is available. | Yes | source & binaries, webserver | |
SARNA-Predict | RNA Secondary structure prediction method based on simulated annealing. It can also predict structure with pseudoknots. | Yes | link | [29] |
seqfold | Predict the minimum free energy structure of nucleic acids. seqfold is an implementation of the Zuker, 1981 dynamic programming algorithm, the basis for UNAFold/mfold, with energy functions from SantaLucia, 2004 (DNA) and Turner, 2009 (RNA). MIT license. Python CLI or module. | No | link & source | [30] |
Sfold | Statistical sampling of all possible structures. The sampling is weighted by partition function probabilities. | No | Github_Repository | [31] [32] [33] [34] |
Sliding Windows & Assembly | Sliding windows and assembly is a tool chain for folding long series of similar hairpins. | No | sourcecode | [6] |
SPOT-RNA | SPOT-RNA is first RNA secondary structure predictor which can predict all kind base pairs (canonical, noncanonical, pseudoknots, and base triplets). | Yes | sourcecode | [35] |
SwiSpot | Command-line utility for predicting alternative (secondary) configurations of riboswitches. It is based on the prediction of the so-called switching sequence, to subsequently constrain the folding of the two functional structures. | No | sourcecode | [36] |
UFold | UFold: fast and accurate RNA secondary structure prediction with deep learning | Yes | sourcecode, webserver | [37] |
UNAFold | Command-line utility for predicting alternative (secondary) configurations of riboswitches. It is based on the prediction of the so-called switching sequence, to subsequently constrain the folding of the two functional structures. | No | sourcecode | [38] |
vsfold/vs subopt | Folds and predicts RNA secondary structure and pseudoknots using an entropy model derived from polymer physics. The program vs_subopt computes suboptimal structures based on the free energy landscape derived from vsfold5. | Yes | webserver | [39] [40] |
|
Name | Description | Knots [Note 1] | Links | References |
---|---|---|---|---|
trRosettaRNA | trRosettaRNA is an algorithm for automated prediction of RNA 3D structure. It builds the RNA structure by Rosetta energy minimization, with deep learning restraints from a transformer network (RNAformer). trRosettaRNA has been validated in blind tests, including CASP15 and RNA-Puzzles, which suggests that that the automated predictions by trRosettaRNA are competitive to the predictions by the top human groups on natural RNAs. | Yes | webserver sourcecode | [41] |
BARNACLE | A Python library for the probabilistic sampling of RNA structures that are compatible with a given nucleotide sequence and that are RNA-like on a local length scale. | Yes | sourcecode | [42] |
FARFAR2 | Automated de novo prediction of native-like RNA tertiary structures . | Yes | webserver | [43] |
iFoldRNA | three-dimensional RNA structure prediction and folding | Yes | webserver | [44] |
MC-Fold MC-Sym Pipeline | Thermodynamics and Nucleotide cyclic motifs for RNA structure prediction algorithm. 2D and 3D structures. | Yes | sourcecode, webserver | [45] |
NAST | Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters | Unknown | executables | [46] |
MMB | Turning limited experimental information into 3D models of RNA | Unknown | sourcecode | [47] |
RNA123 | Integrated platform for de novo and homology modeling of RNA 3D structures, where coordinate file input, sequence editing, sequence alignment, structure prediction and analysis features are all accessed from one intuitive graphical user interface. | Yes | ||
RNAComposer | Fully automated prediction of large RNA 3D structures. | Yes | webserver webserver | [48] |
|
The single sequence methods mentioned above have a difficult job detecting a small sample of reasonable secondary structures from a large space of possible structures. A good way to reduce the size of the space is to use evolutionary approaches. Structures that have been conserved by evolution are far more likely to be the functional form. The methods below use this approach.
Name | Description | Number of sequences [Note 1] | Alignment [Note 2] | Structure [Note 3] | Knots [Note 4] | Link | References |
---|---|---|---|---|---|---|---|
SQUARNA | Common secondary structure prediction based on a greedy stem formation model | any | No | Yes | Yes | sourcecode | [1] |
Carnac | Comparative analysis combined with MFE folding. | any | No | Yes | No | sourcecode, webserver | [49] [50] |
CentroidAlifold | Common secondary structure prediction based on generalized centroid estimator | any | No | Yes | No | sourcecode | [51] |
CentroidAlign | Fast and accurate multiple aligner for RNA sequences | any | Yes | No | No | sourcecode | [52] |
CMfinder | an expectation maximization algorithm using covariance models for motif description. Uses heuristics for effective motif search, and a Bayesian framework for structure prediction combining folding energy and sequence covariation. | Yes | Yes | No | sourcecode, webserver, website | [53] | |
CONSAN | implements a pinned Sankoff algorithm for simultaneous pairwise RNA alignment and consensus structure prediction. | 2 | Yes | Yes | No | sourcecode | [54] |
DAFS | Simultaneous aligning and folding of RNA sequences via dual decomposition. | any | Yes | Yes | Yes | sourcecode | [55] |
Dynalign | an algorithm that improves the accuracy of structure prediction by combining free energy minimization and comparative sequence analysis to find a low free energy structure common to two sequences without requiring any sequence identity. | 2 | Yes | Yes | No | sourcecode | [56] [57] [58] |
Foldalign | An algorithm capable of making both local and global pairwise structural alignments of RNAs. Based on a combination of energy minimization of the conserved structure and sequence similarity using ribosum-like scoring matrices. For local alignments more than one alignment can be returned. | 2 | Yes | Yes | No | sourcecode, webserver , website | [59] |
FoldalignM | A multiple RNA structural RNA alignment method, to a large extent based on the PMcomp program. | any | Yes | Yes | No | sourcecode | [60] |
FRUUT | A pairwise RNA structural alignment tool based on the comparison of RNA trees. Considers alignments in which the compared trees can be rooted differently (with respect to the standard "external loop" corresponding roots), and/or permuted with respect to branching order. | any | Yes | input | No | sourcecode, webserver | [61] [62] |
GraphClust | Fast RNA structural clustering method of local RNA secondary structures. Predicted clusters are refined using LocARNA and CMsearch. Due to the linear time complexity for clustering it is possible to analyse large RNA datasets. | any | Yes | Yes | No | sourcecode | [63] |
KNetFold | Computes a consensus RNA secondary structure from an RNA sequence alignment based on machine learning. | any | input | Yes | Yes | linuxbinary, webserver | [64] |
LARA | Produce a global fold and alignment of ncRNA families using integer linear programming and Lagrangian relaxation. | any | Yes | Yes | No | sourcecode | [65] |
LocaRNA | LocaRNA is the successor of PMcomp with an improved time complexity. It is a variant of Sankoff's algorithm for simultaneous folding and alignment, which takes as input pre-computed base pair probability matrices from McCaskill's algorithm as produced by RNAfold -p. Thus the method can also be viewed as way to compare base pair probability matrices. | any | Yes | Yes | No | sourcecode, webserver | [66] |
MASTR | A sampling approach using Markov chain Monte Carlo in a simulated annealing framework, where both structure and alignment is optimized by making small local changes. The score combines the log-likelihood of the alignment, a covariation term and the basepair probabilities. | any | Yes | Yes | No | sourcecode | [67] [68] |
Multilign | This method uses multiple Dynalign calculations to find a low free energy structure common to any number of sequences. It does not require any sequence identity. | any | Yes | Yes | No | sourcecode | [69] |
Murlet | a multiple alignment tool for RNA sequences using iterative alignment based on Sankoff's algorithm with sharply reduced computational time and memory. | any | Yes | Yes | No | webserver | [70] |
MXSCARNA | a multiple alignment tool for RNA sequences using progressive alignment based on pairwise structural alignment algorithm of SCARNA. | any | Yes | Yes | No | webserver sourcecode | [71] |
pAliKiss | pAliKiss predicts RNA secondary structures for fixed RNA multiple sequence alignments, with special attention for pseudoknotted structures. This program is an offspring of the hybridization of RNAalishapes and pKiss. | any | input | Yes | Yes | webserver sourcecode | [18] |
PARTS | A method for joint prediction of alignment and common secondary structures of two RNA sequences using a probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities. | 2 | Yes | Yes | No | sourcecode | [72] |
Pfold | Folds alignments using a SCFG trained on rRNA alignments. | input | Yes | No | webserver | [73] [74] | |
PETfold | Formally integrates both the energy-based and evolution-based approaches in one model to predict the folding of multiple aligned RNA sequences by a maximum expected accuracy scoring. The structural probabilities are calculated by RNAfold and Pfold. | any | input | Yes | No | sourcecode | [75] |
PhyloQFold | Method that takes advantage of the evolutionary history of a group of aligned RNA sequences for sampling consensus secondary structures, including pseudoknots, according to their approximate posterior probability. | any | input | Yes | Yes | sourcecode | [76] |
PMcomp/PMmulti | PMcomp is a variant of Sankoff's algorithm for simultaneous folding and alignment, which takes as input pre-computed base pair probability matrices from McCaskill's algorithm as produced by RNAfold -p. Thus the method can also be viewed as way to compare base pair probability matrices. PMmulti is a wrapper program that does progressive multiple alignments by repeatedly calling pmcomp | Yes | Yes | No | sourcecode, webserver | [77] | |
RNAG | A Gibbs sampling method to determine a conserved structure and the structural alignment. | any | Yes | Yes | No | sourcecode | [78] |
R-COFFEE | uses RNAlpfold to compute the secondary structure of the provided sequences. A modified version of T-Coffee is then used to compute the multiple sequence alignment having the best agreement with the sequences and the structures. R-Coffee can be combined with any existing sequence alignment method. | any | Yes | Yes | No | sourcecode, webserver | [79] [80] |
TurboFold | This algorithm predicts conserved structures in any number of sequences. It uses probabilistic alignment and partition functions to map conserved pairs between sequences, and then iterates the partition functions to improve structure prediction accuracy | any | No | Yes | Yes | sourcecode | [81] [82] |
R-scape | Verify conserved secondary structure by measuring covarying basepairs and their statistical significance compared to pure phylogeny. Will propose a most conserved ("optimized") one if no secondary structure is given. | any | input | Yes | Yes | home page | [83] |
RNA123 | Included structure based sequence alignment (SBSA) algorithm uses a novel suboptimal version of the Needleman-Wunsch global sequence alignment method that fully accounts for secondary structure in the template and query. It also uses two separate substitution matrices optimized for RNA helices and single stranded regions. The SBSA algorithm provides >90% accurate sequence alignments even for structures as large as bacterial 23S rRNA: ~2,800 nts. | any | Yes | Yes | Yes | webserver | |
RNAalifold | Folds precomputed alignments using mix of free-energy and covariation measures. Ships with the ViennaRNA Package. | any | input | Yes | No | homepage | [21] [84] |
RNAalishapes | Tool for secondary structure prediction for precomputed alignments using a mix of free-energy and a covariation measures. Output can be sifted by the abstract shapes concept to focus on major difference in suboptimal results. | any | input | Yes | No | sourcecode, webserver | [85] |
RNAcast | enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences, and for each sequence, the thermodynamically best structure which has this abstract shape. | any | No | Yes | No | sourcecode, webserver | [86] |
RNAforester | Compare and align RNA secondary structures via a "forest alignment" approach. | any | Yes | input | No | sourcecode, webserver | [87] [88] |
RNAmine | Frequent stem pattern miner from unaligned RNA sequences is a software tool to extract the structural motifs from a set of RNA sequences. | any | No | Yes | No | webserver | [89] |
RNASampler | A probabilistic sampling approach that combines intrasequence base pairing probabilities with intersequence base alignment probabilities. This is used to sample possible stems for each sequence and compare these stems between all pairs of sequences to predict a consensus structure for two sequences. The method is extended to predict the common structure conserved among multiple sequences by using a consistency-based score that incorporates information from all the pairwise structural alignments. | any | Yes | Yes | Yes | sourcecode | [90] |
SCARNA | Stem Candidate Aligner for RNA (Scarna) is a fast, convenient tool for structural alignment of a pair of RNA sequences. It aligns two RNA sequences and calculates the similarities of them, based on the estimated common secondary structures. It works even for pseudoknotted secondary structures. | 2 | Yes | Yes | No | webserver | [91] |
SimulFold | simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. | any | Yes | Yes | Yes | sourcecode | [92] |
Stemloc | a program for pairwise RNA structural alignment based on probabilistic models of RNA structure known as Pair stochastic context-free grammars. | any | Yes | Yes | No | sourcecode | [93] |
StrAl | an alignment tool designed to provide multiple alignments of non-coding RNAs following a fast progressive strategy. It combines the thermodynamic base pairing information derived from RNAfold calculations in the form of base pairing probability vectors with the information of the primary sequence. | Yes | No | No | sourcecode, webserver | [94] | |
TFold | A tool for predicting non-coding RNA secondary structures including pseudoknots. It takes in input an alignment of RNA sequences and returns the predicted secondary structure(s). It combines criteria of stability, conservation and covariation in order to search for stems and pseudoknots. Users can change different parameters values, set (or not) some known stems (if there are) which are taken into account by the system, choose to get several possible structures or only one, search for pseudoknots or not, etc. | any | Yes | Yes | Yes | webserver | [95] |
WAR | a webserver that makes it possible to simultaneously use a number of state of the art methods for performing multiple alignment and secondary structure prediction for noncoding RNA sequences. | Yes | Yes | No | webserver | [96] | |
Xrate | a program for analysis of multiple sequence alignments using phylogenetic grammars, that may be viewed as a flexible generalization of the "Pfold" program. | any | Yes | Yes | No | sourcecode | [97] |
Alifreefold/AlifreefoldMulti | an alignment-free approach to predict secondary structure from homologous RNA sequences. It computes a representative structure from a set of homologous RNA sequences using sub-optimal secondary structures generated for each sequence. It is based on a vector representation of sub-optimal structures capturing structure conservation signals by weighting structural motifs according to their conservation across the sub-optimal structures. | >5 | No | Yes | No | sourcecode sourcecode | [98] [99] |
|
Name (Year) | Description | Link | References |
---|---|---|---|
RNAsnap2 (2020) | RNAsnap2 uses a dilated convolutional neural network with evolutionary features generated from BLAST + INFERNAL (same as RNAsol) and predicted base-pairing probabilities from LinearPartition as an input for the prediction of RNA solvent accessibility. Also, the single-sequence version of RNAsnap2 can predict the solvent accessibility of a given input RNA sequence without using evolutionary information. | sourcecode | [100] |
RNAsol (2019) | RNAsol predictor uses a unidirectional LSTM deep learning algorithm with evolutionary information generated from BLASTN + INFERNAL and predicted secondary structure from RNAfold as an input for the prediction of RNA solvent accessibility. | sourcecode | [101] |
RNAsnap (2017) | RNAsnap predictor uses an SVM machine learning algorithm and evolutionary information generated from BLASTN as an input for the prediction of RNA solvent accessibility. | sourcecode | [102] |
Many ncRNAs function by binding to other RNAs. For example, miRNAs regulate protein coding gene expression by binding to 3' UTRs, small nucleolar RNAs guide post-transcriptional modifications by binding to rRNA, U4 spliceosomal RNA and U6 spliceosomal RNA bind to each other forming part of the spliceosome and many small bacterial RNAs regulate gene expression by antisense interactions E.g. GcvB, OxyS and RyhB.
Name | Description | Intra-molecular structure | Comparative | Link | References |
---|---|---|---|---|---|
SQUARNA | SQUARNA predicts RNA secondary structure formed by several RNA sequences using a greedy stem formation model | Yes | Yes | sourcecode | [1] |
RNApredator | RNApredator uses a dynamic programming approach to compute RNA-RNA interaction sites. | Yes | No | webserver | [103] |
GUUGle | A utility for fast determination of RNA-RNA matches with perfect hybridization via A-U, C-G, and G-U base pairing. | No | No | webserver | [104] |
IntaRNA | Efficient target prediction incorporating the accessibility of target sites. | Yes | No | sourcecode webserver | [105] [106] [107] [108] [109] |
CopraRNA | Tool for sRNA target prediction. It computes whole genome predictions by mix of distinct whole genome IntaRNA predictions. | Yes | Yes | sourcecode webserver | [110] [106] |
MINT | Automatic tool to analyze three-dimensional structures of RNA and DNA molecules, their full-atom molecular dynamics trajectories or other conformation sets (e.g. X-ray or NMR-derived structures). For each RNA or DNA conformation MINT determines the hydrogen bonding network resolving the base pairing patterns, identifies secondary structure motifs (helices, junctions, loops, etc.) and pseudoknots. Also estimates the energy of stacking and phosphate anion-base interactions. | Yes | No | sourcecode webserver | [111] |
NUPACK | Computes the full unpseudoknotted partition function of interacting strands in dilute solution. Calculates the concentrations, mfes, and base-pairing probabilities of the ordered complexes below a certain complexity. Also computes the partition function and basepairing of single strands including a class of pseudoknotted structures. Also enables design of ordered complexes. | Yes | No | NUPACK | [112] |
OligoWalk/RNAstructure | Predicts bimolecular secondary structures with and without intramolecular structure. Also predicts the hybridization affinity of a short nucleic acid to an RNA target. | Yes | No | [113] | |
piRNA | Calculates the partition function and thermodynamics of RNA-RNA interactions. It considers all possible joint secondary structure of two interacting nucleic acids that do not contain pseudoknots, interaction pseudoknots, or zigzags. | Yes | No | linuxbinary | [114] |
piRNAPred | an integrated framework for piRNA prediction employing hybrid features like k-mer nucleotide composition, secondary structure, thermodynamic and physicochemical properties. | Yes | No | [115] | |
RNAripalign | Calculates the partition function and thermodynamics of RNA-RNA interactions based on structural alignments. Also supports RNA-RNA interaction prediction for single sequences. It outputs suboptimal structures based on Boltzmann distribution. It considers all possible joint secondary structure of two interacting nucleic acids that do not contain pseudoknots, interaction pseudoknots, or zigzags. | Yes | No | [116] | |
RactIP | Fast and accurate prediction of RNA-RNA interaction using integer programming. | Yes | No | sourcecode webserver | [117] |
RNAaliduplex | Based on RNAduplex with bonuses for covarying sites | No | Yes | sourcecode | [21] |
RNAcofold | Works much like RNAfold, but allows specifying two RNA sequences which are then allowed to form a dimer structure. | Yes | No | sourcecode | [21] [118] |
RNAduplex | Computes optimal and suboptimal secondary structures for hybridization. The calculation is simplified by allowing only inter-molecular base pairs. | No | No | sourcecode | [21] |
RNAhybrid | Tool to find the minimum free energy hybridisation of a long and a short RNA (≤ 30 nt). | No | No | sourcecode, webserver | [119] [120] |
RNAup | Calculates the thermodynamics of RNA-RNA interactions. RNA-RNA binding is decomposed into two stages. (1) First the probability that a sequence interval (e.g. a binding site) remains unpaired is computed. (2) Then the binding energy given that the binding site is unpaired is calculated as the optimum over all possible types of bindings. | Yes | No | sourcecode | [21] [121] |
The below table includes interactions that are not limited to UTRs.
Name | Description | Cross-species | Intra-molecular structure | Comparative | Link | References |
---|---|---|---|---|---|---|
comTAR | A a web tool for the prediction of miRNA targets that is mainly based on the conservation of the potential regulation in plant species. | Yes | No | No | Web tool | [122] |
RNA22 | The first link (precomputed predictions) provides RNA22 predictions for all protein coding transcripts in human, mouse, roundworm, and fruit fly. It allows visualizing the predictions within a cDNA map and also find transcripts where multiple miR's of interest target. The second web-site link (interactive/custom sequences) first finds putative microRNA binding sites in the sequence of interest, then identifies the targeted microRNA. Both tools are provided by the Computational Medicine Center at Thomas Jefferson University. | Yes | No | No | precomputed predictions interactive/custom sequences | [123] |
RNAhybrid | Tool to find the minimum free energy hybridisation of a long and a short RNA (≤ 30 nt). | Yes | No | No | sourcecode, webserver | [119] [120] |
miRBooking | Simulates the stochiometric mode of action of microRNAs using a derivative of the Gale-Shapley algorithm for finding a stable set of duplexes. It uses quantifications for traversing the set of mRNA and microRNA pairs and seed complementarity for ranking and assigning sites. | Yes | No | No | sourcecode, webserver | [124] |
MicroRNAs regulate protein coding gene expression by binding to 3' UTRs, there are tools specifically designed for predicting these interactions. For an evaluation of target prediction methods on high-throughput experimental data see (Baek et al., Nature 2008), [125] (Alexiou et al., Bioinformatics 2009), [126] or (Ritchie et al., Nature Methods 2009) [127]
Name | Description | Cross-species | Intra-molecular structure | Comparative | Link | References |
---|---|---|---|---|---|---|
Cupid | Method for simultaneous prediction of miRNA-target interactions and their mediated competing endogenous RNA (ceRNA) interactions. It is an integrative approach significantly improves on miRNA-target prediction accuracy as assessed by both mRNA and protein level measurements in breast cancer cell lines. Cupid is implemented in 3 steps: Step 1: re-evaluate candidate miRNA binding sites in 3' UTRs. Step2: interactions are predicted by integrating information about selected sites and the statistical dependency between the expression profiles of miRNA and putative targets. Step 3: Cupid assesses whether inferred targets compete for predicted miRNA regulators. | human | No | Yes | software (MATLAB) | [128] |
Diana-microT | Version 3.0 is an algorithm based on several parameters calculated individually for each microRNA and it combines conserved and non-conserved microRNA recognition elements into a final prediction score. | human, mouse | No | Yes | webserver | [129] |
MicroTar | An animal miRNA target prediction tool based on miRNA-target complementarity and thermodynamic data. | Yes | No | No | sourcecode | [130] |
miTarget | microRNA target gene prediction using a support vector machine. | Yes | No | No | webserver | [131] |
miRror | Based on the notion of a combinatorial regulation by an ensemble of miRNAs or genes. miRror integrates predictions from a dozen of miRNA resources that are based on complementary algorithms into a unified statistical framework | Yes | No | No | webserver | [132] [133] |
PicTar | Combinatorial microRNA target predictions. | 8 vertebrates | No | Yes | predictions | [134] |
PITA | Incorporates the role of target-site accessibility, as determined by base-pairing interactions within the mRNA, in microRNA target recognition. | Yes | Yes | No | executable, webserver, predictions | [135] |
RNA22 | The first link (precomputed predictions) provides RNA22 predictions for all protein coding transcripts in human, mouse, roundworm, and fruit fly. It allows visualizing the predictions within a cDNA map and also find transcripts where multiple miR's of interest target. The second web-site link (interactive/custom sequences) first finds putative microRNA binding sites in the sequence of interest, then identifies the targeted microRNA. Both tools are provided by the Computational Medicine Center at Thomas Jefferson University. | Yes | No | No | precomputed predictions interactive/custom sequences | [123] |
RNAhybrid | Tool to find the minimum free energy hybridisation of a long and a short RNA (≤ 30 nt). | Yes | No | No | sourcecode, webserver | [119] [120] |
Sylamer | Method to find significantly over or under-represented words in sequences according to a sorted gene list. Usually used to find significant enrichment or depletion of microRNA or siRNA seed sequences from microarray expression data. | Yes | No | No | sourcecode webserver | [136] [137] |
TAREF | TARget REFiner (TAREF) predicts microRNA targets on the basis of multiple feature information derived from the flanking regions of the predicted target sites where traditional structure prediction approach may not be successful to assess the openness. It also provides an option to use encoded pattern to refine filtering. | Yes | No | No | server/sourcecode | [138] |
p-TAREF | plant TARget REFiner (p-TAREF) identifies plant microRNA targets on the basis of multiple feature information derived from the flanking regions of the predicted target sites where traditional structure prediction approach may not be successful to assess the openness. It also provides an option to use encoded pattern to refine filtering. It first time employed power of machine learning approach with scoring scheme through support vector regression (SVR) while considering structural and alignment aspects of targeting in plants with plant specific models. p-TAREF has been implemented in concurrent architecture in server and standalone form, making it one of the very few available target identification tools able to run concurrently on simple desktops while performing huge transcriptome level analysis accurately and fast. Also provides option to experimentally validate the predicted targets, on the spot, using expression data, which has been integrated in its back-end, to draw confidence on prediction along with SVR score.p-TAREF performance benchmarking has been done extensively through different tests and compared with other plant miRNA target identification tools. p-TAREF was found to perform better. | Yes | No | No | server/standalone | |
TargetScan | Predicts biological targets of miRNAs by searching for the presence of sites that match the seed region of each miRNA. In flies and nematodes, predictions are ranked based on the probability of their evolutionary conservation. In zebrafish, predictions are ranked based on site number, site type, and site context, which includes factors that influence target-site accessibility. In mammals, the user can choose whether the predictions should be ranked based on the probability of their conservation or on site number, type, and context. In mammals and nematodes, the user can choose to extend predictions beyond conserved sites and consider all sites. | vertebrates, flies, nematodes | evaluated indirectly | Yes | sourcecode, webserver | [139] [140] [141] [142] [143] [144] |
Name | Description | Number of sequences [Note 1] | Alignment [Note 2] | Structure [Note 3] | Link | References |
---|---|---|---|---|---|---|
Alifoldz | Assessing a multiple sequence alignment for the existence of an unusual stable and conserved RNA secondary structure. | any | input | Yes | sourcecode | [145] |
EvoFold | a comparative method for identifying functional RNA structures in multiple-sequence alignments. It is based on a probabilistic model-construction called a phylo-SCFG and exploits the characteristic differences of the substitution process in stem-pairing and unpaired regions to make its predictions. | any | input | Yes | linuxbinary | [146] |
GraphClust | Fast RNA structural clustering method to identify common (local) RNA secondary structures. Predicted structural clusters are presented as alignment. Due to the linear time complexity for clustering it is possible to analyse large RNA datasets. | any | Yes | Yes | sourcecode | [63] |
MSARi | heuristic search for statistically significant conservation of RNA secondary structure in deep multiple sequence alignments. | any | input | Yes | sourcecode | [147] |
QRNA | This is the code from Elena Rivas that accompanies a submitted manuscript "Noncoding RNA gene detection using comparative sequence analysis". QRNA uses comparative genome sequence analysis to detect conserved RNA secondary structures, including both ncRNA genes and cis-regulatory RNA structures. | 2 | input | Yes | sourcecode | [148] [149] |
RNAz | program for predicting structurally conserved and thermodynamic stable RNA secondary structures in multiple sequence alignments. It can be used in genome wide screens to detect functional RNA structures, as found in noncoding RNAs and cis-acting regulatory elements of mRNAs. | any | input | Yes | sourcecode, webserver RNAz 2 | [150] [151] [152] |
ScanFold | A program for predicting unique local RNA structures in large sequences with unusually stable folding. | 1 | None | Yes | sourcecode webserver | [153] |
Xrate | a program for analysis of multiple sequence alignments using phylogenetic grammars, that may be viewed as a flexible generalization of the "Evofold" program. | any | Yes | Yes | sourcecode | [97] |
|
Name | Description | Family | Link | References |
---|---|---|---|---|
ARAGORN | ARAGORN detects tRNA and tmRNA in nucleotide sequences. | tRNA tmRNA | webserver source | [154] |
miReader | miReader is a first of its type to detect mature miRNAs with no dependence on genomic or reference sequences. So far, discovering miRNAs was possible only with species for which genomic or reference sequences would be available as most of the miRNA discovery tools relied on drawing pre-miRNA candidates. Due to this, miRNA biology became limited to model organisms, mostly. miReader allows directly discerning mature miRNAs from small RNA sequencing data, with no need of genomic-reference sequences. It has been developed for many Phyla and species, from vertebrate to plant models. Its accuracy has been found to be consistently >90% in heavy validatory testing. | mature miRNA | webserver/source webserver/source | [155] |
miRNAminer | Given a search query, candidate homologs are identified using BLAST search and then tested for their known miRNA properties, such as secondary structure, energy, alignment and conservation, in order to assess their fidelity. | MicroRNA | webserver | [156] |
RISCbinder | Prediction of guide strand of microRNAs. | Mature miRNA | webserver | [157] |
RNAmicro | A SVM-based approach that, in conjunction with a non-stringent filter for consensus secondary structures, is capable of recognizing microRNA precursors in multiple sequence alignments. | MicroRNA | homepage | [158] |
RNAmmer | RNAmmer uses HMMER to annotate rRNA genes in genome sequences. Profiles were built using alignments from the European ribosomal RNA database [159] and the 5S Ribosomal RNA Database. [160] | rRNA | webserver source | [161] |
SnoReport | Uses a mix of RNA secondary structure prediction and machine learning that is designed to recognize the two major classes of snoRNAs, box C/D and box H/ACA snoRNAs, among ncRNA candidate sequences. | snoRNA | sourcecode | [162] |
SnoScan | Search for C/D box methylation guide snoRNA genes in a genomic sequence. | C/D box snoRNA | sourcecode, webserver | [163] [164] |
tRNAscan-SE | a program for the detection of transfer RNA genes in genomic sequence. | tRNA | sourcecode, webserver | [164] [165] |
miRNAFold | A fast ab initio software for searching for microRNA precursors in genomes. | microRNA | webserver | [166] |
Name | Description | Link | References |
---|---|---|---|
DECIPHER (software) | FindNonCoding takes a pattern mining approach to capture the essential sequence motifs and hairpin loops representing a non-coding RNA family and quickly identify matches in genomes. FindNonCoding was designed for ease of use and accurately finds non-coding RNAs with a low false discovery rate. | sourcecode | [167] |
ERPIN | "Easy RNA Profile IdentificatioN" is an RNA motif search program reads a sequence alignment and secondary structure, and automatically infers a statistical "secondary structure profile" (SSP). An original Dynamic Programming algorithm then matches this SSP onto any target database, finding solutions and their associated scores. | sourcecode webserver | [168] [169] [170] |
Infernal | "INFERence of RNA ALignment" is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). | sourcecode | [171] [172] [173] |
GraphClust | Fast RNA structural clustering method to identify common (local) RNA secondary structures. Predicted structural clusters are presented as alignment. Due to the linear time complexity for clustering it is possible to analyse large RNA datasets. | sourcecode | [63] |
PHMMTS | "pair hidden Markov models on tree structures" is an extension of pair hidden Markov models defined on alignments of trees. | sourcecode, webserver | [174] |
RaveNnA | A slow and rigorous or fast and heuristic sequence-based filter for covariance models. | sourcecode | [175] [176] |
RSEARCH | Takes one RNA sequence with its secondary structure and uses a local alignment algorithm to search a database for homologous RNAs. | sourcecode | [177] |
Structator | Ultra fast software for searching for RNA structural motifs employing an innovative index-based bidirectional matching algorithm combined with a new fast fragment chaining strategy. | sourcecode | [178] |
RaligNAtor | Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns | sourcecode | [179] |
Name | Description | Structure [Note 1] | Alignment [Note 2] | Phylogeny | Links | References |
---|---|---|---|---|---|---|
BRalibase I | A comprehensive comparison of comparative RNA structure prediction approaches | Yes | No | No | data | [180] |
BRalibase II | A benchmark of multiple sequence alignment programs upon structural RNAs | No | Yes | No | data | [181] |
BRalibase 2.1 | A benchmark of multiple sequence alignment programs upon structural RNAs | No | Yes | No | data | [182] |
BRalibase III | A critical assessment of the performance of homology search methods on noncoding RNA | No | Yes | No | data | [183] |
CompaRNA | An independent comparison of single-sequence and comparative methods for RNA secondary structure prediction | Yes | No | No | AMU mirror or IIMCB mirror | [184] |
EternaBench | Database comprising the diverse high-throughput structural data gathered through the crowdsourced RNA design project Eterna | Yes | No | No | data | |
RNAconTest | A test of RNA multiple sequence alignments based entirely on known three dimensional RNA structures | Yes | Yes | No | data | [185] |
|
Name | Description | Alignment [Note 1] | Structure [Note 2] | Link | References |
---|---|---|---|---|---|
4sale | A tool for Synchronous RNA Sequence and Secondary Structure Alignment and Editing | Yes | Yes | sourcecode | [186] |
Colorstock, SScolor, Raton | Colorstock, a command-line script using ANSI terminal color; SScolor, a Perl script that generates static HTML pages; and Raton, an Ajax web application generating dynamic HTML. Each tool can be used to color RNA alignments by secondary structure and to visually highlight compensatory mutations in stems. | Yes | Yes | sourcecode | [187] |
Integrated Genome Browser (IGB) | Multiple alignment viewer written in Java. | Yes | No | sourcecode | [188] |
Jalview | Multiple alignment editor written in Java. | Yes | No | sourcecode | [189] [190] |
RALEE | a major mode for the Emacs text editor. It provides functionality to aid the viewing and editing of multiple sequence alignments of structured RNAs. | Yes | Yes | sourcecode | [191] |
SARSE | A graphical sequence editor for working with structural alignments of RNA. | Yes | Yes | sourcecode | [192] |
|
Name | Description | Link | References |
---|---|---|---|
Single state design | |||
EteRNA/EteRNABot | An RNA folding game that challenges players to make sequences that fold into a target RNA structure. The best sequences for a given puzzle are synthesized and their structures are probed through chemical mapping. The sequences are then scored by the data's agreement to the target structure and feedback is provided to the players. EteRNABot is a software implementation based on design rules submitted by EteRNA players. | EteRNA Game EteRNABot web server | [193] |
RNAinverse | The ViennaRNA Package provides RNAinverse, an algorithm for designing sequences with desired structure. | Web Server | [21] |
RNAiFold | A complete RNA inverse folding approach based on constraint programming and implemented using OR Tools which allows for the specification of a wide range of design constraints. The RNAiFold software provides two algorithms to solve the inverse folding problem: i) RNA-CPdesign explores the complete search space and ii) RNA-LNSdesign based on the large neighborhood search metaheuristic is suitable to design large structures. The software can also design interacting RNA molecules using RNAcofold of the ViennaRNA Package. A fully functional, earlier implementation using COMET is available. | Web Server Source Code | [194] [195] [196] |
RNA-SSD/RNA Designer | The RNA-SSD (RNA Secondary Structure Designer) approach first assigns bases probabilistically to each position based probabilistic models. Subsequently, a stochastic local search is used to optimize this sequence. RNA-SSD is publicly available under the name of RNA Designer at the RNASoft web page | Web Server | [197] |
INFO-RNA | INFO-RNA uses a dynamic programming approach to generate an energy optimized starting sequence that is subsequently further improved by a stochastic local search that uses an effective neighbor selection method. | Web Server Source Code | [198] [199] |
RNAexinv | RNAexinv is an extension of RNAinverse to generate sequences that not only fold into a desired structure, but they should also exhibit selected attributes such as thermodynamic stability and mutational robustness. This approach does not necessarily outputs a sequence that perfectly fits the input structure, but a shape abstraction, i.e. it keeps the adjacency and nesting of structural elements, but disregards helix lengths and the exact number unpaired positions, of it. | Source Code | [200] |
RNA-ensign | This approach applies an efficient global sampling algorithm to examine the mutational landscape under structural and thermodynamical constraints. The authors show that the global sampling approach is more robust, succeeds more often and generates more thermodynamically stable sequences than local approaches do. | Source Code | [201] |
IncaRNAtion | Successor of RNA-ensign that can specifically design sequences with a specified GC content using a GC-weighted Boltzmann ensemble and stochastic backtracking | Source Code | [202] |
DSS-Opt | Dynamics in Sequence Space Optimization (DSS-Opt) uses Newtonian dynamics in the sequence space, with a negative design term and simulated annealing to optimize a sequence such that it folds into the desired secondary structure. | Source Code | [203] |
MODENA | This approach interprets RNA inverse folding as a multi-objective optimization problem and solves it using a genetic algorithm. In its extended version MODENA is able to design pseudoknotted RNA structures with the aid of IPknot. | Source Code | [204] [205] |
ERD | Evolutionary RNA Design (ERD) can be used to design RNA sequences that fold into a given target structure. Any RNA secondary structure contains different structural components, each having a different length. Therefore, in the first step, the RNA subsequences (pools) corresponding to different components with different lengths are reconstructed. Using these pools, ERD reconstructs an initial RNA sequence which is compatible with the given target structure. Then ERD uses an evolutionary algorithm to improve the quality of the subsequences corresponding to the components. The major contributions of ERD are using the natural RNA sequences, a different method to evaluate the sequences in each population, and a different hierarchical decomposition of the target structure into smaller substructures. | Web Server Source Code | [206] |
antaRNA | Uses an underlying ant colony foraging heuristic terrain modeling to solve the inverse folding problem. The designed RNA sequences show high compliance to input structural and sequence constraints. Most prominently, also the GC value of the designed sequence can be regulated with high precision. GC value distribution sampling of solution sets is possible and sequence domain specific definition of multiple GC values within one entity. Due to the flexible evaluation of the intermediate sequences using underlying programs such as RNAfold, pKiss, or also HotKnots and IPKnot, RNA secondary nested structures and also pseudoknot structures of H- and K-type are feasible to solve with this approach. | Web Server Source Code | [207] [208] |
Dual state design | |||
switch.pl | The ViennaRNA Package provides a Perl script to design RNA sequences that can adopt two states. For instance RNA thermometer, which change their structural state depending on the environmental temperature, have been successfully designed using this program. | Man Page Source Code | [209] |
RiboMaker | Intended to design small RNAs (sRNA) and their target mRNA's 5'UTR. The sRNA is designed to activate or repress protein expression of the mRNA. It is also possible to design just one of the two RNA components provided the other sequence is fixed. | Web Server Source Code | [210] |
Multi state design | |||
RNAblueprint | This C++ library is based on the RNAdesign multiple target sampling algorithm. It brings a SWIG interface for Perl and Python which allows for an effortless integration into various tools. Therefore, multiple target sequence sampling can be combined with many optimization techniques and objective functions. | Source Code | [211] |
RNAdesign | The underlying algorithm is based on a mix of graph coloring and heuristic local optimization to find sequences can adapt multiple prescribed conformations. The software can also use of RNAcofold to design interacting RNA sequence pairs. | Source Code [ permanent dead link ] | [212] |
Frnakenstein | Frnakenstein applies a genetic algorithm to solve the inverse RNA folding problem. | Source Code | [213] |
ARDesigner | The Allosteric RNA Designer (ARDesigner) is a web-based tool that solves the inverse folding problem by incorporating mutational robustness. Beside a local search the software has been equipped with a simulated annealing approach to effectively search for good solutions. The tool has been used to design RNA thermometer. | [ dead link ] | [214] |
Name | Description | Link | References |
---|---|---|---|
PseudoViewer | Automatically visualizing RNA pseudoknot structures as planar graphs. | webapp/binary | [215] [216] [217] [218] |
RNA Movies | browse sequential paths through RNA secondary structure landscapes | sourcecode | [219] [220] |
RNA-DV | RNA-DV aims at providing an easy-to-use GUI for visualizing and designing RNA secondary structures. It allows users to interact directly with the RNA structure and perform operations such as changing primary sequence content and connect/disconnect nucleotide bonds. It also integrates thermodynamic energy calculations including four major energy models. RNA-DV recognizes three input formats including CT, RNAML and dot bracket (dp). | sourcecode | [221] |
RNA2D3D | Program to generate, view, and compare 3-dimensional models of RNA | binary | [222] |
RNAstructure | RNAstructure has a viewer for structures in ct files. It can also compare predicted structures using the circleplot program. Structures can be output as postscript files. | sourcecode | [223] |
RNAView/RnamlView | Use RNAView to automatically identify and classify the types of base pairs that are formed in nucleic acid structures. Use RnamlView to arrange RNA structures. | sourcecode | [224] |
RILogo | Visualizes the intra-/intermolecular base pairing of two interacting RNAs with sequence logos in a planar graph. | web server / sourcecode | [225] |
VARNA | A tool for the automated drawing, visualization and annotation of the secondary structure of RNA, initially designed as a companion software for web servers and databases | webapp/sourcecode | [226] |
forna | A web based viewer for displaying RNA secondary structures using the force-directed graph layout provided by the d3.js visualization library. It is based on fornac, a javascript container for simply drawing a secondary structure on a web page. | webapp fornac source forna source | [227] |
R2R | Program for drawing aesthetic RNA consensus diagrams with automated pair covariance recognition. Rfam uses this program both for drawing the human-annotated SS and the R-scape covariance-optimized structure. | source | [228] |
RNAcanvas | A web app for drawing and exploring nucleic acid structures. | webapp | [229] |
RNAscape | Geometric mapping algorithm for RNA 3D structure to 2D diagram production, which attempts to preserve tertiary interaction topology, provided through an interactive webserver with various customizability options. | webserver | [230] |
Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.
A pseudoknot is a nucleic acid secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem. The pseudoknot was first recognized in the turnip yellow mosaic virus in 1982. Pseudoknots fold into knot-shaped three-dimensional conformations but are not true topological knots. These structures are categorized as cross (X) topology within the circuit topology framework, which, in contrast to knot theory, is a contact-based approach.
Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.
Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.
Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.
SAM-V riboswitch is the fifth known riboswitch to bind S-adenosyl methionine (SAM). It was first discovered in the marine bacterium Candidatus Pelagibacter ubique and can also be found in marine metagenomes. SAM-V features a similar consensus sequence and secondary structure as the binding site of SAM-II riboswitch, but bioinformatics scans cluster the two aptamers independently. These similar binding pockets suggest that the two riboswitches have undergone convergent evolution.
αr7 is a family of bacterial small non-coding RNAs with representatives in a broad group of Alphaproteobacterial species from the order Hyphomicrobiales. The first member of this family was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis identified full-length homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr7 RNA species are 134-159 nucleotides (nt) long and share a well defined common secondary structure. αr7 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the Alphaproteobacterial genomes.
αr9 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Hyphomicrobiales. The first member of this family (Smr9C) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis have identified full-length Smr9C homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr9C RNA species are 144-158 nt long and share a well defined common secondary structure consisting of seven conserved regions. Most of the αr9 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.
αr14 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria. The first member of this family (Smr14C2) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). It was later renamed NfeR1 and shown to be highly expressed in salt stress and during the symbiotic interaction on legume roots. Further homology and structure conservation analysis identified 2 other chromosomal copies and 3 plasmidic ones. Moreover, full-length Smr14C homologs have been identified in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr14C RNA species are 115-125 nt long and share a well defined common secondary structure. Most of the αr14 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.
αr15 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Rhizobiales. The first members of this family were found tandemly arranged in the same intergenic region (IGR) of the Sinorhizobium meliloti 1021 chromosome (C). Further homology and structure conservation analysis have identified full-length Smr15C1 and Smr15C2 homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. The Smr15C1 and Smr15C2 homologs are also encoded in tandem within the same IGR region of Rhizobium and Agrobacterium species, whereas in Brucella species the αr15C loci are spread in the IGRs of Chromosome I. Moreover, this analysis also identified a third αr15 loci in extrachromosomal replicons of the mentioned nitrogen-fixing α-proteobacteria and in the Chromosome II of Brucella species. αr15 RNA species are 99-121 nt long and share a well defined common secondary structure consisting of three stem loops. The transcripts of the αr15 family can be catalogued as trans-acting sRNAs encoded by independent transcription units with recognizable promoter and transcription termination signatures within intergenic regions (IGRs) of the α-proteobacterial genomes.
αr35 is a family of bacterial small non-coding RNAs with representatives in a reduced group of Alphaproteobacteria from the order Hyphomicrobiales. The first member of this family (Smr35B) was found in a Sinorhizobium meliloti 1021 locus located in the symbiotic plasmid B (pSymB). Further homology and structure conservation analysis have identified full-length SmrB35 homologs in other legume symbionts, as well as in the human and plant pathogens Brucella anthropi and Agrobacterium tumefaciens, respectively. αr35 RNA species are 139-142 nt long and share a common secondary structure consisting of two stem loops and a well conserved rho independent terminator. Most of the αr35 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions of the Alphaproteobacterial genomes.
αr45 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Hyphomicrobiales. The first member of this family (Smr45C) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis identified homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species, in Bartonella species, in several members of the Xanthobactereacea family, and in some representatives of the Beijerinckiaceae family. αr45C RNA species are 147-153 nt long and share a well defined common secondary structure. All of the αr45 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.
The 3' splice site of the influenza A virus segment 7 pre-mRNA can adopt two different types of RNA structure: a pseudoknot and a hairpin. This conformational switch is proposed to play a role in RNA alternative splicing and may influence the production of M1 and M2 proteins produced by splicing of this pre-mRNA.
Non-coding RNAs have been discovered using both experimental and bioinformatic approaches. Bioinformatic approaches can be divided into three main categories. The first involves homology search, although these techniques are by definition unable to find new classes of ncRNAs. The second category includes algorithms designed to discover specific types of ncRNAs that have similar properties. Finally, some discovery methods are based on very general properties of RNA, and are thus able to discover entirely new kinds of ncRNAs.
An array of protein tandem repeats is defined as several adjacent copies having the same or similar sequence motifs. These periodic sequences are generated by internal duplications in both coding and non-coding genomic sequences. Repetitive units of protein tandem repeats are considerably diverse, ranging from the repetition of a single amino acid to domains of 100 or more residues.
Computational methods that use protein sequence and/ or protein structure to predict protein aggregation. The table below, shows the main features of software for prediction of protein aggregation
DIMPL is a bioinformatic pipeline that enables the extraction and selection of bacterial GC-rich intergenic regions (IGRs) that are enriched for structured non-coding RNAs (ncRNAs). The method of enriching bacterial IGRs for ncRNA motif discovery was first reported for a study in "Genome-wide discovery of structured noncoding RNAs in bacteria".
Sfold is a software program developed to predict probable RNA secondary structures through structure ensemble sampling and centroid predictions with a focus on assessment of RNA target accessibility, for major applications to the rational design of siRNAs in the suppression of gene expressions, and to the identification of targets for regulatory RNAs particularly microRNAs.
{{cite journal}}
: Cite journal requires |journal=
(help){{cite web}}
: Missing or empty |title=
(help){{cite web}}
: Missing or empty |title=
(help)