Partition function for Interacting RNAs

Last updated

Partition function for Interacting RNAs (piRNA) is a parallel C++ package to compute joint and individual partition functions for two RNA sequences. From the partition functions, piRNA computes equilibrium concentrations of single and double species, ensemble energy, melting temperatures, and base pair probabilities. piRNA is part of TaveRNA RNA software suite.

C++ general purpose high-level programming language

C++ is a general-purpose programming language that was developed by Bjarne Stroustrup as an extension of the C language, or "C with Classes". It has imperative, object-oriented and generic programming features, while also providing facilities for low-level memory manipulation. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, Microsoft, Intel, and IBM, so it is available on many platforms.

taveRNA is a software suite for RNA/DNA secondary structure. It is developed in the laboratories for computational biology of the School of Computing Science at the Simon Fraser University. The suite is composed by alteRNA, for RNA density fold computing, inteRNA, for RNA-RNA interaction prediction, piRNA, for predicting the joint partition function, equilibrium concentration, ensemble energy, and melting temperature for two RNA sequences, pRuNA, a sequence based pruning RNA interaction search engine, and smyRNA, a platform independent C program novel ab initio ncRNA finder.

piRNA algorithm has been published in Bioinformatics 25(12):i365-i373; doi : 10.1093/bioinformatics/btp212 PMID   19478011. [1]

Digital object identifier Character string used as a permanent identifier for a digital object, in a format controlled by the International DOI Foundation

In computing, a Digital Object Identifier or DOI is a persistent identifier or handle used to identify objects uniquely, standardized by the International Organization for Standardization (ISO). An implementation of the Handle System, DOIs are in wide use mainly to identify academic, professional, and government information, such as journal articles, research reports and data sets, and official publications though they also have been used to identify other types of information resources, such as commercial videos.

Related Research Articles

Bioinformatics Software tools for understanding biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics and statistics to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques.

In mathematics, the factorial of a positive integer n, denoted by n!, is the product of all positive integers less than or equal to n. For example,

Reinforcement learning field of machine learning

Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Reinforcement learning is considered as one of three machine learning paradigms, alongside supervised learning and unsupervised learning.

Grammar theory to model symbol strings originated from work in computational linguistics aiming to understand the structure of natural languages. Probabilistic context free grammars (PCFGs) have been applied in probabilistic modeling of RNA structures almost 40 years after they were introduced in computational linguistics.

Pseudoknot

A pseudoknot is a nucleic acid secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem. The pseudoknot was first recognized in the turnip yellow mosaic virus in 1982. Pseudoknots fold into knot-shaped three-dimensional conformations but are not true topological knots.

Protein–protein interaction prediction is a field combining bioinformatics and structural biology in an attempt to identify and catalog physical interactions between pairs or groups of proteins. Understanding protein–protein interactions is important for the investigation of intracellular signaling pathways, modelling of protein complex structures and for gaining insights into various biochemical processes.

Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.

Piwi-interacting RNA (piRNA) is the largest class of small non-coding RNA molecules expressed in animal cells. piRNAs form RNA-protein complexes through interactions with piwi proteins. These piRNA complexes are mostly involved in the epigenetic and post-transcriptional silencing of transposons, but can also be involved in the regulation of other genetic elements in germ line cells. piRNAs are mostly created from loci that function as transposon traps and provide an RNA-mediated adaptive immunity against transposon expansions and invasions. They are distinct from microRNA (miRNA) in size, lack of sequence conservation, and increased complexity.

Piwi

Piwi genes were identified as regulatory proteins responsible for stem cell and germ cell differentiation. Piwi is an abbreviation of P-elementInduced WImpy testis in Drosophila. Piwi proteins are highly conserved RNA-binding proteins and are present in both plants and animals. Piwi proteins belong to the Argonaute/Piwi family and have been classified as nuclear proteins. Studies on Drosophila have also indicated that Piwi proteins have slicer activity conferred by the presence of the Piwi domain. In addition, Piwi associates with Heterochromatin protein 1, an epigenetic modifier, and piRNA-complementary sequences. These are indications of the role Piwi plays in epigenetic regulation. Piwi proteins are also thought to control the biogenesis of piRNA as many Piwi-like proteins contain slicer activity which would allow Piwi proteins to process precursor piRNA into mature piRNA.

Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.

The Robinson–Foulds metric is a way to measure the distance between unrooted phylogenetic trees. It is defined as where A is the number of partitions of data implied by the first tree but not the second tree and B is the number of partitions of data implied by the second tree but not the first tree. The partitions are calculated for each tree by removing each branch. Thus, the number of eligible partitions for each tree is equal to the number of branches in that tree. The Robinson–Foulds metric is also known as the symmetric difference metric.

Strength of a graph

In the branch of mathematics called graph theory, the strength of an undirected graph corresponds to the minimum ratio edges removed/components created in a decomposition of the graph in question. It is a method to compute partitions of the set of vertices and detect zones of high concentration of edges, and is analogous to graph toughness which is defined similarly for vertex removal.

NUPACK

The Nucleic Acid Package (NUPACK) is a growing software suite for the analysis and design of nucleic acid systems. Jobs can be run online on the NUPACK webserver or NUPACK source code can be downloaded and compiled locally for non-commercial academic use. NUPACK algorithms are formulated in terms of nucleic acid secondary structure. In most cases, pseudoknots are excluded from the structural ensemble.

Anne Elizabeth Condon, is an Irish-Canadian computer scientist, professor, and former head of the Computer Science Department of the University of British Columbia. Her research focuses on computational complexity theory, DNA computing, and bioinformatics. She has also held the NSERC/General Motors Canada Chair for Women in Science and Engineering from 2004 to 2009, and has worked to improve the success of women in the sciences and engineering.

Rna22 is a pattern-based algorithm for the discovery of microRNA target sites and the corresponding heteroduplexes.

Ron Shamir Israeli bioinformatician

Ron Shamir is an Israeli professor of computer science known for his work in graph theory and in computational biology. He holds the Raymond and Beverly Sackler Chair in Bioinformatics, and is the founder and head of the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.

Non-coding RNAs have been discovered using both experimental and bioinformatic approaches. Bioinformatic approaches can be divided into three main categories. The first involves homology search, although these techniques are by definition unable to find new classes of ncRNAs. The second category includes algorithms designed to discover specific types of ncRNAs that have similar properties. Finally, some discovery methods are based on very general properties of RNA, and are thus able to discover entirely new kinds of ncRNAs.

References

  1. H. Chitsaz et al., A partition function algorithm for interacting nucleic acid strands, http://bioinformatics.oxfordjournals.org/cgi/content/full/25/12/i365