BioPAX

Last updated

BioPAX (Biological Pathway Exchange) is a RDF/OWL-based standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data. Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.

Contents

It is supported by a variety of online databases (e.g. Reactome) and tools. The latest released version is BioPAX Level 3. There is also an effort to create a version of BioPAX as part of OBO.

Governance and development

The next version of BioPAX, Level 4, is being developed by a community of researchers. Development is coordinated by board of editors and facilitated by various BioPAX work groups.

Systems Biology Pathway Exchange (SBPAX) is an extension for Level 3 and proposal for Level 4 to add quantitative data and systems biology terms (such as Systems Biology Ontology). SBPAX export has been implemented by the pathway databases Signaling Gateway Molecule Pages [1] and the SABIO-Reaction Kinetics Database. SBPAX import has been implemented by the cellular modeling framework Virtual Cell.

Other proposals for Level 4 include improved support for Semantic Web, validation and visualization.

Databases with BioPAX Export

Online databases offering BioPAX export include:

Software

Software supporting BioPAX include:

See also

Related Research Articles

Bioinformatics Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics and statistics to analyze and interpret the biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques.

The Systems Biology Markup Language (SBML) is a representation format, based on XML, for communicating and storing computational models of biological processes. It is a free and open standard with widespread software support and a community of users and developers. SBML can represent many different classes of biological phenomena, including metabolic networks, cell signaling pathways, regulatory networks, infectious diseases, and many others. It has been proposed as a standard for representing computational models in systems biology today.

Reactome is a free online database of biological pathways. There are several Reactomes that concentrate on specific organisms, the largest of these is focused on human biology, the following description concentrates on the human Reactome. It is authored by expert biologists, in collaboration with Reactome editorial staff who are all PhD level biologists. Content is cross-referenced to many bioinformatics databases. The rationale behind Reactome is to visually represent biological pathways in full mechanistic detail, while making the source data available in a computationally accessible format.

Systems Biology Ontology

The Systems Biology Ontology (SBO) is a set of controlled, relational vocabularies of terms commonly used in systems biology, and in particular in computational modeling. SBO is part of the BioModels.net effort.

Netpath

NetPath is a manually curated resource of human signal transduction pathways. It is a joint effort between Pandey Lab at the Johns Hopkins University and the Institute of Bioinformatics (IOB), Bangalore, India, and is also worked on by other parties.

Systems immunology is a research field under systems biology that uses mathematical approaches and computational methods to examine the interactions within cellular and molecular networks of the immune system. The immune system has been thoroughly analyzed as regards to its components and function by using a "reductionist" approach, but its overall function can't be easily predicted by studying the characteristics of its isolated components because they strongly rely on the interactions among these numerous constituents. It focuses on in silico experiments rather than in vivo.

GenMAPP

GenMAPP is a free, open-source bioinformatics software tool designed to visualize and analyze genomic data in the context of pathways, connecting gene-level datasets to biological processes and disease. First created in 2000, GenMAPP is developed by an open-source team based in an academic research laboratory. GenMAPP maintains databases of gene identifiers and collections of pathway maps in addition to visualization and analysis tools. Together with other public resources, GenMAPP aims to provide the research community with tools to gain insight into biology through the integration of data types ranging from genes to proteins to pathways to disease.

Cytoscape

Cytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and integrating with gene expression profiles and other state data. Additional features are available as plugins. Plugins are available for network and molecular profiling analyses, new layouts, additional file format support and connection with databases and searching in large networks. Plugins may be developed using the Cytoscape open Java software architecture by anyone and plugin community development is encouraged. Cytoscape also has a JavaScript-centric sister project named Cytoscape.js that can be used to analyse and visualise graphs in JavaScript environments, like a browser.

The Pathway Interaction Database (PID) is a free biomedical database of human cellular signaling pathways. The database contains information about the molecular interactions and reactions that take place in cells, with a particular focus on processes that might be relevant to cancer research and treatment. The database was established as collaboration between the U.S. National Cancer Institute, NIH and Nature Publishing Group in 2005 and was launched in November 2006. In September 2012, active curation was stopped and the PID data are now available in the Network Data Exchange, NDEx.

The ConsensusPathDB is a molecular functional interaction database, integrating information on protein interactions, genetic interactions signaling, metabolism, gene regulation, and drug-target interactions in humans. ConsensusPathDB currently includes such interactions from 32 databases. ConsensusPathDB is freely available for academic use under http://ConsensusPathDB.org.

A biological pathway is a series of interactions among molecules in a cell that leads to a certain product or a change in a cell. Such a pathway can trigger the assembly of new molecules, such as a fat or protein. Pathways can also turn genes on and off, or spur a cell to move. Some of the most common biological pathways are involved in metabolism, the regulation of gene expression and the transmission of signals. Pathways play a key role in advanced studies of genomics.

WikiPathways

WikiPathways is a community resource for contributing and maintaining content dedicated to biological pathways. Any registered WikiPathways user can contribute, and anybody can become a registered user. Contributions are monitored by a group of admins, but the bulk of peer review, editorial curation, and maintenance is the responsibility of the user community. WikiPathways is built using MediaWiki software, a custom graphical pathway editing tool (PathVisio) and integrated BridgeDb databases covering major gene, protein, and metabolite systems.

Virtual Cell (VCell) is an open-source software platform for modeling and simulation of living organisms, primarily cells. It has been designed to be a tool for a wide range of scientists, from experimental cell biologists to theoretical biophysicists.

Signaling Gateway is a web portal dedicated to signaling pathways powered by the San Diego Supercomputer Center at the University of California, San Diego. It was initiated by a collaboration between the Alliance for Cellular Signaling and Nature. A primary feature is the Molecule Pages database.

geWorkbench is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java. geWorkbench uses a component architecture. As of 2016, there are more than 70 plug-ins available, providing for the visualization and analysis of gene expression, sequence, and structure data.

PathVisio

PathVisio is a free open-source pathway analysis and drawing software. It allows drawing, editing, and analyzing biological pathways. Visualization of ones experimental data on the pathways for finding relevant pathways that are over-represented in your data set is possible.

Identifiers.org is a project providing stable and perennial identifiers for data records used in the Life Sciences. The identifiers are provided in the form of Uniform Resource Identifiers (URIs). Identifiers.org is also a resolving system, that relies on collections listed in the MIRIAM Registry to provide direct access to different instances of the identified records.

Pathway is the term from molecular biology for a curated schematic representation of a well characterized segment of the molecular physiological machinery, such as a metabolic pathway describing an enzymatic process within a cell or tissue or a signaling pathway model representing a regulatory process that might, in its turn, enable a metabolic or another regulatory process downstream. A typical pathway model starts with an extracellular signaling molecule that activates a specific receptor, thus triggering a chain of molecular interactions. A pathway is most often represented as a relatively small graph with gene, protein, and/or small molecule nodes connected by edges of known functional relations. While a simpler pathway might appear as a chain, complex pathway topologies with loops and alternative routes are much more common. Computational analyses employ special formats of pathway representation. In the simplest form, however, a pathway might be represented as a list of member molecules with order and relations unspecified. Such a representation, generally called Functional Gene Set (FGS), can also refer to other functionally characterised groups such as protein families, Gene Ontology (GO) and Disease Ontology (DO) terms etc. In bioinformatics, methods of pathway analysis might be used to identify key genes/ proteins within a previously known pathway in relation to a particular experiment / pathological condition or building a pathway de novo from proteins that have been identified as key affected elements. By examining changes in e.g. gene expression in a pathway, its biological activity can be explored. However most frequently, pathway analysis refers to a method of initial characterization and interpretation of an experimental condition that was studied with omics tools or GWAS. Such studies might identify long lists of altered genes. A visual inspection is then challenging and the information is hard to summarize, since the altered genes map to a broad range of pathways, processes, and molecular functions. In such situations, the most productive way of exploring the list is to identify enrichment of specific FGSs in it. The general approach of enrichment analyses is to identify FGSs, members of which were most frequently or most strongly altered in the given condition, in comparison to a gene set sampled by chance. In other words, enrichment can map canonical prior knowledge structured in the form of FGSs to the condition represented by altered genes.

Metascape is a free gene annotation and analysis resource that helps biologists make sense of one or multiple gene lists. Metascape provides automated meta-analysis tools to understand either common or unique pathways and protein networks within a group of orthogonal target-discovery studies.

References

  1. Dinasarapu A.R; Saunders B; Ozerlat I; Azam K; Subramaniam S (2010). "Signaling Gateway Molecule Pages – a data model perspective". Bioinformatics. 27 (12): 1736–1738. doi:10.1093/bioinformatics/btr190. PMC   3106186 . PMID   21505029.
  2. Babur, Ozgun, Ugur Dogrusoz, Emek Demir, and Chris Sander. "ChiBE: interactive visualization and manipulation of BioPAX pathway models." Bioinformatics 26, no. 3 (2010): 429-431.