ConsensusPathDB

Last updated
ConsensusPathDB
Database.png
Content
Descriptionhuman functional interaction networks.
Organisms Homo sapiens, Saccharomyces cerevisiae, Mus musculus
Contact
Research center Max Planck Institute for Molecular Genetics
AuthorsAtanas Kamburov
Primary citationKamburov et al. (2011) [1]
Release date2008
Access
Data format BioPAX
PSI-MI
SBML
Website consensuspathdb.org
Download URLyes
Web service URLyes
Miscellaneous
Version30; 9 January 2015;8 years ago (2015-01-09)

The ConsensusPathDB is a molecular functional interaction database, integrating information on protein interactions, genetic interactions signaling, metabolism, gene regulation, and drug-target interactions in humans. ConsensusPathDB currently (release 30) includes such interactions from 32 databases. [1] ConsensusPathDB is freely available for academic use under http://ConsensusPathDB.org.

Contents

Integrated Databases

Functionalities

The ConsensusPathDB is accessible via a web interface providing a variety of functions.

Search and visualization

Using the web interface users can search for physical entities (e.g. proteins, metabolites etc.) or pathways using common names or accession numbers (e.g. UniProt identifiers). Selected interactions can be visualized in an interactive environment as expandable networks. ConsensusPathDB currently allows users to export their models in BioPAX format or as image in several formats.

Cpdb network.png

Shortest path

Users can search for shortest paths of functional interactions between physical entities, based on all interactions in the database. The pathway search can be constrained by forbidding passing through certain physical entities.

Data upload

Users can upload their own interaction networks in BioPAX, PSI-MI or SBML files in order to validate and/or extend those networks in the context of the interactions in ConsensusPathDB.

Over-representation analysis

Using the web-interface of the database, one can perform overrepresentation analysis, based on biochemical pathways or on neighbourhood-based entity sets (NESTs) that constitute sub-networks of the overall interaction network containing all physical entities around a central one within a "radius" (number of interactions from the center). For each predefined set (pathway / NEST), a P-value is computed based on the hypergeometric distribution. It reflects the significance of the observed overlap between the user-specific input gene list and the members of the predefined set.

Over-representation analyses can be performed with user-specified genes or metabolites.

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Gene regulatory network</span> Collection of molecular regulators

A generegulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the function of the cell. GRN also play a central role in morphogenesis, the creation of body structures, which in turn is central to evolutionary developmental biology (evo-devo).

<span class="mw-page-title-main">Metabolome</span>

The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.

<span class="mw-page-title-main">Metabolic network modelling</span> Form of biological modelling

Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.

<span class="mw-page-title-main">KEGG</span> Collection of bioinformatics databases

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

Reactome is a free online database of biological pathways. There are several Reactomes that concentrate on specific organisms, the largest of these is focused on human biology, the following description concentrates on the human Reactome. It is authored by biologists, in collaboration with Reactome editorial staff. The content is cross-referenced to many bioinformatics databases. The rationale behind Reactome is to visually represent biological pathways in full mechanistic detail, while making the source data available in a computationally accessible format.

<span class="mw-page-title-main">Biological network inference</span>

Biological network inference is the process of making inferences and predictions about biological networks. By using these networks to analyze patterns in biological systems, such as food-webs, we can visualize the nature and strength of these interactions between species, DNA, proteins, and more.

<span class="mw-page-title-main">MicrobesOnline</span>

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

BioPAX is a RDF/OWL-based standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data. Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.

The Proteolysis MAP (PMAP) was an integrated web resource focused on proteases. Its domain now links to a scam/spam browser extender.

<span class="mw-page-title-main">STRING</span>

In molecular biology, STRING is a biological database and web resource of known and predicted protein–protein interactions.

Your Favorite Gene is a dynamic web-based research tool provided by Sigma-Aldrich Corp and powered by Ingenuity Systems' Knowledge Base, a repository of biological and chemical networks that is the largest database of its kind. Biological pathways, metabolic pathways, and gene interaction networks are available. The tool was initially released in 2007.

A biological pathway is a series of interactions among molecules in a cell that leads to a certain product or a change in a cell. Such a pathway can trigger the assembly of new molecules, such as a fat or protein. Pathways can also turn genes on and off, or spur a cell to move. Some of the most common biological pathways are involved in metabolism, the regulation of gene expression and the transmission of signals. Pathways play a key role in advanced studies of genomics.

<span class="mw-page-title-main">WikiPathways</span>

WikiPathways is a community resource for contributing and maintaining content dedicated to biological pathways. Any registered WikiPathways user can contribute, and anybody can become a registered user. Contributions are monitored by a group of admins, but the bulk of peer review, editorial curation, and maintenance is the responsibility of the user community. WikiPathways is originally built using MediaWiki software, a custom graphical pathway editing tool (PathVisio) and integrated BridgeDb databases covering major gene, protein, and metabolite systems. WikiPathways was founded in 2008 by Thomas Kelder, Alex Pico, Martijn Van Iersel, Kristina Hanspers, Bruce Conklin and Chris Evelo. Current architects are Alex Pico and Martina Summer-Kutmon.

Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.

The Small Molecule Pathway Database (SMPDB) is a comprehensive, high-quality, freely accessible, online database containing more than 600 small molecule (i.e. metabolic) pathways found in humans. SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology. It is able to do so, in part, by providing colorful, detailed, fully searchable, hyperlinked diagrams of five types of small molecule pathways: 1) general human metabolic pathways; 2) human metabolic disease pathways; 3) human metabolite signaling pathways; 4) drug-action pathways and 5) drug metabolism pathways. SMPDB pathways may be navigated, viewed and zoomed interactively using a Google Maps-like interface. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures (Fig. 1). Each small molecule in SMPDB is hyperlinked to detailed descriptions contained in the HMDB or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. Additionally, all SMPDB pathways are accompanied with detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram. Users can browse the SMPDB (Fig. 2) or search its contents by text searching (Fig. 3), sequence searching, or chemical structure searching. More powerful queries are also possible including searching with lists of gene or protein names, drug names, metabolite names, GenBank IDs, Swiss-Prot IDs, Agilent or Affymetrix microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein concentration data can also be visualized through SMPDB's mapping interface.

<span class="mw-page-title-main">Gene set enrichment analysis</span> Bioinformatics method

Gene set enrichment analysis (GSEA) (also called functional enrichment analysis or pathway enrichment analysis) is a method to identify classes of genes or proteins that are over-represented in a large set of genes or proteins, and may have an association with different phenotypes (e.g. different organism growth patterns or diseases). The method uses statistical approaches to identify significantly enriched or depleted groups of genes. Transcriptomics technologies and proteomics results often identify thousands of genes, which are used for the analysis.

Network medicine is the application of network science towards identifying, preventing, and treating diseases. This field focuses on using network topology and network dynamics towards identifying diseases and developing medical drugs. Biological networks, such as protein-protein interactions and metabolic pathways, are utilized by network medicine. Disease networks, which map relationships between diseases and biological factors, also play an important role in the field. Epidemiology is extensively studied using network science as well; social networks and transportation networks are used to model the spreading of disease across populations. Network medicine is a medically focused area of systems biology.

Metascape is a free gene annotation and analysis resource that helps biologists make sense of one or multiple gene lists. Metascape provides automated meta-analysis tools to understand either common or unique pathways and protein networks within a group of orthogonal target-discovery studies.

References

  1. 1 2 Kamburov, Atanas; Pentchev Konstantin; Galicka Hanna; Wierling Christoph; Lehrach Hans; Herwig Ralf (Jan 2011). "ConsensusPathDB: toward a more complete picture of cell biology". Nucleic Acids Res. England. 39 (Database issue): D712-7. doi:10.1093/nar/gkq1156. PMC   3013724 . PMID   21071422.