Systems Biology Ontology

Last updated
SBO logo.png

The Systems Biology Ontology (SBO) is a set of controlled, relational vocabularies of terms commonly used in systems biology, and in particular in computational modeling.

Contents

Motivation

The rise of systems biology, seeking to comprehend biological processes as a whole, highlighted the need to not only develop corresponding quantitative models but also to create standards allowing their exchange and integration. This concern drove the community to design common data formats, such as SBML and CellML. SBML is now largely accepted and used in the field. However, as important as the definition of a common syntax is, it is also necessary to make clear the semantics of models. SBO tries to give us a way to label models with words that describe how they should be used in a large group of models that are commonly used in computational systems biology. [1] [2] The development of SBO was first discussed at the 9th SBML Forum Meeting in Heidelberg on October 14–15, 2004. During the forum, Pedro Mendes mentioned that modellers possessed a lot of knowledge that was necessary to understand the model and, more importantly, to simulate it, but this knowledge was not encoded in SBML. Nicolas Le Novère proposed to create a controlled vocabulary to store the content of Pedro Mendes' mind before he wandered out of the community. [3] The development of the ontology was announced more officially in a message from Le Novère to Michael Hucka and Andrew Finney on October 19.

Structure

SBO is currently made up of seven different vocabularies:

Resources

To curate and maintain SBO, a dedicated resource has been developed and the public interface of the SBO browser can be accessed at http://www.ebi.ac.uk/sbo. A relational database management system (MySQL) at the back-end is accessed through a web interface based on Java Server Pages (JSP) and JavaBeans. Its content is encoded in UTF-8, therefore supporting a large set of characters in the definitions of terms. Distributed curation is made possible by using a custom-tailored locking system allowing concurrent access. This system allows a continuous update of the ontology with immediate availability and suppress merging problems.

Several exports formats (OBO flat file, SBO-XML and OWL) are generated daily or on request and can be downloaded from the web interface.

To allow programmatic access to the resource, Web Services have been implemented based on Apache Axis for the communication layer and Castor for the validation. [4] The libraries, full documentation, samples and tutorial are available online.

The SourceForge project can be accessed at http://sourceforge.net/projects/sbo/.

SBO and SBML

Since Level 2 Version 2 SBML provides a mechanism to annotate model components with SBO terms, therefore increasing the semantics of the model beyond the sole topology of interaction and mathematical expression. Modelling tools such as SBMLsqueezer [5] interpret SBO terms to augment the mathematics in the SBML file. Simulation tools can check the consistency of a rate law, convert reaction from one modelling framework to another (e.g., continuous to discrete), or distinguish between identical mathematical expressions based on different assumptions (e.g., Michaelis–Menten vs. Briggs–Haldane). To add missing SBO terms to models, software such as SBOannotator [6] can be used. Other tools such as semanticSBML [7] can use the SBO annotation to integrate individual models into a larger one. The use of SBO is not restricted to the development of models. Resources providing quantitative experimental information such as SABIO Reaction Kinetics will be able to annotate the parameters (what do they mean exactly, how were they calculated) and determine relationships between them.

SBO and SBGN

All the graphical symbols used in the SBGN languages are associated with an SBO term. This permits, for instance, to help generate SBGN maps from SBML models.

SBO and BioPAX

The Systems Biology Pathway Exchange (SBPAX) allows SBO terms to be added to Biological Pathway Exchange (BioPAX). This links BioPAX to information useful for modelling, especially by adding quantitative descriptions described by SBO.

Organization of SBO development

SBO is built in collaboration by the Computational Neurobiology Group (Nicolas Le Novère, EMBL-EBI, United-Kingdom) and the SBMLTeam (Michael Hucka, Caltech, USA).

Funding for SBO

SBO has benefited from the funds of the European Molecular Biology Laboratory and the National Institute of General Medical Sciences.

Related Research Articles

The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and gene product attributes; 2) annotate genes and gene products, and assimilate and disseminate annotation data; and 3) provide tools for easy access to all aspects of the data provided by the project, and to enable functional interpretation of experimental data using the GO, for example via enrichment analysis. GO is part of a larger classification effort, the Open Biomedical Ontologies, being one of the Initial Candidate Members of the OBO Foundry.

The Systems Biology Markup Language (SBML) is a representation format, based on XML, for communicating and storing computational models of biological processes. It is a free and open standard with widespread software support and a community of users and developers. SBML can represent many different classes of biological phenomena, including metabolic networks, cell signaling pathways, regulatory networks, infectious diseases, and many others. It has been proposed as a standard for representing computational models in systems biology today.

Reactome is a free online database of biological pathways. There are several Reactomes that concentrate on specific organisms, the largest of these is focused on human biology, the following description concentrates on the human Reactome. It is authored by biologists, in collaboration with Reactome editorial staff. The content is cross-referenced to many bioinformatics databases. The rationale behind Reactome is to visually represent biological pathways in full mechanistic detail, while making the source data available in a computationally accessible format.

<span class="mw-page-title-main">BioModels</span> Database of biological reactions

BioModels is a free and open-source repository for storing, exchanging and retrieving quantitative models of biological interest created in 2006. All the models in the curated section of BioModels Database have been described in peer-reviewed scientific literature.

<span class="mw-page-title-main">Minimum information required in the annotation of models</span>

MIRIAM is a community-level effort to standardize the annotation and curation processes of quantitative models of biological systems. It consists of a set of guidelines suitable for use with any structured format, allowing different groups to collaborate and share resulting models. Adherence to these guidelines also facilitates the sharing of software and service infrastructures built upon modeling activities.

Igor I. Goryanin is a systems biologist, who holds a Henrik Kacser Chair in Computational Systems Biology at the University of Edinburgh. He also heads the Biological Systems Unit at the Okinawa Institute of Science and Technology, Japan.

BioPAX is a RDF/OWL-based standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data. Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.

<span class="mw-page-title-main">Minimum information about a simulation experiment</span>

The minimum information about a simulation experiment (MIASE) is a list of the common set of information a modeller needs to enable the execution and reproduction of a numerical simulation experiment, derived from a given set of quantitative models.

<span class="mw-page-title-main">Systems Biology Graphical Notation</span>

The Systems Biology Graphical Notation (SBGN) is a standard graphical representation intended to foster the efficient storage, exchange and reuse of information about signaling pathways, metabolic networks, and gene regulatory networks amongst communities of biochemists, biologists, and theoreticians. The system was created over several years by a community of biochemists, modelers and computer scientists.

<span class="mw-page-title-main">SABIO-Reaction Kinetics Database</span>

SABIO-RK is a web-accessible database storing information about biochemical reactions and their kinetic properties.

LibSBML is an open-source software library that provides an application programming interface (API) for the SBML format. The libSBML library can be embedded in a software application or used in a web servlet as part of the application or servlet's implementation of support for reading, writing, and manipulating SBML documents and data streams. The core of libSBML is written in ISO standard C++; the library provides API for many programming languages via interfaces generated with the help of SWIG.

<span class="mw-page-title-main">MIRIAM Registry</span>

The MIRIAM Registry, a by-product of the MIRIAM Guidelines, is a database of namespaces and associated information that is used in the creation of uniform resource identifiers. It contains the set of community-approved namespaces for databases and resources serving, primarily, the biological sciences domain. These shared namespaces, when combined with 'data collection' identifiers, can be used to create globally unique identifiers for knowledge held in data repositories. For more information on the use of URIs to annotate models, see the specification of SBML Level 2 Version 2.

<span class="mw-page-title-main">JSBML</span>

JSBML is an open-source Java (API) for the SBML format. Its API strives to attain a strong similarity to the Java binding of the corresponding library libSBML, but is entirely implemented in Java and therefore platform independent. JSBML provides an elaborated abstract type hierarchy, whose data types implement or extend many interfaces and abstract classes from the standard Java library. In this way, JSBML integrates smoothly into existing Java projects, and provides methods to read, write, evaluate, and manipulate the content of SBML documents.

Identifiers.org is a project providing stable and perennial identifiers for data records used in the Life Sciences. The identifiers are provided in the form of Uniform Resource Identifiers (URIs). Identifiers.org is also a resolving system, that relies on collections listed in the MIRIAM Registry to provide direct access to different instances of the identified records.

<span class="mw-page-title-main">KiSAO</span>

The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. KiSAO is part of the BioModels.net project and of the COMBINE initiative.

<span class="mw-page-title-main">SED-ML</span>

The Simulation Experiment Description Markup Language (SED-ML) is a representation format, based on XML, for the encoding and exchange of simulation descriptions on computational models of biological systems. It is a free and open community development project.

<span class="mw-page-title-main">Terminology for the Description of Dynamics</span>

Terminology for the Description of Dynamics (TEDDY) aims to provide an ontology for dynamical behaviours, observable dynamical phenomena, and control elements of bio-models and biological systems in Systems Biology and Synthetic Biology.

Multi-state modeling of biomolecules refers to a series of techniques used to represent and compute the behaviour of biological molecules or complexes that can adopt a large number of possible functional states.

Nicolas Le Novère is a British and French biologist. His research focuses on modeling signaling pathways and developing tools to share mathematical models.

References

  1. Le Novère N. BioModels.net, tools and resources to support Computational Systems Biology. Proceedings of the 4th Workshop on Computation of Biochemical Pathways and Genetic Networks (2005), Logos, Berlin, pp. 69-74.
  2. Le Novère N., Courtot M., Laibe C. Adding semantics in kinetics models of biochemical pathways. Proceedings of the 2nd International Symposium on experimental standard conditions of enzyme characterizations (2007), 137-153. Available online
  3. Nicolas Le Novère, personal communication
  4. Li C, Courtot M, Le Novère N, Laibe C (November 2009). "BioModels.net Web Services, a free and integrated toolkit for computational modelling software". Brief. Bioinformatics. 11 (3): 270–7. doi:10.1093/bib/bbp056. PMC   2913671 . PMID   19939940.
  5. Dräger, Andreas; Zielinski, Daniel C.; Keller, Roland; Rall, Matthias; Eichner, Johannes; Palsson, Bernhard O.; Zell, Andreas (2015). "SBMLsqueezer 2: Context-sensitive creation of kinetic equations in biochemical networks" (PDF). BMC Systems Biology. 9 (1): 68. doi: 10.1186/s12918-015-0212-9 . PMC   4600286 . PMID   26452770.
  6. Leonidou, Nantia; Fritze, Elisabeth; Renz, Alina; Dräger, Andreas (2023). "SBOannotator: a Python Tool for the Automated Assignment of Systems Biology Ontology Terms". Bioinformatics. 39 (7). doi: 10.1093/bioinformatics/btad437 . PMC   10371491 . PMID   37449910.
  7. Krause F, Uhlendorf J., Lubitz T., Schulz M., Klipp E., Liebermeister W. (2010), Annotation and merging of SBML models with semanticSBML, Bioinformatics 26 (3), 421-422