MetaboLights

Last updated
MetaboLights
Content
DescriptionMetabolomics database
Data types
captured
metabolites from different species, metabolite structure, chemical properties, synonyms, experimental protocols, taxonomy, reactions, pathways, NMR spectra, mass spectra
Contact
Research center European Molecular Biology Laboratory
Laboratory Flag of the United Kingdom.svg European Bioinformatics Institute
Primary citation PMID   23109552
Access
Website http://www.ebi.ac.uk/metabolights/
Download URL http://www.ebi.ac.uk/metabolights/download
Tools
Web MetaboLights Website
Miscellaneous
Data release
frequency
live
Curation policyManually curated

MetaboLights [1] is a data repository founded in 2012 for cross-species and cross-platform metabolomic studies that provides primary research data and meta data for metabolomic studies as well as a knowledge base for properties of individual metabolites. [2] [3] [4] The database is maintained by the European Bioinformatics Institute (EMBL-EBI) and the development is funded by Biotechnology and Biological Sciences Research Council (BBSRC). [5] [6] As of July 2018, the MetaboLights browse functionality consists of 383 studies, two analytical platforms, NMR spectroscopy and mass spectrometry. [7]

Semantic annotation is based on various ontologies and controlled vocabularies, including the BRENDA tissue ontology and the NCBI taxonomy. The metabolite structure data is linked to chemical databases, including ChemSpider, PubChem, and ChEBI. Links to metabolite databases, however, seem to be missing.

MetaboLights consists of two components:

Fig.2 MetaboLights Study Protocol Protocol.png
Fig.2 MetaboLights Study Protocol
Fig.3 Metabolite Page Metabolite Page.png
Fig.3 Metabolite Page

Scope and access

The data stored in MetaboLights is available for download from an FTP site and can be reused by the scientific community, where data sharing is considered an integral part of the scientific method. [8] Copyright and license information, however, is not easily identifiable.

MetaboLights includes user tools for submission of experiments using the ISA-TAB format for metadata tagging of all submissions. [9] Submitted studies are automatically assigned a stable unique accession number (e.g. MTBLS1) that can be used as a publication reference; MetaboLights is one of the repositories recommended by several scientific journals, including EMBO Journal [10] and Nature's Scientific Data. [11] There is also a guided submission process to help meet the Metabolomics Standards Initiative (MSI) recommendations for high quality data submissions for NMR and MS experiments. [12]

Related Research Articles

<span class="mw-page-title-main">Metabolomics</span> Scientific study of chemical processes involving metabolites

Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerprints that specific cellular processes leave behind", the study of their small-molecule metabolite profiles. The metabolome represents the complete set of metabolites in a biological cell, tissue, organ, or organism, which are the end products of cellular processes. Messenger RNA (mRNA), gene expression data, and proteomic analyses reveal the set of gene products being produced in the cell, data that represents one aspect of cellular function. Conversely, metabolic profiling can give an instantaneous snapshot of the physiology of that cell, and thus, metabolomics provides a direct "functional readout of the physiological state" of an organism. There are indeed quantifiable correlations between the metabolome and the other cellular ensembles, which can be used to predict metabolite abundances in biological samples from, for example mRNA abundances. One of the ultimate challenges of systems biology is to integrate metabolomics with all other -omics information to provide a better understanding of cellular biology.

<span class="mw-page-title-main">Metabolome</span>

The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.

Chemical Entities of Biological Interest, also known as ChEBI, is a chemical database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies (OBO) effort at the European Bioinformatics Institute (EBI). The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products which have potential bioactivity. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.

<span class="mw-page-title-main">WikiPathways</span>

WikiPathways is a community resource for contributing and maintaining content dedicated to biological pathways. Any registered WikiPathways user can contribute, and anybody can become a registered user. Contributions are monitored by a group of admins, but the bulk of peer review, editorial curation, and maintenance is the responsibility of the user community. WikiPathways is originally built using MediaWiki software, a custom graphical pathway editing tool (PathVisio) and integrated BridgeDb databases covering major gene, protein, and metabolite systems. WikiPathways was founded in 2008 by Thomas Kelder, Alex Pico, Martijn Van Iersel, Kristina Hanspers, Bruce Conklin and Chris Evelo. Current architects are Alex Pico and Martina Summer-Kutmon.

The PRIDE is a public data repository of mass spectrometry (MS) based proteomics data, and is maintained by the European Bioinformatics Institute as part of the Proteomics Team.

<span class="mw-page-title-main">ChEMBL</span> Chemical database of bioactive molecules also having drug-like properties

ChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug inducing properties. It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

The METLIN Metabolite and Chemical Entity Database is the largest repository of experimental tandem mass spectrometry and neutral loss data acquired from standards. The tandem mass spectrometry data on over 870,000 molecular standards is provided to facilitate the identification of chemical entities from tandem mass spectrometry experiments. In addition to the identification of known molecules, it is also useful for identifying unknowns using its similarity searching technology. All tandem mass spectrometry data comes from the experimental analysis of standards at multiple collision energies and in both positive and negative ionization modes.

<span class="mw-page-title-main">Sequence Read Archive</span>

The Sequence Read Archive is a bioinformatics database that provides a public repository for DNA sequencing data, especially the "short reads" generated by high-throughput sequencing, which are typically less than 1,000 base pairs in length. The archive is part of the International Nucleotide Sequence Database Collaboration (INSDC), and run as a collaboration between the NCBI, the European Bioinformatics Institute (EBI), and the DNA Data Bank of Japan (DDBJ).

<span class="mw-page-title-main">Human Metabolome Database</span> Database of human metabolites

The Human Metabolome Database (HMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in the human body. It bas been created by the Human Metabolome Project funded by Genome Canada and is one of the first dedicated metabolomics databases. The HMDB facilitates human metabolomics research, including the identification and characterization of human metabolites using NMR spectroscopy, GC-MS spectrometry and LC/MS spectrometry. To aid in this discovery process, the HMDB contains three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data (Fig. 1–3). The chemical data includes 41,514 metabolite structures with detailed descriptions along with nearly 10,000 NMR, GC-MS and LC/MS spectra.

MetaboAnalyst is a set of online tools for metabolomic data analysis and interpretation, created by members of the Wishart Research Group at the University of Alberta. It was first released in May 2009 and version 2.0 was released in January 2012. MetaboAnalyst provides a variety of analysis methods that have been tailored for metabolomic data. These methods include metabolomic data processing, normalization, multivariate statistical analysis, and data annotation. The current version is focused on biomarker discovery and classification.

<span class="mw-page-title-main">European Nucleotide Archive</span> Online database from the EBI on Nucleotides

The European Nucleotide Archive (ENA) is a repository providing free and unrestricted access to annotated DNA and RNA sequences. It also stores complementary information such as experimental procedures, details of sequence assembly and other metadata related to sequencing projects. The archive is composed of three main databases: the Sequence Read Archive, the Trace Archive and the EMBL Nucleotide Sequence Database. The ENA is produced and maintained by the European Bioinformatics Institute and is a member of the International Nucleotide Sequence Database Collaboration (INSDC) along with the DNA Data Bank of Japan and GenBank.

<span class="mw-page-title-main">Christoph Steinbeck</span> German chemist (born 1966)

Christoph Steinbeck is a German chemist and has a professorship for analytical chemistry, cheminformatics and chemometrics at the Friedrich-Schiller-Universität Jena in Thuringia.

The Yeast Metabolome Database (YMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in or produced by Saccharomyces cerevisiae. The YMDB was designed to facilitate yeast metabolomics research, specifically in the areas of general fermentation as well as wine, beer and fermented food analysis. YMDB supports the identification and characterization of yeast metabolites using NMR spectroscopy, GC-MS spectrometry and Liquid chromatography–mass spectrometry. The YMDB contains two kinds of data: 1) chemical data and 2) molecular biology/biochemistry data. The chemical data includes 2027 metabolite structures with detailed metabolite descriptions along with nearly 4000 NMR, GC-MS and LC/MS spectra.

<span class="mw-page-title-main">Experimental factor ontology</span>

Experimental factor ontology, also known as EFO, is an open-access ontology of experimental variables particularly those used in molecular biology. The ontology covers variables which include aspects of disease, anatomy, cell type, cell lines, chemical compounds and assay information. EFO is developed and maintained at the EMBL-EBI as a cross-cutting resource for the purposes of curation, querying and data integration in resources such as Ensembl, ChEMBL and Expression Atlas.

Metabolite Set Enrichment Analysis (MSEA) is a method designed to help metabolomics researchers identify and interpret patterns of metabolite concentration changes in a biologically meaningful way. It is conceptually similar to another widely used tool developed for transcriptomics called Gene Set Enrichment Analysis or GSEA. GSEA uses a collection of predefined gene sets to rank the lists of genes obtained from gene chip studies. By using this “prior knowledge” about gene sets researchers are able to readily identify significant and coordinated changes in gene expression data while at the same time gaining some biological context. MSEA does the same thing by using a collection of predefined metabolite pathways and disease states obtained from the Human Metabolome Database. MSEA is offered as a service both through a stand-alone web server and as part of a larger metabolomics analysis suite called MetaboAnalyst.

Metabolomic Pathway Analysis, shortened to MetPA, is a freely available, user-friendly web server to assist with the identification analysis and visualization of metabolic pathways using metabolomic data. MetPA makes use of advances originally developed for pathway analysis in microarray experiments and applies those principles and concepts to the analysis of metabolic pathways. For input, MetPA expects either a list of compound names or a metabolite concentration table with phenotypic labels. The list of compounds can include common names, HMDB IDs or KEGG IDs with one compound per row. Compound concentration tables must have samples in rows and compounds in columns. MetPA's output is a series of tables indicating which pathways are significantly enriched as well as a variety of graphs or pathway maps illustrating where and how certain pathways were enriched. MetPA's graphical output uses a colorful Google-Maps visualization system that allows simple, intuitive data exploration that lets users employ a computer mouse or track pad to select, drag and place images and to seamlessly zoom in and out. Users can explore MetPA's output using three different views or levels: 1) a metabolome view; 2) a pathway view; 3) a compound view.

The Plant Genomics and Phenomics Research Data Repository (PGP) is a data publication infrastructure to comprehensively publish multi-domain plant research data. It is hosted at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) in Gatersleben, Germany. The repository hosts DOI citeable datasets that are not being published in public repositories because of their volume or data scope. PGP enables the publication of gigabyte-scale datasets and is registered as a research data repository at FAIRSharing.org, re3data.org and OpenAIRE as a valid EU Horizon 2020 open data archive. The above features, the programmatic interface and the support of standard metadata formats, enable PGP to fulfil the FAIR data principles—findable, accessible, interoperable, reusable. The PGP repository was created using the e!DAL software infrastructure and applies an on-premises approach to "bring the infrastructure to the data" (I2D).

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

<span class="mw-page-title-main">Oncometabolism</span>

Oncometabolism is the field of study that focuses on the metabolic changes that occur in cells that make up the tumor microenvironment (TME) and accompany oncogenesis and tumor progression toward a neoplastic state.

<span class="mw-page-title-main">Susanna-Assunta Sansone</span> British-Italian data scientist

Susanna-Assunta Sansone is a British-Italian data scientist who is professor of data readiness at the University of Oxford where she leads the data readiness group and serves as associate director of the Oxford e-Research Centre. Her research investigates techniques for improving the interoperability, reproducibility and integrity of data.

References

  1. Kenneth Haug; Reza M. Salek; Pablo Conesa; et al. (January 2013). "MetaboLights--an open-access general-purpose repository for metabolomics studies and associated meta-data". Nucleic Acids Research . 41 (Database issue): D781-6. doi:10.1093/NAR/GKS1004. ISSN   0305-1048. PMC   3531110 . PMID   23109552. Wikidata   Q27818909.
  2. Susanna-Assunta Sansone; Philippe Rocca-Serra; Dawn Field; et al. (27 January 2012). "Toward interoperable bioscience data". Nature Genetics . 44 (2): 121–6. doi:10.1038/NG.1054. ISSN   1061-4036. PMC   3428019 . PMID   22281772. Wikidata   Q28090939.
  3. Haug, Kenneth; Salek, Reza M.; Conesa, Pablo; Mahendraker, Tejasvi; Williams, Mark; Griffin, Julian L.; Steinbeck, Christoph. "MetaboLights - The new EBI Metabolomics database". No. 19. MetaboNews. Retrieved 5 May 2015.
  4. Reza M. Salek; Kenneth Haug; Pablo Conesa; et al. (2013). "The MetaboLights repository: curation challenges in metabolomics". Database . 2013 (0): bat029. doi:10.1093/DATABASE/BAT029. ISSN   1758-0463. PMC   3638156 . PMID   23630246. Wikidata   Q28707581.
  5. BBSRC Grant BB/I000933/1 "MetaboLights: Creating the missing Metabolomics community resource", http://www.bbsrc.ac.uk/research/grants/grants/AwardDetails.aspx?FundingReference=BB/I000933/1
  6. £30,000 Boost For UK-China Metabolomics Data Sharing, Asian Scientist Newsroom, 2015, retrieved 2015-06-11
  7. "Browse".
  8. Reza M Salek; Kenneth Haug; Christoph Steinbeck (2013). "Dissemination of metabolomics results: role of MetaboLights and COSMOS". GigaScience . 2 (1): 8. doi: 10.1186/2047-217X-2-8 . ISSN   2047-217X. PMC   3658998 . PMID   23683662. Wikidata   Q21195826.
  9. Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; et al. (15 September 2010). "ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level". Bioinformatics . 26 (18): 2354–6. doi:10.1093/BIOINFORMATICS/BTQ415. ISSN   1367-4803. PMC   2935443 . PMID   20679334. Wikidata   Q28749402.
  10. Author Guidelines, http://emboj.embopress.org/authorguide
  11. Recommended Data Repositories, http://www.nature.com/sdata/data-policies/repositories
  12. "The Metabolomics Standards Initiative". Nature Biotechnology . 25 (8): 846–848. August 2007. doi:10.1038/NBT0807-846B. ISSN   1087-0156. PMID   17687353. Wikidata   Q56982425.