MetaboLights

Last updated
MetaboLights
Content
DescriptionMetabolomics database
Data types
captured
metabolites from different species, metabolite structure, chemical properties, synonyms, experimental protocols, taxonomy, reactions, pathways, NMR spectra, mass spectra
Contact
Research center European Molecular Biology Laboratory
Laboratory Flag of the United Kingdom.svg European Bioinformatics Institute
Primary citation PMID   23109552
Access
Website http://www.ebi.ac.uk/metabolights/
Download URL http://www.ebi.ac.uk/metabolights/download
Tools
Web MetaboLights Website
Miscellaneous
Data release
frequency
live
Curation policyManually curated

MetaboLights [1] is a data repository founded in 2012 for cross-species and cross-platform metabolomic studies that provides primary research data and meta data for metabolomic studies as well as a knowledge base for properties of individual metabolites. [2] [3] [4] The database is maintained by the European Bioinformatics Institute (EMBL-EBI) and the development is funded by Biotechnology and Biological Sciences Research Council (BBSRC). [5] [6] As of July 2018, the MetaboLights browse functionality consists of 383 studies, two analytical platforms, NMR spectroscopy and mass spectrometry. [7]

Semantic annotation is based on various ontologies and controlled vocabularies, including the BRENDA tissue ontology and the NCBI taxonomy. The metabolite structure data is linked to chemical databases, including ChemSpider, PubChem, and ChEBI. Links to metabolite databases, however, seem to be missing.

MetaboLights consists of two components:

Fig.2 MetaboLights Study Protocol Protocol.png
Fig.2 MetaboLights Study Protocol
Fig.3 Metabolite Page Metabolite Page.png
Fig.3 Metabolite Page

Scope and access

The data stored in MetaboLights is available for download from an FTP site and can be reused by the scientific community, where data sharing is considered an integral part of the scientific method. [8] Copyright and license information, however, is not easily identifiable.

MetaboLights includes user tools for submission of experiments using the ISA-TAB format for metadata tagging of all submissions. [9] Submitted studies are automatically assigned a stable unique accession number (e.g. MTBLS1) that can be used as a publication reference; MetaboLights is one of the repositories recommended by several scientific journals, including EMBO Journal [10] and Nature's Scientific Data. [11] There is also a guided submission process to help meet the Metabolomics Standards Initiative (MSI) recommendations for high quality data submissions for NMR and MS experiments. [12]

Related Research Articles

<span class="mw-page-title-main">Metabolomics</span> Scientific study of chemical processes involving metabolites

Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerprints that specific cellular processes leave behind", the study of their small-molecule metabolite profiles. The metabolome represents the complete set of metabolites in a biological cell, tissue, organ, or organism, which are the end products of cellular processes. Messenger RNA (mRNA), gene expression data, and proteomic analyses reveal the set of gene products being produced in the cell, data that represents one aspect of cellular function. Conversely, metabolic profiling can give an instantaneous snapshot of the physiology of that cell, and thus, metabolomics provides a direct "functional readout of the physiological state" of an organism. There are indeed quantifiable correlations between the metabolome and the other cellular ensembles, which can be used to predict metabolite abundances in biological samples from, for example mRNA abundances. One of the ultimate challenges of systems biology is to integrate metabolomics with all other -omics information to provide a better understanding of cellular biology.

<span class="mw-page-title-main">Metabolome</span> Complete set of small molecules in a biological sample

The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.

Chemical Entities of Biological Interest, also known as ChEBI, is a chemical database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies (OBO) effort at the European Bioinformatics Institute (EBI). The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products which have potential bioactivity. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.

<span class="mw-page-title-main">Genomic Standards Consortium</span>

The Genomic Standards Consortium (GSC) is an initiative working towards richer descriptions of our collection of genomes, metagenomes and marker genes. Established in September 2005, this international community includes representatives from a range of major sequencing and bioinformatics centres and research institutions. The goal of the GSC is to promote mechanisms for standardizing the description of (meta)genomes, including the exchange and integration of (meta)genomic data. The number and pace of genomic and metagenomic sequencing projects will only increase as the use of ultra-high-throughput methods becomes common place and standards are vital to scientific progress and data sharing.

<span class="mw-page-title-main">WikiPathways</span>

WikiPathways is a community resource for contributing and maintaining content dedicated to biological pathways. Any registered WikiPathways user can contribute, and anybody can become a registered user. Contributions are monitored by a group of admins, but the bulk of peer review, editorial curation, and maintenance is the responsibility of the user community. WikiPathways is originally built using MediaWiki software, a custom graphical pathway editing tool (PathVisio) and integrated BridgeDb databases covering major gene, protein, and metabolite systems. WikiPathways was founded in 2008 by Thomas Kelder, Alex Pico, Martijn Van Iersel, Kristina Hanspers, Bruce Conklin and Chris Evelo. Current architects are Alex Pico and Martina Summer-Kutmon.

The PRIDE is a public data repository of mass spectrometry (MS) based proteomics data, and is maintained by the European Bioinformatics Institute as part of the Proteomics Team.

The METLIN Metabolite and Chemical Entity Database is the largest repository of experimental tandem mass spectrometry and neutral loss data acquired from standards. The tandem mass spectrometry data on over 930,000 molecular standards is provided to facilitate the identification of chemical entities from tandem mass spectrometry experiments. In addition to the identification of known molecules, it is also useful for identifying unknowns using its similarity searching technology. All tandem mass spectrometry data comes from the experimental analysis of standards at multiple collision energies and in both positive and negative ionization modes.

<span class="mw-page-title-main">Sequence Read Archive</span> Database of DNA sequencing data

The Sequence Read Archive is a bioinformatics database that provides a public repository for DNA sequencing data, especially the "short reads" generated by high-throughput sequencing, which are typically less than 1,000 base pairs in length. The archive is part of the International Nucleotide Sequence Database Collaboration (INSDC), and run as a collaboration between the NCBI, the European Bioinformatics Institute (EBI), and the DNA Data Bank of Japan (DDBJ).

<span class="mw-page-title-main">Human Metabolome Database</span> Database of human metabolites

The Human Metabolome Database (HMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in the human body. It has been created by the Human Metabolome Project funded by Genome Canada and is one of the first dedicated metabolomics databases. The HMDB facilitates human metabolomics research, including the identification and characterization of human metabolites using NMR spectroscopy, GC-MS spectrometry and LC/MS spectrometry. To aid in this discovery process, the HMDB contains three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data (Fig. 1–3). The chemical data includes 41,514 metabolite structures with detailed descriptions along with nearly 10,000 NMR, GC-MS and LC/MS spectra.

MetaboAnalyst is a set of online tools for metabolomic data analysis and interpretation, created by members of the Wishart Research Group at the University of Alberta. It was first released in May 2009 and version 2.0 was released in January 2012. MetaboAnalyst provides a variety of analysis methods that have been tailored for metabolomic data. These methods include metabolomic data processing, normalization, multivariate statistical analysis, and data annotation. The current version is focused on biomarker discovery and classification.

<span class="mw-page-title-main">European Nucleotide Archive</span> Online database from the EBI on Nucleotides

The European Nucleotide Archive (ENA) is a repository providing free and unrestricted access to annotated DNA and RNA sequences. It also stores complementary information such as experimental procedures, details of sequence assembly and other metadata related to sequencing projects. The archive is composed of three main databases: the Sequence Read Archive, the Trace Archive and the EMBL Nucleotide Sequence Database. The ENA is produced and maintained by the European Bioinformatics Institute and is a member of the International Nucleotide Sequence Database Collaboration (INSDC) along with the DNA Data Bank of Japan and GenBank.

<span class="mw-page-title-main">Christoph Steinbeck</span> German chemist (born 1966)

Christoph Steinbeck is a German chemist and has a professorship for analytical chemistry, cheminformatics and chemometrics at the Friedrich-Schiller-Universität Jena in Thuringia.

<span class="mw-page-title-main">Experimental factor ontology</span>

Experimental factor ontology, also known as EFO, is an open-access ontology of experimental variables particularly those used in molecular biology. The ontology covers variables which include aspects of disease, anatomy, cell type, cell lines, chemical compounds and assay information. EFO is developed and maintained at the EMBL-EBI as a cross-cutting resource for the purposes of curation, querying and data integration in resources such as Ensembl, ChEMBL and Expression Atlas.

Metabolite Set Enrichment Analysis (MSEA) is a method designed to help metabolomics researchers identify and interpret patterns of metabolite concentration changes in a biologically meaningful way. It is conceptually similar to another widely used tool developed for transcriptomics called Gene Set Enrichment Analysis or GSEA. GSEA uses a collection of predefined gene sets to rank the lists of genes obtained from gene chip studies. By using this “prior knowledge” about gene sets researchers are able to readily identify significant and coordinated changes in gene expression data while at the same time gaining some biological context. MSEA does the same thing by using a collection of predefined metabolite pathways and disease states obtained from the Human Metabolome Database. MSEA is offered as a service both through a stand-alone web server and as part of a larger metabolomics analysis suite called MetaboAnalyst.

The Plant Genomics and Phenomics Research Data Repository (PGP) is a data publication infrastructure to comprehensively publish multi-domain plant research data. It is hosted at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) in Gatersleben, Germany. The repository hosts DOI citeable datasets that are not being published in public repositories because of their volume or data scope. PGP enables the publication of gigabyte-scale datasets and is registered as a research data repository at FAIRSharing.org, re3data.org and OpenAIRE as a valid EU Horizon 2020 open data archive. The above features, the programmatic interface and the support of standard metadata formats, enable PGP to fulfil the FAIR data principles—findable, accessible, interoperable, reusable. The PGP repository was created using the e!DAL software infrastructure and applies an on-premises approach to "bring the infrastructure to the data" (I2D).

<span class="mw-page-title-main">Gary Siuzdak</span> American chemist

Gary Siuzdak is an American chemist best known for his work in the field of metabolomics, activity metabolomics, and mass spectrometry. His lab discovered indole-3-propionic acid as a gut bacteria derived metabolite in 2009. He is currently the Professor and Director of The Center for Metabolomics and Mass Spectrometry at Scripps Research in La Jolla, California. Siuzdak has also made contributions to virus analysis, viral structural dynamics, as well as developing mass spectrometry imaging technology using nanostructured surfaces. The Siuzdak lab is also responsible for creating the research tools eXtensible Computational Mass Spectrometry (XCMS), METLIN, METLIN Neutral Loss and Q-MRM. As of January 2021, the XCMS/METLIN platform has over 50,000 registered users.

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

David S. Wishart is a Canadian researcher and a Distinguished University Professor in the Department of Biological Sciences and the Department of Computing Science at the University of Alberta. Wishart also holds cross appointments in the Faculty of Pharmacy and Pharmaceutical Sciences and the Department of Laboratory Medicine and Pathology in the Faculty of Medicine and Dentistry. Additionally, Wishart holds a joint appointment in metabolomics at the Pacific Northwest National Laboratory in Richland, Washington. Wishart is well known for his pioneering contributions to the fields of protein NMR spectroscopy, bioinformatics, cheminformatics and metabolomics. In 2011, Wishart founded the Metabolomics Innovation Centre (TMIC), which is Canada's national metabolomics laboratory.

<span class="mw-page-title-main">Oncometabolism</span>

Oncometabolism is the field of study that focuses on the metabolic changes that occur in cells that make up the tumor microenvironment (TME) and accompany oncogenesis and tumor progression toward a neoplastic state.

<span class="mw-page-title-main">Susanna-Assunta Sansone</span> British-Italian data scientist

Susanna-Assunta Sansone is a British-Italian data scientist who is professor of data readiness at the University of Oxford where she leads the data readiness group and serves as associate director of the Oxford e-Research Centre. Her research investigates techniques for improving the interoperability, reproducibility and integrity of data.

References

  1. Kenneth Haug; Reza M. Salek; Pablo Conesa; et al. (January 2013). "MetaboLights--an open-access general-purpose repository for metabolomics studies and associated meta-data". Nucleic Acids Research . 41 (Database issue): D781-6. doi:10.1093/NAR/GKS1004. ISSN   0305-1048. PMC   3531110 . PMID   23109552. Wikidata   Q27818909.
  2. Susanna-Assunta Sansone; Philippe Rocca-Serra; Dawn Field; et al. (27 January 2012). "Toward interoperable bioscience data". Nature Genetics . 44 (2): 121–6. doi:10.1038/NG.1054. ISSN   1061-4036. PMC   3428019 . PMID   22281772. Wikidata   Q28090939.
  3. Haug, Kenneth; Salek, Reza M.; Conesa, Pablo; Mahendraker, Tejasvi; Williams, Mark; Griffin, Julian L.; Steinbeck, Christoph. "MetaboLights - The new EBI Metabolomics database". No. 19. MetaboNews. Retrieved 5 May 2015.
  4. Reza M. Salek; Kenneth Haug; Pablo Conesa; et al. (2013). "The MetaboLights repository: curation challenges in metabolomics". Database . 2013 (0): bat029. doi:10.1093/DATABASE/BAT029. ISSN   1758-0463. PMC   3638156 . PMID   23630246. Wikidata   Q28707581.
  5. BBSRC Grant BB/I000933/1 "MetaboLights: Creating the missing Metabolomics community resource", http://www.bbsrc.ac.uk/research/grants/grants/AwardDetails.aspx?FundingReference=BB/I000933/1
  6. £30,000 Boost For UK-China Metabolomics Data Sharing, Asian Scientist Newsroom, 2015, retrieved 2015-06-11
  7. "Browse".
  8. Reza M Salek; Kenneth Haug; Christoph Steinbeck (2013). "Dissemination of metabolomics results: role of MetaboLights and COSMOS". GigaScience . 2 (1): 8. doi: 10.1186/2047-217X-2-8 . ISSN   2047-217X. PMC   3658998 . PMID   23683662. Wikidata   Q21195826.
  9. Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; et al. (15 September 2010). "ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level". Bioinformatics . 26 (18): 2354–6. doi:10.1093/BIOINFORMATICS/BTQ415. ISSN   1367-4803. PMC   2935443 . PMID   20679334. Wikidata   Q28749402.
  10. Author Guidelines, http://emboj.embopress.org/authorguide
  11. Recommended Data Repositories, http://www.nature.com/sdata/data-policies/repositories
  12. "The Metabolomics Standards Initiative". Nature Biotechnology . 25 (8): 846–848. August 2007. doi:10.1038/NBT0807-846B. ISSN   1087-0156. PMID   17687353. Wikidata   Q56982425.