EcoCyc

Last updated
EcoCyc
Database.png
Content
Descriptioncomprehensive database of Escherichia coli biology.
Organisms Escherichia coli
Contact
Research center SRI International
Laboratory SRI International
Authors Peter Karp et al [1]
Primary citationKeseler & al. (2011) [1]
Access
Website ecocyc.org

In bioinformatics, EcoCyc is a biological database for the bacterium Escherichia coli K-12. The EcoCyc project performs literature-based curation of the E. coli genome, and of E. coli transcriptional regulation, transporters, and metabolic pathways. EcoCyc contains written summaries of E. coli genes, distilled from over 36,000 scientific articles. EcoCyc is also a description of the genome and cellular networks of E. coli that supports scientists to carry out computational analyses. [2]

Data objects in the EcoCyc database describe each E. coli gene and gene product. Database objects also describe molecular interactions, including metabolic pathways, transport events, and the regulation of gene expression. EcoCyc provides several genome-scale visualization tools to aid in the analysis of omics data, such as by painting gene expression or metabolomics data onto the full regulatory network of E. coli.[ citation needed ]

EcoCyc can be accessed through the EcoCyc web site, as a set of downloadable files, and in conjunction with the Pathway Tools software that can be installed locally on Macintosh, PC/Windows, and PC/Linux computers. The downloadable software provides capabilities that go well beyond the web version of EcoCyc.[ citation needed ]

Related Research Articles

<span class="mw-page-title-main">Biological database</span>

Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.

<span class="mw-page-title-main">Metabolome</span>

The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.

<span class="mw-page-title-main">Metabolic network modelling</span> Form of biological modelling

Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.

Reactome is a free online database of biological pathways. There are several Reactomes that concentrate on specific organisms, the largest of these is focused on human biology, the following description concentrates on the human Reactome. It is authored by biologists, in collaboration with Reactome editorial staff. The content is cross-referenced to many bioinformatics databases. The rationale behind Reactome is to visually represent biological pathways in full mechanistic detail, while making the source data available in a computationally accessible format.

<span class="mw-page-title-main">Mixed acid fermentation</span> Biochemical conversion of six-carbon sugars into acids in bacteria

In biochemistry, mixed acid fermentation is the metabolic process by which a six-carbon sugar is converted into a complex and variable mixture of acids. It is an anaerobic (non-oxygen-requiring) fermentation reaction that is common in bacteria. It is characteristic for members of the Enterobacteriaceae, a large family of Gram-negative bacteria that includes E. coli.

The hisB gene, found in the enterobacteria, in Campylobacter jejuni and in Xylella/Xanthomonas encodes a protein involved in catalysis of two step in histidine biosynthesis, namely the bifunctional Imidazoleglycerol-phosphate dehydratase/histidinol-phosphatase.

The MetaCyc database is one of the largest metabolic pathways and enzymes databases currently available. The data in the database is manually curated from the scientific literature, and covers all domains of life. MetaCyc has extensive information about chemical compounds, reactions, metabolic pathways and enzymes. The data have been curated from more than 58,000 publications.

The BioCyc database collection is an assortment of organism specific Pathway/Genome Databases (PGDBs) that provide reference to genome and metabolic pathway information for thousands of organisms. As of July 2023, there were over 20,040 databases within BioCyc. SRI International, based in Menlo Park, California, maintains the BioCyc database family.

<span class="mw-page-title-main">MicrobesOnline</span>

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

<span class="mw-page-title-main">STRING</span>

In molecular biology, STRING is a biological database and web resource of known and predicted protein–protein interactions.

RegulonDB is a database of the regulatory network of gene expression in Escherichia coli K-12. RegulonDB also models the organization of the genes in transcription units, operons and regulons. A total of 120 sRNAs with 231 total interactions which all together regulate 192 genes are also included. RegulonDB was founded in 1998 and also contributes data to the EcoCyc database.

The CyberCell Database (CCDB) is a freely available, web-accessible database that provides quantitative genomic, proteomic as well metabolomic data on Escherichia coli. Escherichia coli is perhaps the best-studied bacterium on the planet and has been the organism of choice for several international efforts in cell simulation. These cell simulation efforts require up-to-date web-accessible resources that provide comprehensive, non-redundant, and quantitative data on this bacterium. The intent of CCDB is to facilitate the collection, revision, coordination and storage of the key information required for in silico E. coli simulation.

The E. coli Metabolome Database (ECMDB) is a freely accessible, online database of small molecule metabolites found in or produced by Escherichia coli. Escherichia coli is perhaps the best studied bacterium on earth and has served as the "model microbe" in microbiology research for more than 60 years. The ECMDB is essentially an E. coli "omics" encyclopedia containing detailed data on the genome, proteome and metabolome of E. coli. ECMDB is part of a suite of organism-specific metabolomics databases that includes DrugBank, HMDB, YMDB and SMPDB. As a metabolomics resource, the ECMDB is designed to facilitate research in the area gut/microbiome metabolomics and environmental metabolomics. The ECMDB contains two kinds of data: 1) chemical data and 2) molecular biology and/or biochemical data. The chemical data includes more than 2700 metabolite structures with detailed metabolite descriptions along with nearly 5000 NMR, GC-MS and LC-MS spectra corresponding to these metabolites. The biochemical data includes nearly 1600 protein sequences and more than 3100 biochemical reactions that are linked to these metabolite entries. Each metabolite entry in the ECMDB contains more than 80 data fields with approximately 65% of the information being devoted to chemical data and the other 35% of the information devoted to enzymatic or biochemical data. Many data fields are hyperlinked to other databases. The ECMDB also has a variety of structure and pathway viewing applets. The ECMDB database offers a number of text, sequence, spectral, chemical structure and relational query searches. These are described in more detail below.

Monica Riley was an American scientist who contributed to the discovery of messenger RNA in her Ph.D work with Arthur Pardee, and was later a pioneer in the exploration and computer representation of the Escherichia coli genome.

Model organism databases (MODs) are biological databases, or knowledgebases, dedicated to the provision of in-depth biological data for intensively studied model organisms. MODs allow researchers to easily find background information on large sets of genes, plan experiments efficiently, combine their data with existing knowledge, and construct novel hypotheses. They allow users to analyse results and interpret datasets, and the data they generate are increasingly used to describe less well studied species. Where possible, MODs share common approaches to collect and represent biological information. For example, all MODs use the Gene Ontology (GO) to describe functions, processes and cellular locations of specific gene products. Projects also exist to enable software sharing for curation, visualization and querying between different MODs. Organismal diversity and varying user requirements however mean that MODs are often required to customize capture, display, and provision of data.

Julio Collado-Vides is a Guatemalan scientist and Professor of Computational Genomics at the National Autonomous University of Mexico. His research focuses on genomics and bioinformatics.

Peter D. Karp is director of the Bioinformatics Research Group at SRI International in Menlo Park, California. Karp leads the development of the BioCyc database collection. BioCyc databases combine genome, metabolic pathway, and regulatory information for thousands of organisms.

SoyBase is a database created by the United States Department of Agriculture. It contains genetic information about soybeans. It includes genetic maps, information about Mendelian genetics and molecular data regarding genes and sequences. It was started in 1990 and is freely available to individuals and organizations worldwide.

Christos A. Ouzounis is a computational biologist, a director of research at the CERTH, and Professor of Bioinformatics at Aristotle University in Thessaloniki.

References

  1. 1 2 Keseler, Ingrid M; Collado-Vides Julio; Santos-Zavaleta Alberto; Peralta-Gil Martin; Gama-Castro Socorro; Muñiz-Rascado Luis; Bonavides-Martinez César; Paley Suzanne; Krummenacker Markus; Altman Tomer; Kaipa Pallavi; Spaulding Aaron; Pacheco John; Latendresse Mario; Fulcher Carol; Sarker Malabika; Shearer Alexander G; Mackie Amanda; Paulsen Ian; Gunsalus Robert P; Karp Peter D (Jan 2011). "EcoCyc: a comprehensive database of Escherichia coli biology". Nucleic Acids Res. 39 (Database issue). England: D583–90. doi:10.1093/nar/gkq1143. PMC   3013716 . PMID   21097882.
  2. Karp PD, Riley M, Paley SM, Pelligrini-Toole A (1996). "EcoCyc: an encyclopedia of Escherichia coli genes and metabolism". Nucleic Acids Res. 24 (1): 32–9. doi:10.1093/nar/24.1.32. PMC   145574 . PMID   8594595.