Content | |
---|---|
Description | Molecular and biochemical information on enzymes that have been classified by the IUBMB |
Contact | |
Research center | Technische Universität Braunschweig, BRICS - Braunschweig Integrated Centre of Systems Biology |
Primary citation | PMID 33211880 |
Release date | 2021 |
Access | |
Website | http://www.brenda-enzymes.org |
Download URL | Download BRENDA |
Web service URL | SOAP access |
BRENDA (BRaunschweig ENzyme DAtabase) is the world's most comprehensive online database for functional, biochemical and molecular biological data on enzymes, metabolites and metabolic pathways. It contains data on the properties, function and significance of all enzymes classified by the Enzyme Commission of the International Union of Biochemistry and Molecular Biology (IUBMB) classified enzymes. As ELIXIR Core Data Resource, BRENDA is considered a data resource of critical importance to the international life sciences research community. The database compiles a representative overview of enzymes and metabolites using current research data from primary scientific literature and thus serves the purpose of facilitating information retrieval for researchers. BRENDA is subject to the terms of the Creative Commons license (CC BY 4.0), is accessible worldwide and can be used free of charge. [1] As one of the digital resources of the Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, BRENDA is part of the integrated biodata infrastructure DSMZ Digital Diversity.
BRENDA was founded in 1987 by Dietmar Schomburg at the former German Research Centre for Biotechnology, now the Helmholtz Centre for Infection Research, in Braunschweig.
Schomburg's basic idea was to compile the most relevant enzyme data from the primary scientific literature in a standardized form in a generally accessible information system, thus making it easier for researchers to search the literature. He saw researchers facing a growing challenge in obtaining information, as the first large genome sequencing projects rapidly increased the amount of functional enzyme data, while information at that time still had to be extracted manually from printed publications in various journals. [2]
Initially, the enzyme data was published as a series of books. Springerverlag published the first of nineteen editions of the "Springer Handbook of Enzymes" in 1990, which contained data on over 3000 EC classes. A second edition with 39 issues containing data on over 4900 EC classes was published from 2001 to 2009. [2] [3]
In 1996, Dietmar Schomburg accepted an appointment at the University of Cologne, where he and his working group further developed the data collection into a globally accessible, free online information system, which was available online in 1998 in the SRS system of the European Bioinformatics Institute (EBI) in 1998. [4] In the following year, a separate full-text database was developed, which was accessible via the BRENDA website of the University of Cologne, [2] [5] in 2004 it was converted into a relational database. In 2007, Schomburg returned to the Technical University of Braunschweig. Since then, the BRENDA team has been based at the Braunschweig Center for Systems Biology (BRICS).
Since 2015, BRENDA has been part of de.NBI, the German network for bioinformatics infrastructure, and is part of the Center for Biological Data (BioData). [2] In June 2018, BRENDA was included in the prestigious list of Core Data Resources maintained by ELIXIR, a European initiative for digital research infrastructure in biomedicine. [6] In 2022, the database was also awarded Global Core Biodata Resource status by the Global Biodata Coalition. [7] Since January 2023, BRENDA has been part of the Leibniz Institute DSMZ and receives permanent funding as part of the networked data services DSMZ Digital Diversity. [8]
The BRENDA content basically covers organisms of all domains and is geared to the broad interest of the scientific community from different areas of life sciences such as systems biology, biotechnology, medicine and pharmaceuticals.
The enzyme-specific data in BRENDA are annotated from scientific literature and assigned an EC number (English: Enzyme Commission numbers). The EC numbers are part of a system established by the IUBMB that classifies enzymes according to their catalytic activity, i.e. the chemical reaction. The IUBMB Enzyme Commission has so far defined over 8300 EC numbers in seven main classes, all of which - including the obsolete ones - can be found in BRENDA. The data on all enzymes of an EC number are displayed on a common overview page (Enzyme Summary Page) and can be reduced to individual enzymes via filter options. The Enzyme Summary Page shows the name defined by the IUBMB for enzymes of this class, the reaction scheme that defines this enzyme class and a commentary by the Enzyme Commission. The information presented here also includes Enzyme nomenclature, substrates and products or the catalyzed reactions, inhibiting and activating ligands, enzyme structure, isolation and purification, enzyme stability, kinetic parameters, such as Km values and turnover numbers, the occurrence and intracellular localization as well as mutations.
The literature base of the data of an EC number can comprise several hundred publications if it contains medically or industrially relevant and thus well-studied enzymes. Each entry is linked to a literature reference and an organism from which the enzyme originates. [9] If the protein sequence is known and has been published, entries are also assigned to a specific protein sequence in the UniProt database. BRENDA provides links to other online information systems to which the entries are linked. In addition to ExplorEnz, the enzyme information system of the IUBMB, these include DSMZ databases such as BacDive and CellDive, protein sequence and protein structure databases such as UniProt and PDB, literature databases such as PubMed and Europe PubMed Central and ontologies such as NCBI-MeSH.
In addition to the enzyme database, BRENDA contains a database with information on ligands, mostly low-molecular compounds that interact with enzymes. Depending on their role in enzymatic reactions, these are categorized as substrate, product, inhibitor, activator, cofactor or as metals and ions (if their function is not specified in the literature). These molecules can have different functions, e.g. they can be metabolites of primary metabolism, naturally occurring antibiotics or synthetic compounds used in the development of drugs or pesticides. All information on a ligand annotated in BRENDA can be accessed centrally on a summary page (Ligand Summary Page). The information presented here includes structural and molecular formula, InChIKey (International Chemical Identifier), synonyms and information on the role in enzymatic reactions including reaction equations and kinetic data such as inhibitor constants. Each entry is linked to a reference and an EC number.
The search bar on the homepage is used to quickly search for terms in specific data categories, while the Advanced Search function can be used to narrow down various search parameters and thus perform a targeted query. The Full-text Search enables an all-encompassing search of terms in all text fields of the database, including the comment fields, whose content is always visible on the Summary Pages, but can only be specifically queried using this search function.
Ligand data can be found not only by querying ligand names, but also by their structure. Via the search mask of the Ligand structure search [10] , who developed the JavaScript-based JSME molecule editor, [11] users can draw a chemical structure and search the BRENDA ligand database for substructures, isomers or similar structures.
In addition to these web browser-based query options, users can obtain the BRENDA data via SOAP-API or SBML download. The manually curated data can also be downloaded in JSON or txt format.
The BRENDA additional functions offer further ways to access the data.
In addition to the annotation of new data, BRENDA is constantly developing new database functions such as ontologies or visualizations, which open up further access paths to the data, show correlations and help to answer specific questions.
The BRENDA Tissue Ontology is a comprehensive and structured ontology with terms for tissues, organs, anatomical structures, plant parts, cell cultures, cell types and cell lines in organisms from all taxonomic groups in which enzymes can occur. It is a hierarchically organized set of controlled terms. [12]
The BRENDA Metabolic Pathways graphically summarize the reaction equations annotated in BRENDA into metabolic pathways. They are drawn manually by the BRENDA curators. The BRENDA Metabolic Pathways visualize metabolic pathways that are largely described scientifically and whose reactions, enzymes and ligands can mostly be found in BRENDA. Search and filter functions can be used to highlight metabolic pathways, EC numbers or ligands (also organism-specific). [4]
BRENDA also provides additional tools, the most important of which are described below.
The data in BRENDA comes from primary scientific literature. The process of integrating new data begins with a manual literature search in PubMed and Scopus and the selection of relevant, qualitative and comprehensive publications. From these, data is annotated and the result is then double-checked qualitatively. All these steps are carried out manually by scientific staff with relevant expertise. At the same time, the structural formulas of the ligands are created and curated manually. After curation, the data undergo several hundred computer-aided checks to verify the formal correctness of the data as part of the integration into the database. New data is published twice a year on the BRENDA website.
The seven EC classes are not updated in parallel, but periodically one after the other. For data of newly described enzymes that do not match any existing EC number, new BRENDA-internal EC numbers are created that contain the capital letter "B" before the last digit. These are provisional auxiliary enzyme classes that have not yet been officially approved by the IUBMB. As soon as a sufficient amount of reliable scientific data on a B-number has been annotated in BRENDA, BRENDA staff submit it to the IUBMB Enzyme Commission as a new proposal for review. The BRENDA curators are themselves part of the IUBMB Nomenclature Committee. New EC numbers are immediately added to the BRENDA database and published online with the next release.
Due to the manual and selective annotation process, the literature base and the associated amount of data in BRENDA is quantitatively limited. In 2006, a computer-aided information retrieval function (text mining) was established to expand the manually curated data core. Computer-aided methods search the specialist literature available online and automatically annotate certain information in the corresponding data categories. Four text mining information systems can be used in BRENDA: FRENDA (Full Reference ENzyme DAta), AMENDA (Automatic Mining of ENzyme DAta), DRENDA (Disease-Related ENzyme information DAtabase) and KENDA (Kinetic ENzyme DAta). The PubMed literature database serves as the basis for text mining. In order to obtain the information relevant for BRENDA, all titles and summaries of scientific articles in PubMed are searched for specific text modules and terms, saved and processed for BRENDA. [9] [14] [15] There is no quality control of the data acquired by text mining by the BRENDA staff, but the AMENDA results include an automatic qualitative assessment that supports users in assessing the scientific quality of the results. [14]
The Enzyme Commission number is a numerical classification scheme for enzymes, based on the chemical reactions they catalyze. As a system of enzyme nomenclature, every EC number is associated with a recommended name for the corresponding enzyme-catalyzed reaction.
Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.
The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.
Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.
KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.
InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.
Reactome is a free online database of biological pathways. It is manually curated and authored by PhD-level biologists, in collaboration with Reactome editorial staff. The content is cross-referenced to many bioinformatics databases. The rationale behind Reactome is to visually represent biological pathways in full mechanistic detail, while making the source data available in a computationally accessible format.
The Pathogen-Host Interactions database (PHI-base) is a biological database that contains manually curated information on genes experimentally proven to affect the outcome of pathogen-host interactions. The database has been maintained by researchers at Rothamsted Research and external collaborators since 2005. PHI-base has been part of the UK node of ELIXIR, the European life-science infrastructure for biological information, since 2016.
The MetaCyc database is one of the largest metabolic pathways and enzymes databases currently available. The data in the database is manually curated from the scientific literature, and covers all domains of life. MetaCyc has extensive information about chemical compounds, reactions, metabolic pathways and enzymes. The data have been curated from more than 58,000 publications.
Translocase is a general term for a protein that assists in moving another molecule, usually across a cell membrane. These enzymes catalyze the movement of ions or molecules across membranes or their separation within membranes. The reaction is designated as a transfer from “side 1” to “side 2” because the designations “in” and “out”, which had previously been used, can be ambiguous. Translocases are the most common secretion system in Gram positive bacteria.
MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.
In molecular biology, STRING is a biological database and web resource of known and predicted protein–protein interactions.
The BRENDA tissue ontology (BTO) represents a comprehensive structured encyclopedia. It provides terms, classifications, and definitions of tissues, organs, anatomical structures, plant parts, cell cultures, cell types, and cell lines of organisms from all taxonomic groups (animals, plants, fungi, protozoon) as enzyme sources. The information is connected to the functional data in the BRENDA ("BRaunschweig ENzyme DAtabase“) enzyme information system.
SABIO-RK is a web-accessible database storing information about biochemical reactions and their kinetic properties.
In bioinformatics, the PANTHER classification system is a large curated biological database of gene/protein families and their functionally related subfamilies that can be used to classify and identify the function of gene products. PANTHER is part of the Gene Ontology Reference Genome Project designed to classify proteins and their genes for high-throughput analysis.
I-TASSER is a bioinformatics method for predicting three-dimensional structure model of protein molecules from amino acid sequences. It detects structure templates from the Protein Data Bank by a technique called fold recognition. The full-length structure models are constructed by reassembling structural fragments from threading templates using replica exchange Monte Carlo simulations. I-TASSER is one of the most successful protein structure prediction methods in the community-wide CASP experiments.
BacDive is a bacterial metadatabase that provides strain-linked information about bacterial and archaeal biodiversity.