|Molecules with drug-like properties and biological activity|
|Research center||European Molecular Biology Laboratory|
|Authors||Andrew Leach, Team Leader 2016-Present; John Overington, Team Leader 2008-2015|
|Primary citation||PMID 21948594|
|Web service URL||ChEMBL Webservices|
|Sparql endpoint||ChEMBL EBI-RDF Platform|
|License||The ChEMBL data is made available on a Creative Commons Attribution-Share Alike 3.0 Unported Licence|
ChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug-like properties.It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.
The database, originally known as StARlite, was developed by a biotechnology company called Inpharmatica Ltd. later acquired by Galapagos NV. The data was acquired for EMBL in 2008 with an award from The Wellcome Trust,resulting in the creation of the ChEMBL chemogenomics group at EMBL-EBI, led by John Overington.
The ChEMBL database contains compound bioactivity data against drug targets. Bioactivity is reported in Ki, Kd, IC50, and EC50.Data can be filtered and analyzed to develop compound screening libraries for lead identification during drug discovery.
ChEMBL version 2 (ChEMBL_02) was launched in January 2010, including 2.4 million bioassay measurements covering 622,824 compounds, including 24,000 natural products. This was obtained from curating over 34,000 publications across twelve medicinal chemistry journals. ChEMBL's coverage of available bioactivity data has grown to become "the most comprehensive ever seen in a public database.".In October 2010 ChEMBL version 8 (ChEMBL_08) was launched, with over 2.97 million bioassay measurements covering 636,269 compounds.
ChEMBL_10 saw the addition of the PubChem confirmatory assays, in order to integrate data that is comparable to the type and class of data contained within ChEMBL.
ChEMBLdb can be accessed via a web interface or downloaded by File Transfer Protocol. It is formatted in a manner amenable to computerized data mining, and attempts to standardize activities between different publications, to enable comparative analysis.ChEMBL is also integrated into other large-scale chemistry resources, including PubChem and the ChemSpider system of the Royal Society of Chemistry.
In addition to the database, the ChEMBL group have developed tools and resources for data mining.These include Kinase SARfari, an integrated chemogenomics workbench focussed on kinases. The system incorporates and links sequence, structure, compounds and screening data.
GPCR SARfari is a similar workbench focused on GPCRs, and ChEMBL-Neglected Tropical Diseases (ChEMBL-NTD) is a repository for Open Access primary screening and medicinal chemistry data directed at endemic tropical diseases of the developing regions of the Africa, Asia, and the Americas. The primary purpose of ChEMBL-NTD is to provide a freely accessible and permanent archive and distribution centre for deposited data.
July 2012 saw the release of a new malaria data service, sponsored by the Medicines for Malaria Venture (MMV), aimed at researchers around the globe. The data in this service includes compounds from the Malaria Box screening set, as well as the other donated malaria data found in ChEMBL-NTD.
myChEMBL, the ChEMBL virtual machine, was released in October 2013 to allow users to access a complete and free, easy-to-install cheminformatics infrastructure.
In December 2013, the operations of the SureChem patent informatics database were transferred to EMBL-EBI. In a portmanteau, SureChem was renamed SureChEMBL.
2014 saw the introduction of the new resource ADME SARfari - a tool for predicting and comparing cross-species ADME targets.
In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered.
Cheminformatics is the use of computer and informational techniques applied to a range of problems in the field of chemistry. These in silico techniques are used, for example, in pharmaceutical companies and academic settings in the process of drug discovery. These methods can also be used in chemical and allied industries in various other forms.
A biological target is anything within a living organism to which some other entity is directed and/or binds, resulting in a change in its behavior or function. Examples of common classes of biological targets are proteins and nucleic acids. The definition is context-dependent, and can refer to the biological target of a pharmacologically active drug compound, the receptor target of a hormone, or some other target of an external stimulus. Biological targets are most commonly proteins such as enzymes, ion channels, and receptors.
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature.
The European Bioinformatics Institute (EMBL-EBI) is an International Governmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Wellcome Genome Campus in Hinxton near Cambridge, and employs over 600 full-time equivalent (FTE) staff.
PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains substance descriptions and small molecules with fewer than 1000 atoms and 1000 bonds. More than 80 database vendors contribute to the growing PubChem database.
Chemical Entities of Biological Interest, also known as ChEBI, is a database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies effort. The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products which have potential bioactivity. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.
This page describes mining for molecules. Since molecules may be represented by molecular graphs this is strongly related to graph mining and structured data mining. The main problem is how to represent molecules while discriminating the data instances. One way to do this is chemical similarity metrics, which has a long tradition in the field of cheminformatics.
Chemogenomics, or chemical genomics, is the systematic screening of targeted chemical libraries of small molecules against individual drug target families with the ultimate goal of identification of novel drugs and drug targets. Typically some members of a target library have been well characterized where both the function has been determined and compounds that modulate the function of those targets have been identified. Other members of the target family may have unknown function with no known ligands and hence are classified as orphan receptors. By identifying screening hits that modulate the activity of the less well characterized members of the target family, the function of these novel targets can be elucidated. Furthermore, the hits for these targets can be used as a starting point for drug discovery. The completion of the human genome project has provided an abundance of potential targets for therapeutic intervention. Chemogenomics strives to study the intersection of all possible drugs on all of these potential targets.
Virtual screening (VS) is a computational technique used in drug discovery to search libraries of small molecules in order to identify those structures which are most likely to bind to a drug target, typically a protein receptor or enzyme.
ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry.
Collaborative Drug Discovery (CDD) is a software company founded in 2004 as a spin-out of Eli Lilly by Barry Bunin, PhD. CDD offers a web-based database solution for managing drug discovery data, primarily around small molecules and associated bio-assay data.
Aureus Sciences was a research-based company which sold software to the pharmaceutical industry for drug development.
Druggability is a term used in drug discovery to describe a biological target that is known to or is predicted to bind with high affinity to a drug. Furthermore, by definition, the binding of the drug to a druggable target must alter the function of the target with a therapeutic benefit to the patient. The concept of druggability is most often restricted to small molecules but also has been extended to include biologic medical products such as therapeutic monoclonal antibodies.
The TDR Targets database is a bioinformatics project that seeks to exploit the availability of diverse genomic and chemical datasets to facilitate the identification and prioritization of drugs and drug targets in neglected disease pathogens. TDR in the name of the database stands from the popular abbreviation for a special programme within the World Health Organization, whose focus is Tropical Disease Research. The project was jumpstarted by funds from this programme, and the initial focus of the resource was on organisms/diseases of high priority for this Programme.
The European Nucleotide Archive (ENA) is a repository providing free and unrestricted access to annotated DNA and RNA sequences. It also stores complementary information such as experimental procedures, details of sequence assembly and other metadata related to sequencing projects. The archive is composed of three main databases: the Sequence Read Archive, the Trace Archive and the EMBL Nucleotide Sequence Database. The ENA is produced and maintained by the European Bioinformatics Institute and is a member of the International Nucleotide Sequence Database Collaboration (INSDC) along with the DNA Data Bank of Japan and GenBank.
Christoph Steinbeck is a chemist born in Neuwied in 1966 and has a professorship for analytical chemistry, cheminformatics and chemometrics at the Friedrich-Schiller-Universität Jena in Thuringia, Germany.
The IUPHAR/BPS Guide to PHARMACOLOGY is an open-access website, acting as a portal to information on the biological targets of licensed drugs and other small molecules. The Guide to PHARMACOLOGY is developed as a joint venture between the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). This replaces and expands upon the original 2009 IUPHAR Database. The Guide to PHARMACOLOGY aims to provide a concise overview of all pharmacological targets, accessible to all members of the scientific and clinical communities and the interested public, with links to details on a selected set of targets. The information featured includes pharmacological data, target and gene nomenclature, as well as curated chemical information for ligands. Overviews and commentaries on each target family are included, with links to key references.
Experimental factor ontology, also known as EFO, is an open-access ontology of experimental variables particularly those used in molecular biology. The ontology covers variables which include aspects of disease, anatomy, cell type, cell lines, chemical compounds and assay information. EFO is developed and maintained at the EMBL-EBI as a cross-cutting resource for the purposes of curation, querying and data integration in resources such as Ensembl, ChEMBL and Expression Atlas.
Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.
| Wikidata has the property: |