DrugBank

Last updated
DrugBank
Drugbank logo.svg
Content
DescriptionDrug database
Data types
captured
Chemical structures, small molecule drugs, biotech drugs, drug targets, drug transporters, drug target sequences, drug target SNPs, drug metabolites, drug descriptions, disease associations, dosage data, food and drug interactions, adverse drug reactions, pharmacology, mechanisms of action, drug metabolism, chemical synthesis, patent and pricing data, chemical properties, nomenclature, synonyms, chemical taxonomy, drug NMR spectra, drug GC-MS spectra, drug LC-MS spectra
Contact
Research center University of Alberta and The Metabolomics Innovation Centre, Alberta, Canada
Laboratory David S. Wishart
Primary citationDrugBank: a comprehensive resource for in silico drug discovery and exploration. [1]
Access
Website www.drugbank.com
Download URL www.drugbank.ca/downloads
Miscellaneous
Data release
frequency
Every 2 years with monthly corrections and updates
Curation policyManually curated

The DrugBank database is a comprehensive, freely accessible, online database containing information on drugs and drug targets created and maintained by the University of Alberta and The Metabolomics Innovation Centre located in Alberta, Canada. [1] As both a bioinformatics and a cheminformatics resource, DrugBank combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. [1] [2] DrugBank has used content from Wikipedia; [3] Wikipedia also often links to Drugbank, posing potential circular reporting issues. [3]

Contents

The DrugBank Online website is available to the public as a free-to-access resource. However, use and re-distribution of content from DrugBank Online or the underlying DrugBank Data, in whole or part, and for any purpose requires a license. Academic users can apply for a free license for certain use cases while all other users require a paid license.

The latest release of the database (version 5.0) contains 9591 drug entries including 2037 FDA-approved small molecule drugs, 241 FDA-approved biotech (protein/peptide) drugs, 96 nutraceuticals and over 6000 experimental drugs. [4] Additionally, 4270 non-redundant protein (i.e. drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. Each DrugCard entry (Fig. 1) contains more than 200 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data. [4]

Four additional databases, HMDB, [5] T3DB, [6] SMPDB [7] and FooDB are also part of a general suite of metabolomic/cheminformatic databases. HMDB contains equivalent information on more than 40,000 human metabolites, T3DB contains information on 3100 common toxins and environmental pollutants, SMPDB contains pathway diagrams for nearly 700 human metabolic pathways and disease pathways, while FooDB contains equivalent information on ~28,000 food components and food additives.

Version history

The first version of DrugBank was released in 2006. [1] This early release contained relatively modest information about 841 FDA-approved small molecule drugs and 113 biotech drugs. It also included information on 2133 drug targets. The second version of DrugBank was released in 2009. [2] This greatly expanded and improved version of the database included 1344 approved small molecule drugs and 123 biotech drugs as well as 3037 unique drug targets. Version 2.0 also included, for the first time, withdrawn drugs and illicit drugs, extensive food-drug and drug-drug interactions as well as ADMET (absorption, distribution, metabolism, excretion and toxicity) parameters. Version 3.0 was released in 2011. [8] This version contained 1424 approved small molecule drugs and 132 biotech drugs as well as >4000 unique drug targets. Version 3.0 also included drug transporter data, drug pathway data, drug pricing, patent and manufacturing data as well as data on >5000 experimental drugs. Version 4.0 was released in 2014. [4] This version included 1558 FDA-approved small molecule drugs, 155 biotech drugs and 4200 unique drug targets. Version 4.0 also incorporated extensive information on drug metabolites (structures and reactions), drug taxonomy, drug spectra, drug binding constants and drug synthesis information. Table 1 provides a more complete statistical summary of the history of DrugBank’s development.

Table 1. Comparison between the coverage in DrugBank 1.0, 2.0, 3.0 and DrugBank 4.0.
Category1.02.03.04.0
No. of data fields per DrugCard88108148208
No. of search types8121618
No. of illustrated drug-action pathways 00168232
No. of drugs with metabolizing enzyme data 007621,037
No. of drug metabolites with structures0001,239
No. of drug-metabolism reactions0001,308
No. of illustrated drug metabolism pathways00053
No. of drugs with drug transporter data00516623
No. of drugs with taxonomic classification information0006,713
No. of SNP-associated drug effects00113201
No. of drugs with patent/pricing/manufacturer data001,2081,450
No. of food–drug interactions07141,0391,180
No. of drug–drug interactions013,24213,79514,150
No. of ADMET parameters (Caco-2, LogS)02768906,667
No. of QSAR parameters per drug561423
No. of drugs with drug-target binding constant data000791
No. of drugs with NMR spectra000306
No. of drugs with MS spectra000384
No. of drugs with chemical synthesis information038381,285
No. of FDA-approved small molecule drugs8411,3441,4241,558
No. of biotech drugs113123132155
No. of nutraceutical drugs61698287
No. of withdrawn drugs0576878
No. of illicit drugs0188189190
No. of experimental drugs2,8943,1165,2106,009
Total No. of experimental and FDA small molecule drugs3,7964,7746,6847,561
Total No. of experimental and FDA drugs (all types)3,9094,8976,8167,713
No. of all drug targets (unique)2,1333,0374,3264,115
No. of approved-drug enzymes/carriers (unique)00164245
No. of all drug enzymes/carriers (unique)00169253
No. of external database links12183133

Scope and access

All data in DrugBank is derived from public non-proprietary sources. Nearly every data item is fully traceable and explicitly referenced to the original source. DrugBank data is available through a public web interface. [9]

See also

Related Research Articles

<span class="mw-page-title-main">Metabolome</span>

The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.

<span class="mw-page-title-main">Orciprenaline</span> Chemical compound

Orciprenaline, also known as metaproterenol, is a bronchodilator used in the treatment of asthma. Orciprenaline is a moderately selective β2 adrenergic receptor agonist that stimulates receptors of the smooth muscle in the lungs, uterus, and vasculature supplying skeletal muscle, with minimal or no effect on α adrenergic receptors. The pharmacologic effects of β adrenergic agonist drugs, such as orciprenaline, are at least in part attributable to stimulation through β adrenergic receptors of intracellular adenylyl cyclase, the enzyme which catalyzes the conversion of ATP to cAMP. Increased cAMP levels are associated with relaxation of bronchial smooth muscle and inhibition of release of mediators of immediate hypersensitivity from many cells, especially from mast cells.

<span class="mw-page-title-main">Glycochenodeoxycholic acid</span> Chemical compound

Glycochenodeoxycholic acid is a bile salt formed in the liver from chenodeoxycholic acid and glycine, usually found as the sodium salt. It acts as a detergent to solubilize fats for absorption.

<span class="mw-page-title-main">Therapeutic Targets Database</span> Database of protein targets in drug design

Therapeutic Target Database (TTD) is a pharmaceutical and medical repository constructed by the Innovative Drug Research and Bioinformatics Group (IDRB) at Zhejiang University, China and the Bioinformatics and Drug Design Group at the National University of Singapore. It provides information about known and explored therapeutic protein and nucleic acid targets, the targeted disease, pathway information and the corresponding drugs directed at each of these targets. Detail knowledge about target function, sequence, 3D structure, ligand binding properties, enzyme nomenclature and drug structure, therapeutic class, and clinical development status. TTD is freely accessible without any login requirement.

PDBsum is a database that provides an overview of the contents of each 3D macromolecular structure deposited in the Protein Data Bank. The original version of the database was developed around 1995 by Roman Laskowski and collaborators at University College London. As of 2014, PDBsum is maintained by Laskowski and collaborators in the laboratory of Janet Thornton at the European Bioinformatics Institute (EBI).

Druggability is a term used in drug discovery to describe a biological target that is known to or is predicted to bind with high affinity to a drug. Furthermore, by definition, the binding of the drug to a druggable target must alter the function of the target with a therapeutic benefit to the patient. The concept of druggability is most often restricted to small molecules but also has been extended to include biologic medical products such as therapeutic monoclonal antibodies.

<span class="mw-page-title-main">Human Metabolome Database</span> Database of human metabolites

The Human Metabolome Database (HMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in the human body. It bas been created by the Human Metabolome Project funded by Genome Canada and is one of the first dedicated metabolomics databases. The HMDB facilitates human metabolomics research, including the identification and characterization of human metabolites using NMR spectroscopy, GC-MS spectrometry and LC/MS spectrometry. To aid in this discovery process, the HMDB contains three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data (Fig. 1–3). The chemical data includes 41,514 metabolite structures with detailed descriptions along with nearly 10,000 NMR, GC-MS and LC/MS spectra.

<span class="mw-page-title-main">Toxin and Toxin-Target Database</span>

The Toxin and Toxin-Target Database (T3DB), also known as the Toxic Exposome Database, is a freely accessible online database of common substances that are toxic to humans, along with their protein, DNA or organ targets. The database currently houses nearly 3,700 toxic compounds or poisons described by nearly 42,000 synonyms. This list includes various groups of toxins, including common pollutants, pesticides, drugs, food toxins, household and industrial/workplace toxins, cigarette toxins, and uremic toxins. These toxic substances are linked to 2,086 corresponding protein/DNA target records. In total there are 42,433 toxic substance-toxin target associations. Each toxic compound record (ToxCard) in T3DB contains nearly 100 data fields and holds information such as chemical properties and descriptors, mechanisms of action, toxicity or lethal dose values, molecular and cellular interactions, medical information, NMR an MS spectra, and up- and down-regulated genes. This information has been extracted from over 18,000 sources, which include other databases, government documents, books, and scientific literature.

The Small Molecule Pathway Database (SMPDB) is a comprehensive, high-quality, freely accessible, online database containing more than 600 small molecule (i.e. metabolic) pathways found in humans. SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology. It is able to do so, in part, by providing colorful, detailed, fully searchable, hyperlinked diagrams of five types of small molecule pathways: 1) general human metabolic pathways; 2) human metabolic disease pathways; 3) human metabolite signaling pathways; 4) drug-action pathways and 5) drug metabolism pathways. SMPDB pathways may be navigated, viewed and zoomed interactively using a Google Maps-like interface. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures (Fig. 1). Each small molecule in SMPDB is hyperlinked to detailed descriptions contained in the HMDB or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. Additionally, all SMPDB pathways are accompanied with detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram. Users can browse the SMPDB (Fig. 2) or search its contents by text searching (Fig. 3), sequence searching, or chemical structure searching. More powerful queries are also possible including searching with lists of gene or protein names, drug names, metabolite names, GenBank IDs, Swiss-Prot IDs, Agilent or Affymetrix microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein concentration data can also be visualized through SMPDB's mapping interface.

MetaboAnalyst is a set of online tools for metabolomic data analysis and interpretation, created by members of the Wishart Research Group at the University of Alberta. It was first released in May 2009 and version 2.0 was released in January 2012. MetaboAnalyst provides a variety of analysis methods that have been tailored for metabolomic data. These methods include metabolomic data processing, normalization, multivariate statistical analysis, and data annotation. The current version is focused on biomarker discovery and classification.

<span class="mw-page-title-main">Forasartan</span> Chemical compound

Forasartan, otherwise known as the compound SC-52458, is a nonpeptide angiotensin II receptor antagonist (ARB, AT1 receptor blocker).

<span class="mw-page-title-main">5-Androstenedione</span> Chemical compound

5-Androstenedione, also known as androst-5-ene-3,17-dione, is a prohormone of testosterone. The World Anti-Doping Agency prohibits its use in athletes. In the United States, it is a controlled substance.

The Yeast Metabolome Database (YMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in or produced by Saccharomyces cerevisiae. The YMDB was designed to facilitate yeast metabolomics research, specifically in the areas of general fermentation as well as wine, beer and fermented food analysis. YMDB supports the identification and characterization of yeast metabolites using NMR spectroscopy, GC-MS spectrometry and Liquid chromatography–mass spectrometry. The YMDB contains two kinds of data: 1) chemical data and 2) molecular biology/biochemistry data. The chemical data includes 2027 metabolite structures with detailed metabolite descriptions along with nearly 4000 NMR, GC-MS and LC/MS spectra.

Metabolite Set Enrichment Analysis (MSEA) is a method designed to help metabolomics researchers identify and interpret patterns of metabolite concentration changes in a biologically meaningful way. It is conceptually similar to another widely used tool developed for transcriptomics called Gene Set Enrichment Analysis or GSEA. GSEA uses a collection of predefined gene sets to rank the lists of genes obtained from gene chip studies. By using this “prior knowledge” about gene sets researchers are able to readily identify significant and coordinated changes in gene expression data while at the same time gaining some biological context. MSEA does the same thing by using a collection of predefined metabolite pathways and disease states obtained from the Human Metabolome Database. MSEA is offered as a service both through a stand-alone web server and as part of a larger metabolomics analysis suite called MetaboAnalyst.

BASys is a freely available web server that can be used to perform automated, comprehensive annotation of bacterial genomes. With the advent of next generation DNA sequencing it is now possible to sequence the complete genome of a bacterium within a single day. This has led to an explosion in the number of fully sequenced microbes. In fact, as of 2013, there were more than 2700 fully sequenced bacterial genomes deposited with GenBank. However, a continuing challenge with microbial genomics is finding the resources or tools for annotating the large number of newly sequenced genomes. BASys was developed in 2005 in anticipation of these needs. In fact, BASys was the world’s first publicly accessible microbial genome annotation web server. Because of its widespread popularity, the BASys server was updated in 2011 through the addition of multiple server nodes to handle the large number of queries it was receiving.

FooDB is a freely available, open-access database containing chemical composition data on common, unprocessed foods. It also contains extensive data on flavour and aroma constituents, food additives as well as positive and negative health effects associated with food constituents. The database contains information on more than 28,000 chemicals found in more than 1000 raw or unprocessed food products. The data in FooDB was collected from many sources including textbooks, scientific journals, on-line food composition or nutrient databases, flavour and aroma databases and various on-line metabolomic databases. This literature-derived information has been combined with experimentally derived data measured on thousands of compounds from more than 40 very common food products through the Alberta Food Metabolome Project which is led by David S. Wishart. Users are able to browse through the FooDB data by food source, name, descriptors or function. Chemical structures and molecular weights for compounds in FooDB may be searched via a specialized chemical structure search utility. Users are able to view the content of FooDB using two different “Viewing” options: FoodView, which lists foods by their chemical compounds, or ChemView, which lists chemicals by their food sources. Knowledge about the precise chemical composition of foods can be used to guide public health policies, assist food companies with improved food labelling, help dieticians prepare better dietary plans, support nutraceutical companies with their submissions of health claims and guide consumer choices with regard to food purchases.

The E. coli Metabolome Database (ECMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in or produced by Escherichia coli. Escherichia coli is perhaps the best studied bacterium on earth and has served as the "model microbe" in microbiology research for more than 60 years. The ECMDB is essentially an E. coli "omics" encyclopedia containing detailed data on E. coli's genome, proteome and its metabolome. ECMDB is part of a suite of organism-specific metabolomics databases that includes DrugBank, HMDB, YMDB and SMPDB. As a metabolomics resource, the ECMDB is designed to facilitate research in the area gut/microbiome metabolomics and environmental metabolomics. The ECMDB contains two kinds of data: 1) chemical data and 2) molecular biology and/or biochemical data. The chemical data includes more than 2700 metabolite structures with detailed metabolite descriptions along with nearly 5000 NMR, GC-MS and LC-MS spectra corresponding to these metabolites. The biochemical data includes nearly 1600 protein sequences and more than 3100 biochemical reactions that are linked to these metabolite entries. Each metabolite entry in the ECMDB contains more than 80 data fields with approximately 65% of the information being devoted to chemical data and the other 35% of the information devoted to enzymatic or biochemical data. Many data fields are hyperlinked to other databases. The ECMDB also has a variety of structure and pathway viewing applets. The ECMDB database offers a number of text, sequence, spectral, chemical structure and relational query searches. These are described in more detail below.

David S. Wishart is a Canadian researcher and a Distinguished University Professor in the Department of Biological Sciences and the Department of Computing Science at the University of Alberta. Wishart also holds cross appointments in the Faculty of Pharmacy and Pharmaceutical Sciences and the Department of Laboratory Medicine and Pathology in the Faculty of Medicine and Dentistry. Additionally, Wishart holds a joint appointment in metabolomics at the Pacific Northwest National Laboratory in Richland, Washington. Wishart is well known for his pioneering contributions to the fields of protein NMR spectroscopy, bioinformatics, cheminformatics and metabolomics. In 2011, Wishart founded the Metabolomics Innovation Centre (TMIC), which is Canada's national metabolomics laboratory.

References

  1. 1 2 3 4 Wishart, DS; Knox C; Guo AC; et al. (Jan 2006). "DrugBank: a comprehensive resource for in silico drug discovery and exploration". Nucleic Acids Research. 34 (Database issue): D668-72. doi:10.1093/nar/gkj067. PMC   1347430 . PMID   16381955.
  2. 1 2 Wishart, DS; Knox C; Guo AC; et al. (Jan 2008). "DrugBank: a knowledgebase for drugs, drug actions and drug targets". Nucleic Acids Research. 36 (Database issue): D901-6. doi:10.1093/nar/gkm958. PMC   2238889 . PMID   18048412.
  3. 1 2 Harrison, Stephen (7 March 2019). "The Dizzying Problem of Citationless Wikipedia "Facts" That Take On a Life of Their Own". Slate Magazine. Retrieved 9 November 2019.
  4. 1 2 3 Law, V; Knox, C; Djoumbou, Y; Jewison, T; Guo, AC; Liu, Y; Maciejewski, A; Arndt, D; Wilson, M; Neveu, V; Tang, A; Gabriel, G; Ly, C; Adamjee, S; Dame, ZT; Han, B; Zhou, Y; Wishart, DS (Jan 2014). "DrugBank 5.0: shedding new light on drug metabolism". Nucleic Acids Research. 42 (Database issue): D1091-7. doi:10.1093/nar/gkt1068. PMC   3965102 . PMID   24203711.
  5. Wishart, DS; Guo, AC; Eisner, R; Young, N; Gautam, B; Hau, DD; Psychogios, N; Dong, E; Bouatra, S; Mandal, R; Sinelnikov, I; Xia, J; Jia, L; Cruz, JA; Lim, E; Sobsey, CA; Shrivastava, S; Huang, P; Liu, P; Fang, L; Peng, J; Fradette, R; Cheng, D; Tzur, D; Clements, M; Lewis, A; De Souza, A; Zuniga, A; Dawe, M; Xiong, Y; Clive, D; Greiner, R; Nazyrova, A; Shaykhutdinov, R; Li, L; Vogel, HJ; Forsythe, I (Jan 2009). "HMDB: a knowledgebase for the human metabolome". Nucleic Acids Research. 37 (Database issue): D603-10. doi:10.1093/nar/gkn810. PMC   2686599 . PMID   18953024.
  6. Lim, E; Pon A; Djoumbou Y; Knox C; Shrivastava S; Guo AC; Neveu V; Wishart DS. (Jan 2010). "T3DB: a comprehensively annotated database of common toxins and their targets". Nucleic Acids Research. 38 (Database issue): D781-6. doi:10.1093/nar/gkp934. PMC   2808899 . PMID   19897546.
  7. Jewison, T; Su Y; Disfany FM; et al. (Jan 2014). "Small Molecule Pathway Database". Nucleic Acids Research. 42 (Database issue): D478-84. doi:10.1093/nar/gkt1067. PMC   3965088 . PMID   24203708.
  8. Knox, C; Law, V; Jewison, T; Liu, P; Ly, S; Frolkis, A; Pon, A; Banco, K; Mak, C; Neveu, V; Djoumbou, Y; Eisner, R; Guo, AC; Wishart, DS. (Jan 2011). "DrugBank 3.0: a comprehensive resource for 'omics' research on drugs". Nucleic Acids Research. 39 (Database issue): D1035-41. doi:10.1093/nar/gkq1126. PMC   3013709 . PMID   21059682.
  9. Wishart, David S. "DrugBank: a comprehensive resource for in silico drug discovery and exploration". Nucleic Acids Research. 34 (suppl_1): 668–672.