List of chemical databases

Last updated

This is a list of websites that contain lists of chemicals, or databases of chemical information. There is further detail on the content of these and other resources in a Wikibook of information sources.

AbbreviationFull nameOperatorSelectsContainsID prefixQualityLinkEntries
ACToR Environmental Protection Agency toxicology information; occurrence "ACToR".893,280
AtomWorkInorganic Material Database National Institute for Materials Science crystal structures "AtomWork" .82,000
Beilstein Beilstein database Elsevier organic compoundspropertiesclosed access
BIAdb Benzylisoquinoline Alkaloid Database "BIAdb".846
BindingDBThe Binding Database Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California, San Diegononcovalent association of molecules in solutionChEMBL SMILES InChiKey targets "BindingDB".
BindingMOADBinding Mother of All Databasesprotein ligand structures "BindingMOAD".36047
BMDB Bovine Metabolome Database Collaborative Drug Discovery BMDBmanually selected and checked "BMDB".7859
BMRBBiological Magnetic Resonance Data BankUniversity of Wisconsinbiological molecules including ligands, cofactors, peptides, saccharidesNMR spectroscopy "BMRB".
BRENDA Technical University of Braunschweig enzymes ligands "BRENDA".
Carotenoids DatabasecarotenoidsCA "Carotenoids".1195
CCCBDBComputational Chemistry Comparison and Benchmark DataBase National Institute of Standards and Technology gas phase molecules "CCCDBD" 2069
CCRISChemical Carcinogenesis Research Information SystemNational Library of Medicinesubstances that affect tumorsCCRISfrom primary literature, reviewed by experts "CCRIS subset of PubChem".9562 [1] [2]
CDD Public drug candidateslimited access3,000,000
ChEBI Chemical Entities of Biological Interest ELIXIR small chemical compoundsfrom PDBeChem ChEMBL KEGG IntEnz "ChEBI".60,000
Chematica Merck organic chemicalsreaction pathway calculation; Beilstein CAS SMILESproprietary7,000,000
ChEMBL Chemicals from European Molecular Biology Laboratory EMBL molecules with drug-like properties "ChEMBL".1,961,000
cheML.io Departments of Computer Science and Chemistry at Nazarbayev Universityde novo molecules generated by ML modelsSMILES, computed propertiesartificially generated "cheML.io". [3] 2,800,000
ChemDBchemical databasesmall molecules "ChemDB".5,000,000
ChemExperChemexper Chemical Directorycatalogue chemicalsCASno Structure SMILES "ChemExper".
Chemxpert DatabaseChemxpert Chemical Databasesmall molecules databasebuyers,suppliers "ChemxpertDB".10,00000
Chemical Book East West University commercially available compoundsCASno, suppliers, properties "Chemical Book".200,000
Chemical Registerfrom 20,000 vendorsCASno mainly from larger-scale suppliers "Chemical Register".1,750,000
ChemIDplus National Library of Medicineother NLM databases; regulated substancesCASNo UNII structureCMNPD https://chem.nlm.nih.gov/chemidplus/chemidlite.jsp 400,000
ChemSpider Royal Society of Chemistry from 275 data sources "ChemSpider".88,000,000
ChemIndex chemical databasesubstancesCAS Search; suppliers "Chemindex".
Clival DatabaseClinical Trail DatabaseClinical Trail Data Solutions50,000 molecules clinical trail dataPhase 0 to IV indications "clival".
CMNPDComprehensive Marine Natural Products Database Peking University from literature and other databasesstructural classification; speciesCMNPDcurated https://www.cmnpd.org/ 31,561
COD Crystallography Open Database Vilnius University small molecules (open source)crystal structure atomic coordinatesCODcurated "COD".478,715
Common Chemistry American Chemical Society structure CAS SMILES InCh https://commonchemistry.cas.org/ [4] ~500,000
Compendium of Pesticide Common Names British Crop Production Council Pesticides with ISO common namesstructure, CASNo, IUPAC name, SMILES, InChIcurated "Compendium of Pesticide Common Names".1,800
CompTox CompTox Chemicals Dashboard US Environmental Protection Agency chemicals evaluated for potential health risks "CompTox".
CosIngCosmetic IngredientsEuropean Commissioncosmetic ingredients "CosIng".
CrystalWorks Science and Technology Facilities Council "CrystalWorks" .
CSD Cambridge Structural Database Cambridge Crystallographic Data Centre "CSD".1,038,250
CSDBCarbohydrate Structure Database Zelinsky Institute of Organic Chemistry carbohydratesstructures referencesCSDB ID "CSDB".
CTD Comparative Toxicogenomics Database Department of Biological Sciences at North Carolina State UniversityMeSH CASNo ChEBI PubChem genes, pathways "CTD".
DDB Dortmund Data Bank pure compounds, mixtures, gas hydratesphysical properties "DDB" .
Dissociation Constants IUPAC Digitized pKa Dataset IUPAC dissociation constants "Dissociation Constants". GitHub .
DETHERMDECHEMAthermophysical properties "DETHERM" .75,000
DrugBank University of Albertadrugs "DrugBank".
DrugCentralUniversity of New Mexicopharmaceuticalsproducts containing substance "DrugCentral".
DTP/NCIDTP Open Compound collection National Cancer Institute Development Therapeutics ProgramCancer therapeuticsCancer Chemotherapy National Service Center number "DTP/NCI".250,000
ECHA REACH database European Chemicals Agency EINECS ELINCS NLP CASNo HPhrases pictograms tonnage "ECHA/REACH".245,000
EAWAG-BBDBiocatalysis/Biodegradation DatabaseEawag: Swiss Federal Institute of Aquatic Science and TechnologyCAS SMILES pubchem pathways "EAWAG-BBD".1396
eMolecules drug screening chemicalslist of suppliers and catalog numbers "eMolecules".8,000,000 [5]
ENCS Japanese Existing and New Chemical Substances Inventory regulated chemicals "ENCS (in Japanese)".
Evaluated Kinetic Data IUPAC rate constantscurated "Evaluated Kinetic Data".
FDA SRSFood and Drug Administration Substance Registration System U.S. National Library of Medicine ingredients in FDA regulated productsUNII inchikey "FDA SRS".781,000
FEMA Flavor Ingredient Library Flavor and Extract Manufacturers Association CAS CFR FEMA number "FEMA".
FooDB Food DatabaseUniversity of AlbertaFood components and additives "FooDB".70926
GlyTouCaninternational glycan structure repositoryMinistry of Education, Culture, Sports, Science & Technology
[ which country? ]
glycansWURCS GlycoCT PubChem CIDG "Glycan Repository".122194
Gmelin Gmelin database Elsevier inorganic and organometallic compoundsclosed access1,500,000
G-SRS Global Substance Registration System CAS PubChem ChEMBL INN UNII "G-SRS".109,260
GMDGolm Metabolome DatabaseGC/MS of metabolites "GMD".
Guide to PHARMACOLOGY IUPHAR drugs and targetsINN CAS ChEBI ChEMBL DrugBank PubChem "Guide to PHARMACOLOGY".
Henry's law constants Max Planck Institute for Chemistry volatile compoundsHenry's law constantsfrom literature "Henry's law constants".46434
HMDB Human Metabolome Database Genome Canada metabolites found in the human bodybiochemical data, clinical dataHMDB "HMDB".114,222 [6]
HugeMDBHuge Molecular Database Elegant Mathematics LLC Small molecules (most of entries have <100 atoms)major conformers with its 3D and easy search on themMgood correlated with PubChem on data that is available on PubChem "HugeMDB".102 million
ICSCILO International Chemical Safety Cards International Labour Organization CAS, EC number, UNnumber "ICSC".1784
ICSD Inorganic Crystal Structure Database FIZ Karlsruhe GmbH "ICSD".161,030
IEDBImmune Epitope Database National Institute of Allergy and Infectious Diseases Epitopes mainly peptides and carbohydrates "IEDB".3,002 non-peptides
IUPAC-NIST Solubility Database https://srdata.nist.gov/solubility/index.aspx
JECDB Japan Existing Chemical DatabaseCAS EINECS RTECS SDBS TSCA graph of number of articles per year "JECDB".
J-GLOBALNikaji Japan Science and Technology Agency "J-GLOBAL".
KEGG Kyoto Encyclopedia of Genes and Genomes Kyoto University Bioinformatics Center Compounds Glycans (also enzymes, reactions, pathways)CAS ChEBI ChEMBL MASSBANK NIKKAJI PubChem PDB-CCD "KEGG".
Ki Database PDSPligand binding "Ki Database".
KNApSAcK Nara Institute of Science and Technology InChI CAS SMILES organismsC00 "KNApSAcK".
LINCSLibrary of Integrated Network-based Cellular Signaturessmall moleculesPubChem ChEMBL SMILES InChILSM "LINCS".43,700
LipidBankJapanese Conference on the Biochemistry of Lipidslipids "LipidBank".7,009
LMSD LIPID MAPS Structure DatabaseLipidsHMDB ChEBI PubChem InChILMFA "LMSD".44701
LOLI List of Listssafety data sheets, regulation "LOLI" .
Mculesupplied chemicalsInChI, SMILES, SDF, physichochemical properties "Mcule".45,000,000
MediaDB Institute for Systems Biology growth media "MediaDB".288
Merck Index Royal Society of Chemistry drugs "Merck-Index" .11,500
MeSH Medical Subject Headings US National Library of Medicine biomedical thesaurushierarchy of descriptors to literature with MeSH ID "MeSH".
MetaCyc SRI International metabolic pathways; metabolites "MetaCyc".
MetaboLightsEMBL-EBIMTBL "MetaboLights".
MetaNetX SIB Swiss Institute of Bioinformatics metabolic networks, metabolites, biochemical reactions, cellular compartmentsmetabolic models, SBML, InChI, InChIKey, SMILESMNXMunified namespace for metabolites and biochemical reactions in the context of metabolic models "MetaNetX".240 metabolic models, 1292154 metabolites, 74613 reactions, 44 compartments
METLIN Metabolite and Chemical Entity Databasetandem mass spectrometry of metabolites "METLIN" .960,000
MINAS Metal Ions in Nucleic AcidS University of Zurich https://www.minas.uzh.ch/
ModelSeedKEGG

MetaCyc

metabolic pathways

CPD "ModelSeed" .
MolPortcatalog chemicals "MolPort".
MoNAMass Bank of North Americamass spectrasplash legg chemspider pubchem chebi CAS "MoNA".200,000
npatlasThe Natural Products Atlas Simon Fraser University microbial and fungal productssmiles, organismNPA npatlas [7] 33434
NIOSH pocket guideNIOSH Pocket Guide to Chemical Hazards National Institute for Occupational Safety and Health commonly used chemicalsexposure limits "NIOSH". 2 August 2024.677
NIST WebbookNIST Chemistry Webbook National Institute of Standards and Technology spectra CAS ionization energy mass spectrum, InChIC+CAS "NIST Webbook".
NMRShiftDB University of Cologne organicnuclear magnetic resonance spectra "NMRShiftDB".43,581
NORMAN SLENORMAN Suspect List Exchangeenvironmental monitoring "NORMAN SLE".110,000
OMGOpen Macromolecular Genome Jackson group at University of Illinois at Urbana-Champaign synthetically accessible linear homopolymersSMILES of linear homopolymers Github / Zenodo 12,886,131
ORDOpen Reaction DatabaseORD consortiumOrganic reactionsmachine-readable reaction schemes "ORD" [8] 2,000,000
OrgSyn Organic Syntheses Organic Syntheses, Inc.Reliable chemical reactionsSearchable experimental proceduresPeer reviewed "OrgSyn search".
PDB PDBeProtein Data Bank in Europe EMBL-EBI has some chemicals as well as proteins "PDBe".
PATENTSCOPE WIPO "PATENTSCOPE".16,000,000
PDBRSCB Protein Data Bank "PDB".166,891
PharmGKBShriram Center for Bioengineering and Chemical Engineeringdrugs targetsprescribing infocurated "PharmGKB".
PHAROSIlluminating the Druggable GenomeNational Institutes of Healthdrug ligands; targets [9] https://pharos.nih.gov/ 355932 ligands

20412 targets

Phenol-Explorerpolyphenols found in food "Phenol-Explorer".500
Phosida PHOsphorylation SIte DAtabaseprotein modifications "Phosida".
PoLyInfoPolymer Database National Institute for Materials Science physical properties "PoLyInfo" .26,000
PPDBPesticide Properties DatabaseAgriculture & Environment Research Unit, University of Hertfordshire Pesticides and their metabolitesChemical structure, physicochemical properties, human health and ecotoxicological datacurated "PPDB".2000 [10]
Probes and Drugs
ProCarDBProkaryotic Bacterial Carotenoid DataBase IMTECH spectra references "ProCarDB".1800
PubChem National Library of Medicine National Center for Biotechnology Informationfrom 748 data sourcesStructures, Names and Identifiers, Chemical and Physical Properties, Spectral Information, Related Records, Chemical Vendors, Pharmacology and Biochemistry, Use and Manufacturing, Safety and Hazards, Toxicity, Literature, Patents, Biomolecular Interactions and Pathways, Biological Test Results "PubChem".103,000,000
Reaxys Elsevier chemical compoundsSearchable chemical reactions "About Reaxys" .118,000,000
Ref-DBRe-referenced Protein Chemical shift Databaseproteins from BioMagResBankRe-referenced NMR shift "Ref-DB".2162
Rhea Swiss Institute of Bioinformatics biochemical reactionsChEBIcurated "Rhea".
RÖMPP Thieme Gruppe "RÖMPP" .
RTECS Registry of Toxic Effects of Chemical Substances Dassault Systèmes Toxicity, Literature "Biovia-RTECS" . 8 September 2023.160,000
RxNav U.S. National Library of Medicine  drugsinteractions "RxNav".
SaguaroChemDe Novo ChemChemical reactions from the patent literatureChemical reaction SMILES, annotated procedures, characterization data, reference metadataCurated from patent literature "SaguaroChem" . 4 July 2024.2,091,105
SciFinder Chemical Abstracts Service of American Chemical Society organic, inorganic chemicals, proteinsCASNopaid access only130,000,000
ScrubChemscraped from PubChem "ScrubChem" .2,282,992
SDBS Spectral Database for

Organic Compounds

National Institute of Advanced Industrial Science and Technology (AIST), JapanOrganic compoundsSpectra:IR Raman MASS ESR 1H NMR 13C NMRSDBS Nocurated "SDBS".34,000
Serum Metabolome Database The Metabolomics Innovation Centre found in blood serum "Serum Metabolome DB".4,651
Solvent Selection Tool ACS Green Chemistry Institute SolventsPrincipal components analysis of physical propertiescurated "Solvent Selection Tool".272 [11]
SPRESIweb InfoChem Gesellschaft für chemische Information mbHorganic molecules and reactionsorganic structuresfrom literature "SPRESI" .5,800,000
SpringerMaterials Springer solid materialsCAS InChI physical propertiesfrom literature "SpringerMaterials" .155,165 + 494,942
STITCHEMBLfrom Biocarta, BioCyc, GO, KEGG, and ReactomeChemical-Protein Interactionscurated and predicted "STITCH".500,000
SuperDRUG2Structural Bioinformatics Groupdrugs targetstargets, dose, side effects, Canonical SMILES, Standard InChI, Standard InChIKey, DrugBank, ChEMBL, DrugCentral, KEGG, PubChem, CASRNSD "SuperDRUG2".4,600
Super Natural IInatural product chemicalsSMILES vendorsSN00 "Super Natural II".325,508
SureChEMBLEuropean Molecular Biology Laboratorysubstances in patentspatent text "SureChEMBL".
SwissLipids Swiss Institute of Bioinformatics lipidsSLM: "SwissLipids".
TDR Targets Tropical Disease ResearchTrypanosomatics Laboratorydrugs and targets "TDR Targets".2,000,000
TTDTherapeutic Targets Database Zhejiang University drugs and targetsSMILES InChI CAS PubChem "TTD".37,316
T3DBToxin and Toxin-Target Database

Toxic Exposome Database

University of Alberta toxins and toxin targetsT3D "T3DB".3,678
UniChemEMBL-EBIpointers to existing chemicals; indexes 41 databases [12] Structure; StdInChI; links to databasesautomated loads ""Compound Sources Search"".>2000000
UniProtUniProt Knowledgebaseproteinssequence, modifications, location, organism, similar "UniProt".
US DOTUS Department of transportEmergency response guidebook

DOT + others

bulk transported chemicalsUNnumber United Nations ID number, hazard response guide "Emergency response guidebook" (PDF).3000
UV/VIS Spectral AtlasThe MPI-Mainz UV/VIS spectral atlas of gaseous molecules of atmospheric interest Max Planck Institute for Chemistry gaseous moleculesabsorption cross sectionsfrom literature "UV/VIS Spectral Atlas".7313
YMDBYeast Metabolome Database The Metabolomics Innovation Centre metabolites of yeast48 data fieldsYMDB "YMDB".16042
ZINC ZINC is not commercial University of California, San Francisco purchasable substancesEPA DSS TOX, ChEMBL, HMDB, KEGG, PDB, SMILES "ZINC". [13] 37 x 109

Related Research Articles

<span class="mw-page-title-main">CAS Registry Number</span> Chemical identifier

A CAS Registry Number is a unique identification number, assigned by the Chemical Abstracts Service (CAS) in the US to every chemical substance described in the open scientific literature, in order to index the substance in the CAS Registry. This registry includes all substances described since 1957, plus some substances from as far back as the early 1800s; it is a chemical database that includes organic and inorganic compounds, minerals, isotopes, alloys, mixtures, and nonstructurable materials. CAS RNs are generally serial numbers, so they do not contain any information about the structures themselves the way SMILES and InChI strings do.

A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.

<span class="mw-page-title-main">Metabolomics</span> Scientific study of chemical processes involving metabolites

Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerprints that specific cellular processes leave behind", the study of their small-molecule metabolite profiles. The metabolome represents the complete set of metabolites in a biological cell, tissue, organ, or organism, which are the end products of cellular processes. Messenger RNA (mRNA), gene expression data, and proteomic analyses reveal the set of gene products being produced in the cell, data that represents one aspect of cellular function. Conversely, metabolic profiling can give an instantaneous snapshot of the physiology of that cell, and thus, metabolomics provides a direct "functional readout of the physiological state" of an organism. There are indeed quantifiable correlations between the metabolome and the other cellular ensembles, which can be used to predict metabolite abundances in biological samples from, for example mRNA abundances. One of the ultimate challenges of systems biology is to integrate metabolomics with all other -omics information to provide a better understanding of cellular biology.

<span class="mw-page-title-main">Metabolome</span> Complete set of small molecules in a biological sample

The metabolome refers to the complete set of small-molecule chemicals found within a biological sample. The biological sample can be a cell, a cellular organelle, an organ, a tissue, a tissue extract, a biofluid or an entire organism. The small molecule chemicals found in a given metabolome may include both endogenous metabolites that are naturally produced by an organism as well as exogenous chemicals that are not naturally produced by an organism.

<i>N</i>-Nitrosonornicotine Chemical compound

N-Nitrosonornicotine (NNN) is a tobacco-specific nitrosamine produced during the curing and processing of tobacco.

<span class="mw-page-title-main">Substructure search</span> Method of finding chemicals in a database

Substructure search (SSS) is a method to retrieve from a database only those chemicals matching a pattern of atoms and bonds which a user specifies. It is an application of graph theory, specifically subgraph matching in which the query is a hydrogen-depleted molecular graph. The mathematical foundations for the method were laid in the 1870s, when it was suggested that chemical structure drawings were equivalent to graphs with atoms as vertices and bonds as edges. SSS is now a standard part of cheminformatics and is widely used by pharmaceutical chemists in drug discovery.

<span class="mw-page-title-main">Triphenylphosphine oxide</span> Chemical compound

Triphenylphosphine oxide (often abbreviated TPPO) is the organophosphorus compound with the formula OP(C6H5)3, also written as Ph3PO or PPh3O (Ph = C6H5). It is one of the more common phosphine oxides. This colourless crystalline compound is a common but potentially useful waste product in reactions involving triphenylphosphine. It is a popular reagent to induce the crystallizing of chemical compounds.

The ZINC database is a curated collection of commercially available chemical compounds prepared especially for virtual screening. ZINC is used by investigators in pharmaceutical companies, biotechnology companies, and research universities.

<span class="mw-page-title-main">Ronald T. Raines</span> American chemical biologist ( born 1958 )

Ronald T. Raines is an American chemical biologist. He is the Roger and Georges Firmenich Professor of Natural Products Chemistry at the Massachusetts Institute of Technology. He is known for using ideas and methods of physical organic chemistry to solve important problems in biology.

ChemSpider is a freely accessible online database of chemicals owned by the Royal Society of Chemistry. It contains information on more than 100 million molecules from over 270 data sources, each of them receiving a unique identifier called ChemSpider Identifier.

Sodium salicylate is a sodium salt of salicylic acid. It can be prepared from sodium phenolate and carbon dioxide under higher temperature and pressure. Historically, it has been synthesized by refluxing methyl salicylate with an excess of sodium hydroxide.

Inte:Ligand was founded in Maria Enzersdorf, Lower Austria (Niederösterreich) in 2003. They established the company headquarters on Mariahilferstrasse in Vienna, Austria that same year.

Anne M. Andrews is an American academic, the Richard Metzner Endowed Chair in Clinical Neuropharmacology, Professor of Chemistry & Biochemistry, and Professor of Psychiatry & Behavioral Sciences at the University of California, Los Angeles. Andrews is known for her work on the study of the serotonin system with a special focus on how the serotonin transporter modulates complex behaviors including anxiety, mood, stress responsiveness, and learning and memory.

Gabriela S. Schlau-Cohen is a Thomas D. and Virginia W. Cabot Career Development Associate Professor at MIT in the Department of Chemistry.

<span class="mw-page-title-main">David S. Wishart</span> Canadian bioinformatician (born 1961)

[[

Lacto-<i>N</i>-tetraose Chemical compound

Lacto-N-tetraose is a complex sugar found in human milk. It is one of the few characterized human milk oligosaccharides (HMOs) and is enzymatically synthesized from the substrate lactose. It is biologically relevant in the early development of the infant gut flora.

<span class="mw-page-title-main">5-Methyl-2-((2-nitrophenyl)amino)-3-thiophenecarbonitrile</span> Organic compound

5-Methyl-2-[(2-nitrophenyl)amino]-3-thiophenecarbonitrile, also known as ROY (red-orange-yellow), is an organic compound which is a chemical intermediate to the drug olanzapine. It has been the subject of intensive study because it can exist in multiple well-characterised crystalline polymorphic forms.

<span class="mw-page-title-main">NOXRED1</span> Human gene

NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.

<span class="mw-page-title-main">SCRN3</span> Protein-coding gene in the species Homo sapiens

Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.

SIRIUS is a Java-based open-source software for the identification of small molecules from fragmentation mass spectrometry data without the use of spectral libraries. It combines the analysis of isotope patterns in MS1 spectra with the analysis of fragmentation patterns in MS2 spectra. SIRIUS is the umbrella application comprising CSI:FingerID, CANOPUS, COSMIC and ZODIAC.

References

  1. "Chemical Carcinogenesis Research Information System (CCRIS) - PubChem Data Source". pubchem.ncbi.nlm.nih.gov. Retrieved 2020-08-07.
  2. "Download CCRIS (Chemical Carcinogenesis Research Information System) Data". www.nlm.nih.gov. Retrieved 2020-08-07.
  3. Zhumagambetov, Rustam; Kazbek, Daniyar; Shakipov, Mansur; Maksut, Daulet; Peshkov, Vsevolod A.; Fazli, Siamac (2020-12-17). "cheML.io: an online database of ML-generated molecules". RSC Advances. 10 (73): 45189–45198. Bibcode:2020RSCAd..1045189Z. doi:10.1039/D0RA07820D. ISSN   2046-2069. PMC   9058596 . PMID   35516285.
  4. Jacobs, Andrea; Williams, Dustin; Hickey, Katherine; Patrick, Nathan; Williams, Antony J.; Chalk, Stuart; McEwen, Leah; Willighagen, Egon; Walker, Martin; Bolton, Evan; Sinclair, Gabriel; Sanford, Adam (13 May 2022). "CAS Common Chemistry in 2021: Expanding Access to Trusted Chemical Information for the Scientific Community". Journal of Chemical Information and Modeling. 62 (11): 2737–2743. doi: 10.1021/acs.jcim.2c00268 . PMC   9199008 . PMID   35559614.
  5. "Vision - eMolecules". www.emolecules.com. Retrieved 2020-07-27.
  6. "Human Metabolome Database: About the Human Metabolome Database". hmdb.ca. Retrieved 2020-07-27.
  7. Van Santen, Jeffrey A.; Jacob, Grégoire; Singh, Amrit Leen; et al. (2019). "The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery". ACS Central Science. 5 (11): 1824–1833. doi:10.1021/acscentsci.9b00806. PMC   6891855 . PMID   31807684.
  8. Kearnes, Steven M.; Maser, Michael R.; Wleklinski, Michael; et al. (2021). "The Open Reaction Database". Journal of the American Chemical Society. 143 (45): 18820–18826. doi:10.1021/jacs.1c09820.
  9. "Pharos: Illuminating the Druggable Genome". pharos.nih.gov. Retrieved 2024-10-02.
  10. Lewis, Kathleen A.; Tzilivakis, John; Warner, Douglas J.; Green, Andrew (2016). "An international database for pesticide risk assessments and management". Human and Ecological Risk Assessment. 22 (4): 1050–1064. Bibcode:2016HERA...22.1050L. doi:10.1080/10807039.2015.1133242. hdl: 2299/17565 . S2CID   87599872.
  11. Diorazio, Louis J.; Hose, David R. J.; Adlington, Neil K. (2016). "Toward a More Holistic Framework for Solvent Selection". Organic Process Research & Development. 20 (4): 760–773. doi: 10.1021/acs.oprd.6b00015 .
  12. "UniChem". www.ebi.ac.uk. Retrieved 2024-10-02.
  13. Tingle, Benjamin I.; Tang, Khanh G.; Castanon, Mar; Gutierrez, John J.; Khurelbaatar, Munkhzul; Dandarchuluun, Chinzorig; Moroz, Yurii S.; Irwin, John J. (2023). "ZINC-22─A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery". Journal of Chemical Information and Modeling. 63 (4): 1166–1176. doi: 10.1021/acs.jcim.2c01253 . PMC   9976280 . PMID   36790087.