ChemSpider

Last updated
ChemSpider
ChemSpider Logo.png
Content
DescriptionA chemical structure database providing fast access to over 63 million structures, properties and associated information.
Contact
Research center Raleigh, North Carolina, United States
Laboratory
Access
Website www.chemspider.com
Tools
Standalone https://itunes.apple.com/us/app/chemspider/id458878661
Miscellaneous
License Creative Commons Attribution Share-alike [2]

ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry. [3] [4] [5] [6] [7]

A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.

Royal Society of Chemistry UK learned society

The Royal Society of Chemistry (RSC) is a learned society in the United Kingdom with the goal of "advancing the chemical sciences". It was formed in 1980 from the amalgamation of the Chemical Society, the Royal Institute of Chemistry, the Faraday Society, and the Society for Analytical Chemistry with a new Royal Charter and the dual role of learned society and professional body. At its inception, the Society had a combined membership of 34,000 in the UK and a further 8,000 abroad. The headquarters of the Society are at Burlington House, Piccadilly, London. It also has offices in Thomas Graham House in Cambridge where RSC Publishing is based. The Society has offices in the United States at the University City Science Center, Philadelphia, in both Beijing and Shanghai, China and Bangalore, India. The organisation carries out research, publishes journals, books and databases, as well as hosting conferences, seminars and workshops. It is the professional body for chemistry in the UK, with the ability to award the status of Chartered Chemist (CChem) and, through the Science Council the awards of Chartered Scientist (CSci), Registered Scientist (RSci) and Registered Science Technician (RScTech) to suitably qualified candidates. The designation FRSC is given to a group of elected Fellows of the society who have made major contributions to chemistry and other interface disciplines such as biological chemistry. The names of Fellows are published each year in The Times (London). Honorary Fellowship of the Society ("HonFRSC") is awarded for distinguished service in the field of chemistry.

Contents

Database

The database contains information on more than 63 million molecules from over 280 data sources including: [8]

Molecule Electrically neutral entity consisting of more than one atom (n > 1); rigorously, a molecule, in which n > 1 must correspond to a depression on the potential energy surface that is deep enough to confine at least one vibrational state

A molecule is an electrically neutral group of two or more atoms held together by chemical bonds. Molecules are distinguished from ions by their lack of electrical charge.In the kinetic theory of gases, the term molecule is often used for any gaseous particle regardless of its composition. According to this definition, noble gas atoms are considered molecules as they are monatomic molecules.

Human Metabolome Database database of human metabolites

The Human Metabolome Database (HMDB) is a comprehensive, high-quality, freely accessible, online database of small molecule metabolites found in the human body. Created by the Human Metabolome Project funded by Genome Canada. One of the first dedicated metabolomics databases, the HMDB facilitates human metabolomics research, including the identification and characterization of human metabolites using NMR spectroscopy, GC-MS spectrometry and LC/MS spectrometry. To aid in this discovery process, the HMDB contains three kinds of data: 1) chemical data, 2) clinical data, and 3) molecular biology/biochemistry data. The chemical data includes 41,514 metabolite structures with detailed descriptions along with nearly 10,000 NMR, GC-MS and LC/MS spectra.

Journal of Heterocyclic Chemistry is a peer-reviewed scientific journal summarizing progress in the field of heterocycle chemistry. It is a source for the ChemSpider database.

KEGG biological database

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

Each chemical is given a unique identifier, which forms part of a corresponding URL. For example, acetone is 175, and thus has the URL http://www.chemspider.com/Chemical-Structure.175.html

With reference to a given set of objects, a unique identifier (UID) is any identifier which is guaranteed to be unique among all identifiers used for those objects and for a specific purpose. The concept have been formalized early in Computer science and Information systems, in general associating it to an atomic data type.

Acetone chemical compound

Acetone, or propanone, is the organic compound with the formula (CH3)2CO. It is a colorless, volatile, flammable liquid and is the simplest and smallest ketone.

Crowdsourcing

The ChemSpider database can be updated with user contributions including chemical structure deposition, spectra deposition and user curation. This is a crowdsourcing approach to develop an online chemistry database. Crowdsourced based curation of the data has produced a dictionary of chemical names associated with chemical structures that has been used in text-mining applications of the biomedical and chemical literature. [10]

A chemical structure determination includes a chemist's specifying the molecular geometry and, when feasible and necessary, the electronic structure of the target molecule or other solid. Molecular geometry refers to the spatial arrangement of atoms in a molecule and the chemical bonds that hold the atoms together, and can be represented using structural formulae and by molecular models; complete electronic structure descriptions include specifying the occupation of a molecule's molecular orbitals. Structure determination can be applied to a range of targets from very simple molecules, to very complex ones.

Crowdsourcing obtaining services, ideas, or content from a group of people, rather than from employees or suppliers

Crowdsourcing is a sourcing model in which individuals or organizations obtain goods and services, including ideas and finances, from a large, relatively open and often rapidly-evolving group of internet users; it divides work between participants to achieve a cumulative result. The word crowdsourcing itself is a portmanteau of crowd and outsourcing, and was coined in 2006. As a mode of sourcing, crowdsourcing existed prior to the digital age.

Dictionary collection of words and their meanings

A dictionary, sometimes known as a wordbook, is a collection of words in one or more specific languages, often arranged alphabetically, which may include information on definitions, usage, etymologies, pronunciations, translation, etc. or a book of words in one language with their equivalents in another, sometimes known as a lexicon. It is a lexicographical reference that shows inter-relationships among the data.

However, database rights are not waived and a data dump is not available; in fact, the FAQ even states that only limited downloads are allowed: [11] therefore the right to fork is not guaranteed and the project can't be considered free/open.

Free content Work or artwork with few or no restrictions on how it may be used

Free content, libre content, or free information, is any kind of functional work, work of art, or other creative content that meets the definition of a free cultural work.

Open data practice of sharing data publicly and reusably

Open data is the idea that some data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. The goals of the open-source data movement are similar to those of other "open(-source)" movements such as open-source software, hardware, open content, open education, open educational resources, open government, open knowledge, open access, open science, and the open web. Paradoxically, the growth of the open data movement is paralleled by a rise in intellectual property rights. The philosophy behind open data has been long established, but the term "open data" itself is recent, gaining popularity with the rise of the Internet and World Wide Web and, especially, with the launch of open-data government initiatives such as Data.gov, Data.gov.uk and Data.gov.in.

Searching

A number of available search modules are provided:

Chemistry document mark-up

The ChemSpider database has been used in combination with text mining as the basis of chemistry document markup. ChemMantis, [14] the Chemistry Markup And Nomenclature Transformation Integrated System uses algorithms to identify and extract chemical names from documents and web pages and converts the chemical names to chemical structures using name-to-structure conversion algorithms and dictionary look-ups in the ChemSpider database. The result is an integrated system between chemistry documents and information look-up via ChemSpider into over 150 data sources.

History

ChemSpider was acquired by the Royal Society of Chemistry (RSC) in May, 2009. [15] Prior to the acquisition by RSC, ChemSpider was controlled by a private corporation, ChemZoo Inc. The system was first launched in March 2007 in a beta release form and transitioned to release in March 2008. ChemSpider has expanded the generic support of a chemistry database to include support of the Wikipedia chemical structure collection via their WiChempedia implementation.

Services

A number of services are made available online. These include the conversion of chemical names to chemical structures, the generation of SMILES and InChI strings as well as the prediction of many physicochemical parameters and integration to a web service allowing NMR prediction.

SyntheticPages

SyntheticPages is a free interactive database of synthetic chemistry procedures operated by the Royal Society of Chemistry. [16] Users submit synthetic procedures which they have conducted themselves for publication on the site. These procedures may be original works, but they are more often based on literature reactions. Citations to the original published procedure are made where appropriate. They are checked by a scientific editor before posting. The pages do not undergo formal peer-review like a scientific journal article but comments can be made by logged-in users. The comments are also moderated by scientific editors. The intention is to collect practical experience of how to conduct useful chemical synthesis in the lab. While experimental methods published in an ordinary academic journal are listed formally and concisely, the procedures in ChemSpider SyntheticPages are given with more practical detail. Informality is encouraged. Comments by submitters are included as well. Other publications with comparable amounts of detail include Organic Syntheses and Inorganic Syntheses . The SyntheticPages site was originally set up by Professors Kevin Booker-Milburn (University of Bristol), Stephen Caddick (University College London), Peter Scott (University of Warwick) and Dr Max Hammond. In February 2010 a merger was announced [17] with the Royal Society of Chemistry's chemical structure search engine ChemSpider and the formation of ChemSpider|SyntheticPages (CS|SP).

Open PHACTS

ChemSpider is serving as the chemical compound repository as part of the Open PHACTS project, an Innovative Medicines Initiative. Open PHACTS will deploy an open standards, open access, semantic web approach to address bottlenecks in small molecule drug discovery - disparate information sources, lack of standards and information overload. [18]

See also

Related Research Articles

Heterocyclic compound cyclic chemical compound having as ring members atoms of at least two different elements

A heterocyclic compound or ring structure is a cyclic compound that has atoms of at least two different elements as members of its ring(s). Heterocyclic chemistry is the branch of organic chemistry dealing with the synthesis, properties, and applications of these heterocycles.

Cheminformatics is the use of computer and informational techniques applied to a range of problems in the field of chemistry. These in silico techniques are used, for example, in pharmaceutical companies and academic settings in the process of drug discovery. These methods can also be used in chemical and allied industries in various other forms.

Organic synthesis is a special branch of chemical synthesis and is concerned with the intentional construction of organic compounds. Organic molecules are often more complex than inorganic compounds, and their synthesis has developed into one of the most important branches of organic chemistry. There are several main areas of research within the general area of organic synthesis: total synthesis, semisynthesis, and methodology.

The IUPAC International Chemical Identifier is a textual identifier for chemical substances, designed to provide a standard way to encode molecular information and to facilitate the search for such information in databases and on the web. Initially developed by IUPAC and NIST from 2000 to 2005, the format and algorithms are non-proprietary.

Butyronitrile chemical compound

Butyronitrile or butanenitrile or propyl cyanide, is a nitrile with the formula C3H7CN. This colorless liquid is miscible with most polar organic solvents.

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains substance descriptions and small molecules with fewer than 1000 atoms and 1000 bonds. More than 80 database vendors contribute to the growing PubChem database.

<i>Dalton Transactions</i> chemistry journal

Dalton Transactions is a peer-reviewed scientific journal publishing original (primary) research and review articles on all aspects of the chemistry of inorganic, bioinorganic, and organometallic compounds. It is published weekly by the Royal Society of Chemistry. The journal was named after the English chemist, John Dalton, best known for his work on modern atomic theory. Authors can elect to have accepted articles published as open access. The editor is Andrew Shore. Dalton Transactions was named a "rising star" by In-cites from Thomson Scientific in 2006.

Chemical Entities of Biological Interest, also known as ChEBI, is a database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies effort. The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products which have potential bioactivity. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.

Steven Victor Ley CBE FRS FRSC is Professor of Organic Chemistry in the Department of Chemistry at the University of Cambridge, and is a Fellow of Trinity College, Cambridge. He was President of the Royal Society of Chemistry (2000–2002) and was made a CBE in January 2002, in the process. In 2011, he was included by The Times in the list of the "100 most important people in British science".

Simbiosys Toronto-based chemistry software company

SimBioSys is a Toronto-based chemistry software company focusing on structure based drug discovery and retrosynthetic analysis tools. It has established a strong reputation as one of the leading developers of flexible docking applications, virtual screening methods and computer aided organic synthesis design.

ChEMBL chemical database of bioactive molecules with drug-like properties

ChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug-like properties. It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

Antony John Williams Welsh chemist

Antony John Williams is a British chemist and expert in the fields of both nuclear magnetic resonance (NMR) spectroscopy and cheminformatics at the United States Environmental Protection Agency. He is the founder of the ChemSpider website that was purchased by the Royal Society of Chemistry in May 2009. He is a science blogger, one of the hosts of the SciMobileApps wiki, a community-based wiki for Scientific Mobile Apps and an author.

Chemicalize

Chemicalize is an online platform for chemical calculations, search, and text processing. It is developed and owned by ChemAxon and offers various cheminformatics tools in freemium model: chemical property predictions, structure-based and text-based search, chemical text processing, and checking compounds with respect to national regulations of different countries.

The ChemDB HIV, Opportunistic Infection and Tuberculosis Therapeutics Database is a publicly available tool developed by the National Institute of Allergy and Infectious Diseases to compile preclinical data on small molecules with potential therapeutic action against HIV/AIDS and related opportunistic infections.

Sean Ekins is a British pharmacologist and expert in the fields of ADME/Tox, computational toxicology and cheminformatics at Collaborations in Chemistry, a division of corporate communications firm Collaborations in Communications. He is also the editor of four books and a book series for John Wiley & Sons.

Dotmatics is a scientific informatics company, focusing on data management, analysis and visualization. Founded in 2005, the company's headquarters are in Bishops Stortford, Hertfordshire, England and has two US offices in San Diego, CA and Woburn, MA. Dotmatics provides software to half of the world's 20 largest drugmakers.

The IUPHAR/BPS Guide to PHARMACOLOGY is an open-access website, acting as a portal to information on the biological targets of licensed drugs and other small molecules. The Guide to PHARMACOLOGY is developed as a joint venture between the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). This replaces and expands upon the original 2009 IUPHAR Database. The Guide to PHARMACOLOGY aims to provide a concise overview of all pharmacological targets, accessible to all members of the scientific and clinical communities and the interested public, with links to details on a selected set of targets. The information featured includes pharmacological data, target and gene nomenclature, as well as curated chemical information for ligands. Overviews and commentaries on each target family are included, with links to key references.

Open PHACTS is a European initiative public–private partnership between academia, publishers, enterprises, pharmaceutical companies and other organisations working to enable better, cheaper and faster drug discovery. It has been funded by the Innovative Medicines Initiative, selected as part of three projects to "design methods for common standards and sharing of data for more efficient drug development and patient treatment in the future".

CompTox Chemicals Dashboard chemical database

The CompTox Chemicals Dashboard is a freely accessible online database created and maintained by the U.S. Environmental Protection Agency (EPA). The database provides access to multiple types of data including physicochemical properties, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay. EPA and other scientists use the data and models contained within the dashboard to help identify chemicals that require further testing and reduce the use of animals in chemical testing. The Dashboard is also used to provide public access to information from EPA Action Plans, e.g. around perfluorinated alkylated substances.,

References

  1. Van Noorden, R. (2012). "Chemistry's web of data expands". Nature. 483 (7391): 524. Bibcode:2012Natur.483..524V. doi:10.1038/483524a. PMID   22460877.
  2. "ChemSpider Blog » Blog Archive » ChemSpider Adopts Creative Commons Licenses". www.chemspider.com. Archived from the original on 2015-04-02. Retrieved 2014-03-21.
  3. Antony John Williams (Jan–Feb 2008). "ChemSpider and Its Expanding Web: Building a Structure-Centric Community for Chemists". Chemistry International. 30 (1).
  4. Williams, A. J. (2008). "Public chemical compound databases". Current Opinion in Drug Discovery & Development. 11 (3): 393–404. PMID   18428094.
  5. Brumfiel, G. (2008). "Chemists spin a web of data". Nature. 453 (7192): 139. Bibcode:2008Natur.453..139B. doi:10.1038/453139a. PMID   18464701.
  6. Williams, A. J. (2011). "Chemspider: A Platform for Crowdsourced Collaboration to Curate Data Derived from Public Compound Databases". Collaborative Computational Technologies for Biomedical Research. pp. 363–386. doi:10.1002/9781118026038.ch22. ISBN   9781118026038.
  7. Pence, H. E.; Williams, A. (2010). "ChemSpider: An Online Chemical Information Resource". Journal of Chemical Education. 87 (11): 1123. Bibcode:2010JChEd..87.1123P. doi:10.1021/ed100697w.
  8. "Data Sources". Chemspider . Retrieved May 16, 2019.
  9. "ChemSpider Blog » Blog Archive » The US EPA DSSTox Browser Connects to ChemSpider". ChemSpider. August 23, 2008. Archived from the original on 7 November 2017. Retrieved 7 November 2017.
  10. Hettne, K. M.; Williams, A. J.; Van Mulligen, E. M.; Kleinjans, J.; Tkachenko, V.; Kors, J. A. (2010). "Automatic vs. Manual curation of a multi-source chemical dictionary: The impact on text mining". Journal of Cheminformatics. 2 (1): 3. doi:10.1186/1758-2946-2-3. PMC   2848622 . PMID   20331846.
  11. "ChemSpider Blog » Blog Archive » Who Would Like to Have the Entire ChemSpider Database?". www.chemspider.com. Archived from the original on 2015-09-24. Retrieved 2014-04-18.
  12. "ChemSpider on the App Store". App Store.
  13. "ChemSpider Mobile - Android Apps on Google Play". play.google.com.
  14. Welcome ChemMantis to ChemZoo and a Call for Contributions from the Community,2008-10-23, A. Williams,blog post Archived 2015-09-24 at the Wayback Machine
  15. "RSC acquires ChemSpider". Royal Society of Chemistry. 11 May 2009. Retrieved 2009-05-11.
  16. "ChemSpider SyntheticPages". ChemSpider SyntheticPages. Royal Society of Chemistry. Retrieved 26 June 2012.
  17. "ChemSpider and SyntheticPages support synthetic chemistry". RSC Publishing. Royal Society of Chemistry. 2010-02-05. Archived from the original on 26 July 2012. Retrieved 2012-06-26.
  18. Williams, A. J.; Harland, L.; Groth, P.; Pettifer, S.; Chichester, C.; Willighagen, E. L.; Evelo, C. T.; Blomberg, N.; Ecker, G.; Goble, C.; Mons, B. (2012). "Open PHACTS: Semantic interoperability for drug discovery". Drug Discovery Today . 17 (21–22): 1188–1198. doi:10.1016/j.drudis.2012.05.016. PMID   22683805.