ChemSpider

Last updated
ChemSpider
ChemSpider Logo.png
Content
DescriptionA chemical structure database providing fast access to over 100 million structures, properties and associated information.
Contact
Research center Raleigh, North Carolina, United States
Laboratory
Access
Website www.chemspider.com
Tools
Standalone https://itunes.apple.com/us/app/chemspider/id458878661
Miscellaneous
License Creative Commons Attribution Share-alike [2]

ChemSpider is a database of chemicals. ChemSpider is owned by the Royal Society of Chemistry. [3] [4] [5] [6] [7]

Contents

Database

The database contains information on more than 100 million molecules from over 270 data sources including: [8]

Each chemical is given a unique identifier, which forms part of a corresponding URL. For example, acetone is 175, and thus has the URL http://www.chemspider.com/Chemical-Structure.175.html

Crowdsourcing

The ChemSpider database can be updated with user contributions including chemical structure deposition, spectra deposition and user curation. This is a crowdsourcing approach to develop an online chemistry database. Crowdsourced based curation of the data has produced a dictionary of chemical names associated with chemical structures that has been used in text-mining applications of the biomedical and chemical literature. [10]

However, database rights are not waived and a data dump is not available; in fact, the FAQ even states that only limited downloads are allowed: [11] therefore the right to fork is not guaranteed and the project can't be considered free/open.

Searching

A number of available search modules are provided:

Chemistry document mark-up

The ChemSpider database has been used in combination with text mining as the basis of chemistry document markup. ChemMantis, [14] the Chemistry Markup And Nomenclature Transformation Integrated System uses algorithms to identify and extract chemical names from documents and web pages and converts the chemical names to chemical structures using name-to-structure conversion algorithms and dictionary look-ups in the ChemSpider database. The result is an integrated system between chemistry documents and information look-up via ChemSpider into over 150 data sources.

History

ChemSpider was acquired by the Royal Society of Chemistry (RSC) in May, 2009. [15] Prior to the acquisition by RSC, ChemSpider was controlled by a private corporation, ChemZoo Inc. The system was first launched in March 2007 in a beta release form and transitioned to release in March 2008.

Services

A number of services are made available online. These include the conversion of chemical names to chemical structures, the generation of SMILES and InChI strings as well as the prediction of many physicochemical parameters and integration to a web service allowing NMR prediction.

SyntheticPages

SyntheticPages is a free interactive database of synthetic chemistry procedures operated by the Royal Society of Chemistry. [16] Users submit synthetic procedures which they have conducted themselves for publication on the site. These procedures may be original works, but they are more often based on literature reactions. Citations to the original published procedure are made where appropriate. They are checked by a scientific editor before posting. The pages do not undergo formal peer-review like a scientific journal article but comments can be made by logged-in users. The comments are also moderated by scientific editors. The intention is to collect practical experience of how to conduct useful chemical synthesis in the lab. While experimental methods published in an ordinary academic journal are listed formally and concisely, the procedures in ChemSpider SyntheticPages are given with more practical detail. Informality is encouraged. Comments by submitters are included as well. Other publications with comparable amounts of detail include Organic Syntheses and Inorganic Syntheses . The SyntheticPages site was originally set up by Professors Kevin Booker-Milburn (University of Bristol), Stephen Caddick (University College London), Peter Scott (University of Warwick) and Dr Max Hammond. In February 2010 a merger was announced [17] with the Royal Society of Chemistry's chemical structure search engine ChemSpider and the formation of ChemSpider|SyntheticPages (CS|SP).

Open PHACTS

ChemSpider served as the chemical compound repository as part of the Open PHACTS project, an Innovative Medicines Initiative. Open PHACTS developed to open standards, with an open access, semantic web approach to address bottlenecks in small molecule drug discovery - disparate information sources, lack of standards and information overload. [18]

See also

Related Research Articles

Heterocyclic compound Cyclic compound that has atoms of at least two different elements as members of its ring(s).

A heterocyclic compound or ring structure is a cyclic compound that has atoms of at least two different elements as members of its ring(s). Heterocyclic chemistry is the branch of organic chemistry dealing with the synthesis, properties, and applications of these heterocycles.

A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.

Cheminformatics refers to use of physical chemistry theory with computer and information science techniques—so called "in silico" techniques—in application to a range of descriptive and prescriptive problems in the field of chemistry, including in its applications to biology and related molecular fields. Such in silico techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.

Organic synthesis is a special branch of chemical synthesis and is concerned with the intentional construction of organic compounds. Organic molecules are often more complex than inorganic compounds, and their synthesis has developed into one of the most important branches of organic chemistry. There are several main areas of research within the general area of organic synthesis: total synthesis, semisynthesis, and methodology.

The IUPAC International Chemical Identifier is a textual identifier for chemical substances, designed to provide a standard way to encode molecular information and to facilitate the search for such information in databases and on the web. Initially developed by IUPAC and NIST from 2000 to 2005, the format and algorithms are non-proprietary.

Neopentane Chemical compound

Neopentane, also called 2,2-dimethylpropane, is a double-branched-chain alkane with five carbon atoms. Neopentane is a flammable gas at room temperature and pressure which can condense into a highly volatile liquid on a cold day, in an ice bath, or when compressed to a higher pressure.

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1000 bonds. More than 80 database vendors contribute to the growing PubChem database.

Chemical Entities of Biological Interest, also known as ChEBI, is a chemical database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies (OBO) effort at the European Bioinformatics Institute (EBI). The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products which have potential bioactivity. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.

Steven Victor Ley CBE FRS FRSC is Professor of Organic Chemistry in the Department of Chemistry at the University of Cambridge, and is a Fellow of Trinity College, Cambridge. He was President of the Royal Society of Chemistry (2000–2002) and was made a CBE in January 2002, in the process. In 2011, he was included by The Times in the list of the "100 most important people in British science".

Chemical similarity

Chemical similarity refers to the similarity of chemical elements, molecules or chemical compounds with respect to either structural or functional qualities, i.e. the effect that the chemical compound has on reaction partners in inorganic or biological settings. Biological effects and thus also similarity of effects are usually quantified using the biological activity of a compound. In general terms, function can be related to the chemical activity of compounds.

ChEMBL

ChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug-like properties. It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

Tridecylic acid, or tridecanoic acid, is a 13-carbon saturated fatty acid with the chemical formula CH3(CH2)11COOH.

Antony John Williams

Antony John Williams is a British chemist and expert in the fields of both nuclear magnetic resonance (NMR) spectroscopy and cheminformatics at the United States Environmental Protection Agency. He is the founder of the ChemSpider website that was purchased by the Royal Society of Chemistry in May 2009. He is a science blogger, one of the hosts of the SciMobileApps wiki, a community-based wiki for Scientific Mobile Apps and an author.

Chemicalize

Chemicalize is an online platform for chemical calculations, search, and text processing. It is developed and owned by ChemAxon and offers various cheminformatics tools in freemium model: chemical property predictions, structure-based and text-based search, chemical text processing, and checking compounds with respect to national regulations of different countries.

Sean Ekins

Sean Ekins is a British pharmacologist and expert in the fields of ADME/Tox, computational toxicology and cheminformatics at Collaborations in Chemistry, a division of corporate communications firm Collaborations in Communications. He is also the editor of four books and a book series for John Wiley & Sons.

Dotmatics is a scientific informatics company, focusing on data management, analysis and visualization. Founded in 2005, the company's headquarters are in Bishops Stortford, Hertfordshire, England and has two US offices in San Diego, CA and Woburn, MA. Dotmatics provides software to half of the world's 20 largest drugmakers.

The IUPHAR/BPS Guide to PHARMACOLOGY is an open-access website, acting as a portal to information on the biological targets of licensed drugs and other small molecules. The Guide to PHARMACOLOGY is developed as a joint venture between the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS). This replaces and expands upon the original 2009 IUPHAR Database. The Guide to PHARMACOLOGY aims to provide a concise overview of all pharmacological targets, accessible to all members of the scientific and clinical communities and the interested public, with links to details on a selected set of targets. The information featured includes pharmacological data, target, and gene nomenclature, as well as curated chemical information for ligands. Overviews and commentaries on each target family are included, with links to key references.

Open PHACTS was a European initiative public–private partnership between academia, publishers, enterprises, pharmaceutical companies and other organisations working to enable better, cheaper and faster drug discovery. It has been funded by the Innovative Medicines Initiative, selected as part of three projects to "design methods for common standards and sharing of data for more efficient drug development and patient treatment in the future".

CompTox Chemicals Dashboard Chemical database

The CompTox Chemicals Dashboard is a freely accessible online database created and maintained by the U.S. Environmental Protection Agency (EPA). The database provides access to multiple types of data including physicochemical properties, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay. EPA and other scientists use the data and models contained within the dashboard to help identify chemicals that require further testing and reduce the use of animals in chemical testing. The Dashboard is also used to provide public access to information from EPA Action Plans, e.g. around perfluorinated alkylated substances.,

References

  1. Van Noorden, R. (2012). "Chemistry's web of data expands". Nature. 483 (7391): 524. Bibcode:2012Natur.483..524V. doi: 10.1038/483524a . PMID   22460877.
  2. "ChemSpider Blog  » Blog Archive  » ChemSpider Adopts Creative Commons Licenses". www.chemspider.com. Archived from the original on 2015-04-02. Retrieved 2014-03-21.
  3. Antony John Williams (Jan–Feb 2008). "ChemSpider and Its Expanding Web: Building a Structure-Centric Community for Chemists". Chemistry International. 30 (1).
  4. Williams, A. J. (2008). "Public chemical compound databases". Current Opinion in Drug Discovery & Development. 11 (3): 393–404. PMID   18428094.
  5. Brumfiel, G. (2008). "Chemists spin a web of data". Nature. 453 (7192): 139. Bibcode:2008Natur.453..139B. doi: 10.1038/453139a . PMID   18464701.
  6. Williams, A. J. (2011). "Chemspider: A Platform for Crowdsourced Collaboration to Curate Data Derived from Public Compound Databases". Collaborative Computational Technologies for Biomedical Research. pp. 363–386. doi:10.1002/9781118026038.ch22. ISBN   9781118026038.
  7. Pence, H. E.; Williams, A. (2010). "ChemSpider: An Online Chemical Information Resource". Journal of Chemical Education. 87 (11): 1123. Bibcode:2010JChEd..87.1123P. doi:10.1021/ed100697w.
  8. "Data Sources". Chemspider . Retrieved May 16, 2019.
  9. "ChemSpider Blog  » Blog Archive » The US EPA DSSTox Browser Connects to ChemSpider". ChemSpider. August 23, 2008. Archived from the original on 7 November 2017. Retrieved 7 November 2017.
  10. Hettne, K. M.; Williams, A. J.; Van Mulligen, E. M.; Kleinjans, J.; Tkachenko, V.; Kors, J. A. (2010). "Automatic vs. Manual curation of a multi-source chemical dictionary: The impact on text mining". Journal of Cheminformatics. 2 (1): 3. doi:10.1186/1758-2946-2-3. PMC   2848622 . PMID   20331846.
  11. "ChemSpider Blog  » Blog Archive  » Who Would Like to Have the Entire ChemSpider Database?". www.chemspider.com. Archived from the original on 2015-09-24. Retrieved 2014-04-18.
  12. "ChemSpider on the App Store". App Store.
  13. "ChemSpider Mobile - Android Apps on Google Play". play.google.com.
  14. Welcome ChemMantis to ChemZoo and a Call for Contributions from the Community,2008-10-23, A. Williams,blog post Archived 2015-09-24 at the Wayback Machine
  15. "RSC acquires ChemSpider". Royal Society of Chemistry. 11 May 2009. Retrieved 2009-05-11.
  16. "ChemSpider SyntheticPages". ChemSpider SyntheticPages. Royal Society of Chemistry. Retrieved 26 June 2012.
  17. "ChemSpider and SyntheticPages support synthetic chemistry". RSC Publishing. Royal Society of Chemistry. 2010-02-05. Archived from the original on 26 July 2012. Retrieved 2012-06-26.
  18. Williams, A. J.; Harland, L.; Groth, P.; Pettifer, S.; Chichester, C.; Willighagen, E. L.; Evelo, C. T.; Blomberg, N.; Ecker, G.; Goble, C.; Mons, B. (2012). "Open PHACTS: Semantic interoperability for drug discovery". Drug Discovery Today . 17 (21–22): 1188–1198. doi: 10.1016/j.drudis.2012.05.016 . PMID   22683805.