ChemSpider

Last updated
ChemSpider
Content
DescriptionMore than 100 million chemical structures, properties and associated information
Contact
Research center Cambridge, United Kingdom
Laboratory
Access
Website www.chemspider.com
Miscellaneous
License Creative Commons Attribution Share-alike [2]

ChemSpider is a freely accessible online database of chemicals owned by the Royal Society of Chemistry. [3] [4] [5] [6] [7] It contains information on more than 100 million molecules from over 270 data sources, each of them receiving a unique identifier called ChemSpider Identifier.

Contents

Sources

The database sources include: [8]

Professional databases

Crowdsourcing

The ChemSpider database can be updated with user contributions including chemical structure deposition, spectra deposition and user curation. This is a crowdsourcing approach to develop an online chemistry database. Crowdsourced based curation of the data has produced a dictionary of chemical names associated with chemical structures that has been used in text-mining applications of the biomedical and chemical literature. [10]

However, database rights are not waived and a data dump is not available; in fact, the FAQ even states that only limited downloads are allowed: [11] therefore the right to fork is not guaranteed and the project can not be considered free/open.

Features

Searching

A number of available search modules are provided:

Chemistry document mark-up

The ChemSpider database has been used in combination with text mining as the basis of chemistry document markup. ChemMantis, [14] the Chemistry Markup And Nomenclature Transformation Integrated System uses algorithms to identify and extract chemical names from documents and web pages and converts the chemical names to chemical structures using name-to-structure conversion algorithms and dictionary look-ups in the ChemSpider database. The result is an integrated system between chemistry documents and information look-up via ChemSpider into over 150 data sources.

SyntheticPages

SyntheticPages is a free interactive database of synthetic chemistry procedures operated by the Royal Society of Chemistry. [15] Users submit synthetic procedures which they have conducted themselves for publication on the site. These procedures may be original works, but they are more often based on literature reactions. Citations to the original published procedure are made where appropriate. They are checked by a scientific editor before posting. The pages do not undergo formal peer-review like a scientific journal article but comments can be made by logged-in users. The comments are also moderated by scientific editors. The intention is to collect practical experience of how to conduct useful chemical synthesis in the lab. While experimental methods published in an ordinary academic journal are listed formally and concisely, the procedures in ChemSpider SyntheticPages are given with more practical detail. Informality is encouraged. Comments by submitters are included as well. Other publications with comparable amounts of detail include Organic Syntheses and Inorganic Syntheses . The SyntheticPages site was originally set up by Professors Kevin Booker-Milburn (University of Bristol), Stephen Caddick (University College London), Peter Scott (University of Warwick) and Max Hammond. In February 2010 a merger was announced [16] with the Royal Society of Chemistry's chemical structure search engine ChemSpider and the formation of ChemSpider|SyntheticPages (CS|SP).

Other services

A number of services are made available online. These include the conversion of chemical names to chemical structures, the generation of SMILES and InChI strings as well as the prediction of many physicochemical parameters and integration to a web service allowing NMR prediction.

History

ChemSpider was acquired by the Royal Society of Chemistry (RSC) in May, 2009. [17] Prior to the acquisition by RSC, ChemSpider was controlled by a private corporation, ChemZoo Inc. The system was first launched in March 2007 in a beta release form and transitioned to release in March 2008.

Open PHACTS

ChemSpider served as the chemical compound repository as part of the Open PHACTS project, an Innovative Medicines Initiative. Open PHACTS developed to open standards, with an open access, semantic web approach to address bottlenecks in small molecule drug discovery - disparate information sources, lack of standards and information overload. [18]

See also

Related Research Articles

<span class="mw-page-title-main">Heterocyclic compound</span> Molecule with one or more rings composed of different elements

A heterocyclic compound or ring structure is a cyclic compound that has atoms of at least two different elements as members of its ring(s). Heterocyclic organic chemistry is the branch of organic chemistry dealing with the synthesis, properties, and applications of organic heterocycles.

Combinatorial chemistry comprises chemical synthetic methods that make it possible to prepare a large number of compounds in a single process. These compound libraries can be made as mixtures, sets of individual compounds or chemical structures generated by computer software. Combinatorial chemistry can be used for the synthesis of small molecules and for peptides.

A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.

Cheminformatics refers to the use of physical chemistry theory with computer and information science techniques—so called "in silico" techniques—in application to a range of descriptive and prescriptive problems in the field of chemistry, including in its applications to biology and related molecular fields. Such in silico techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.

Supramolecular chemistry refers to the branch of chemistry concerning chemical systems composed of a discrete number of molecules. The strength of the forces responsible for spatial organization of the system range from weak intermolecular forces, electrostatic charge, or hydrogen bonding to strong covalent bonding, provided that the electronic coupling strength remains small relative to the energy parameters of the component. While traditional chemistry concentrates on the covalent bond, supramolecular chemistry examines the weaker and reversible non-covalent interactions between molecules. These forces include hydrogen bonding, metal coordination, hydrophobic forces, van der Waals forces, pi–pi interactions and electrostatic effects.

Organic synthesis is a branch of chemical synthesis concerned with the construction of organic compounds. Organic compounds are molecules consisting of combinations of covalently-linked hydrogen, carbon, oxygen, and nitrogen atoms. Within the general subject of organic synthesis, there are many different types of synthetic routes that can be completed including total synthesis, stereoselective synthesis, automated synthesis, and many more. Additionally, in understanding organic synthesis it is necessary to be familiar with the methodology, techniques, and applications of the subject.

The International Chemical Identifier is a textual identifier for chemical substances, designed to provide a standard way to encode molecular information and to facilitate the search for such information in databases and on the web. Initially developed by the International Union of Pure and Applied Chemistry (IUPAC) and National Institute of Standards and Technology (NIST) from 2000 to 2005, the format and algorithms are non-proprietary. Since May 2009, it has been developed by the InChI Trust, a nonprofit charity from the United Kingdom which works to implement and promote the use of InChI.

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.

Chemical Entities of Biological Interest, also known as ChEBI, is a chemical database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies (OBO) effort at the European Bioinformatics Institute (EBI). The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products which have potential bioactivity. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI.

<span class="mw-page-title-main">Chemical similarity</span> Chemical term

Chemical similarity refers to the similarity of chemical elements, molecules or chemical compounds with respect to either structural or functional qualities, i.e. the effect that the chemical compound has on reaction partners in inorganic or biological settings. Biological effects and thus also similarity of effects are usually quantified using the biological activity of a compound. In general terms, function can be related to the chemical activity of compounds.

<span class="mw-page-title-main">ChEMBL</span> Chemical database of bioactive molecules also having drug-like properties

ChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug inducing properties. It is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

Reaxys is a web-based tool for the retrieval of information about chemical compounds and data from published literature, including journals and patents. The information includes chemical compounds, chemical reactions, chemical properties, related bibliographic data, substance data with synthesis planning information, as well as experimental procedures from selected journals and patents. It is licensed by Elsevier.

<span class="mw-page-title-main">Antony John Williams</span> British chemist

Antony John Williams is a British chemist and expert in the fields of both nuclear magnetic resonance (NMR) spectroscopy and cheminformatics at the United States Environmental Protection Agency. He is the founder of the ChemSpider website that was purchased by the Royal Society of Chemistry in May 2009. He is a science blogger and an author.

<span class="mw-page-title-main">Chemicalize</span>

Chemicalize is an online platform for chemical calculations, search, and text processing. It is developed and owned by ChemAxon and offers various cheminformatics tools in freemium model: chemical property predictions, structure-based and text-based search, chemical text processing, and checking compounds with respect to national regulations of different countries.

<span class="mw-page-title-main">Sean Ekins</span>

Sean Ekins is a British pharmacologist and expert in the fields of ADME/Tox, computational toxicology and cheminformatics at Collaborations in Chemistry, a division of corporate communications firm Collaborations in Communications. He is also the editor of four books and a book series for John Wiley & Sons.

Dotmatics is an R&D scientific software company used by scientists in the R&D process that help them be more efficient in their efforts to innovate. Founded in 2005, the company's primary office is in Boston with 14 offices around the globe. In March 2021, Dotmatics joined forces with Insightful Science through a merger. In April 2022, the two companies consolidated under the Dotmatics brand. Dotmatics' software is used by 2 million scientists and researchers and 10,000 customers.

Véronique Gouverneur is the Waynflete Professor of Chemistry at Magdalen College at the University of Oxford in the United Kingdom. Prior to the Waynflete professorship, she held a tutorial fellowship at Merton College, Oxford. Her research on fluorine chemistry has received many professional and scholarly awards.

References

  1. Van Noorden, R. (2012). "Chemistry's web of data expands". Nature. 483 (7391): 524. Bibcode:2012Natur.483..524V. doi: 10.1038/483524a . PMID   22460877.
  2. "ChemSpider Blog » Blog Archive » ChemSpider Adopts Creative Commons Licenses". www.chemspider.com. Archived from the original on 2015-04-02. Retrieved 2014-03-21.
  3. Antony John Williams (Jan–Feb 2008). "ChemSpider and Its Expanding Web: Building a Structure-Centric Community for Chemists". Chemistry International. 30 (1).
  4. Williams, A. J. (2008). "Public chemical compound databases". Current Opinion in Drug Discovery & Development. 11 (3): 393–404. PMID   18428094.
  5. Brumfiel, G. (2008). "Chemists spin a web of data". Nature. 453 (7192): 139. Bibcode:2008Natur.453..139B. doi: 10.1038/453139a . PMID   18464701.
  6. Williams, A. J. (2011). "Chemspider: A Platform for Crowdsourced Collaboration to Curate Data Derived from Public Compound Databases". Collaborative Computational Technologies for Biomedical Research. pp. 363–386. doi:10.1002/9781118026038.ch22. ISBN   9781118026038.
  7. Pence, H. E.; Williams, A. (2010). "ChemSpider: An Online Chemical Information Resource". Journal of Chemical Education. 87 (11): 1123. Bibcode:2010JChEd..87.1123P. doi:10.1021/ed100697w.
  8. "Data Sources". Chemspider . Retrieved May 16, 2019.
  9. "ChemSpider Blog » Blog Archive » The US EPA DSSTox Browser Connects to ChemSpider". ChemSpider. August 23, 2008. Archived from the original on 7 November 2017. Retrieved 7 November 2017.
  10. Hettne, K. M.; Williams, A. J.; Van Mulligen, E. M.; Kleinjans, J.; Tkachenko, V.; Kors, J. A. (2010). "Automatic vs. Manual curation of a multi-source chemical dictionary: The impact on text mining". Journal of Cheminformatics. 2 (1): 3. doi: 10.1186/1758-2946-2-3 . PMC   2848622 . PMID   20331846.
  11. "ChemSpider Blog » Blog Archive » Who Would Like to Have the Entire ChemSpider Database?". www.chemspider.com. Archived from the original on 2015-09-24. Retrieved 2014-04-18.
  12. "ChemSpider on the App Store". App Store.
  13. "ChemSpider Mobile - Android Apps on Google Play". play.google.com.
  14. Welcome ChemMantis to ChemZoo and a Call for Contributions from the Community, 2008-10-23, A. Williams,blog post Archived 2015-09-24 at the Wayback Machine
  15. "ChemSpider SyntheticPages". Royal Society of Chemistry. Retrieved 26 June 2012.
  16. "ChemSpider and SyntheticPages support synthetic chemistry". RSC Publishing. Royal Society of Chemistry. 2010-02-05. Archived from the original on 26 July 2012. Retrieved 2012-06-26.
  17. "RSC acquires ChemSpider". Royal Society of Chemistry. 11 May 2009. Retrieved 2009-05-11.
  18. Williams, A. J.; Harland, L.; Groth, P.; Pettifer, S.; Chichester, C.; Willighagen, E. L.; Evelo, C. T.; Blomberg, N.; Ecker, G.; Goble, C.; Mons, B. (2012). "Open PHACTS: Semantic interoperability for drug discovery". Drug Discovery Today . 17 (21–22): 1188–1198. doi: 10.1016/j.drudis.2012.05.016 . PMID   22683805.