World Wide Molecular Matrix

Last updated

The World Wide Molecular Matrix (WWMM) was a proposed electronic repository for unpublished chemical data. First introduced in 2002 by Peter Murray-Rust and his colleagues in the chemistry department at the University of Cambridge in the United Kingdom, WWMM provided a free, easily searchable database for information about thousands of complicated molecules, data that would otherwise remain inaccessible to scientists.

Contents

Murray-Rust, a chemical informatics specialist, has estimated that 80% of the results produced by chemists around the world is never published in scientific journals. [1] Most of this data is not ground-breaking, yet it could conceivably be of use to scientists doing related projects—if they could access it. The WWMM was proposed as a solution to this problem. It would house the results of experiments on over 100,000 molecules in physical chemistry, organic chemistry, biochemistry and medicinal chemistry.

In other scientific fields, the need for a similar depository to house inaccessible information could be more acute. In a presentation at the "CERN Workshop on Innovations in Scholarly Communications (OAI4)", Murray-Rust said that chemistry actually leads other fields in published data. He estimated that the majority of the data in some scientific fields never reaches publication. [1]

Although scientific in nature, the WWMM was part of the broader open archives and open source movements, pushes to make more and more information freely available to any user via the Internet or World Wide Web. In his CERN presentation, Murray-Rust stated that the WWMM was a "response to the expense of [scientific] journals", and he asked the rhetorical question, "Can we win the war to make data open, or will it be absorbed into the publishing and pseudo-publishing world?" Murray-Rust and his colleagues are also responsible for the development of the Chemical Mark-up Language (CML), a variant of XML intended for chemists.

See also

Related Research Articles

Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of molecules, groups of molecules, and solids. It is essential because, apart from relatively recent results concerning the hydrogen molecular ion, the quantum many-body problem cannot be solved analytically, much less in closed form. While computational results normally complement the information obtained by chemical experiments, it can in some cases predict hitherto unobserved chemical phenomena. It is widely used in the design of new drugs and materials.

The following outline is provided as an overview of and topical guide to chemistry:

<span class="mw-page-title-main">Open Archives Initiative</span>

The Open Archives Initiative (OAI) was an informal organization, in the circle around the colleagues Herbert Van de Sompel, Carl Lagoze, Michael L. Nelson and Simeon Warner, to develop and apply technical interoperability standards for archives to share catalogue information (metadata). The group got together in the late late 1990s and was active for around twenty years. OAI coordinated in particular three specification activities: OAI-PMH, OAI-ORE and ResourceSync. All along the group worked towards building a "low-barrier interoperability framework" for archives containing digital content to allow people harvest metadata. Such sets of metadata are since then harvested to provide "value-added services", often by combining different data sets.

A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.

Cheminformatics refers to use of physical chemistry theory with computer and information science techniques—so called "in silico" techniques—in application to a range of descriptive and prescriptive problems in the field of chemistry, including in its applications to biology and related molecular fields. Such in silico techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.

Chemical Markup Language is an approach to managing molecular information using tools such as XML and Java. It was the first domain specific implementation based strictly on XML, first based on a DTD and later on an XML Schema, the most robust and widely used system for precise information management in many areas. It has been developed over more than a decade by Murray-Rust, Rzepa and others and has been tested in many areas and on a variety of machines.

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.

A chemical file format is a type of data file which is used specifically to depicting molecular data. One of the most widely used is the chemical table file format, which is similar to Structure Data Format (SDF) files. They are text files that represent multiple chemical structure records and associated data fields. The XYZ file format is a simple format that usually gives the number of atoms in the first line, a comment on the second, followed by a number of lines with atomic symbols and cartesian coordinates. The Protein Data Bank Format is commonly used for proteins but is also used for other types of molecules. There are many other types which are detailed below. Various software systems are available to convert from one format to another.

<span class="mw-page-title-main">Open Babel</span>

Open Babel is computer software, a chemical expert system mainly used to interconvert chemical file formats.

<span class="mw-page-title-main">JOELib</span>

JOELib is computer software, a chemical expert system used mainly to interconvert chemical file formats. Because of its strong relationship to informatics, this program belongs more to the category cheminformatics than to molecular modelling. It is available for Windows, Unix and other operating systems supporting the programming language Java. It is free and open-source software distributed under the GNU General Public License (GPL) 2.0.

<span class="mw-page-title-main">Henry Rzepa</span>

Henry Stephen Rzepa is a chemist and Emeritus Professor of Computational chemistry at Imperial College London.

<span class="mw-page-title-main">Peter Murray-Rust</span> Chemist and open-access research activist

Peter Murray-Rust is a chemist currently working at the University of Cambridge. As well as his work in chemistry, Murray-Rust is also known for his support of open access and open data.

MDL Information Systems, Inc. was a provider of R&D informatics products for the life sciences and chemicals industries. The company was launched as a computer-aided drug design firm in January 1978 in Hayward, California. The company was acquired by Symyx Technologies, Inc. in 2007. Subsequently Accelrys merged with Symyx. The Accelrys name was retained for the combined company. In 2014 Accelrys was acquired by Dassault Systemes. The Accelrys business unit was renamed BIOVIA.

The Willard Gibbs Award, presented by the Chicago Section of the American Chemical Society, was established in 1910 by William A. Converse (1862–1940), a former Chairman and Secretary of the Chicago Section of the society and named for Professor Josiah Willard Gibbs (1839–1903) of Yale University. Gibbs, whose formulation of the Phase Rule founded a new science, is considered by many to be the only American-born scientist whose discoveries are as fundamental in nature as those of Newton and Galileo.

Chemaxon is a cheminformatics and bioinformatics software development company, headquartered in Budapest with 250 employees. The company also has offices in Cambridge, San Diego, Basel and in Prague. and it has distributors in China, India, Japan, South Korea, Singapore, and Australia.

<span class="mw-page-title-main">Blue Obelisk</span>

Blue Obelisk is an informal group of chemists who promote open data, open source, and open standards; it was initiated by Peter Murray-Rust and others in 2005. Multiple open source cheminformatics projects associate themselves with the Blue Obelisk, among which, in alphabetical order, Avogadro, Bioclipse, cclib, Chemistry Development Kit, GaussSum, JChemPaint, JOELib, Kalzium, Openbabel, OpenSMILES, and UsefulChem.

<span class="mw-page-title-main">William Klemperer</span> American chemist

William A. Klemperer (October 6, 1927 – November 5, 2017) was an American chemist who was one of the most influential chemical physicists and molecular spectroscopists in the second half of the 20th century. Klemperer is most widely known for introducing molecular beam methods into chemical physics research, greatly increasing the understanding of nonbonding interactions between atoms and molecules through development of the microwave spectroscopy of van der Waals molecules formed in supersonic expansions, pioneering astrochemistry, including developing the first gas phase chemical models of cold molecular clouds that predicted an abundance of the molecular HCO+ ion that was later confirmed by radio astronomy.

<span class="mw-page-title-main">Christoph Steinbeck</span> German chemist

Christoph Steinbeck is a German chemist and has a professorship for analytical chemistry, cheminformatics and chemometrics at the Friedrich-Schiller-Universität Jena in Thuringia.

ChemWindow is a chemical structure drawing molecule editor and publishing program now published by John Wiley & Sons as of 2020, originally developed by Bio-Rad Laboratories, Inc. It was first developed by SoftShell International in the 1990s. Bio-Rad acquired this technology in 1996 and eventually made it part of their KnowItAll software product line, offering a specific ChemWindow edition of their software for structure drawing and publishing. They have also incorporated ChemWindow structure drawing components into their KnowItAll spectroscopy software packages with their DrawIt, ReportIt, and MineIt tools.

<span class="mw-page-title-main">Panton Principles</span>

The Panton Principles are a set of principles which were written to promote open science. They were first drafted in July 2009 at the Panton Arms pub in Cambridge.

References

  1. 1 2 Peter Murray-Rust and Henry S. Rzepa (2004). "Journal of Digital Information, Vol 5, No 1 (2004) The Next Big Thing: From Hypermedia to Datuments". Texas Digital Library.