Peter Murray-Rust | |
---|---|
Born | 1941 (age 82–83) Guildford, England |
Alma mater | Balliol College, Oxford |
Known for | |
Awards | Herman Skolnik Award |
Scientific career | |
Fields | |
Institutions | |
Thesis | A structural investigation of some compounds showing charge-transfer properties (1969) |
Website | www-pmr |
Peter Murray-Rust is a chemist currently working at the University of Cambridge. As well as his work in chemistry, Murray-Rust is also known for his support of open access and open data.
He was educated at Bootham School,[ citation needed ] a private school in York, and at Balliol College, Oxford. After obtaining a Doctor of Philosophy with a thesis entitled A structural investigation of some compounds showing charge-transfer properties, he became lecturer in chemistry at the (new) University of Stirling and was first warden of Andrew Stewart Hall of Residence. In 1982, he moved to Glaxo Group Research at Greenford to head Molecular Graphics, [1] Computational Chemistry and later protein structure determination. He was Professor of Pharmacy in the University of Nottingham from 1996 to 2000, setting up the Virtual School of Molecular Sciences. He is now Reader Emeritus in Molecular Informatics at the University of Cambridge and Senior Research Fellow Emeritus at Churchill College, Cambridge.
His research interests have involved the automated analysis of data in scientific publications, creation of virtual communities, e.g. The Virtual School of Natural Sciences in the Globewide Network Academy, and the Semantic Web. With Henry Rzepa, he has extended this to chemistry through the development of markup languages, especially Chemical Markup Language. [2] He campaigns for open data, particularly in science, and is on the advisory board of the Open Knowledge International and a co-author of the Panton Principles for Open scientific data. [3] Together with a few other chemists, he was a founder member of the Blue Obelisk movement in 2005. [4] [5] [6]
In 2002, Peter Murray-Rust and his colleagues proposed an electronic repository for unpublished chemical data called the World Wide Molecular Matrix (WWMM). In January 2011, a symposium around his career and visions was organized, called Visions of a Semantic Molecular Future. [7] [8] [9] [10] In 2011, he and Henry Rzepa were joint recipients of the Herman Skolnik Award of the American Chemical Society. [11] In 2014, he was awarded a Fellowship by the Shuttleworth Foundation to develop the automated mining of science from the literature.
In 2009 Murray-Rust coined the term "Doctor Who" model for the phenomenon exhibited by the Blue Obelisk project and other Open Science projects, where when a project leader does not have the resources to continue to lead a project (e.g. because he or she has moved to another university with other tasks), someone else will stand up to become the new leader and continue the project. [12] [13] This is a reference to the long-running British science fiction television series Doctor Who , in which the main character periodically regenerates into a different form, which is played by a different actor.
As of 2014, Murray-Rust was granted a Fellowship by Shuttleworth Foundation in relation to the ContentMine project which uses machines to liberate 100,000,000 facts from the scientific literature. [14]
Murray-Rust is also known for his work on making scientific knowledge from literature freely available, and in such taking a stance against publishers that are not fully compliant with the Berlin Declaration on Open Access. In 2014, he actively raised awareness of glitches in the publishing system of Elsevier, where restrictions were imposed by Elsevier on the reuse of papers after the authors had paid Elsevier to make the paper freely available. [15]
A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.
Cheminformatics refers to the use of physical chemistry theory with computer and information science techniques—so called "in silico" techniques—in application to a range of descriptive and prescriptive problems in the field of chemistry, including in its applications to biology and related molecular fields. Such in silico techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.
Chemical Markup Language is an approach to managing molecular information using tools such as XML and Java. It was the first domain specific implementation based strictly on XML, first based on a DTD and later on an XML Schema, the most robust and widely used system for precise information management in many areas. It has been developed over more than a decade by Murray-Rust, Rzepa and others and has been tested in many areas and on a variety of machines.
A chemical file format is a type of data file which is used specifically for depicting molecular data. One of the most widely used is the chemical table file format, which is similar to Structure Data Format (SDF) files. They are text files that represent multiple chemical structure records and associated data fields. The XYZ file format is a simple format that usually gives the number of atoms in the first line, a comment on the second, followed by a number of lines with atomic symbols and cartesian coordinates. The Protein Data Bank Format is commonly used for proteins but is also used for other types of molecules. There are many other types which are detailed below. Various software systems are available to convert from one format to another.
Mathematical chemistry is the area of research engaged in novel applications of mathematics to chemistry; it concerns itself principally with the mathematical modeling of chemical phenomena. Mathematical chemistry has also sometimes been called computer chemistry, but should not be confused with computational chemistry.
Open Babel is a free chemical informatics software designed to facilitate the conversion of Chemical file formats and manage molecular data. It serves as a chemical expert system, widely used in fields such as cheminformatics, molecular modelling, and computational chemistry. Open Babel provides both a comprehensive library and command-line utilities, making it a versatile tool for researchers, developers, and professionals.
The World Wide Molecular Matrix (WWMM) was a proposed electronic repository for unpublished chemical data. First introduced in 2002 by Peter Murray-Rust and his colleagues in the chemistry department at the University of Cambridge in the United Kingdom, WWMM provided a free, easily searchable database for information about thousands of complicated molecules, data that would otherwise remain inaccessible to scientists.
JOELib is computer software, a chemical expert system used mainly to interconvert chemical file formats. Because of its strong relationship to informatics, this program belongs more to the category cheminformatics than to molecular modelling. It is available for Windows, Unix and other operating systems supporting the programming language Java. It is free and open-source software distributed under the GNU General Public License (GPL) 2.0.
The Chemistry Development Kit (CDK) is computer software, a library in the programming language Java, for chemoinformatics and bioinformatics. It is available for Windows, Linux, Unix, and macOS. It is free and open-source software distributed under the GNU Lesser General Public License (LGPL) 2.0.
Henry Stephen Rzepa is a chemist and Emeritus Professor of Computational Chemistry at Imperial College London.
Substructure search (SSS) is a method to retrieve from a database only those chemicals matching a pattern of atoms and bonds which a user specifies. It is an application of graph theory, specifically subgraph matching in which the query is a hydrogen-depleted molecular graph. The mathematical foundations for the method were laid in the 1870s, when it was suggested that chemical structure drawings were equivalent to graphs with atoms as vertices and bonds as edges. SSS is now a standard part of cheminformatics and is widely used by pharmaceutical chemists in drug discovery.
ChemSpider is a freely accessible online database of chemicals owned by the Royal Society of Chemistry. It contains information on more than 100 million molecules from over 270 data sources, each of them receiving a unique identifier called ChemSpider Identifier.
Blue Obelisk is an informal group of chemists who promote open data, open source, and open standards; it was initiated by Peter Murray-Rust and others in 2005. Multiple open source cheminformatics projects associate themselves with the Blue Obelisk, among which, in alphabetical order, Avogadro, Bioclipse, cclib, Chemistry Development Kit, GaussSum, JChemPaint, JOELib, Kalzium, Openbabel, OpenSMILES, and UsefulChem.
Avogadro is a molecule editor and visualizer designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It is extensible via a plugin architecture.
The Journal of Cheminformatics is a peer-reviewed open access scientific journal that covers cheminformatics and molecular modelling. It was established in 2009 with David Wild and Christoph Steinbeck as founding editors-in-chief, and was originally published by Chemistry Central. At the end of 2015, the Chemistry Central brand was retired and its titles, including Journal of Cheminformatics, were merged with the SpringerOpen portfolio of open access journals.
Antony John Williams is a British chemist and expert in the fields of both nuclear magnetic resonance (NMR) spectroscopy and cheminformatics at the United States Environmental Protection Agency. He is the founder of the ChemSpider website that was purchased by the Royal Society of Chemistry in May 2009. He is a science blogger and an author.
Christoph Steinbeck is a German chemist and has a professorship for analytical chemistry, cheminformatics and chemometrics at the Friedrich-Schiller-Universität Jena in Thuringia.
The CompTox Chemicals Dashboard is a freely accessible online database created and maintained by the U.S. Environmental Protection Agency (EPA). The database provides access to multiple types of data including physicochemical properties, environmental fate and transport, exposure, usage, in vivo toxicity, and in vitro bioassay. EPA and other scientists use the data and models contained within the dashboard to help identify chemicals that require further testing and reduce the use of animals in chemical testing. The Dashboard is also used to provide public access to information from EPA Action Plans, e.g. around perfluorinated alkylated substances.
A chemical graph generator is a software package to generate computer representations of chemical structures adhering to certain boundary conditions. The development of such software packages is a research topic of cheminformatics. Chemical graph generators are used in areas such as virtual library generation in drug design, in molecular design with specified properties, called inverse QSAR/QSPR, as well as in organic synthesis design, retrosynthesis or in systems for computer-assisted structure elucidation (CASE). CASE systems again have regained interest for the structure elucidation of unknowns in computational metabolomics, a current area of computational biology.