This page describes mining for molecules . Since molecules may be represented by molecular graphs this is strongly related to graph mining and structured data mining. The main problem is how to represent molecules while discriminating the data instances. One way to do this is chemical similarity metrics, which has a long tradition in the field of cheminformatics.
Typical approaches to calculate chemical similarities use chemical fingerprints, but this loses the underlying information about the molecule topology. Mining the molecular graphs directly avoids this problem. So does the inverse QSAR problem which is preferable for vectorial mappings.
A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.
Cheminformatics refers to the use of physical chemistry theory with computer and information science techniques—so called "in silico" techniques—in application to a range of descriptive and prescriptive problems in the field of chemistry, including in its applications to biology and related molecular fields. Such in silico techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.
Chemical Markup Language is an approach to managing molecular information using tools such as XML and Java. It was the first domain specific implementation based strictly on XML, first based on a DTD and later on an XML Schema, the most robust and widely used system for precise information management in many areas. It has been developed over more than a decade by Murray-Rust, Rzepa and others and has been tested in many areas and on a variety of machines.
Quantitative structure–activity relationship models are regression or classification models used in the chemical and biological sciences and engineering. Like other regression models, QSAR regression models relate a set of "predictor" variables (X) to the potency of the response variable (Y), while classification QSAR models relate the predictor variables to a categorical value of the response variable.
Retrosynthetic analysis is a technique for solving problems in the planning of organic syntheses. This is achieved by transforming a target molecule into simpler precursor structures regardless of any potential reactivity/interaction with reagents. Each precursor material is examined using the same method. This procedure is repeated until simple or commercially available structures are reached. These simpler/commercially available compounds can be used to form a synthesis of the target molecule. E.J. Corey formalized this concept in his book The Logic of Chemical Synthesis.
A structural analog, also known as a chemical analog or simply an analog, is a compound having a structure similar to that of another compound, but differing from it in respect to a certain component.
JOELib is computer software, a chemical expert system used mainly to interconvert chemical file formats. Because of its strong relationship to informatics, this program belongs more to the category cheminformatics than to molecular modelling. It is available for Windows, Unix and other operating systems supporting the programming language Java. It is free and open-source software distributed under the GNU General Public License (GPL) 2.0.
ISIS/Draw was a chemical structure drawing program developed by MDL Information Systems. It introduced a number of file formats for the storage of chemical information that have become industry standards.
A unimolecular rectifier is a single organic molecule which functions as a rectifier of electric current. The idea was first proposed in 1974 by Arieh Aviram, then at IBM, and Mark Ratner, then at New York University. Their publication was the first serious and concrete theoretical proposal in the new field of molecular electronics (UE). Based on the mesomeric effect of certain chemical compounds on organic molecules, a molecular rectifier was built by simulating the pn junction with the help of chemical compounds.
ChemSpider is a freely accessible online database of chemicals owned by the Royal Society of Chemistry. It contains information on more than 100 million molecules from over 270 data sources, each of them receiving a unique identifier called ChemSpider Identifier.
SMILES arbitrary target specification (SMARTS) is a language for specifying substructural patterns in molecules. The SMARTS line notation is expressive and allows extremely precise and transparent substructural specification and atom typing.
Chemical similarity refers to the similarity of chemical elements, molecules or chemical compounds with respect to either structural or functional qualities, i.e. the effect that the chemical compound has on reaction partners in inorganic or biological settings. Biological effects and thus also similarity of effects are usually quantified using the biological activity of a compound. In general terms, function can be related to the chemical activity of compounds.
LigandScout is computer software that allows creating three-dimensional (3D) pharmacophore models from structural data of macromolecule–ligand complexes, or from training and test sets of organic molecules. It incorporates a complete definition of 3D chemical features that describe the interaction of a bound small organic molecule (ligand) and the surrounding binding site of the macromolecule. These pharmacophores can be overlaid and superimposed using a pattern-matching based alignment algorithm that is solely based on pharmacophoric feature points instead of chemical structure. From such an overlay, shared features can be interpolated to create a so-called shared-feature pharmacophore that shares all common interactions of several binding sites/ligands or extended to create a so-called merged-feature pharmacophore. The software has been successfully used to predict new lead structures in drug design, e.g., predicting biological activity of novel human immunodeficiency virus (HIV) reverse transcriptase inhibitors.
This is a list of notable computer programs that are used for nucleic acids simulations.
Louis Hodes was an American mathematician, computer scientist, and cancer researcher.
Periodic systems of molecules are charts of molecules similar to the periodic table of the elements. Construction of such charts was initiated in the early 20th century and is still ongoing.
Matched molecular pair analysis (MMPA) is a method in cheminformatics that compares the properties of two molecules that differ only by a single chemical transformation, such as the substitution of a hydrogen atom by a chlorine one. Such pairs of compounds are known as matched molecular pairs (MMP). Because the structural difference between the two molecules is small, any experimentally observed change in a physical or biological property between the matched molecular pair can more easily be interpreted. The term was first coined by Kenny and Sadowski in the book Chemoinformatics in Drug Discovery.
In chemical graph theory, the Padmakar–Ivan (PI) index is a topological index of a molecule, used in biochemistry. The Padmakar–Ivan index is a generalization introduced by Padmakar V. Khadikar and Iván Gutman of the concept of the Wiener index, introduced by Harry Wiener. The Padmakar–Ivan index of a graph G is the sum over all edges uv of G of number of edges which are not equidistant from u and v. Let G be a graph and e = uv an edge of G. Here denotes the number of edges lying closer to the vertex u than the vertex v, and is the number of edges lying closer to the vertex v than the vertex u. The Padmakar–Ivan index of a graph G is defined as
A chemical graph generator is a software package to generate computer representations of chemical structures adhering to certain boundary conditions. The development of such software packages is a research topic of cheminformatics. Chemical graph generators are used in areas such as virtual library generation in drug design, in molecular design with specified properties, called inverse QSAR/QSPR, as well as in organic synthesis design, retrosynthesis or in systems for computer-assisted structure elucidation (CASE). CASE systems again have regained interest for the structure elucidation of unknowns in computational metabolomics, a current area of computational biology.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)