Crystallography Open Database

Last updated
Crystallography Open Database (COD)
Database.png
Content
Description Crystal structures and platform for world-wide collaboration
Contact
Research center Vilnius University
AuthorsSaulius Gražulis
Primary citationGražulis & al. (2012) [1]
Release date2003
Access
Data format Crystallographic Information File (.cif)
Website http://www.crystallography.net
Public SQL access http://wiki.crystallography.net/howtoquerycod/

The Crystallography Open Database (COD) is a database of crystal structures. [1] Unlike similar crystallographic databases, the database is entirely open-access, with registered users able to contribute published and unpublished structures of small molecules and small to medium-sized unit cell crystals to the database. As of May 2016, the database has more than 360,000 entries. [2] The database has various contributors, and contains Crystallographic Information Files as defined by the International Union of Crystallography (IUCr). There are currently five sites worldwide that mirror this database. The 3D structures of compounds can be converted to input files for 3D printers. [3]

Contents

See also

Related Research Articles

<span class="mw-page-title-main">Crystallography</span> Scientific study of crystal structures

Crystallography is the branch of science devoted to the study of molecular and crystalline structure and properties. The word crystallography is derived from the Ancient Greek word κρύσταλλος, and γράφειν. In July 2012, the United Nations recognised the importance of the science of crystallography by proclaiming 2014 the International Year of Crystallography.

<span class="mw-page-title-main">X-ray crystallography</span> Technique used for determining crystal structures and identifying mineral compounds

X-ray crystallography is the experimental science of determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract in specific directions. By measuring the angles and intensities of the X-ray diffraction, a crystallographer can produce a three-dimensional picture of the density of electrons within the crystal and the positions of the atoms, as well as their chemical bonds, crystallographic disorder, and other information.

The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules such as proteins and nucleic acids, which is overseen by the Worldwide Protein Data Bank (wwPDB). These structural data are obtained and deposited by biologists and biochemists worldwide through the use of experimental methodologies such as X-ray crystallography, NMR spectroscopy, and, increasingly, cryo-electron microscopy. All submitted data are reviewed by expert biocurators and, once approved, are made freely available on the Internet under the CC0 Public Domain Dedication. Global access to the data is provided by the websites of the wwPDB member organisations.

<span class="mw-page-title-main">Structural bioinformatics</span> Bioinformatics subfield

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a Protein Data Bank (PDB) file, interacting with Jmol and many more. This application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis.

<span class="mw-page-title-main">Kinemage</span>

A kinemage is an interactive graphic scientific illustration. It often is used to visualize molecules, especially proteins although it can also represent other types of 3-dimensional data. The kinemage system is designed to optimize ease of use, interactive performance, and the perception and communication of detailed 3D information. The kinemage information is stored in a text file, human- and machine-readable, that describes the hierarchy of display objects and their properties, and includes optional explanatory text. The kinemage format is a defined chemical MIME type of 'chemical/x-kinemage' with the file extension '.kin'.

<span class="mw-page-title-main">Tom Blundell</span> British biochemist

Sir Thomas Leon Blundell, is a British biochemist, structural biologist, and science administrator. He was a member of the team of Dorothy Hodgkin that solved in 1969 the first structure of a protein hormone, insulin. Blundell has made contributions to the structural biology of polypeptide hormones, growth factors, receptor activation, signal transduction, and DNA double-strand break repair, subjects important in cancer, tuberculosis, and familial diseases. He has developed software for protein modelling and understanding the effects of mutations on protein function, leading to new approaches to structure-guided and Fragment-based lead discovery. In 1999 he co-founded the oncology company Astex Therapeutics, which has moved ten drugs into clinical trials. Blundell has played central roles in restructuring British research councils and, as President of the UK Science Council, in developing professionalism in the practice of science.

<span class="mw-page-title-main">Jane S. Richardson</span> American biophysicist

Jane Shelby Richardson is an American biophysicist best known for developing the Richardson diagram, or ribbon diagram, a method of representing the 3D structure of proteins. Ribbon diagrams have become a standard representation of protein structures that has facilitated further investigation of protein structure and function globally. With interests in astronomy, math, physics, botany, and philosophy, Richardson took an unconventional route to establishing a science career. Richardson is a professor in biochemistry at Duke University.

<span class="mw-page-title-main">Cambridge Structural Database</span>

The Cambridge Structural Database (CSD) is both a repository and a validated and curated resource for the three-dimensional structural data of molecules generally containing at least carbon and hydrogen, comprising a wide range of organic, metal-organic and organometallic molecules. The specific entries are complementary to the other crystallographic databases such as the Protein Data Bank (PDB), Inorganic Crystal Structure Database and International Centre for Diffraction Data. The data, typically obtained by X-ray crystallography and less frequently by electron diffraction or neutron diffraction, and submitted by crystallographers and chemists from around the world, are freely accessible on the Internet via the CSD's parent organization's website. The CSD is overseen by the not-for-profit incorporated company called the Cambridge Crystallographic Data Centre, CCDC.

Acta Crystallographica is a series of peer-reviewed scientific journals, with articles centred on crystallography, published by the International Union of Crystallography (IUCr). Originally established in 1948 as a single journal called Acta Crystallographica, there are now six independent Acta Crystallographica titles:

<span class="mw-page-title-main">Helen M. Berman</span> American chemist

Helen Miriam Berman is a Board of Governors Professor of Chemistry and Chemical Biology at Rutgers University and a former director of the RCSB Protein Data Bank. A structural biologist, her work includes structural analysis of protein-nucleic acid complexes, and the role of water in molecular interactions. She is also the founder and director of the Nucleic Acid Database, and led the Protein Structure Initiative Structural Genomics Knowledgebase.

A crystallographic database is a database specifically designed to store information about the structure of molecules and crystals. Crystals are solids having, in all three dimensions of space, a regularly repeating arrangement of atoms, ions, or molecules. They are characterized by symmetry, morphology, and directionally dependent physical properties. A crystal structure describes the arrangement of atoms, ions, or molecules in a crystal..

PDBsum is a database that provides an overview of the contents of each 3D macromolecular structure deposited in the Protein Data Bank (PDB).

<span class="mw-page-title-main">Structure validation</span> Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

<span class="mw-page-title-main">Disordered Structure Refinement</span>

The Disordered Structure Refinement program (DSR), written by Daniel Kratzert, is designed to simplify the modeling of molecular disorder in crystal structures using SHELXL by George M. Sheldrick. It has a database of approximately 120 standard solvent molecules and molecular moieties. These can be inserted into the crystal structure with little effort, while at the same time chemically meaningful binding and angular restraints are set. DSR was developed because the previous description of disorder in crystal structures with SHELXL was very lengthy and error-prone. Instead of editing large text files manually and defining restraints manually, this process is automated with DSR.

Non-canonical base pairs are planar hydrogen bonded pairs of nucleobases, having hydrogen bonding patterns which differ from the patterns observed in Watson-Crick base pairs, as in the classic double helical DNA. The structures of polynucleotide strands of both DNA and RNA molecules can be understood in terms of sugar-phosphate backbones consisting of phosphodiester-linked D 2’ deoxyribofuranose sugar moieties, with purine or pyrimidine nucleobases covalently linked to them. Here, the N9 atoms of the purines, guanine and adenine, and the N1 atoms of the pyrimidines, cytosine and thymine, respectively, form glycosidic linkages with the C1’ atom of the sugars. These nucleobases can be schematically represented as triangles with one of their vertices linked to the sugar, and the three sides accounting for three edges through which they can form hydrogen bonds with other moieties, including with other nucleobases. The side opposite to the sugar linked vertex is traditionally called the Watson-Crick edge, since they are involved in forming the Watson-Crick base pairs which constitute building blocks of double helical DNA. The two sides adjacent to the sugar-linked vertex are referred to, respectively, as the Sugar and Hoogsteen edges.

<span class="mw-page-title-main">Mercury (crystallography)</span>

Mercury is a freeware developed by the Cambridge Crystallographic Data Centre, originally designed as a crystal structure visualization tool. Mercury helps three dimensional visualization of crystal structure and assists in drawing and analysis of crystal packing and intermolecular interactions. Current version Mercury can read "cif", ".mol", ".mol2", ".pdb", ".res", ".sd" and ".xyz" types of files. Mercury has its own file format with filename extension ".mryx".

<span class="mw-page-title-main">CrystalExplorer</span> Crystal structure analysis software

CrystalExplorer (CE) is a freeware designed to analysis the crystal structure with *.cif file format.

References

  1. 1 2 Gražulis, Saulius; Daškevič Adriana; Merkys Andrius; Chateigner Daniel; Lutterotti Luca; Quirós Miguel; Serebryanaya Nadezhda R; Moeck Peter; Downs Robert T; Le Bail Armel (Jan 2012). "Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration". Nucleic Acids Res. 40 (Database issue). England: D420-7. doi:10.1093/nar/gkr900. PMC   3245043 . PMID   22070882.
  2. Oxford Journal - Nucleic Acids Research Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration October 5, 2011
  3. Scalfani, Vincent F.; Williams, Antony J.; Tkachenko, Valery; Karapetyan, Karen; Pshenichnov, Alexey; Hanson, Robert M.; Liddie, Jahred M.; Bara, Jason E. (23 November 2016). "Programmatic conversion of crystal structures into 3D printable files using Jmol". Journal of Cheminformatics. 8 (1): 66. doi: 10.1186/s13321-016-0181-z . PMC   5122160 . PMID   27933103.