Crystallographic Information File

Last updated
Crystallographic Information File
CSD CIF KEWMEE.png
Filename extension
.cif
Internet media type
chemical/x-cif
Developed by International Union of Crystallography (IUCr)
Type of format chemical file format
Extended from Self-defining Text Archive and Retrieval
Extended to mmCIF
Website www.iucr.org/resources/cif

Crystallographic Information File (CIF) is a standard text file format for representing crystallographic information, promulgated by the International Union of Crystallography (IUCr). CIF was developed by the IUCr Working Party on Crystallographic Information in an effort sponsored by the IUCr Commission on Crystallographic Data and the IUCr Commission on Journals. The file format was initially published by Hall, Allen, and Brown [1] and has since been revised, most recently versions 1.1 and 2.0. [2] Full specifications for the format are available at the IUCr website. Many computer programs for molecular viewing are compatible with this format, including Jmol.

Contents

mmCIF

Closely related is mmCIF, macromolecular CIF, [3] which is intended as an successor to the Protein Data Bank (PDB) format. It is now the default format used by the Protein Data Bank. [4] [5]

Also closely related is Crystallographic Information Framework, a broader system of exchange protocols based on data dictionaries and relational rules expressible in different machine-readable manifestations, including, but not restricted to, Crystallographic Information File and XML.

Related Research Articles

<span class="mw-page-title-main">X-ray crystallography</span> Technique used for determining crystal structures and identifying mineral compounds

X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles and intensities of these diffracted beams, a crystallographer can produce a three-dimensional picture of the density of electrons within the crystal. From this electron density, the positions of the atoms in the crystal can be determined, as well as their chemical bonds, crystallographic disorder, and various other information.

The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations. The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.

Electron crystallography is a method to determine the arrangement of atoms in solids using a transmission electron microscope (TEM). It can involve the use of high-resolution transmission electron microscopy images, electron diffraction patterns including convergent-beam electron diffraction or combinations of these. It has been successful in determining some bulk structures, and also surface structures. Two related methods are low-energy electron diffraction which has solved the structure of many surfaces, and reflection high-energy electron diffraction which is used to monitor surfaces often during growth.

The Protein Data Bank (PDB) file format is a textual file format describing the three-dimensional structures of molecules held in the Protein Data Bank, now succeeded by the mmCIF format. The PDB format accordingly provides for description and annotation of protein and nucleic acid structures including atomic coordinates, secondary structure assignments, as well as atomic connectivity. In addition experimental metadata are stored. The PDB format is the legacy file format for the Protein Data Bank which now keeps data on biological macromolecules in the newer mmCIF file format.

<span class="mw-page-title-main">Jane S. Richardson</span> American biophysicist

Jane Shelby Richardson is an American biophysicist best known for developing the Richardson diagram, or ribbon diagram, a method of representing the 3D structure of proteins. Ribbon diagrams have become a standard representation of protein structures that has facilitated further investigation of protein structure and function globally. With interests in astronomy, math, physics, botany, and philosophy, Richardson took an unconventional route to establishing a science career. Today Richardson is a professor in biochemistry at Duke University.

<span class="mw-page-title-main">Cambridge Structural Database</span>

The Cambridge Structural Database (CSD) is both a repository and a validated and curated resource for the three-dimensional structural data of molecules generally containing at least carbon and hydrogen, comprising a wide range of organic, metal-organic and organometallic molecules. The specific entries are complementary to the other crystallographic databases such as the Protein Data Bank (PDB), Inorganic Crystal Structure Database and International Centre for Diffraction Data. The data, typically obtained by X-ray crystallography and less frequently by electron diffraction or neutron diffraction, and submitted by crystallographers and chemists from around the world, are freely accessible on the Internet via the CSD's parent organization's website. The CSD is overseen by the not-for-profit incorporated company called the Cambridge Crystallographic Data Centre, CCDC.

Acta Crystallographica is a series of peer-reviewed scientific journals, with articles centred on crystallography, published by the International Union of Crystallography (IUCr). Originally established in 1948 as a single journal called Acta Crystallographica, there are now six independent Acta Crystallographica titles:

<span class="mw-page-title-main">Helen M. Berman</span> American chemist

Helen Miriam Berman is a Board of Governors Professor of Chemistry and Chemical Biology at Rutgers University and a former director of the RCSB Protein Data Bank. A structural biologist, her work includes structural analysis of protein-nucleic acid complexes, and the role of water in molecular interactions. She is also the founder and director of the Nucleic Acid Database, and led the Protein Structure Initiative Structural Genomics Knowledgebase.

In biology, a protein structure database is a database that is modeled around the various experimentally determined protein structures. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes three-dimensional coordinates as well as experimental information, such as unit cell dimensions and angles for x-ray crystallography determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in computational biology such as structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.

A crystallographic database is a database specifically designed to store information about the structure of molecules and crystals. Crystals are solids having, in all three dimensions of space, a regularly repeating arrangement of atoms, ions, or molecules. They are characterized by symmetry, morphology, and directionally dependent physical properties. A crystal structure describes the arrangement of atoms, ions, or molecules in a crystal.

Eleanor Joy Dodson FRS is an Australian-born biologist who specialises in the computational modelling of protein crystallography. She holds a chair in the Department of Chemistry at the University of York. She is the widow of the scientist Guy Dodson.

Cryo bio-crystallography is the application of crystallography to biological macromolecules at cryogenic temperatures.

<span class="mw-page-title-main">M. Vijayan</span> Indian structural biologist (1941–2022)

Mamannamana Vijayan was an Indian structural biologist.

Gerard Jacob Kleywegt is a Dutch X-ray crystallographer and the former team leader of the Protein Data Bank in Europe at the EBI; a member of the Worldwide Protein Data Bank.

<span class="mw-page-title-main">Structure validation</span> Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

<span class="mw-page-title-main">Randy Read</span> Canadian-British scientist (1957–)

Randy John Read is a Wellcome Trust Principal Research Fellow and professor of protein crystallography at the University of Cambridge.

<span class="mw-page-title-main">Mercury (crystallography)</span>

Mercury is a freeware developed by the Cambridge Crystallographic Data Centre, originally designed as a crystal structure visualization tool. Mercury helps three dimensional visualization of crystal structure and assists in drawing and analysis of crystal packing and intermolecular interactions. Current version Mercury can read "cif", ".mol", ".mol2", ".pdb", ".res", ".sd" and ".xyz" types of files. Mercury has its own file format with filename extension ".mryx".

<span class="mw-page-title-main">Wladek Minor</span> Polish-American structural biologist

Władysław Minor also known as Wladek Minor is a Polish-American biophysicist, a specialist in structural biology and protein crystallography. He is a Harrison Distinguished Professor of Molecular Physiology and Biological Physics at the University of Virginia. Minor is a co-author of HKL2000/HKL3000 – crystallographic data processing and structure solution software used to process data and solve structures of macromolecules, as well as small molecules. He is a co-founder of HKL Research, a company that distributes the software. He is also a co-author of a public repository of diffraction images (proteindiffraction.org) for some of the protein structures available in the Protein Data Bank and other software tools for structural biology.

John R. Helliwell is a British crystallographer known for his pioneering work in the use of synchrotron radiation in macromolecular crystallography.

<span class="mw-page-title-main">Macromolecular Crystallographic Information File</span> File format used for macromolecular structure data

The Macromolecular Crystallographic Information File (mmCIF) also known as PDBx/mmCIF is a standard text file format for representing macromolecular structure data, developed by the International Union of Crystallography (IUCr) and the Protein Data Bank It is an extension of the Crystallographic Information File (CIF), specifically for macromolecular data, such as proteins and nucleic acids, incorporating elements from the PDB file format.

References

  1. Hall SR, Allen FH, Brown ID (1991). "The Crystallographic Information File (CIF): a new standard archive file for crystallography". Acta Crystallographica Section A. 47 (6): 655–685. doi: 10.1107/S010876739101067X .
  2. Brown ID, McMahon B (2002). "CIF: the computer language of crystallography". Acta Crystallographica Section B: Structural Science. 58 (Pt 3 Pt 1): 317–24. doi:10.1107/s0108768102003464. PMID   12037350.
  3. Bourne, PE; Berman, HM; McMahon, B; Watenpaugh, KD; Westbrook, JD; Fitzgerald, PM (1997). Macromolecular Crystallographic Information File. Methods in Enzymology. Vol. 277. pp. 571–90. doi:10.1016/s0076-6879(97)77032-0. PMID   18488325.
  4. Adams, PD; Afonine, PV; Baskaran, K; Berman, HM; Berrisford, J; Bricogne, G; Brown, DG; Burley, SK; Chen, M; Feng, Z; Flensburg, C; Gutmanas, A; Hoch, JC; Ikegawa, Y; Kengaku, Y; Krissinel, E; Kurisu, G; Liang, Y; Liebschner, D; Mak, L; Markley, JL; Moriarty, NW; Murshudov, GN; Noble, M; Peisach, E; Persikova, I; Poon, BK; Sobolev, OV; Ulrich, EL; Velankar, S; Vonrhein, C; Westbrook, J; Wojdyr, M; Yokochi, M; Young, JY (1 April 2019). "Announcing mandatory submission of PDBx/mmCIF format files for crystallographic depositions to the Protein Data Bank (PDB)". Acta Crystallographica Section D. 75 (Pt 4): 451–454. doi:10.1107/S2059798319004522. PMC   6465986 . PMID   30988261.
  5. Bichler, Martin (1997), "Grundlegende WWW-Techniken" , Aufbau unternehmensweiter WWW-Informationssysteme, Wiesbaden: Vieweg+Teubner Verlag, pp. 8–32, doi:10.1007/978-3-322-86597-7_2, ISBN   978-3-322-86598-4 , retrieved 2021-09-17