Coot (software)

Last updated
Coot
Coot-crystallography-software.png
the Coot main window (version 0.5pre)
Developer(s) Paul Emsley
Kevin D. Cowtan
Initial release2002
Stable release
0.9.4.1 [1]   OOjs UI icon edit-ltr-progressive.svg / 2 February 2021;18 months ago (2 February 2021)
Operating system Windows, Linux, OS X, Unix
Type Molecular modelling
License GNU General Public License
Website http://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot
http://www.biop.ox.ac.uk/coot/

The program Coot (Crystallographic Object-Oriented Toolkit) [2] [3] is used to display and manipulate atomic models of macromolecules, typically of proteins or nucleic acids, using 3D computer graphics. It is primarily focused on building and validation of atomic models into three-dimensional electron density maps obtained by X-ray crystallography methods, although it has also been applied to data from electron microscopy.

Contents

Overview

Coot displays electron density maps and atomic models and allows model manipulations such as idealization, real space refinement, manual rotation/translation, rigid-body fitting, ligand search, solvation, mutations, rotamers, and Ramachandran idealization. The software is designed to be easy-to-learn for novice users, achieved by ensuring that tools for common tasks are 'discoverable' through familiar user interface elements (menus and toolbars), or by intuitive behaviour (mouse controls). Recent developments have enhanced the usability of the software for expert users, with customisable key bindings, extensions, and an extensive scripting interface.

Coot is free software, distributed under the GNU GPL. It is available from the Coot web site [4] originally at the University of York, and now at the MRC Laboratory of Molecular Biology. Pre-compiled binaries are also available for Linux and Windows from the web page and CCP4, and for Mac OS X through Fink and CCP4. Additional support is available through the Coot wiki and an active COOT mailing list. [5] [6]

The primary author is Paul Emsley (MRC-LMB at Cambridge). Other contributors include Kevin Cowtan, Bernhard Lohkamp and Stuart McNicholas (University of York), William Scott (University of California at Santa Cruz), and Eugene Krissinel (Daresbury Laboratory).

Features

Coot can be used to read files containing 3D atomic coordinate models of macromolecular structures in a number of formats, including pdb, mmcif, and Shelx files. The model may then be rotated in 3D and viewed from any viewpoint. The atomic model is represented by default using a stick-model, with vectors representing chemical bonds. The two halves of each bond are coloured according to the element of the atom at that end of the bond, allowing chemical structure and identity to be visualised in a manner familiar to most chemists.

Coot can also display electron density, which is the result of structure determination experiments such as X-ray crystallography and EM reconstruction. The density is contoured using a 3D-mesh. The contour level controlled using the mouse wheel for easy manipulation - this provides a simple way for the user to get an idea of the 3D electron density profile without the visual clutter of multiple contour levels. Electron density may be read into the program from ccp4 or cns map formats, though it is more common to calculate an electron density map directly from the X-ray diffraction data, read from an mtz, hkl, fcf or mmcif file.

Coot provides extensive features for model building and refinement (i.e. adjusting the model to better fit the electron density), and for validation (i.e. checking that the atomic model agrees with the experimentally derived electron density and makes chemical sense). The most important of these tools is the real space refinement engine, which will optimize the fit of a section of atomic model to the electron density in real time, with graphical feedback. The user may also intervene in this process, dragging the atoms into the right places if the initial model is too far away from the corresponding electron density.

Model building tools

Coot Real Space Refinement Coot-crystallography-software-realspacerefine.png
Coot Real Space Refinement
Coot Add Terminal Residue Coot-crystallography-software-addresidue.png
Coot Add Terminal Residue

Tools for general model building:

Tools for moving existing atoms:

Tools for adding atoms to the model:

Validation tools

Coot Ramachandran plot validation tool Coot-crystallography-software-ramachandran.png
Coot Ramachandran plot validation tool
Coot density fit validation tool Coot-crystallography-software-densityfit.png
Coot density fit validation tool

In macromolecular crystallography, the observed data is often weak and the observation-to-parameter ratio near 1. As a result, it is possible to build an incorrect atomic model into the electron density in some cases. To avoid this, careful validation is required. Coot provides a range of validation tools, listed below. Having built an initial model, it is usual to check all of these and reconsider any parts of the model which are highlighted as problematic before deposition of the atomic coordinates with a public database.

Program architecture

Coot structure Coot-crystallography-software-structure.png
Coot structure

Coot is built upon a number of libraries. Crystallographic tools include the Clipper library [7] for manipulating electron density and providing crystallographic algorithms, and the MMDB [8] for the manipulation of atomic models. Other dependencies include FFTW, and the GNU Scientific Library.

Much of the program's functionality is available through a scripting interface, which provides access from both the Python and Guile scripting languages.

Relation to CCP4mg

The CCP4mg molecular graphics software [9] [10] from Collaborative Computational Project Number 4 is a related project with which Coot shares some code. The projects are focused on slightly different problems, with CCP4mg dealing with presentation graphics and movies, whereas Coot deals with model building and validation.

Impact in the crystallographic computing community

The software has gained considerable popularity, overtaking widely used packages such as 'O', [11] XtalView, [12] and Turbo Frodo. [13] The primary publication has been cited in over 25,000 independent scientific papers since 2004. [14]

Related Research Articles

The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations. The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.

Protein structure prediction Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.

Structural bioinformatics Bioinformatics subfield

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

Kinemage

A kinemage is an interactive graphic scientific illustration. It often is used to visualize molecules, especially proteins although it can also represent other types of 3-dimensional data. The kinemage system is designed to optimize ease of use, interactive performance, and the perception and communication of detailed 3D information. The kinemage information is stored in a text file, human- and machine-readable, that describes the hierarchy of display objects and their properties, and includes optional explanatory text. The kinemage format is a defined chemical MIME type of 'chemical/x-kinemage' with the file extension '.kin'.

Molecular modelling Discovering chemical properties by physical simulations

Molecular modelling encompasses all methods, theoretical and computational, used to model or mimic the behaviour of molecules. The methods are used in the fields of computational chemistry, drug design, computational biology and materials science to study molecular systems ranging from small chemical systems to large biological molecules and material assemblies. The simplest calculations can be performed by hand, but inevitably computers are required to perform molecular modelling of any reasonably sized system. The common feature of molecular modelling methods is the atomistic level description of the molecular systems. This may include treating atoms as the smallest individual unit, or explicitly modelling protons and neutrons with its quarks, anti-quarks and gluons and electrons with its photons.

Molecular geometry Study of the 3D shapes of molecules

Molecular geometry is the three-dimensional arrangement of the atoms that constitute a molecule. It includes the general shape of the molecule as well as bond lengths, bond angles, torsional angles and any other geometrical parameters that determine the position of each atom.

Conformational isomerism Different molecular structures formed only by rotation about single bonds

In chemistry, conformational isomerism is a form of stereoisomerism in which the isomers can be interconverted just by rotations about formally single bonds. While any two arrangements of atoms in a molecule that differ by rotation about single bonds can be referred to as different conformations, conformations that correspond to local minima on the potential energy surface are specifically called conformational isomers or conformers. Conformations that correspond to local maxima on the energy surface are the transition states between the local-minimum conformational isomers. Rotations about single bonds involve overcoming a rotational energy barrier to interconvert one conformer to another. If the energy barrier is low, there is free rotation and a sample of the compound exists as a rapidly equilibrating mixture of multiple conformers; if the energy barrier is high enough then there is restricted rotation, a molecule may exist for a relatively long time period as a stable rotational isomer or rotamer. When the time scale for interconversion is long enough for isolation of individual rotamers, the isomers are termed atropisomers. The ring-flip of substituted cyclohexanes constitutes another common form of conformational isomerism.

Docking (molecular)

In the field of molecular modeling, docking is a method which predicts the preferred orientation of one molecule to a second when a ligand and a target are bound to each other to form a stable complex. Knowledge of the preferred orientation in turn may be used to predict the strength of association or binding affinity between two molecules using, for example, scoring functions.

Force field (chemistry) Concept on molecular modeling

In the context of chemistry and molecular modelling, a force field is a computational method that is used to estimate the forces between atoms within molecules and also between molecules. More precisely, the force field refers to the functional form and parameter sets used to calculate the potential energy of a system of atoms or coarse-grained particles in molecular mechanics, molecular dynamics, or Monte Carlo simulations. The parameters for a chosen energy function may be derived from experiments in physics and chemistry, calculations in quantum mechanics, or both. Force fields are interatomic potentials and utilize the same concept as force fields in classical physics, with the difference that the force field parameters in chemistry describe the energy landscape, from which the acting forces on every particle are derived as a gradient of the potential energy with respect to the particle coordinates.

Internal Coordinate Mechanics (ICM) is a software program and algorithm to predict low-energy conformations of molecules by sampling the space of internal coordinates defining molecular geometry. In ICM each molecule is constructed as a tree from an entry atom where each next atom is built iteratively from the preceding three atoms via three internal variables. The rings kept rigid or imposed via additional restraints. ICM is used for modelling peptides and interactions with substrates and coenzymes.

The Protein Data Bank (pdb) file format is a textual file format describing the three-dimensional structures of molecules held in the Protein Data Bank. The pdb format accordingly provides for description and annotation of protein and nucleic acid structures including atomic coordinates, secondary structure assignments, as well as atomic connectivity. In addition experimental metadata are stored. PDB format is the legacy file format for the Protein Data Bank which now keeps data on biological macromolecules in the newer mmCIF file format.

Molecular graphics is the discipline and philosophy of studying molecules and their properties through graphical representation. IUPAC limits the definition to representations on a "graphical display device". Ever since Dalton's atoms and Kekulé's benzene, there has been a rich history of hand-drawn atoms and molecules, and these representations have had an important influence on modern molecular graphics.

A molecular model is a physical model of an atomistic system that represents molecules and their processes. They play an important role in understanding chemistry and generating and testing hypotheses. The creation of mathematical models of molecular properties and behavior is referred to as molecular modeling, and their graphical depiction is referred to as molecular graphics.

The Collaborative Computational Project Number 4 in protein crystallography (CCP4) was set up in 1979 in the United Kingdom to support collaboration between researchers working in software development and assemble a comprehensive collection of software for structural biology. The CCP4 core team is located at the Research Complex at Harwell (RCaH) at Rutherford Appleton Laboratory (RAL) in Didcot, near Oxford, UK.

UCSF Chimera

UCSF Chimera is an extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles. High-quality images and movies can be created. Chimera includes complete documentation and can be downloaded free of charge for noncommercial use.

Frederic M. Richards American biochemist and biophysicist (1925–2009)

Frederic Middlebrook Richards, commonly referred to as Fred Richards, was an American biochemist and biophysicist known for solving the pioneering crystal structure of the ribonuclease S enzyme in 1967 and for defining the concept of solvent-accessible surface. He contributed many key experimental and theoretical results and developed new methods, garnering over 20,000 journal citations in several quite distinct research areas. In addition to the protein crystallography and biochemistry of ribonuclease S, these included solvent accessibility and internal packing of proteins, the first side-chain rotamer library, high-pressure crystallography, new types of chemical tags such as biotin/avidin, the nuclear magnetic resonance (NMR) chemical shift index, and structural and biophysical characterization of the effects of mutations.

Structure validation Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

Paul Emsley (crystallographer)

Paul Emsley is a British crystallographer at the MRC Laboratory of Molecular Biology in Cambridge. He works as an independent scientist and is a member of the Computational Crystallography Group headed by Garib Murshudov.

Complementarity plot

The complementarity plot (CP) is a graphical tool for structural validation of atomic models for both folded globular proteins and protein-protein interfaces. It is based on a probabilistic representation of preferred amino acid side-chain orientation, analogous to the preferred backbone orientation of Ramachandran plots). It can potentially serve to elucidate protein folding as well as binding. The upgraded versions of the software suite is available and maintained in github for both folded globular proteins as well as inter-protein complexes. The software is included in the bioinformatic tool suites OmicTools and Delphi tools.

Backbone-dependent rotamer library

In biochemistry, a backbone-dependent rotamer library provides the frequencies, mean dihedral angles, and standard deviations of the discrete conformations of the amino acid side chains in proteins as a function of the backbone dihedral angles φ and ψ of the Ramachandran map. By contrast, backbone-independent rotamer libraries express the frequencies and mean dihedral angles for all side chains in proteins, regardless of the backbone conformation of each residue type. Backbone-dependent rotamer libraries have been shown to have significant advantages over backbone-independent rotamer libraries, principally when used as an energy term, by speeding up search times of side-chain packing algorithms used in protein structure prediction and protein design.

References

  1. "Release 0.9.4.1". 2 February 2021. Retrieved 4 March 2021.
  2. P. Emsley; B. Lohkamp; W.G. Scott; Cowtan (2010). "Features and Development of Coot". Acta Crystallographica. D66 (4): 486–501. doi:10.1107/s0907444910007493. PMC   2852313 . PMID   20383002.
  3. P. Emsley; K. Cowtan (2004). "Coot: model-building tools for molecular graphics". Acta Crystallographica. D60 (12): 2126–2132. doi: 10.1107/s0907444904019158 . PMID   15572765.
  4. "Coot". Mrc-lmb.cam.ac.uk. Retrieved 2017-02-27.
  5. "Coot - CCP4 wiki". Strucbio.biologie.uni-konstanz.de. Retrieved 2017-02-27.
  6. "Coot List At Www.Jiscmail.Ac.Uk". JISCMail. Retrieved 2017-02-27.
  7. "Dr Kevin Cowtan - About staff, The University of York". Ysbl.york.ac.uk. 2014-10-23. Retrieved 2017-02-27.
  8. "CCP4 Coordinate Library Project". www.ebi.ac.uk. Archived from the original on 10 June 2002. Retrieved 17 January 2022.
  9. L. Potterton, S. McNicholas, E. Krissinel, J. Gruber, K. Cowtan, P. Emsley, G. N. Murshudov, S. Cohen, A. Perrakis and M. Noble (2004). "Developments in the CCP4 molecular-graphics project". Acta Crystallogr. D60: 2288–2294.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  10. "Archived copy". www.ysbl.york.ac.uk. Archived from the original on 10 June 2005. Retrieved 17 January 2022.{{cite web}}: CS1 maint: archived copy as title (link)
  11. "Home Page of Alwyn Jones". Xray.bmc.uu.se. Retrieved 2017-02-27.
  12. "CCMS Software - XtalView". Sdsc.edu. 2006-08-09. Retrieved 2017-02-27.
  13. "Turbo Frodo Description". Csb.yale.edu. 1999-03-26. Retrieved 2017-02-27.
  14. "Coot model building tools for molecular graphics - Google Scholar". Scholar.google.co.uk. Retrieved 2017-02-27.