CING (biomolecular NMR structure)

Last updated
A Janin Plot generated by CING from chain A, Arginine residue number 18 in the protein Dynein Light Chain, (PDB ID 1y4o). The blue region show likely angle combinations for helical residues, while the yellow areas display regions that are common to strand-like stretches. Some green background can be seen for residues that are in other types of regions. The image was taken from the residue page here at the NRG-CING archive of validation reports Janin 1y4o A ARG18.png
A Janin Plot generated by CING from chain A, Arginine residue number 18 in the protein Dynein Light Chain, (PDB ID 1y4o). The blue region show likely angle combinations for helical residues, while the yellow areas display regions that are common to strand-like stretches. Some green background can be seen for residues that are in other types of regions. The image was taken from the residue page here at the NRG-CING archive of validation reports

In biomolecular structure, CING stands for the Common Interface for NMR structure Generation and is known for structure and NMR data validation. [2]

Contents

NMR spectroscopy provides diverse data on the solution structure of biomolecules. CING combines many external programs and internalized algorithms to direct an author of a new structure or a biochemist interested in an existing structure to regions of the molecule that might be problematic in relation to the experimental data.

The source code is maintained open to the public at Google Code. There is a secure web interface iCing available for new data.

Applications

Validated NMR data

Software

Following software is used internally or externally by CING:

Algorithms

Funding

The NRG-CING project was supported by the European Community grants 213010 (eNMR) and 261572 (WeNMR).

Related Research Articles

The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations. The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.

<span class="mw-page-title-main">Structural bioinformatics</span> Bioinformatics subfield

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

<span class="mw-page-title-main">Ramachandran plot</span> Visual representation of allowable protein conformations

In biochemistry, a Ramachandran plot, originally developed in 1963 by G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan, is a way to visualize energetically allowed regions for backbone dihedral angles ψ against φ of amino acid residues in protein structure. The figure on the left illustrates the definition of the φ and ψ backbone dihedral angles. The ω angle at the peptide bond is normally 180°, since the partial-double-bond character keeps the peptide bond planar. The figure in the top right shows the allowed φ,ψ backbone conformational regions from the Ramachandran et al. 1963 and 1968 hard-sphere calculations: full radius in solid outline, reduced radius in dashed, and relaxed tau (N-Cα-C) angle in dotted lines. Because dihedral angle values are circular and 0° is the same as 360°, the edges of the Ramachandran plot "wrap" right-to-left and bottom-to-top. For instance, the small strip of allowed values along the lower-left edge of the plot are a continuation of the large, extended-chain region at upper left.

Nuclear magnetic resonance spectroscopy of proteins is a field of structural biology in which NMR spectroscopy is used to obtain information about the structure and dynamics of proteins, and also nucleic acids, and their complexes. The field was pioneered by Richard R. Ernst and Kurt Wüthrich at the ETH, and by Ad Bax, Marius Clore, Angela Gronenborn at the NIH, and Gerhard Wagner at Harvard University, among others. Structure determination by NMR spectroscopy usually consists of several phases, each using a separate set of highly specialized techniques. The sample is prepared, measurements are made, interpretive approaches are applied, and a structure is calculated and validated.

<i>Journal of Biomolecular NMR</i> Academic journal

The Journal of Biomolecular NMR publishes research on technical developments and innovative applications of nuclear magnetic resonance spectroscopy for the study of structure and dynamic properties of biopolymers in solution, liquid crystals, solids and mixed environments. Some of the main topics include experimental and computational approaches for the determination of three-dimensional structures of proteins and nucleic acids, advancements in the automated analysis of NMR spectra, and new methods to probe and interpret molecular motions.

WHAT IF is a computer program used in a wide variety of computational macromolecular structure research fields. The software provides a flexible environment to display, manipulate, and analyze small and large molecules, proteins, nucleic acids, and their interactions.

<span class="mw-page-title-main">Helen M. Berman</span> American chemist

Helen Miriam Berman is a Board of Governors Professor of Chemistry and Chemical Biology at Rutgers University and a former director of the RCSB Protein Data Bank. A structural biologist, her work includes structural analysis of protein-nucleic acid complexes, and the role of water in molecular interactions. She is also the founder and director of the Nucleic Acid Database, and led the Protein Structure Initiative Structural Genomics Knowledgebase.

Gerard Jacob Kleywegt is a Dutch X-ray crystallographer and the former team leader of the Protein Data Bank in Europe at the EBI; a member of the Worldwide Protein Data Bank.

The Re-referenced Protein Chemical shift Database (RefDB) is an NMR spectroscopy database of carefully corrected or re-referenced chemical shifts, derived from the BioMagResBank (BMRB). The database was assembled by using a structure-based chemical shift calculation program to calculate expected protein (1)H, (13)C and (15)N chemical shifts from X-ray or NMR coordinate data of previously assigned proteins reported in the BMRB. The comparison is automatically performed by a program called SHIFTCOR. The RefDB database currently provides reference-corrected chemical shift data on more than 2000 assigned peptides and proteins. Data from the database indicates that nearly 25% of BMRB entries with (13)C protein assignments and 27% of BMRB entries with (15)N protein assignments require significant chemical shift reference readjustments. Additionally, nearly 40% of protein entries deposited in the BioMagResBank appear to have at least one assignment error. Users may download, search or browse the database through a number of methods available through the RefDB website. RefDB provides a standard chemical shift resource for biomolecular NMR spectroscopists, wishing to derive or compute chemical shift trends in peptides and proteins.

<span class="mw-page-title-main">WeNMR</span> Worldwide e-Infrastructure for NMR spectroscopy and structural biology

WeNMR is a worldwide e-Infrastructure for NMR spectroscopy and structural biology. It is the largest virtual Organization in the life sciences and is supported by EGI.

<span class="mw-page-title-main">Structure validation</span> Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

<span class="mw-page-title-main">GeNMR</span>

GeNMR method is the first fully automated template-based method of protein structure determination that utilizes both NMR chemical shifts and NOE -based distance restraints.

<span class="mw-page-title-main">Protein Structure Evaluation Suite & Server</span> System for validating protein structures

Protein Structure Evaluation Suite & Server (PROSESS) is a freely available web server for protein structure validation. It has been designed at the University of Alberta to assist with the process of evaluating and validating protein structures solved by NMR spectroscopy.

<span class="mw-page-title-main">CS23D</span>

CS23D is a web server to generate 3D structural models from NMR chemical shifts. CS23D combines maximal fragment assembly with chemical shift threading, de novo structure generation, chemical shift-based torsion angle prediction, and chemical shift refinement. CS23D makes use of RefDB and ShiftX.

<span class="mw-page-title-main">Chemical shift index</span> Laboratory technique

The chemical shift index or CSI is a widely employed technique in protein nuclear magnetic resonance spectroscopy that can be used to display and identify the location as well as the type of protein secondary structure found in proteins using only backbone chemical shift data The technique was invented by David S. Wishart in 1992 for analyzing 1Hα chemical shifts and then later extended by him in 1994 to incorporate 13C backbone shifts. The original CSI method makes use of the fact that 1Hα chemical shifts of amino acid residues in helices tends to be shifted upfield relative to their random coil values and downfield in beta strands. Similar kinds of upfield and downfield trends are also detectable in backbone 13C chemical shifts.

Protein chemical shift prediction is a branch of biomolecular nuclear magnetic resonance spectroscopy that aims to accurately calculate protein chemical shifts from protein coordinates. Protein chemical shift prediction was first attempted in the late 1960s using semi-empirical methods applied to protein structures solved by X-ray crystallography. Since that time protein chemical shift prediction has evolved to employ much more sophisticated approaches including quantum mechanics, machine learning and empirically derived chemical shift hypersurfaces. The most recently developed methods exhibit remarkable precision and accuracy.

Nuclear magnetic resonance chemical shift re-referencing is a chemical analysis method for chemical shift referencing in biomolecular nuclear magnetic resonance (NMR). It has been estimated that up to 20% of 13C and up to 35% of 15N shift assignments are improperly referenced. Given that the structural and dynamic information contained within chemical shifts is often quite subtle, it is critical that protein chemical shifts be properly referenced so that these subtle differences can be detected. Fundamentally, the problem with chemical shift referencing comes from the fact that chemical shifts are relative frequency measurements rather than absolute frequency measurements. Because of the historic problems with chemical shift referencing, chemical shifts are perhaps the most precisely measurable but the least accurately measured parameters in all of NMR spectroscopy.

Protein chemical shift re-referencing is a post-assignment process of adjusting the assigned NMR chemical shifts to match IUPAC and BMRB recommended standards in protein chemical shift referencing. In NMR chemical shifts are normally referenced to an internal standard that is dissolved in the NMR sample. These internal standards include tetramethylsilane (TMS), 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) and trimethylsilyl propionate (TSP). For protein NMR spectroscopy the recommended standard is DSS, which is insensitive to pH variations. Furthermore, the DSS 1H signal may be used to indirectly reference 13C and 15N shifts using a simple ratio calculation [1]. Unfortunately, many biomolecular NMR spectroscopy labs use non-standard methods for determining the 1H, 13C or 15N “zero-point” chemical shift position. This lack of standardization makes it difficult to compare chemical shifts for the same protein between different laboratories. It also makes it difficult to use chemical shifts to properly identify or assign secondary structures or to improve their 3D structures via chemical shift refinement. Chemical shift re-referencing offers a means to correct these referencing errors and to standardize the reporting of protein chemical shifts across laboratories.

Resolution by Proxy (ResProx) is a method for assessing the equivalent X-ray resolution of NMR-derived protein structures. ResProx calculates resolution from coordinate data rather than from electron density or other experimental inputs. This makes it possible to calculate the resolution of a structure regardless of how it was solved. ResProx was originally designed to serve as a simple, single-number evaluation that allows straightforward comparison between the quality/resolution of X-ray structures and the quality of a given NMR structure. However, it can also be used to assess the reliability of an experimentally reported X-ray structure resolution, to evaluate protein structures solved by unconventional or hybrid means and to identify fraudulent structures deposited in the PDB. ResProx incorporates more than 25 different structural features to determine a single resolution-like value. ResProx values are reported in Angstroms. Tests on thousands of X-ray structures show that ResProx values match very closely to resolution values reported by X-ray crystallographers. Resolution-by-proxy values can be calculated for newly determined protein structures using a freely accessible ResProx web server. This server accepts protein coordinate data and generates a resolution estimate for that input structure.

The Biological Magnetic Resonance Data Bank is an open access repository of nuclear magnetic resonance (NMR) spectroscopic data from peptides, proteins, nucleic acids and other biologically relevant molecules. The database is operated by the University of Wisconsin–Madison and is supported by the National Library of Medicine. The BMRB is part of the Research Collaboratory for Structural Bioinformatics and, since 2006, it is a partner in the Worldwide Protein Data Bank (wwPDB). The repository accepts NMR spectral data from laboratories around the world and, once the data is validated, it is available online at the BMRB website. The database has also an ftp site, where data can be downloaded in the bulk. The BMRB has two mirror sites, one at the Protein Database Japan (PDBj) at Osaka University and one at the Magnetic Resonance Research Center (CERM) at the University of Florence in Italy. The site at Japan accepts and processes data depositions.

References

  1. 1 2 Doreleijers, J. F.; Vranken, W. F.; Schulte, C.; Markley, J. L.; Ulrich, E. L.; Vriend, G.; Vuister, G. W. (2011). "NRG-CING: Integrated validation reports of remediated experimental biomolecular NMR data and coordinates in wwPDB". Nucleic Acids Research. 40 (Database issue): D519–D524. doi:10.1093/nar/gkr1134. PMC   3245154 . PMID   22139937.
  2. CING; an integrated residue-based structure validation program suite, Jurgen F. Doreleijers Alan W. Sousa da Silva, Elmar Krieger, Sander B. Nabuurs, Chris Spronk, Tim Stevens, Wim F. Vranken, Gert Vriend, Geerten W. Vuister (to be submitted).
  3. Lu and Olson. 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nature Protocols (2008) vol. 3 (7) pp. 1213-27
  4. Koradi et al. MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph (1996) vol. 14 pp. 51-55
  5. Laskowski et al. AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR. J Biomol NMR (1996) vol. 8 (4) pp. 477-486
  6. Neal et al. Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J Biomol NMR (2003) vol. 26 (3) pp. 215-240
  7. Shen et al. TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR (2009) vol. 44 (4) pp. 213-23
  8. Hooft et al. Errors in protein structures. Nature (1996) vol. 381 (6580) pp. 272-272
  9. Doreleijers, J. F.; Nederveen, A. J.; Vranken, W.; Lin, J.; Bonvin, A. M. J. J.; Kaptein, R.; Markley, J. L.; Ulrich, E. L. (2005). "BioMagResBank databases DOCR and FRED containing converted and filtered sets of experimental NMR restraints and coordinates from over 500 protein PDB structures". Journal of Biomolecular NMR. 32 (1): 1–12. doi:10.1007/s10858-005-2195-0. hdl: 1874/14810 . PMID   16041478. S2CID   25252823.
  10. Kumar and Nussinov. Relationship between Ion Pair Geometries and Electrostatic Strengths in Proteins. Biophys.J. (2002) vol. 83 pp. 1595–1612
  11. Dombkowski and Crippen. Disulfide recognition in an optimized threading potential. Protein Engineering Design and Selection (2000) vol. 13 (10) pp. 679-689
  12. Ross. Peirce's criterion for the elimination of suspect experimental data. Journal of Engineering Technology (2003)