Collaborative Computing Project for NMR

Last updated

The CCPN logo. CCPN Logo.png
The CCPN logo.

The Collaborative Computing Project for NMR (CCPN) is a project that aims to bring together computational aspects of the scientific community involved in NMR spectroscopy, especially those who work in the field of protein NMR. The general aims are to link new and existing NMR software via a common data standard and provide a forum within the community for the discussion of NMR software and the scientific methods it supports. CCPN was initially started in 1999 in the United Kingdom but collaborates with NMR and software development groups worldwide.

Contents

The Collaborative Project for the NMR Community

The Collaborative Computing project for NMR spectroscopy was set up in with three main aims; to create a common standard for representing NMR spectroscopy related data, to create a suite of new open-source NMR software packages and to arrange meetings for the NMR community, including conferences, workshops and courses in order to discuss and spread best-practice within the NMR community, for both computational and non-computational aspects. Primary financial support for CCPN comes from the BBSRC; the UK Biotechnology and Biological Sciences Research Council. CCPN is part of an array of collaborative computing projects [1] (CCP) and follows in a similar vein to the successful and well-established CCP4 project for X-ray crystallography. CCPN is also supported by European Union grants, most recently as part of the Extend-NMR project; [2] which links together several software producing groups from across Europe.

CCPN is governed by an executive committee which draws its members from academics throughout the UK NMR community. This committee is chosen at the CCPN Assembly Meeting where all UK based NMR groups may participate and vote. The day-to-day work of CCPN, including the organisation of meetings and software development, is handled by an informal working group, coordinated by Ernest Laue at the University of Cambridge, which comprises the core group of staff and developers, as well as a growing number of collaborators throughout the world who contribute to coordinated NMR software development.

NMR Data Standards

The many different software packages available to the NMR spectroscopy community have traditionally employed a number of different data formats and standards to represent computational information. The inception of CCPN was partly to look at this situation and to develop a more unified approach. It was deemed that multiple, informally connected data standards not only made it more difficult for a user to move from one program to the next, but also adversely affected data fidelity, harvesting and database deposition. [3] To this end CCPN has developed a common data standard for NMR, referred to as the CCPN data model, as well as software routines and libraries that allow access, manipulation and storage of the data. The CCPN system works alongside the Bio Mag Res Bank [4] which continues to handle archiving NMR database depositions; the CCPN standard is for active data exchange and in-program manipulation.

Although NMR spectroscopy remains at the core of the data standard it naturally expands into other related areas of science that support and complement NMR. These include molecular and macromolecular description, three-dimensional biological structures, sample preparation, workflow management and software setup. The CCPN libraries are created using the principles of model-driven architecture and automatic code generation; the CCPN data model provides a specification for the automatic generation of APIs in multiple languages. To date CCPN provides APIs to its data model in Python, Java and C programming languages. Through its collaborations, CCPN continues to link new and existing software via its data standards. To enable interaction with as much external software as possible, CCPN has created a format conversion program. This allows data to enter from outside the CCPN scheme and provides a mechanism to translate between existing data formats. The open-source CcpNmr FormatConverter software was first released in 2005 and is available for download (from CCPN and SourceForge) but is also recently accessible as a web application.

CCPN Software Suite

Three dimensional protein NMR spectra viewed with CCPN software. The illustrated spectra are from HNcoCA and HNCA experiments; used here to assign the sequence of amino acids in a protein chain. CCPN Hncoca.png
Three dimensional protein NMR spectra viewed with CCPN software. The illustrated spectra are from HNcoCA and HNCA experiments; used here to assign the sequence of amino acids in a protein chain.

As well as enabling data exchange, CCPN aims to develop software for processing, analysis and interpretation of macromolecular NMR data. To this end CCPN has created CcpNmr Analysis; a graphical program for spectrum visualisation, assignment and NMR data analysis. Here, the requirement was for a program that used a modern graphical user interface and could run on many types of computer. It would be supported and maintained by CCPN and would allow modification and extension, including for new NMR techniques. The first version of Analysis was released in 2005 and is now at version 2.1. Analysis is built directly on the CCPN data model and its design is partly inspired by the older ANSIG. [5] and SPARKY [6] programs, but it has continued to develop from the suggestions, requirements and computational contributions of its user community. Analysis is freely available to academic and non-profit institutions. Commercial users are required to subscribe to CCPN for a moderate fee. CCPN software, including Analysis, is available for download at the CCPN web site [7] and is supported by an active JISC email discussion group.

CCPN Meetings

Through its meetings CCPN provides a forum for the discussion of computational and experimental NMR techniques. The aim is to debate and spread best practice in the determination of macromolecular information, including structure, dynamics and biological chemistry. CCPN continues to arrange annual conferences for the UK NMR community (the current being the ninth) and a series of workshops to discuss and promote data standards. Because it is vital to the success of CCPN as a software project and as a coordinated NMR community, its software developers run courses to teach the use of CCPN software and its development framework. They also arrange visits to NMR groups to introduce the CCPN program suite and to gain an understanding of the requirements of users.

CCPN is especially keen to enable young scientists to contribute to and attend its meetings. Accordingly, wherever possible CCPN tries to keep conference fees at a minimum by using contributions that come from our industrial sponsorship and software subscriptions.

Footnotes

  1. BBSRC Collaborative Computational Projects
  2. The Extend-NMR project
  3. "The CCPN project: an interim report on a data model for the NMR community." (2002) Nat Struct Biol. 9(6):416-8
  4. Bio Mag Res Bank
  5. P.J. Kraulis, "ANSIG: A Program for the Assignment of Protein 1H 2D NMR spectra by Interactive Graphics" (1989) J. Magn. Reson 24, pp 627-633
  6. T. D. Goddard and D. G. Kneller, SPARKY 3, University of California, San Francisco
  7. CCPN Downloads Archived 2009-12-28 at the Wayback Machine

Related Research Articles

<span class="mw-page-title-main">Structural biology</span> Study of molecular structures in biology

Structural biology, as defined by the Journal of Structural Biology, deals with structural analysis of living material at every level of organization. Early structural biologists throughout the 19th and early 20th centuries were primarily only able to study structures to the limit of the naked eye's visual acuity and through magnifying glasses and light microscopes.

<span class="mw-page-title-main">Wolfram Research</span> American multinational company

Wolfram Research, Inc. is an American multinational company that creates computational technology. Wolfram's flagship product is the technical computing program Wolfram Mathematica, first released on June 23, 1988. Other products include WolframAlpha, Wolfram SystemModeler, Wolfram Workbench, gridMathematica, Wolfram Finance Platform, webMathematica, the Wolfram Cloud, and the Wolfram Programming Lab. Wolfram Research founder Stephen Wolfram is the CEO. The company is headquartered in Champaign, Illinois, United States.

<span class="mw-page-title-main">Visual Molecular Dynamics</span> Visualization and modelling software

Visual Molecular Dynamics (VMD) is a molecular modelling and visualization computer program. VMD is developed mainly as a tool to view and analyze the results of molecular dynamics simulations. It also includes tools for working with volumetric data, sequence data, and arbitrary graphics objects. Molecular scenes can be exported to external rendering tools such as POV-Ray, RenderMan, Tachyon, Virtual Reality Modeling Language (VRML), and many others. Users can run their own Tcl and Python scripts within VMD as it includes embedded Tcl and Python interpreters. VMD runs on Unix, Apple Mac macOS, and Microsoft Windows. VMD is available to non-commercial users under a distribution-specific license which permits both use of the program and modification of its source code, at no charge.

Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.

<span class="mw-page-title-main">IGOR Pro</span> Data analysis software

IGOR Pro is a scientific data analysis software, numerical computing environment and programming language that runs on Windows or Mac operating systems. It is developed by WaveMetrics Inc., and was originally aimed at time series analysis, but has since then evolved and covers other applications such as curve fitting and image processing. It comes with a fully functional programming language and compiler, but many functions are also accessible through menus. IGOR Pro is primarily known for its graphics capabilities, and like Origin and other similar programs, is often used to generate plots for scientific and other publications. Other features include the possibility of extending the built-in functions with external operations (XOP) allowing data acquisition, manipulation and analysis features, communication with external devices and in principle any other task that can be programmed in C or C++.

The Earth System Modeling Framework (ESMF) is open-source software for building climate, numerical weather prediction, data assimilation, and other Earth science software applications. These applications are computationally demanding and usually run on supercomputers. The ESMF is considered a technical layer, integrated into a sophisticated common modeling infrastructure for interoperability. Other aspects of interoperability and shared infrastructure include: common experimental protocols, common analytic methods, common documentation standards for data and data provenance, shared workflow, and shared model components.

<span class="mw-page-title-main">PQS (software)</span> Quantum chemistry software program

PQS is a general purpose quantum chemistry program. Its roots go back to the first ab initio gradient program developed in Professor Peter Pulay's group but now it is developed and distributed commercially by Parallel Quantum Solutions. There is a reduction in cost for academic users and a site license. Its strong points are geometry optimization, NMR chemical shift calculations, and large MP2 calculations, and high parallel efficiency on computing clusters. It includes many other capabilities including Density functional theory, the semiempirical methods, MINDO/3, MNDO, AM1 and PM3, Molecular mechanics using the SYBYL 5.0 Force Field, the quantum mechanics/molecular mechanics mixed method using the ONIOM method, natural bond orbital (NBO) analysis and COSMO solvation models. Recently, a highly efficient parallel CCSD(T) code for closed shell systems has been developed. This code includes many other post Hartree–Fock methods: MP2, MP3, MP4, CISD, CEPA, QCISD and so on.

Nuclear magnetic resonance spectroscopy of proteins is a field of structural biology in which NMR spectroscopy is used to obtain information about the structure and dynamics of proteins, and also nucleic acids, and their complexes. The field was pioneered by Richard R. Ernst and Kurt Wüthrich at the ETH, and by Ad Bax, Marius Clore, Angela Gronenborn at the NIH, and Gerhard Wagner at Harvard University, among others. Structure determination by NMR spectroscopy usually consists of several phases, each using a separate set of highly specialized techniques. The sample is prepared, measurements are made, interpretive approaches are applied, and a structure is calculated and validated.

X-PLOR is a computer software package for computational structural biology originally developed by Axel T. Brunger at Yale University. It was first published in 1987 as an offshoot of CHARMM - a similar program that ran on supercomputers made by Cray Inc. It is used in the fields of X-ray crystallography and nuclear magnetic resonance spectroscopy of proteins (NMR) analysis.

The Collaborative Computational Project Number 4 in Protein Crystallography (CCP4) was set up in 1979 in the United Kingdom to support collaboration between researchers working in software development and assemble a comprehensive collection of software for structural biology. The CCP4 core team is located at the Research Complex at Harwell (RCaH) at Rutherford Appleton Laboratory (RAL) in Didcot, near Oxford, UK.

<span class="mw-page-title-main">BALL</span>

BALL is a C++ class framework and set of algorithms and data structures for molecular modelling and computational structural bioinformatics, a Python interface to this library, and a graphical user interface to BALL, the molecule viewer BALLView.

In software engineering, a software development process or software development life cycle (SDLC) is a process of planning and managing software development. It typically involves dividing software development work into smaller, parallel, or sequential steps or sub-processes to improve design and/or product management. The methodology may include the pre-definition of specific deliverables and artifacts that are created and completed by a project team to develop or maintain an application.

<span class="mw-page-title-main">CING (biomolecular NMR structure)</span>

In biomolecular structure, CING stands for the Common Interface for NMR structure Generation and is known for structure and NMR data validation.

<span class="mw-page-title-main">Antony John Williams</span> British chemist

Antony John Williams is a British chemist and expert in the fields of both nuclear magnetic resonance (NMR) spectroscopy and cheminformatics at the United States Environmental Protection Agency. He is the founder of the ChemSpider website that was purchased by the Royal Society of Chemistry in May 2009. He is a science blogger and an author.

The Re-referenced Protein Chemical shift Database (RefDB) is an NMR spectroscopy database of carefully corrected or re-referenced chemical shifts, derived from the BioMagResBank (BMRB). The database was assembled by using a structure-based chemical shift calculation program to calculate expected protein (1)H, (13)C and (15)N chemical shifts from X-ray or NMR coordinate data of previously assigned proteins reported in the BMRB. The comparison is automatically performed by a program called SHIFTCOR. The RefDB database currently provides reference-corrected chemical shift data on more than 2000 assigned peptides and proteins. Data from the database indicates that nearly 25% of BMRB entries with (13)C protein assignments and 27% of BMRB entries with (15)N protein assignments require significant chemical shift reference readjustments. Additionally, nearly 40% of protein entries deposited in the BioMagResBank appear to have at least one assignment error. Users may download, search or browse the database through a number of methods available through the RefDB website. RefDB provides a standard chemical shift resource for biomolecular NMR spectroscopists, wishing to derive or compute chemical shift trends in peptides and proteins.

<span class="mw-page-title-main">WeNMR</span> Worldwide e-Infrastructure for NMR spectroscopy and structural biology

WeNMR is a worldwide e-Infrastructure for NMR spectroscopy and structural biology. It is the largest virtual Organization in the life sciences and is supported by EGI.

A Benchtop nuclear magnetic resonance spectrometer refers to a Fourier transform nuclear magnetic resonance (FT-NMR) spectrometer that is significantly more compact and portable than the conventional equivalents, such that it is portable and can reside on a laboratory benchtop. This convenience comes from using permanent magnets, which have a lower magnetic field and decreased sensitivity compared to the much larger and more expensive cryogen cooled superconducting NMR magnets. Instead of requiring dedicated infrastructure, rooms and extensive installations these benchtop instruments can be placed directly on the bench in a lab and moved as necessary. These spectrometers offer improved workflow, even for novice users, as they are simpler and easy to use. They differ from relaxometers in that they can be used to measure high resolution NMR spectra and are not limited to the determination of relaxation or diffusion parameters.

<span class="mw-page-title-main">Structure validation</span> Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

References