ShiftX

Last updated
ShiftX
Content
DescriptionProtein chemical shift calculation server
Contact
Research center University of Alberta
Laboratory Dr. David Wishart
Primary citation [1] [2]
Access
Data format Data Input: X-ray or NMR coordinates (PDB format); Data Output: 1H, 13C and 15N chemical shifts (Shifty or BMRB format)
Website http://shiftx.wishartlab.com/; http://www.shiftx2.ca/; http://www.shiftx2.ca/download.html
Miscellaneous
Curation policyManually curated

ShiftX (Shifts from X-ray structures) is a freely available web server for rapidly calculating protein chemical shifts from protein X-ray (or NMR) coordinates. Protein chemical shift prediction (also known as protein chemical shift calculation) is particularly useful in verifying protein chemical shift assignments, adjusting mis-referenced chemical shifts, refining NMR protein structures (via chemical shifts) and assisting with the NMR assignment of unassigned proteins that have either had their structures (or the structures of a homologous protein) determined by X-ray or NMR methods.

The ShiftX web server takes atomic coordinates (PDB s format) of proteins as input and quickly (<1 sec) generates the chemical shifts of both backbone (1H, 13C and 15N) and side chain (1H only) atoms as output (BMRB or Shifty format). The server is optimized to work with diamagnetic proteins rather than paramagnetic proteins (i.e. proteins with paramagnetic centers). The ShiftX web server is based on a program of the same name that was developed in 2003 by members of Dr. David Wishart’s laboratory. [1] Both the ShiftX program and the ShiftX web server make use of pre-calculated, empirically derived chemical shift tables relating 1H, 13C and 15N chemical shifts to backbone torsion angles, side chain orientations, local secondary structure and nearest neighbor effects. These tables were derived using data mining techniques from a large database of reference-corrected protein chemical shifts called RefDB. [3] These sequence/structure dependencies on chemical shifts, which cannot easily be converted to analytical formulae, are combined with standard classical or semi-classical equations (for ring current effects and hydrogen bond effects) to further improve the 1H, 13C and 15N chemical shift calculations. ShiftX differs from other protein chemical shift calculation techniques in that it blends both empirical observations with classical or semi-quantum mechanical approaches. Most other protein chemical shift calculation methods use either empirical (such as SPARTA [4] ) or quantum mechanical (such as ShiftS [5] ) approaches, exclusively. ShiftX is both fast and accurate. It has a correlation coefficient (r) between measured and calculated shifts of 0.91(1HA), 0.98 (13CA), 0.99 (13CB), 0.86 (13CO), 0.91 (15N), 0.74 (1HN), and 0.907 (side chain 1H) with RMS errors of 0.23, 0.98, 1.10, 1.16, 2.43, 0.49, and 0.30 ppm. ShiftX is used in several programs or web servers including ShiftCor. It is also used in the generation and updating of the re-referenced chemical shift database known as RefDB.

Recently, substantial improvements to the performance of ShiftX were achieved by using machine learning methods to better integrate protein structure features (including solvent accessible surface area) and local or nearest-neighbor interactions. This led to the release of an updated version of ShiftX called ShiftX2. [2] ShiftX2 is substantially more accurate than ShiftX and it is able to calculate a much larger collection of side chain chemical shifts (1H, 13C and 15N). It is also available as a freely accessible web server. However, it is 2-3X slower. ShiftX2 achieves correlation coefficients between experimentally observed and predicted backbone chemical shifts of 0.98 (15N), 0.99 (13CA), 0.999 (13CB), 0.97 (13CO), 0.97 (1HN), 0.98 (1HA) with corresponding RMS errors of 1.12, 0.44, 0.51, 0.53, 0.17, and 0.12 ppm.

See also

Related Research Articles

Nuclear magnetic resonance spectroscopy of proteins is a field of structural biology in which NMR spectroscopy is used to obtain information about the structure and dynamics of proteins, and also nucleic acids, and their complexes. The field was pioneered by Richard R. Ernst and Kurt Wüthrich at the ETH, and by Ad Bax, Marius Clore, Angela Gronenborn at the NIH, and Gerhard Wagner at Harvard University, among others. Structure determination by NMR spectroscopy usually consists of several phases, each using a separate set of highly specialized techniques. The sample is prepared, measurements are made, interpretive approaches are applied, and a structure is calculated and validated.

Residual dipolar coupling

The residual dipolar coupling between two spins in a molecule occurs if the molecules in solution exhibit a partial alignment leading to an incomplete averaging of spatially anisotropic dipolar couplings.

<i>Journal of Biomolecular NMR</i> Academic journal

The Journal of Biomolecular NMR publishes research on technical developments and innovative applications of nuclear magnetic resonance spectroscopy for the study of structure and dynamic properties of biopolymers in solution, liquid crystals, solids and mixed environments. Some of the main topics include experimental and computational approaches for the determination of three-dimensional structures of proteins and nucleic acids, advancements in the automated analysis of NMR spectra, and new methods to probe and interpret molecular motions.

Carbohydrate NMR Spectroscopy is the application of nuclear magnetic resonance (NMR) spectroscopy to structural and conformational analysis of carbohydrates. This method allows the scientists to elucidate structure of monosaccharides, oligosaccharides, polysaccharides, glycoconjugates and other carbohydrate derivatives from synthetic and natural sources. Among structural properties that could be determined by NMR are primary structure, saccharide conformation, stoichiometry of substituents, and ratio of individual saccharides in a mixture. Modern high field NMR instruments used for carbohydrate samples, typically 500 MHz or higher, are able to run a suite of 1D, 2D, and 3D experiments to determine a structure of carbohydrate compounds.

CING (biomolecular NMR structure)

In biomolecular structure, CING stands for the Common Interface for NMR structure Generation and is known for structure and NMR data validation.

The Re-referenced Protein Chemical shift Database (RefDB) is an NMR spectroscopy database of carefully corrected or re-referenced chemical shifts, derived from the BioMagResBank (BMRB). The database was assembled by using a structure-based chemical shift calculation program to calculate expected protein (1)H, (13)C and (15)N chemical shifts from X-ray or NMR coordinate data of previously assigned proteins reported in the BMRB. The comparison is automatically performed by a program called SHIFTCOR. The RefDB database currently provides reference-corrected chemical shift data on more than 2000 assigned peptides and proteins. Data from the database indicates that nearly 25% of BMRB entries with (13)C protein assignments and 27% of BMRB entries with (15)N protein assignments require significant chemical shift reference readjustments. Additionally, nearly 40% of protein entries deposited in the BioMagResBank appear to have at least one assignment error. Users may download, search or browse the database through a number of methods available through the RefDB website. RefDB provides a standard chemical shift resource for biomolecular NMR spectroscopists, wishing to derive or compute chemical shift trends in peptides and proteins.

SHIFTCOR is a freely available web server as well as a stand-alone computer program for protein chemical shift re-referencing. Chemical shift referencing is a particularly widespread problem in biomolecular NMR with up to 25% of existing NMR chemical shift assignments being improperly referenced. Some of these referencing problems can lead to systematic errors of between 1.0 to 2.5 ppm. Errors of this magnitude can play havoc with any attempt to compare assignments between proteins or to structurally interpret chemical shifts. Identifying which proteins are mis-assigned or improperly referenced can be challenging, as can correcting the errors once they are found. The SHIFTCOR program was designed to assist with identifying and fixing these chemical shift referencing problems. Specifically it compares, identifies, corrects and re-references 1H, 13C and 15N backbone chemical shifts of peptides and proteins by comparing the observed chemical shifts with the predicted chemical shifts derived from the 3D structure of the protein(s) of interest [1]. The predicted chemical shifts are calculated using the ShiftX program. The SHIFTCOR program was originally used to construct a database of properly re-referenced protein chemical shift assignments called RefDB. RefDB is a web-accessible database of more than 2000 correctly referenced protein chemical shift assignments. While originally available as a stand-alone program only, SHIFTCOR has since been released for general use as a web server.

Random coil index Protocol in biochemistry

Random coil index (RCI) predicts protein flexibility by calculating an inverse weighted average of backbone secondary chemical shifts and predicting values of model-free order parameters as well as per-residue RMSD of NMR and molecular dynamics ensembles from this parameter.

CS-ROSETTA is a framework for structure calculation of biological macromolecules on the basis of conformational information from NMR, which is built on top of the biomolecular modeling and design software called ROSETTA. The name CS-ROSETTA for this branch of ROSETTA stems from its origin in combining NMR chemical shift (CS) data with ROSETTA structure prediction protocols. The software package was later extended to include additional NMR conformational parameters, such as Residual Dipolar Couplings (RDC), NOE distance restraints, pseudocontact chemical shifts (PCS) and restraints derived from homologous proteins. This software can be used together with other molecular modeling protocols, such as docking to model protein oligomers. In addition, CS-ROSETTA can be combined with chemical shift resonance assignment algorithms to create a fully automated NMR structure determination pipeline. The CS-ROSETTA software is freely available for academic use and can be licensed for commercial use. A software manual and tutorials are provided on the supporting website https://csrosetta.chemistry.ucsc.edu/.

Structure validation

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

GeNMR

GeNMR method is the first fully automated template-based method of protein structure determination that utilizes both NMR chemical shifts and NOE -based distance restraints.

Protein Structure Evaluation Suite & Server System for validating protein structures

Protein Structure Evaluation Suite & Server (PROSESS) is a freely available web server for protein structure validation. It has been designed at the University of Alberta to assist with the process of evaluating and validating protein structures solved by NMR spectroscopy.

CS23D

CS23D is a web server to generate 3D structural models from NMR chemical shifts. CS23D combines maximal fragment assembly with chemical shift threading, de novo structure generation, chemical shift-based torsion angle prediction, and chemical shift refinement. CS23D makes use of RefDB and ShiftX.

Chemical shift index Laboratory technique

The chemical shift index or CSI is a widely employed technique in protein nuclear magnetic resonance spectroscopy that can be used to display and identify the location as well as the type of protein secondary structure found in proteins using only backbone chemical shift data The technique was invented by David S. Wishart in 1992 for analyzing 1Hα chemical shifts and then later extended by him in 1994 to incorporate 13C backbone shifts. The original CSI method makes use of the fact that 1Hα chemical shifts of amino acid residues in helices tends to be shifted upfield relative to their random coil values and downfield in beta strands. Similar kinds of upfield/downfiled trends are also detectable in backbone 13C chemical shifts.

Protein chemical shift prediction is a branch of biomolecular nuclear magnetic resonance spectroscopy that aims to accurately calculate protein chemical shifts from protein coordinates. Protein chemical shift prediction was first attempted in the late 1960s using semi-empirical methods applied to protein structures solved by X-ray crystallography. Since that time protein chemical shift prediction has evolved to employ much more sophisticated approaches including quantum mechanics, machine learning and empirically derived chemical shift hypersurfaces. The most recently developed methods exhibit remarkable precision and accuracy.

Nuclear magnetic resonance chemical shift re-referencing is a chemical analysis method for chemical shift referencing in biomolecular nuclear magnetic resonance (NMR). It has been estimated that up to 20% of 13C and up to 35% of 15N shift assignments are improperly referenced. Given that the structural and dynamic information contained within chemical shifts is often quite subtle, it is critical that protein chemical shifts be properly referenced so that these subtle differences can be detected. Fundamentally, the problem with chemical shift referencing comes from the fact that chemical shifts are relative frequency measurements rather than absolute frequency measurements. Because of the historic problems with chemical shift referencing, chemical shifts are perhaps the most precisely measurable but the least accurately measured parameters in all of NMR spectroscopy.

Protein chemical shift re-referencing is a post-assignment process of adjusting the assigned NMR chemical shifts to match IUPAC and BMRB recommended standards in protein chemical shift referencing. In NMR chemical shifts are normally referenced to an internal standard that is dissolved in the NMR sample. These internal standards include tetramethylsilane (TMS), 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) and trimethylsilyl propionate (TSP). For protein NMR spectroscopy the recommended standard is DSS, which is insensitive to pH variations. Furthermore, the DSS 1H signal may be used to indirectly reference 13C and 15N shifts using a simple ratio calculation [1]. Unfortunately, many biomolecular NMR spectroscopy labs use non-standard methods for determining the 1H, 13C or 15N “zero-point” chemical shift position. This lack of standardization makes it difficult to compare chemical shifts for the same protein between different laboratories. It also makes it difficult to use chemical shifts to properly identify or assign secondary structures or to improve their 3D structures via chemical shift refinement. Chemical shift re-referencing offers a means to correct these referencing errors and to standardize the reporting of protein chemical shifts across laboratories.

Resolution by Proxy (ResProx) is a method for assessing the equivalent X-ray resolution of NMR-derived protein structures. ResProx calculates resolution from coordinate data rather than from electron density or other experimental inputs. This makes it possible to calculate the resolution of a structure regardless of how it was solved. ResProx was originally designed to serve as a simple, single-number evaluation that allows straightforward comparison between the quality/resolution of X-ray structures and the quality of a given NMR structure. However, it can also be used to assess the reliability of an experimentally reported X-ray structure resolution, to evaluate protein structures solved by unconventional or hybrid means and to identify fraudulent structures deposited in the PDB. ResProx incorporates more than 25 different structural features to determine a single resolution-like value. ResProx values are reported in Angstroms. Tests on thousands of X-ray structures show that ResProx values match very closely to resolution values reported by X-ray crystallographers. Resolution-by-proxy values can be calculated for newly determined protein structures using a freely accessible ResProx web server. This server accepts protein coordinate data and generates a resolution estimate for that input structure.

Probabilistic Approach for protein NMR Assignment Validation (PANAV) is a freely available stand-alone program that is used for protein chemical shift re-referencing. Chemical shift referencing is a problem in protein nuclear magnetic resonance as >20% of reported NMR chemical shift assignments appear to be improperly referenced. For certain nuclei these referencing issues can cause systematic chemical shift errors of between 1.0 and 2.5 ppm. Chemical shift errors of this magnitude often make it very difficult to compare NMR chemical shift assignments between proteins. It also makes it very hard to structurally interpret chemical shifts. Unlike most other chemical shift re-referencing tools PANAV employs a structure-independent protocol. That is, with PANAV there is no need to know the structure of the protein in advance of correcting any chemical shift referencing errors. This makes PANAV particularly useful for NMR studies involving novel or newly assigned proteins, where the structure has yet to be determined. Indeed, this scenario represents the vast majority of assignment cases in biomolecular NMR. PANAV uses residue-specific and secondary structure-specific chemical shift distributions that were calculated over short fragments of correctly referenced proteins to identify mis-assigned resonances. More specifically, PANAV compares the initial chemical shift assignments to the expected chemical shifts based on their local sequence and expected/predicted secondary structure. In this way, PANAV is able to identify and re-reference mis-referenced chemical shift assignments. PANAV can also identify potentially mis-assigned resonances as well. PANAV has been extensively tested and compared against a large number of existing re-referencing or mis-assignment detection programs. These assessments indicate that PANAV is equal to or superior to existing approaches.

PREDITOR is a freely available web-server for the prediction of protein torsion angles from chemical shifts. For many years it has been known that protein chemical shifts are sensitive to protein secondary structure, which in turn, is sensitive to backbone torsion angles. torsion angles are internal coordinates that can be used to describe the conformation of a polypeptide chain. They can also be used as constraints to help determine or refine protein structures via NMR spectroscopy. In proteins there are four major torsion angles of interest: phi, psi, omega and chi-1. Traditionally protein NMR spectroscopists have used vicinal J-coupling information and the Karplus relation to determine approximate backbone torsion angle constraints for phi and chi-1 angles. However, several studies in the early 1990s pointed out the strong relationship between 1H and 13C chemical shifts and torsion angles, especially with backbone phi and psi angles. Later a number of other papers pointed out additional chemical shift relationships with chi-1 and even omega angles. PREDITOR was designed to exploit these experimental observations and to help NMR spectroscopists easily predict protein torsion angles from chemical shift assignments. Specifically, PREDITOR accepts protein sequence and/or chemical shift data as input and generates torsion angle predictions for phi, psi, omega and chi-1 angles. The algorithm that PREDITOR uses combines sequence alignment, chemical shift alignment and a number of related chemical shift analysis techniques to predict torsion angles. PREDITOR is unusually fast and exhibits a very high level of accuracy. In a series of tests 88% of PREDITOR’s phi/psi predictions were within 30 degrees of the correct values, 84% of chi-1 predictions were correct and 99.97% of PREDITOR’s predicted omega angles were correct. PREDITOR also estimates the torsion angle errors so that its torsion angle constraints can be used with standard protein structure refinement software, such as CYANA, CNS, XPLOR and AMBER. PREDITOR also supports automated protein chemical shift re-referencing and the prediction of proline cis/trans states. PREDITOR is not the only torsion angle prediction software available. Several other computer programs including TALOS, TALOS+ and DANGLE have also been developed to predict backbone torsion angles from protein chemical shifts. These stand-alone programs exhibit similar prediction performance to PREDITOR but are substantially slower.

References

  1. 1 2 Neal, S; Nip, A.; Zhang, H.; Wishart, D.S. (July 2003). "Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts". J. Biomol. NMR. 26 (3): 215–240. doi:10.1023/A:1023812930288. PMID   12766419. S2CID   29425090.
  2. 1 2 Han, B.; Liu, Y.; Ginzinger, S.; Wishart, D.S. (May 2011). "SHIFTX2: significantly improved protein chemical shift prediction". J. Biomol. NMR. 50 (1): 43–57. doi:10.1007/s10858-011-9478-4. PMC   3085061 . PMID   21448735.
  3. Zhang, H; Neal, S.; Wishart, D.S. (March 2003). "RefDB: A database of uniformly referenced protein chemical shifts". J. Biomol. NMR. 25 (3): 173–195. doi:10.1023/A:1022836027055. PMID   12652131. S2CID   12786364.
  4. Xu, X.P.; Case, D.A. (Dec 2001). "Automated prediction of 15N, 13Calpha, 13Cbeta and 13C' chemical shifts in proteins using a density functional database". J Biomol NMR. 21 (4): 321–333. doi:10.1023/A:1013324104681. PMID   11824752. S2CID   665000.
  5. Shen, Y.; Bax, A. (Aug 2007). "Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology". J Biomol NMR. 38 (4): 289–302. doi:10.1007/s10858-007-9166-6. PMID   17610132. S2CID   12886163.