RaptorX

RaptorX
Developer(s)	XU Jinbo; LI Ming; PENG Jian; ZHU Jianwei; WANG Sheng;
Initial release	2012;12 years ago
Available in	English
Type	Bioinformatics tool for protein structure prediction
License	Proprietary noncommercial, case-by-case
Website	raptorx.uchicago.edu

Last updated September 09, 2024

RaptorX is a software and web server for protein structure and function prediction that is free for non-commercial use. RaptorX is among the most popular methods for protein structure prediction.^[1]^[2]^[3]^[4]^[5] Like other remote homology recognition and protein threading techniques, RaptorX is able to regularly generate reliable protein models when the widely used PSI-BLAST cannot. However, RaptorX is also significantly different from profile-based methods (e.g., HHPred and Phyre2) in that RaptorX excels at modeling of protein sequences without a large number of sequence homologs by exploiting structure information. RaptorX Server has been designed to ensure a user-friendly interface for users inexpert in protein structure prediction methods.

Description

The RaptorX project was started in 2008 and RaptorX Server^[6] was released to the public in 2011.

Standard usage

After pasting a protein sequence into the RaptorX submission form, a user will typically wait a couple of hours (depending on sequence length) for a prediction to complete. An email is sent to the user together with a link to a web page of results. RaptorX Server currently generates the following results: 3-state and 8-state secondary structure prediction, sequence-template alignment, 3D structure prediction, solvent accessibility prediction, disorder prediction and binding site prediction. The predicted results are displayed to support visual examination. The result files are also available for download.

RaptorX Server also produces some confidence scores indicating the quality of the predicted 3D models (in the absence of their corresponding native structures). For example, it produces P-value for relative global quality of a 3D model, global distance test (GDT) and uGDT (unnormalized-GDT) for absolute global quality of a 3D model and per-position root mean square deviation (RMSD) for absolute local quality at each residue of a 3D model.

Applications and performance

Applications of RaptorX include protein structure prediction, function prediction, protein sequence-structure alignment, evolutionary classification of proteins, guiding site-directed mutagenesis and solving protein crystal structures by molecular replacement. In the Critical Assessment of Structure Prediction (CASP) CASP9 blind protein structure prediction experiment, RaptorX was ranked 2nd out of about 80 automatic structure prediction servers. RaptorX also generated the best alignments for the 50 hardest CASP9 template-based modeling (TBM) targets. In CASP10, RaptorX is the only server group among the top 10 human/server groups for the 15 most difficult CASP10 TBM targets.

History

RaptorX is the successor to the RAPTOR protein structure prediction system. RAPTOR was designed and developed by Dr. Jinbo Xu and Dr. Ming Li at the University of Waterloo. RaptorX was designed and developed by a research group led by Prof. Jinbo Xu at the Toyota Technological Institute branch at Chicago.

Related Research Articles

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.

Critical Assessment of Structure Prediction (CASP), sometimes called Critical Assessment of Protein Structure Prediction, is a community-wide, worldwide experiment for protein structure prediction taking place every two years since 1994. CASP provides research groups with an opportunity to objectively test their structure prediction methods and delivers an independent assessment of the state of the art in protein structure modeling to the research community and software users. Even though the primary goal of CASP is to help advance the methods of identifying protein three-dimensional structure from its amino acid sequence many view the experiment more as a "world championship" in this field of science. More than 100 research groups from all over the world participate in CASP on a regular basis and it is not uncommon for entire groups to suspend their other research for months while they focus on getting their servers ready for the experiment and on performing the detailed predictions.

Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.

This list of structural comparison and alignment software is a compilation of software tools and web portals used in pairwise or multiple structural comparison and structural alignment.

Homology modeling, also known as comparative modeling of protein, refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental three-dimensional structure of a related homologous protein. Homology modeling relies on the identification of one or more known protein structures likely to resemble the structure of the query sequence, and on the production of a sequence alignment that maps residues in the query sequence to residues in the template sequence. It has been seen that protein structures are more conserved than protein sequences amongst homologues, but sequences falling below a 20% sequence identity can have very different structure.

The global distance test (GDT), also written as GDT_TS to represent "total score", is a measure of similarity between two protein structures with known amino acid correspondences but different tertiary structures. It is most commonly used to compare the results of protein structure prediction to the experimentally determined structure as measured by X-ray crystallography, protein NMR, or, increasingly, cryoelectron microscopy. The metric was developed by Adam Zemla at Lawrence Livermore National Laboratory and originally implemented in the Local-Global Alignment (LGA) program. It is intended as a more accurate measurement than the common root-mean-square deviation (RMSD) metric - which is sensitive to outlier regions created, for example, by poor modeling of individual loop regions in a structure that is otherwise reasonably accurate. The conventional GDT_TS score is computed over the alpha carbon atoms and is reported as a percentage, ranging from 0 to 100. In general, the higher the GDT_TS score, the more closely a model approximates a given reference structure.

In computational biology, de novo protein structure prediction refers to an algorithmic process by which protein tertiary structure is predicted from its amino acid primary sequence. The problem itself has occupied leading scientists for decades while still remaining unsolved. According to Science, the problem remains one of the top 125 outstanding issues in modern science. At present, some of the most successful methods have a reasonable probability of predicting the folds of small, single-domain proteins within 1.5 angstroms over the entire structure.

ESyPred3D is an automated homology modeling program. Alignments are obtained by combining, weighting and screening the results of several multiple alignment programs. The final three-dimensional structure is built using the modeling package MODELLER.

RAPTOR is protein threading software used for protein structure prediction. It has been replaced by RaptorX, which is much more accurate than RAPTOR.

Phyre and Phyre2 are free web-based services for protein structure prediction. Phyre is among the most popular methods for protein structure prediction having been cited over 1500 times. Like other remote homology recognition techniques, it is able to regularly generate reliable protein models when other widely used methods such as PSI-BLAST cannot. Phyre2 has been designed to ensure a user-friendly interface for users inexpert in protein structure prediction methods. Its development is funded by the Biotechnology and Biological Sciences Research Council.

Swiss-model is a structural bioinformatics web-server dedicated to homology modeling of 3D protein structures. As of 2024, homology modeling is the most accurate method to generate reliable three-dimensional protein structure models and is routinely used in many practical applications. Homology modelling methods make use of experimental protein structures (templates) to build models for evolutionary related proteins (targets).

The HH-suite is an open-source software package for sensitive protein sequence searching. It contains programs that can search for similar protein sequences in protein sequence databases. Sequence searches are a standard tool in modern biology with which the function of unknown proteins can be inferred from the functions of proteins with similar sequences. HHsearch and HHblits are two main programs in the package and the entry point to its search function, the latter being a faster iteration. HHpred is an online server for protein structure prediction that uses homology information from HH-suite.

Continuous Automated Model EvaluatiOn (CAMEO) is a community-wide project to continuously evaluate the accuracy and reliability of protein structure prediction servers in a fully automated manner. CAMEO is a continuous and fully automated complement to the bi-annual CASP experiment.

CS23D is a web server to generate 3D structural models from NMR chemical shifts. CS23D combines maximal fragment assembly with chemical shift threading, de novo structure generation, chemical shift-based torsion angle prediction, and chemical shift refinement. CS23D makes use of RefDB and ShiftX.

<span class="mw-page-title-main">I-TASSER</span>

I-TASSER is a bioinformatics method for predicting three-dimensional structure model of protein molecules from amino acid sequences. It detects structure templates from the Protein Data Bank by a technique called fold recognition. The full-length structure models are constructed by reassembling structural fragments from threading templates using replica exchange Monte Carlo simulations. I-TASSER is one of the most successful protein structure prediction methods in the community-wide CASP experiments.

C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.

AlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure. The program is designed as a deep learning system.

IntFOLD is fully automated, integrated pipeline for prediction of 3D structure and function from amino acid sequences. The pipeline is wrapped up and deployed as a Web Server. The core of the server method is quality assessment using built-in accuracy self-estimates (ASE) which improves performance prediction of 3D model using ModFOLD.

References

↑ Peng, Jian; Xu, Jinbo (Oct 2011). "RaptorX: exploiting structure information for protein alignment by statistical inference". Proteins . 79 (Suppl 10): 161–71. doi:10.1002/prot.23175. PMC 3226909 . PMID 21987485.
↑ Peng, Jian; Xu, Jinbo (July 2010). "Low-homology protein threading". Bioinformatics. 26 (12): i294–i300. doi:10.1093/bioinformatics/btq192. PMC 2881377 . PMID 20529920.
↑ Peng, Jian; Xu, Jinbo (April 2011). "a multiple-template approach to protein threading". Proteins. 79 (6): 1930–1939. doi:10.1002/prot.23016. PMC 3092796 . PMID 21465564.
↑ Peng, Jian; Xu, Jinbo (January 2009). "Boosting Protein Threading Accuracy". Research in Computational Molecular Biology. Lecture Notes in Computer Science. Vol. 5541. pp. 31–45. doi:10.1007/978-3-642-02008-7_3. ISBN 978-3-642-02007-0. PMC 3325114 . PMID 22506254.
↑ Ma, Jianzhu; Wang, Sheng; Xu, Jinbo (June 2012). "A conditional neural fields model for protein threading". Bioinformatics. 28 (12): i59-66. doi:10.1093/bioinformatics/bts213. PMC 3371845 . PMID 22689779.
↑ Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo (July 2012). "Template-based protein structure modeling using the RaptorX web server". Nature Protocols. 7 (8): 1511–1522. doi:10.1038/nprot.2012.085. PMC 4730388 . PMID 22814390.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[RaptorX2011-1] Peng, Jian; Xu, Jinbo (Oct 2011). "RaptorX: exploiting structure information for protein alignment by statistical inference". Proteins . 79 (Suppl 10): 161–71. doi:10.1002/prot.23175. PMC 3226909 . PMID 21987485.

[PengLowHomology-2] Peng, Jian; Xu, Jinbo (July 2010). "Low-homology protein threading". Bioinformatics. 26 (12): i294–i300. doi:10.1093/bioinformatics/btq192. PMC 2881377 . PMID 20529920.

[PengMultiTemplate-3] Peng, Jian; Xu, Jinbo (April 2011). "a multiple-template approach to protein threading". Proteins. 79 (6): 1930–1939. doi:10.1002/prot.23016. PMC 3092796 . PMID 21465564.

[PengBoostThreader-4] Peng, Jian; Xu, Jinbo (January 2009). "Boosting Protein Threading Accuracy". Research in Computational Molecular Biology. Lecture Notes in Computer Science. Vol. 5541. pp. 31–45. doi:10.1007/978-3-642-02008-7_3. ISBN 978-3-642-02007-0. PMC 3325114 . PMID 22506254.

[CNFpred-5] Ma, Jianzhu; Wang, Sheng; Xu, Jinbo (June 2012). "A conditional neural fields model for protein threading". Bioinformatics. 28 (12): i59-66. doi:10.1093/bioinformatics/bts213. PMC 3371845 . PMID 22689779.

[RaptorXServer-6] Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo (July 2012). "Template-based protein structure modeling using the RaptorX web server". Nature Protocols. 7 (8): 1511–1522. doi:10.1038/nprot.2012.085. PMC 4730388 . PMID 22814390.

[1]

[2]

[3]

[4]

[5]

[6]