Protein secondary structure

Last updated
The image above contains clickable links
This diagram (which is interactive) of protein structure uses PCNA as an example. (PDB: 1AXC ) Protein structure (2)-en.svg
The image above contains clickable links Interactive icon.svg
The image above contains clickable links
This diagram (which is interactive) of protein structure uses PCNA as an example. ( PDB: 1AXC )

Protein secondary structure is the local spatial conformation of the polypeptide backbone excluding the side chains. [1] The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.

Contents

Secondary structure is formally defined by the pattern of hydrogen bonds between the amino hydrogen and carboxyl oxygen atoms in the peptide backbone. Secondary structure may alternatively be defined based on the regular pattern of backbone dihedral angles in a particular region of the Ramachandran plot regardless of whether it has the correct hydrogen bonds.

The concept of secondary structure was first introduced by Kaj Ulrik Linderstrøm-Lang at Stanford in 1952. [2] [3] Other types of biopolymers such as nucleic acids also possess characteristic secondary structures.

Types

Structural features of the three major forms of protein helices [4] [5]
Geometry attributeα-helix310 helixπ-helix
Residues per turn3.63.04.4
Translation per residue1.5 Å (0.15 nm)2.0 Å (0.20 nm)1.1 Å (0.11 nm)
Radius of helix2.3 Å (0.23 nm)1.9 Å (0.19 nm)2.8 Å (0.28 nm)
Pitch5.4 Å (0.54 nm)6.0 Å (0.60 nm)4.8 Å (0.48 nm)
The image above contains clickable links
Interactive diagram of hydrogen bonds in protein secondary structure. Cartoon above, atoms below with nitrogen in blue, oxygen in red (PDB: 1AXC ) Alpha beta structure (full).png
The image above contains clickable links Interactive icon.svg
The image above contains clickable links
Interactive diagram of hydrogen bonds in protein secondary structure. Cartoon above, atoms below with nitrogen in blue, oxygen in red ( PDB: 1AXC​ )

The most common secondary structures are alpha helices and beta sheets. Other helices, such as the 310 helix and π helix, are calculated to have energetically favorable hydrogen-bonding patterns but are rarely observed in natural proteins except at the ends of α helices due to unfavorable backbone packing in the center of the helix. Other extended structures such as the polyproline helix and alpha sheet are rare in native state proteins but are often hypothesized as important protein folding intermediates. Tight turns and loose, flexible loops link the more "regular" secondary structure elements. The random coil is not a true secondary structure, but is the class of conformations that indicate an absence of regular secondary structure.

Amino acids vary in their ability to form the various secondary structure elements. Proline and glycine are sometimes known as "helix breakers" because they disrupt the regularity of the α helical backbone conformation; however, both have unusual conformational abilities and are commonly found in turns. Amino acids that prefer to adopt helical conformations in proteins include methionine, alanine, leucine, glutamate and lysine ("MALEK" in amino-acid 1-letter codes); by contrast, the large aromatic residues (tryptophan, tyrosine and phenylalanine) and Cβ-branched amino acids (isoleucine, valine, and threonine) prefer to adopt β-strand conformations. However, these preferences are not strong enough to produce a reliable method of predicting secondary structure from sequence alone.

Low frequency collective vibrations are thought to be sensitive to local rigidity within proteins, revealing beta structures to be generically more rigid than alpha or disordered proteins. [6] [7] Neutron scattering measurements have directly connected the spectral feature at ~1 THz to collective motions of the secondary structure of beta-barrel protein GFP. [8]

Hydrogen bonding patterns in secondary structures may be significantly distorted, which makes automatic determination of secondary structure difficult. There are several methods for formally defining protein secondary structure (e.g., DSSP, [9] DEFINE, [10] STRIDE, [11] ScrewFit, [12] SST [13] ).

DSSP classification

Distribution obtained from non-redundant pdb_select dataset (March 2006); Secondary structure assigned by DSSP; 8 conformational states reduced to 3 states: H=HGI, E=EB, C=STC. Visible are mixtures of (gaussian) distributions, resulting also from the reduction of DSSP states. SegmentLengths.dist.png
Distribution obtained from non-redundant pdb_select dataset (March 2006); Secondary structure assigned by DSSP; 8 conformational states reduced to 3 states: H=HGI, E=EB, C=STC. Visible are mixtures of (gaussian) distributions, resulting also from the reduction of DSSP states.

The Dictionary of Protein Secondary Structure, in short DSSP, is commonly used to describe the protein secondary structure with single letter codes. The secondary structure is assigned based on hydrogen bonding patterns as those initially proposed by Pauling et al. in 1951 (before any protein structure had ever been experimentally determined). There are eight types of secondary structure that DSSP defines:

'Coil' is often codified as ' ' (space), C (coil) or '–' (dash). The helices (G, H and I) and sheet conformations are all required to have a reasonable length. This means that 2 adjacent residues in the primary structure must form the same hydrogen bonding pattern. If the helix or sheet hydrogen bonding pattern is too short they are designated as T or B, respectively. Other protein secondary structure assignment categories exist (sharp turns, Omega loops, etc.), but they are less frequently used.

Secondary structure is defined by hydrogen bonding, so the exact definition of a hydrogen bond is critical. The standard hydrogen-bond definition for secondary structure is that of DSSP, which is a purely electrostatic model. It assigns charges of ±q1  0.42e to the carbonyl carbon and oxygen, respectively, and charges of ±q2  0.20e to the amide hydrogen and nitrogen, respectively. The electrostatic energy is

According to DSSP, a hydrogen-bond exists if and only if E is less than −0.5 kcal/mol (−2.1 kJ/mol). Although the DSSP formula is a relatively crude approximation of the physical hydrogen-bond energy, it is generally accepted as a tool for defining secondary structure.

SST [13] classification

SST is a Bayesian method to assign secondary structure to protein coordinate data using the Shannon information criterion of Minimum Message Length (MML) inference. SST treats any assignment of secondary structure as a potential hypothesis that attempts to explain (compress) given protein coordinate data. The core idea is that the best secondary structural assignment is the one that can explain (compress) the coordinates of a given protein coordinates in the most economical way, thus linking the inference of secondary structure to lossless data compression. SST accurately delineates any protein chain into regions associated with the following assignment types: [14]

SST detects π and 310 helical caps to standard α-helices, and automatically assembles the various extended strands into consistent β-pleated sheets. It provides a readable output of dissected secondary structural elements, and a corresponding PyMol-loadable script to visualize the assigned secondary structural elements individually.

Experimental determination

The rough secondary-structure content of a biopolymer (e.g., "this protein is 40% α-helix and 20% β-sheet.") can be estimated spectroscopically. [15] For proteins, a common method is far-ultraviolet (far-UV, 170–250 nm) circular dichroism. A pronounced double minimum at 208 and 222 nm indicate α-helical structure, whereas a single minimum at 204 nm or 217 nm reflects random-coil or β-sheet structure, respectively. A less common method is infrared spectroscopy, which detects differences in the bond oscillations of amide groups due to hydrogen-bonding. Finally, secondary-structure contents may be estimated accurately using the chemical shifts of an initially unassigned NMR spectrum. [16]

Prediction

Predicting protein tertiary structure from only its amino sequence is a very challenging problem (see protein structure prediction), but using the simpler secondary structure definitions is more tractable.

Early methods of secondary-structure prediction were restricted to predicting the three predominate states: helix, sheet, or random coil. These methods were based on the helix- or sheet-forming propensities of individual amino acids, sometimes coupled with rules for estimating the free energy of forming secondary structure elements. The first widely used techniques to predict protein secondary structure from the amino acid sequence were the Chou–Fasman method [17] [18] [19] and the GOR method. [20] Although such methods claimed to achieve ~60% accurate in predicting which of the three states (helix/sheet/coil) a residue adopts, blind computing assessments later showed that the actual accuracy was much lower. [21]

A significant increase in accuracy (to nearly ~80%) was made by exploiting multiple sequence alignment; knowing the full distribution of amino acids that occur at a position (and in its vicinity, typically ~7 residues on either side) throughout evolution provides a much better picture of the structural tendencies near that position. [22] [23] For illustration, a given protein might have a glycine at a given position, which by itself might suggest a random coil there. However, multiple sequence alignment might reveal that helix-favoring amino acids occur at that position (and nearby positions) in 95% of homologous proteins spanning nearly a billion years of evolution. Moreover, by examining the average hydrophobicity at that and nearby positions, the same alignment might also suggest a pattern of residue solvent accessibility consistent with an α-helix. Taken together, these factors would suggest that the glycine of the original protein adopts α-helical structure, rather than random coil. Several types of methods are used to combine all the available data to form a 3-state prediction, including neural networks, hidden Markov models and support vector machines. Modern prediction methods also provide a confidence score for their predictions at every position.

Secondary-structure prediction methods were evaluated by the Critical Assessment of protein Structure Prediction (CASP) experiments and continuously benchmarked, e.g. by EVA (benchmark). Based on these tests, the most accurate methods were Psipred, SAM, [24] PORTER, [25] PROF, [26] and SABLE. [27] The chief area for improvement appears to be the prediction of β-strands; residues confidently predicted as β-strand are likely to be so, but the methods are apt to overlook some β-strand segments (false negatives). There is likely an upper limit of ~90% prediction accuracy overall, due to the idiosyncrasies of the standard method (DSSP) for assigning secondary-structure classes (helix/strand/coil) to PDB structures, against which the predictions are benchmarked. [28]

Accurate secondary-structure prediction is a key element in the prediction of tertiary structure, in all but the simplest (homology modeling) cases. For example, a confidently predicted pattern of six secondary structure elements βαββαβ is the signature of a ferredoxin fold. [29]

Applications

Both protein and nucleic acid secondary structures can be used to aid in multiple sequence alignment. These alignments can be made more accurate by the inclusion of secondary structure information in addition to simple sequence information. This is sometimes less useful in RNA because base pairing is much more highly conserved than sequence. Distant relationships between proteins whose primary structures are unalignable can sometimes be found by secondary structure. [22]

It has been shown that α-helices are more stable, robust to mutations, and designable than β-strands in natural proteins, [30] thus designing functional all-α proteins is likely to be easier that designing proteins with both helices and strands; this has been recently confirmed experimentally. [31]

See also

Related Research Articles

<span class="mw-page-title-main">Alpha helix</span> Type of secondary structure of proteins

An alpha helix is a sequence of amino acids in a protein that are twisted into a coil.

<span class="mw-page-title-main">Beta sheet</span> Protein structural motif

The beta sheet is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet. A β-strand is a stretch of polypeptide chain typically 3 to 10 amino acids long with backbone in an extended conformation. The supramolecular association of β-sheets has been implicated in the formation of the fibrils and protein aggregates observed in amyloidosis, Alzheimer's disease and other proteinopathies.

<span class="mw-page-title-main">Collagen helix</span> Main protein structure of fibrous collagen

In molecular biology, the collagen triple helix or type-2 helix is the main secondary structure of various types of fibrous collagen, including type I collagen. In 1954, Ramachandran & Kartha advanced a structure for the collagen triple helix on the basis of fiber diffraction data. It consists of a triple helix made of the repetitious amino acid sequence glycine-X-Y, where X and Y are frequently proline or hydroxyproline. Collagen folded into a triple helix is known as tropocollagen. Collagen triple helices are often bundled into fibrils which themselves form larger fibres, as in tendons.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.

<span class="mw-page-title-main">Protein structure</span> Three-dimensional arrangement of atoms in an amino acid-chain molecule

Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions, such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo-electron microscopy (cryo-EM) and dual polarisation interferometry, to determine the structure of proteins.

<span class="mw-page-title-main">Protein contact map</span>

A protein contact map represents the distance between all possible amino acid residue pairs of a three-dimensional protein structure using a binary two-dimensional matrix. For two residues and , the element of the matrix is 1 if the two residues are closer than a predetermined threshold, and 0 otherwise. Various contact definitions have been proposed: The distance between the Cα-Cα atom with threshold 6-12 Å; distance between Cβ-Cβ atoms with threshold 6-12 Å ; and distance between the side-chain centers of mass.

A turn is an element of secondary structure in proteins where the polypeptide chain reverses its overall direction.

A polyproline helix is a type of protein secondary structure which occurs in proteins comprising repeating proline residues. A left-handed polyproline II helix is formed when sequential residues all adopt (φ,ψ) backbone dihedral angles of roughly and have trans isomers of their peptide bonds. This PPII conformation is also common in proteins and polypeptides with other amino acids apart from proline. Similarly, a more compact right-handed polyproline I helix is formed when sequential residues all adopt (φ,ψ) backbone dihedral angles of roughly and have cis isomers of their peptide bonds. Of the twenty common naturally occurring amino acids, only proline is likely to adopt the cis isomer of the peptide bond, specifically the X-Pro peptide bond; steric and electronic factors heavily favor the trans isomer in most other peptide bonds. However, peptide bonds that replace proline with another N-substituted amino acid are also likely to adopt the cis isomer.

<span class="mw-page-title-main">Pi helix</span>

A pi helix is a type of secondary structure found in proteins. Discovered by crystallographer Barbara Low in 1952 and once thought to be rare, short π-helices are found in 15% of known protein structures and are believed to be an evolutionary adaptation derived by the insertion of a single amino acid into an α-helix. Because such insertions are highly destabilizing, the formation of π-helices would tend to be selected against unless it provided some functional advantage to the protein. π-helices therefore are typically found near functional sites of proteins.

3<sub>10</sub> helix Type of secondary structure

A 310 helix is a type of secondary structure found in proteins and polypeptides. Of the numerous protein secondary structures present, the 310-helix is the fourth most common type observed; following α-helices, β-sheets and reverse turns. 310-helices constitute nearly 10–15% of all helices in protein secondary structures, and are typically observed as extensions of α-helices found at either their N- or C- termini. Because of the α-helices tendency to consistently fold and unfold, it has been proposed that the 310-helix serves as an intermediary conformation of sorts, and provides insight into the initiation of α-helix folding.

<span class="mw-page-title-main">Beta hairpin</span>

The beta hairpin is a simple protein structural motif involving two beta strands that look like a hairpin. The motif consists of two strands that are adjacent in primary structure, oriented in an antiparallel direction, and linked by a short loop of two to five amino acids. Beta hairpins can occur in isolation or as part of a series of hydrogen bonded strands that collectively comprise a beta sheet.

The Chou–Fasman method is an empirical technique for the prediction of secondary structures in proteins, originally developed in the 1970s by Peter Y. Chou and Gerald D. Fasman. The method is based on analyses of the relative frequencies of each amino acid in alpha helices, beta sheets, and turns based on known protein structures solved with X-ray crystallography. From these frequencies a set of probability parameters were derived for the appearance of each amino acid in each secondary structure type, and these parameters are used to predict the probability that a given sequence of amino acids would form a helix, a beta strand, or a turn in a protein. The method is at most about 50–60% accurate in identifying correct secondary structures, which is significantly less accurate than the modern machine learning–based techniques.

The GOR method is an information theory-based method for the prediction of secondary structures in proteins. It was developed in the late 1970s shortly after the simpler Chou–Fasman method. Like Chou–Fasman, the GOR method is based on probability parameters derived from empirical studies of known protein tertiary structures solved by X-ray crystallography. However, unlike Chou–Fasman, the GOR method takes into account not only the propensities of individual amino acids to form particular secondary structures, but also the conditional probability of the amino acid to form a secondary structure given that its immediate neighbors have already formed that structure. The method is therefore essentially Bayesian in its analysis.

The DSSP algorithm is the standard method for assigning secondary structure to the amino acids of a protein, given the atomic-resolution coordinates of the protein. The abbreviation is only mentioned once in the 1983 paper describing this algorithm, where it is the name of the Pascal program that implements the algorithm Define Secondary Structure of Proteins.

<span class="mw-page-title-main">Alpha sheet</span> Secondary protein structure

Alpha sheet is an atypical secondary structure in proteins, first proposed by Linus Pauling and Robert Corey in 1951. The hydrogen bonding pattern in an alpha sheet is similar to that of a beta sheet, but the orientation of the carbonyl and amino groups in the peptide bond units is distinctive; in a single strand, all the carbonyl groups are oriented in the same direction on one side of the pleat, and all the amino groups are oriented in the same direction on the opposite side of the sheet. Thus the alpha sheet accumulates an inherent separation of electrostatic charge, with one edge of the sheet exposing negatively charged carbonyl groups and the opposite edge exposing positively charged amino groups. Unlike the alpha helix and beta sheet, the alpha sheet configuration does not require all component amino acid residues to lie within a single region of dihedral angles; instead, the alpha sheet contains residues of alternating dihedrals in the traditional right-handed (αR) and left-handed (αL) helical regions of Ramachandran space. Although the alpha sheet is only rarely observed in natural protein structures, it has been speculated to play a role in amyloid disease and it was found to be a stable form for amyloidogenic proteins in molecular dynamics simulations. Alpha sheets have also been observed in X-ray crystallography structures of designed peptides.

In protein structure, STRIDE is an algorithm for the assignment of protein secondary structure elements given the atomic coordinates of the protein, as defined by X-ray crystallography, protein NMR, or another protein structure determination method. In addition to the hydrogen bond criteria used by the more common DSSP algorithm, the STRIDE assignment criteria also include dihedral angle potentials. As such, its criteria for defining individual secondary structures are more complex than those of DSSP. The STRIDE energy function contains a hydrogen-bond term containing a Lennard-Jones-like 8-6 distance-dependent potential and two angular dependence factors reflecting the planarity of the optimized hydrogen bond geometry. The criteria for individual secondary structural elements, which are divided into the same groups as those reported by DSSP, also contain statistical probability factors derived from empirical examinations of solved structures with visually assigned secondary structure elements extracted from the Protein Data Bank.

<span class="mw-page-title-main">Protein fold class</span> Categories of protein tertiary structure

In molecular biology, protein fold classes are broad categories of protein tertiary structure topology. They describe groups of proteins that share similar amino acid and secondary structure proportions. Each class contains multiple, independent protein superfamilies.

<span class="mw-page-title-main">Chemical shift index</span> Laboratory technique

The chemical shift index or CSI is a widely employed technique in protein nuclear magnetic resonance spectroscopy that can be used to display and identify the location as well as the type of protein secondary structure found in proteins using only backbone chemical shift data The technique was invented by David S. Wishart in 1992 for analyzing 1Hα chemical shifts and then later extended by him in 1994 to incorporate 13C backbone shifts. The original CSI method makes use of the fact that 1Hα chemical shifts of amino acid residues in helices tends to be shifted upfield relative to their random coil values and downfield in beta strands. Similar kinds of upfield and downfield trends are also detectable in backbone 13C chemical shifts.

PSI-blast based secondary structure PREDiction (PSIPRED) is a method used to investigate protein structure. It uses artificial neural network machine learning methods in its algorithm. It is a server-side program, featuring a website serving as a front-end interface, which can predict a protein's secondary structure from the primary sequence.

Volume, Area, Dihedral Angle Reporter (VADAR) is a freely available protein structure validation web server that was developed as a collaboration between Dr. Brian Sykes and Dr. David Wishart at the University of Alberta. VADAR consists of over 15 different algorithms and programs for assessing and validating peptide and protein structures from their PDB coordinate data. VADAR is capable of determining secondary structure, identifying and classifying six different types of beta turns, determining and calculating the strength of C=O -- N-H hydrogen bonds, calculating residue-specific accessible surface areas (ASA), calculating residue volumes, determining backbone and side chain torsion angles, assessing local structure quality, evaluating global structure quality, and identifying residue "outliers". The results have been validated through extensive comparison to published data and careful visual inspection. VADAR produces both text and graphical output with most of the quantitative data presented in easily viewed tables. In particular, VADAR's output is presented in a vertical, tabular format with most of the sequence data, residue numbering and any other calculated property or feature presented from top to bottom, rather than from left to right.

References

  1. Sun PD, Foster CE, Boyington JC (May 2004). "Overview of protein structural and functional folds". Current Protocols in Protein Science. 17 (1): Unit 17.1. doi:10.1002/0471140864.ps1701s35. PMC   7162418 . PMID   18429251.
  2. Linderstrøm-Lang KU (1952). Lane Medical Lectures: Proteins and Enzymes. Stanford University Press. p. 115. ASIN   B0007J31SC.
  3. Schellman JA, Schellman CG (1997). "Kaj Ulrik Linderstrøm-Lang (1896–1959)". Protein Sci. 6 (5): 1092–100. doi:10.1002/pro.5560060516. PMC   2143695 . PMID   9144781. He had already introduced the concepts of the primary, secondary, and tertiary structure of proteins in the third Lane Lecture (Linderstram-Lang, 1952)
  4. Bottomley S (2004). "Interactive Protein Structure Tutorial". Archived from the original on March 1, 2011. Retrieved January 9, 2011.
  5. Schulz GE, Schirmer RH (1979). Principles of protein structure. New York: Springer-Verlag. ISBN   0-387-90386-0. OCLC   4498269.
  6. Perticaroli S, Nickels JD, Ehlers G, O'Neill H, Zhang Q, Sokolov AP (October 2013). "Secondary structure and rigidity in model proteins". Soft Matter. 9 (40): 9548–56. Bibcode:2013SMat....9.9548P. doi:10.1039/C3SM50807B. PMID   26029761.
  7. Perticaroli S, Nickels JD, Ehlers G, Sokolov AP (June 2014). "Rigidity, secondary structure, and the universality of the boson peak in proteins". Biophysical Journal. 106 (12): 2667–74. Bibcode:2014BpJ...106.2667P. doi:10.1016/j.bpj.2014.05.009. PMC   4070067 . PMID   24940784.
  8. Nickels JD, Perticaroli S, O'Neill H, Zhang Q, Ehlers G, Sokolov AP (2013). "Coherent neutron scattering and collective dynamics in the protein, GFP". Biophys. J. 105 (9): 2182–87. Bibcode:2013BpJ...105.2182N. doi:10.1016/j.bpj.2013.09.029. PMC   3824694 . PMID   24209864.
  9. Kabsch W, Sander C (Dec 1983). "Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features". Biopolymers. 22 (12): 2577–637. doi:10.1002/bip.360221211. PMID   6667333. S2CID   29185760.
  10. Richards FM, Kundrot CE (1988). "Identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure". Proteins. 3 (2): 71–84. doi:10.1002/prot.340030202. PMID   3399495. S2CID   29126855.
  11. Frishman D, Argos P (Dec 1995). "Knowledge-based protein secondary structure assignment" (PDF). Proteins. 23 (4): 566–79. CiteSeerX   10.1.1.132.9420 . doi:10.1002/prot.340230412. PMID   8749853. S2CID   17487756. Archived from the original (PDF) on 2010-06-13.
  12. Calligari PA, Kneller GR (December 2012). "ScrewFit: combining localization and description of protein secondary structure". Acta Crystallographica Section D. 68 (Pt 12): 1690–3. doi:10.1107/s0907444912039029. PMID   23151634.
  13. 1 2 Konagurthu AS, Lesk AM, Allison L (Jun 2012). "Minimum message length inference of secondary structure from protein coordinate data". Bioinformatics. 28 (12): i97–i105. doi:10.1093/bioinformatics/bts223. PMC   3371855 . PMID   22689785.
  14. "SST web server" . Retrieved 17 April 2018.
  15. Pelton JT, McLean LR (2000). "Spectroscopic methods for analysis of protein secondary structure". Anal. Biochem. 277 (2): 167–76. doi:10.1006/abio.1999.4320. PMID   10625503.
  16. Meiler J, Baker D (2003). "Rapid protein fold determination using unassigned NMR data". Proc. Natl. Acad. Sci. U.S.A. 100 (26): 15404–09. Bibcode:2003PNAS..10015404M. doi: 10.1073/pnas.2434121100 . PMC   307580 . PMID   14668443.
  17. Chou PY, Fasman GD (Jan 1974). "Prediction of protein conformation". Biochemistry. 13 (2): 222–45. doi:10.1021/bi00699a002. PMID   4358940.
  18. Chou PY, Fasman GD (1978). "Empirical predictions of protein conformation". Annual Review of Biochemistry. 47: 251–76. doi:10.1146/annurev.bi.47.070178.001343. PMID   354496.
  19. Chou PY, Fasman GD (1978). "Prediction of the secondary structure of proteins from their amino acid sequence". Advances in Enzymology and Related Areas of Molecular Biology. Advances in Enzymology - and Related Areas of Molecular Biology. Vol. 47. pp.  45–148. doi:10.1002/9780470122921.ch2. ISBN   9780470122921. PMID   364941.
  20. Garnier J, Osguthorpe DJ, Robson B (March 1978). "Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins". Journal of Molecular Biology. 120 (1): 97–120. doi:10.1016/0022-2836(78)90297-8. PMID   642007.
  21. Kabsch W, Sander C (May 1983). "How good are predictions of protein secondary structure?". FEBS Letters. 155 (2): 179–82. doi:10.1016/0014-5793(82)80597-8. PMID   6852232. S2CID   41477827.
  22. 1 2 Simossis VA, Heringa J (Aug 2004). "Integrating protein secondary structure prediction and multiple sequence alignment". Current Protein & Peptide Science. 5 (4): 249–66. doi:10.2174/1389203043379675. PMID   15320732.
  23. Pirovano W, Heringa J (2010). "Protein Secondary Structure Prediction". Data Mining Techniques for the Life Sciences. Methods in Molecular Biology. Vol. 609. pp. 327–48. doi:10.1007/978-1-60327-241-4_19. ISBN   978-1-60327-240-7. PMID   20221928.
  24. Karplus K (2009). "SAM-T08, HMM-based protein structure prediction". Nucleic Acids Res. 37 (Web Server issue): W492–97. doi:10.1093/nar/gkp403. PMC   2703928 . PMID   19483096.
  25. Pollastri G, McLysaght A (2005). "Porter: a new, accurate server for protein secondary structure prediction". Bioinformatics. 21 (8): 1719–20. doi: 10.1093/bioinformatics/bti203 . hdl: 2262/39594 . PMID   15585524.
  26. Yachdav G, Kloppmann E, Kajan L, Hecht M, Goldberg T, Hamp T, Hönigschmid P, Schafferhans A, Roos M, Bernhofer M, Richter L, Ashkenazy H, Punta M, Schlessinger A, Bromberg Y, Schneider R, Vriend G, Sander C, Ben-Tal N, Rost B (2014). "PredictProtein—an open resource for online prediction of protein structural and functional features". Nucleic Acids Res. 42 (Web Server issue): W337–43. doi:10.1093/nar/gku366. PMC   4086098 . PMID   24799431.
  27. Adamczak R, Porollo A, Meller J (2005). "Combining prediction of secondary structure and solvent accessibility in proteins". Proteins. 59 (3): 467–75. doi:10.1002/prot.20441. PMID   15768403. S2CID   13267624.
  28. Kihara D (Aug 2005). "The effect of long-range interactions on the secondary structure formation of proteins". Protein Science. 14 (8): 1955–963. doi:10.1110/ps.051479505. PMC   2279307 . PMID   15987894.
  29. Qi Y, Grishin NV (2005). "Structural classification of thioredoxin-like fold proteins" (PDF). Proteins. 58 (2): 376–88. CiteSeerX   10.1.1.644.8150 . doi:10.1002/prot.20329. PMID   15558583. S2CID   823339. Since the fold definition should include only the core secondary structural elements that are present in the majority of homologs, we define the thioredoxin-like fold as a two-layer α/β sandwich with the βαβββα secondary-structure pattern.
  30. Abrusán G, Marsh JA (December 2016). "Alpha Helices Are More Robust to Mutations than Beta Strands". PLOS Computational Biology. 12 (12): e1005242. Bibcode:2016PLSCB..12E5242A. doi: 10.1371/journal.pcbi.1005242 . PMC   5147804 . PMID   27935949.
  31. Rocklin GJ, Chidyausiku TM, Goreshnik I, Ford A, Houliston S, Lemak A, et al. (July 2017). "Global analysis of protein folding using massively parallel design, synthesis, and testing". Science. 357 (6347): 168–175. Bibcode:2017Sci...357..168R. doi:10.1126/science.aan0693. PMC   5568797 . PMID   28706065.

Further reading