Protein tertiary structure

Last updated
The image above contains clickable links
This diagram (which is interactive) of protein structure uses PCNA as an example. (PDB: 1AXC ) Protein structure (3)-en.svg
The image above contains clickable links Interactive icon.svg
The image above contains clickable links
This diagram (which is interactive) of protein structure uses PCNA as an example. ( PDB: 1AXC )
The tertiary structure of a protein consists of the way a polypeptide is formed of a complex molecular shape. This is caused by R-group interactions such as ionic and hydrogen bonds, disulphide bridges, and hydrophobic & hydrophilic interactions. Tertiary Structure of a Protein.svg
The tertiary structure of a protein consists of the way a polypeptide is formed of a complex molecular shape. This is caused by R-group interactions such as ionic and hydrogen bonds, disulphide bridges, and hydrophobic & hydrophilic interactions.

Protein tertiary structure is the three-dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains and the backbone may interact and bond in a number of ways. The interactions and bonds of side chains within a particular protein determine its tertiary structure. The protein tertiary structure is defined by its atomic coordinates. These coordinates may refer either to a protein domain or to the entire tertiary structure. [1] [2] A number of these structures may bind to each other, forming a quaternary structure. [3]

Contents

History

The science of the tertiary structure of proteins has progressed from one of hypothesis to one of detailed definition. Although Emil Fischer had suggested proteins were made of polypeptide chains and amino acid side chains, it was Dorothy Maud Wrinch who incorporated geometry into the prediction of protein structures. Wrinch demonstrated this with the Cyclol model, the first prediction of the structure of a globular protein. [4] Contemporary methods are able to determine, without prediction, tertiary structures to within 5 Å (0.5 nm) for small proteins (<120 residues) and, under favorable conditions, confident secondary structure predictions.

Determinants

Stability of native states

Thermostability

A protein folded into its native state or native conformation typically has a lower Gibbs free energy (a combination of enthalpy and entropy) than the unfolded conformation. A protein will tend towards low-energy conformations, which will determine the protein's fold in the cellular environment. Because many similar conformations will have similar energies, protein structures are dynamic, fluctuating between these similar structures.

Globular proteins have a core of hydrophobic amino acid residues and a surface region of water-exposed, charged, hydrophilic residues. This arrangement may stabilize interactions within the tertiary structure. For example, in secreted proteins, which are not bathed in cytoplasm, disulfide bonds between cysteine residues help to maintain the tertiary structure. There is a commonality of stable tertiary structures seen in proteins of diverse function and diverse evolution. For example, the TIM barrel, named for the enzyme triosephosphateisomerase, is a common tertiary structure as is the highly stable, dimeric, coiled coil structure. Hence, proteins may be classified by the structures they hold. Databases of proteins which use such a classification include SCOP and CATH .

Kinetic traps

Folding kinetics may trap a protein in a high-energy conformation, i.e. a high-energy intermediate conformation blocks access to the lowest-energy conformation. The high-energy conformation may contribute to the function of the protein. For example, the influenza hemagglutinin protein is a single polypeptide chain which when activated, is proteolytically cleaved to form two polypeptide chains. The two chains are held in a high-energy conformation. When the local pH drops, the protein undergoes an energetically favorable conformational rearrangement that enables it to penetrate the host cell membrane.

Metastability

Some tertiary protein structures may exist in long-lived states that are not the expected most stable state. For example, many serpins (serine protease inhibitors) show this metastability. They undergo a conformational change when a loop of the protein is cut by a protease. [5] [6] [7]

Chaperone proteins

It is commonly assumed that the native state of a protein is also the most thermodynamically stable and that a protein will reach its native state, given its chemical kinetics, before it is translated. Protein chaperones within the cytoplasm of a cell assist a newly synthesised polypeptide to attain its native state. Some chaperone proteins are highly specific in their function, for example, protein disulfide isomerase; others are general in their function and may assist most globular proteins, for example, the prokaryotic GroEL/GroES system of proteins and the homologous eukaryotic heat shock proteins (the Hsp60/Hsp10 system).

Cytoplasmic environment

Prediction of protein tertiary structure relies on knowing the protein's primary structure and comparing the possible predicted tertiary structure with known tertiary structures in protein data banks. This only takes into account the cytoplasmic environment present at the time of protein synthesis to the extent that a similar cytoplasmic environment may also have influenced the structure of the proteins recorded in the protein data bank.

Ligand binding

The structure of a protein, such as an enzyme, may change upon binding of its natural ligands, for example a cofactor. In this case, the structure of the protein bound to the ligand is known as holo structure, while the unbound protein has an apo structure. [8]

Structure stabilized by the formation of weak bonds between amino acid side chains - Determined by the folding of the polypeptide chain on itself (nonpolar residues are located inside the protein, while polar residues are mainly located outside) - Envelopment of the protein brings the protein closer and relates a-to located in distant regions of the sequence - Acquisition of the tertiary structure leads to the formation of pockets and sites suitable for the recognition and the binding of specific molecules (biospecificity).

Determination

The knowledge of the tertiary structure of soluble globular proteins is more advanced than that of membrane proteins because the former are easier to study with available technology.

X-ray crystallography

X-ray crystallography is the most common tool used to determine protein structure. It provides high resolution of the structure but it does not give information about protein's conformational flexibility.

NMR

Protein NMR gives comparatively lower resolution of protein structure. It is limited to smaller proteins. However, it can provide information about conformational changes of a protein in solution.

Cryogenic electron microscopy

Cryogenic electron microscopy (cryo-EM) can give information about both a protein's tertiary and quaternary structure. It is particularly well-suited to large proteins and symmetrical complexes of protein subunits.

Dual polarisation interferometry

Dual polarisation interferometry provides complementary information about surface captured proteins. It assists in determining structure and conformation changes over time.

Projects

Prediction algorithm

The Folding@home project at the University of Pennsylvania is a distributed computing research effort which uses approximately 5 petaFLOPS (≈10 x86 petaFLOPS) of available computing. It aims to find an algorithm which will consistently predict protein tertiary and quaternary structures given the protein's amino acid sequence and its cellular conditions. [9] [10]

A list of software for protein tertiary structure prediction can be found at List of protein structure prediction software.

Protein aggregation diseases

Protein aggregation diseases such as Alzheimer's disease and Huntington's disease and prion diseases such as bovine spongiform encephalopathy can be better understood by constructing (and reconstructing) disease models. This is done by causing the disease in laboratory animals, for example, by administering a toxin, such as MPTP to cause Parkinson's disease, or through genetic manipulation. [11] [12] Protein structure prediction is a new way to create disease models, which may avoid the use of animals. [13]

Protein Tertiary Structure Retrieval Project (CoMOGrad)

Matching patterns in tertiary structure of a given protein to huge number of known protein tertiary structures and retrieve most similar ones in ranked order is in the heart of many research areas like function prediction of novel proteins, study of evolution, disease diagnosis, drug discovery, antibody design etc. The CoMOGrad project at BUET is a research effort to device an extremely fast and much precise method for protein tertiary structure retrieval and develop online tool based on research outcome. [14] [15]

See also

Related Research Articles

<span class="mw-page-title-main">Alpha helix</span> Type of secondary structure of proteins

An alpha helix is a sequence of amino acids in a protein that are twisted into a coil.

<span class="mw-page-title-main">Protein</span> Biomolecule consisting of chains of amino acid residues

Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

<span class="mw-page-title-main">Protein primary structure</span> Linear sequence of amino acids in a peptide or protein

Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthesis is most commonly performed by ribosomes in cells. Peptides can also be synthesized in the laboratory. Protein primary structures can be directly sequenced, or inferred from DNA sequences.

<span class="mw-page-title-main">Protein secondary structure</span> General three-dimensional form of local segments of proteins

Protein secondary structure is the local spatial conformation of the polypeptide backbone excluding the side chains. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.

<span class="mw-page-title-main">Protein folding</span> Change of a linear protein chain to a 3D structure

Protein folding is the physical process by which a protein, after synthesis by a ribosome as a linear chain of amino acids, changes from an unstable random coil into a more ordered three-dimensional structure. This structure permits the protein to become biologically functional.

<span class="mw-page-title-main">Globular protein</span> Spherical, water-soluble type of protein

In biochemistry, globular proteins or spheroproteins are spherical ("globe-like") proteins and are one of the common protein types. Globular proteins are somewhat water-soluble, unlike the fibrous or membrane proteins. There are multiple fold classes of globular proteins, since there are many different architectures that can fold into a roughly spherical shape.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.

Levinthal's paradox is a thought experiment in the field of computational protein structure prediction; protein folding seeks a stable energy configuration. An algorithmic search for the minimum energy configuration would take immense time, while protein folding in reality happens very quickly, even in the case of most complex structures.

<span class="mw-page-title-main">Protein structure</span> Three-dimensional arrangement of atoms in an amino acid-chain molecule

Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions, such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo-electron microscopy (cryo-EM) and dual polarisation interferometry, to determine the structure of proteins.

<span class="mw-page-title-main">Silent mutation</span> DNA mutation with no observable effect on an organisms phenotype

Silent mutations are mutations in DNA that do not have an observable effect on the organism's phenotype. They are a specific type of neutral mutation. The phrase silent mutation is often used interchangeably with the phrase synonymous mutation; however, synonymous mutations are not always silent, nor vice versa. Synonymous mutations can affect transcription, splicing, mRNA transport, and translation, any of which could alter phenotype, rendering the synonymous mutation non-silent. The substrate specificity of the tRNA to the rare codon can affect the timing of translation, and in turn the co-translational folding of the protein. This is reflected in the codon usage bias that is observed in many species. Mutations that cause the altered codon to produce an amino acid with similar functionality are often classified as silent; if the properties of the amino acid are conserved, this mutation does not usually significantly affect protein function.

Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch or by making calculated variants of a known protein structure and its sequence. Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as peptide synthesis, site-directed mutagenesis, or artificial gene synthesis.

Lattice proteins are highly simplified models of protein-like heteropolymer chains on lattice conformational space which are used to investigate protein folding. Simplification in lattice proteins is twofold: each whole residue is modeled as a single "bead" or "point" of a finite set of types, and each residue is restricted to be placed on vertices of a lattice. To guarantee the connectivity of the protein chain, adjacent residues on the backbone must be placed on adjacent vertices of the lattice. Steric constraints are expressed by imposing that no more than one residue can be placed on the same lattice vertex.

<span class="mw-page-title-main">Intrinsically disordered proteins</span> Protein without a fixed 3D structure

In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs range from fully unstructured to partially structured and include random coil, molten globule-like aggregates, or flexible linkers in large multi-domain proteins. They are sometimes considered as a separate class of proteins along with globular, fibrous and membrane proteins.

<span class="mw-page-title-main">Biomolecular structure</span> 3D conformation of a biological sequence, like DNA, RNA, proteins

Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule of protein, DNA, or RNA, and that is important to its function. The structure of these molecules may be considered at any of several length scales ranging from the level of individual atoms to the relationships among entire protein subunits. This useful distinction among scales is often expressed as a decomposition of molecular structure into four levels: primary, secondary, tertiary, and quaternary. The scaffold for this multiscale organization of the molecule arises at the secondary level, where the fundamental structural elements are the molecule's various hydrogen bonds. This leads to several recognizable domains of protein structure and nucleic acid structure, including such secondary-structure features as alpha helixes and beta sheets for proteins, and hairpin loops, bulges, and internal loops for nucleic acids. The terms primary, secondary, tertiary, and quaternary structure were introduced by Kaj Ulrik Linderstrøm-Lang in his 1951 Lane Medical Lectures at Stanford University.

<span class="mw-page-title-main">Cyclol</span> Structural model of a folded, globular protein

The cyclol hypothesis is the now discredited first structural model of a folded, globular protein, formulated in the 1930s. It was based on the cyclol reaction of peptide bonds proposed by physicist Frederick Frank in 1936, in which two peptide groups are chemically crosslinked. These crosslinks are covalent analogs of the non-covalent hydrogen bonds between peptide groups and have been observed in rare cases, such as the ergopeptides.

<span class="mw-page-title-main">Folding funnel</span>

The folding funnel hypothesis is a specific version of the energy landscape theory of protein folding, which assumes that a protein's native state corresponds to its free energy minimum under the solution conditions usually encountered in cells. Although energy landscapes may be "rough", with many non-native local minima in which partially folded proteins can become trapped, the folding funnel hypothesis assumes that the native state is a deep free energy minimum with steep walls, corresponding to a single well-defined tertiary structure. The term was introduced by Ken A. Dill in a 1987 article discussing the stabilities of globular proteins.

<span class="mw-page-title-main">Hydrophobic collapse</span> Process in protein folding

Hydrophobic collapse is a proposed process for the production of the 3-D conformation adopted by polypeptides and other molecules in polar solvents. The theory states that the nascent polypeptide forms initial secondary structure creating localized regions of predominantly hydrophobic residues. The polypeptide interacts with water, thus placing thermodynamic pressures on these regions which then aggregate or "collapse" into a tertiary conformation with a hydrophobic core. Incidentally, polar residues interact favourably with water, thus the solvent-facing surface of the peptide is usually composed of predominantly hydrophilic regions.

In computational biology, de novo protein structure prediction refers to an algorithmic process by which protein tertiary structure is predicted from its amino acid primary sequence. The problem itself has occupied leading scientists for decades while still remaining unsolved. According to Science, the problem remains one of the top 125 outstanding issues in modern science. At present, some of the most successful methods have a reasonable probability of predicting the folds of small, single-domain proteins within 1.5 angstroms over the entire structure.

<span class="mw-page-title-main">Protein domain</span> Self-stable region of a proteins chain that folds independently from the rest

In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of several domains, and a domain may appear in a variety of different proteins. Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions. In general, domains vary in length from between about 50 amino acids up to 250 amino acids in length. The shortest domains, such as zinc fingers, are stabilized by metal ions or disulfide bridges. Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin. Because they are independently stable, domains can be "swapped" by genetic engineering between one protein and another to make chimeric proteins.

A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.

References

  1. IUPAC , Compendium of Chemical Terminology , 2nd ed. (the "Gold Book") (1997). Online corrected version: (2006) " tertiary structure ". doi : 10.1351/goldbook.T06282
  2. Branden C. and Tooze J. "Introduction to Protein Structure" Garland Publishing, New York. 1990 and 1991.
  3. Kyte, J. "Structure in Protein Chemistry." Garland Publishing, New York. 1995. ISBN   0-8153-1701-8
  4. Senechal M. "I died for beauty: Dorothy Wrinch and the cultures of science." Oxford University Press, 2012. Chapter 14. ISBN   0-19-991083-9, 9780199910830. Accessed at Google Books 8 December 2013.
  5. Whisstock J (2006). "Molecular gymnastics: serpiginous structure, folding and scaffolding". Current Opinion in Structural Biology. 16 (6): 761–68. doi:10.1016/j.sbi.2006.10.005. PMID   17079131.
  6. Gettins PG (2002). "Serpin structure, mechanism, and function". Chem Rev. 102 (12): 4751–804. doi:10.1021/cr010170. PMID   12475206.
  7. Whisstock JC, Skinner R, Carrell RW, Lesk AM (2000). "Conformational changes in serpins: I. The native and cleaved conformations of alpha(1)-anti-trypsin". J Mol Biol. 296 (2): 685–99. doi:10.1006/jmbi.1999.3520. PMID   10669617.
  8. Seeliger, D; De Groot, B. L. (2010). "Conformational transitions upon ligand binding: Holo-structure prediction from apo conformations". PLOS Computational Biology. 6 (1): e1000634. Bibcode:2010PLSCB...6E0634S. doi: 10.1371/journal.pcbi.1000634 . PMC   2796265 . PMID   20066034.
  9. "Folding@home – Fighting disease with a world wide distributed super computer" . Retrieved 2024-04-23.
  10. "Bowman Lab – University of Pennsylvania" . Retrieved 2024-04-23.
  11. Schober A (October 2004). "Classic toxin-induced animal models of Parkinson's disease: 6-OHDA and MPTP". Cell Tissue Res. 318 (1): 215–24. doi:10.1007/s00441-004-0938-y. PMID   15503155. S2CID   1824912.
  12. "Tp53 Knockout Rat". Cancer. Retrieved 2010-12-18.
  13. "Feature – What is Folding and Why Does it Matter?". Archived from the original on December 12, 2013. Retrieved December 18, 2010.
  14. "Comograd :: Protein Tertiary Matching".
  15. Karim, Rezaul; Aziz, Mohd Momin Al; Shatabda, Swakkhar; Rahman, M. Sohel; Mia, Md Abul Kashem; Zaman, Farhana; Rakin, Salman (21 August 2015). "CoMOGrad and PHOG: From Computer Vision to Fast and Accurate Protein Tertiary Structure Retrieval". Scientific Reports. 5 (1): 13275. arXiv: 1409.0814 . Bibcode:2015NatSR...513275K. doi:10.1038/srep13275. PMC   4543952 . PMID   26293226.