Conformational ensembles

Last updated
This movie depicts the 3-D structures of each of the representative conformations of the Markov State Model of Pin1 WW domain.

In computational chemistry, conformational ensembles, also known as structural ensembles, are experimentally constrained computational models describing the structure of intrinsically unstructured proteins. [1] [2] Such proteins are flexible in nature, lacking a stable tertiary structure, and therefore cannot be described with a single structural representation. [3] The techniques of ensemble calculation are relatively new on the field of structural biology, and are still facing certain limitations that need to be addressed before it will become comparable to classical structural description methods such as biological macromolecular crystallography. [4]

Contents

Purpose

Ensembles are models consisting of a set of conformations that together attempt to describe the structure of a flexible protein. Even though the degree of conformational freedom is extremely high, flexible/disordered protein generally differ from fully random coil structures. [5] [6] The main purpose of these models is to gain insights regarding the function of the flexible protein, extending the structure-function paradigm from folded proteins to intrinsically disordered proteins.

Calculation techniques

The calculation of ensembles rely on experimental measurements, mostly by Nuclear Magnetic Resonance spectroscopy and Small-angle X-ray scattering. These measurements yield short and long-range structural information.

Short-range

Long-range

Constrained molecular dynamics simulations

The structure of disordered proteins may be approximated by running constrained molecular dynamics (MD) simulations where the conformational sampling is being influenced by experimentally derived constraints. [7]

Fitting experimental data

Another approach uses selection algorithms such as ENSEMBLE and ASTEROIDS. [8] [9] Calculation procedures first generate a pool of random conformers (initial pool) so that they sufficiently sample the conformation space. The selection algorithms start by choosing a smaller set of conformers (an ensemble) from the initial pool. Experimental parameters (NMR/SAXS) are calculated (usually by some theoretical prediction methods) for each conformer of chosen ensemble and averaged over ensemble. The difference between these calculated parameters and true experimental parameters is used to make an error function and the algorithm selects the final ensemble so that the error function is minimised.

Limitations

The determination of a structural ensemble for an IDP from NMR/SAXS experimental parameters involves generation of structures that agree with the parameters and their respective weights in the ensemble. Usually, the available experimental data is less compared to the number of variables required to determine making it an under-determined system. Due to this reason, several structurally very different ensembles may describe the experimental data equally well, and currently there are no exact methods to discriminate between ensembles of equally good fit. This problem has to be solved either by bringing in more experimental data or by improving the prediction methods by introducing rigorous computational methods.

Related Research Articles

<span class="mw-page-title-main">Protein folding</span> Change of a linear protein chain to a 3D structure

Protein folding is the physical process by which a protein chain is translated to its native three-dimensional structure, typically a "folded" conformation by which the protein becomes biologically functional. Via an expeditious and reproducible process, a polypeptide folds into its characteristic three-dimensional structure from a random coil. Each protein exists first as an unfolded polypeptide or random coil after being translated from a sequence of mRNA to a linear chain of amino acids. At this stage the polypeptide lacks any stable (long-lasting) three-dimensional structure. As the polypeptide chain is being synthesized by a ribosome, the linear chain begins to fold into its three-dimensional structure.

<span class="mw-page-title-main">Allosteric regulation</span> Regulation of enzyme activity

In biochemistry, allosteric regulation is the regulation of an enzyme by binding an effector molecule at a site other than the enzyme's active site.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.

<span class="mw-page-title-main">Protein complex</span> Type of stable macromolecular complex

A protein complex or multiprotein complex is a group of two or more associated polypeptide chains. Protein complexes are distinct from multienzyme complexes, in which multiple catalytic domains are found in a single polypeptide chain.

<span class="mw-page-title-main">Biological small-angle scattering</span>

Biological small-angle scattering is a small-angle scattering method for structure analysis of biological materials. Small-angle scattering is used to study the structure of a variety of objects such as solutions of biological macromolecules, nanocomposites, alloys, and synthetic polymers. Small-angle X-ray scattering (SAXS) and small-angle neutron scattering (SANS) are the two complementary techniques known jointly as small-angle scattering (SAS). SAS is an analogous method to X-ray and neutron diffraction, wide angle X-ray scattering, as well as to static light scattering. In contrast to other X-ray and neutron scattering methods, SAS yields information on the sizes and shapes of both crystalline and non-crystalline particles. When used to study biological materials, which are very often in aqueous solution, the scattering pattern is orientation averaged.

<span class="mw-page-title-main">Protein structure</span> Three-dimensional arrangement of atoms in an amino acid-chain molecule

Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer may also be called a residue indicating a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo electron microscopy (cryo-EM) and dual polarisation interferometry to determine the structure of proteins.

<span class="mw-page-title-main">Docking (molecular)</span>

In the field of molecular modeling, docking is a method which predicts the preferred orientation of one molecule to a second when a ligand and a target are bound to each other to form a stable complex. Knowledge of the preferred orientation in turn may be used to predict the strength of association or binding affinity between two molecules using, for example, scoring functions.

Macromolecular docking is the computational modelling of the quaternary structure of complexes formed by two or more interacting biological macromolecules. Protein–protein complexes are the most commonly attempted targets of such modelling, followed by protein–nucleic acid complexes.

<span class="mw-page-title-main">Intrinsically disordered proteins</span> Protein without a fixed 3D structure

In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs range from fully unstructured to partially structured and include random coil, molten globule-like aggregates, or flexible linkers in large multi-domain proteins. They are sometimes considered as a separate class of proteins along with globular, fibrous and membrane proteins.

Nuclear magnetic resonance spectroscopy of proteins is a field of structural biology in which NMR spectroscopy is used to obtain information about the structure and dynamics of proteins, and also nucleic acids, and their complexes. The field was pioneered by Richard R. Ernst and Kurt Wüthrich at the ETH, and by Ad Bax, Marius Clore, Angela Gronenborn at the NIH, and Gerhard Wagner at Harvard University, among others. Structure determination by NMR spectroscopy usually consists of several phases, each using a separate set of highly specialized techniques. The sample is prepared, measurements are made, interpretive approaches are applied, and a structure is calculated and validated.

Experimental approaches of determining the structure of nucleic acids, such as RNA and DNA, can be largely classified into biophysical and biochemical methods. Biophysical methods use the fundamental physical properties of molecules for structure determination, including X-ray crystallography, NMR and cryo-EM. Biochemical methods exploit the chemical properties of nucleic acids using specific reagents and conditions to assay the structure of nucleic acids. Such methods may involve chemical probing with specific reagents, or rely on native or analogue chemistry. Different experimental approaches have unique merits and are suitable for different experimental purposes.

<span class="mw-page-title-main">Protein dynamics</span>

Proteins are generally thought to adopt unique structures determined by their amino acid sequences. However, proteins are not strictly static objects, but rather populate ensembles of conformations. Transitions between these states occur on a variety of length scales and time scales, and have been linked to functionally relevant phenomena such as allosteric signaling and enzyme catalysis.

DisProt is a manually curated biological database of intrinsically disordered proteins (IDPs) and regions (IDRs). DisProt annotations cover state information on the protein but also, when available, its state transitions, interactions and functional aspects of disorder detected by specific experimental methods. DisProt is hosted and maintained in the BioComputing UP laboratory.

<span class="mw-page-title-main">WeNMR</span> Worldwide e-Infrastructure for NMR spectroscopy and structural biology

WeNMR is a worldwide e-Infrastructure for NMR spectroscopy and structural biology. It is the largest virtual Organization in the life sciences and is supported by EGI.

<span class="mw-page-title-main">Fuzzy complex</span>

Fuzzy complexes are protein complexes, where structural ambiguity or multiplicity exists and is required for biological function. Alteration, truncation or removal of conformationally ambiguous regions impacts the activity of the corresponding complex. Fuzzy complexes are generally formed by intrinsically disordered proteins. Structural multiplicity usually underlies functional multiplicity of protein complexes following a fuzzy logic. Distinct binding modes of the nucleosome are also regarded as a special case of fuzziness.

Molecular recognition features (MoRFs) are small intrinsically disordered regions in proteins that undergo a disorder-to-order transition upon binding to their partners. MoRFs are implicated in protein-protein interactions, which serve as the initial step in molecular recognition. MoRFs are disordered prior to binding to their partners, whereas they form a common 3D structure after interacting with their partners. As MoRF regions tend to resemble disordered proteins with some characteristics of ordered proteins, they can be classified as existing in an extended semi-disordered state.

<span class="mw-page-title-main">Structure validation</span> Process of evaluating 3-dimensional atomic models of biomacromolecules

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.

Rohit Pappu is an Indian-born computational and theoretical biophysicist. He is the Gene K. Beare Distinguished Professor of Engineering and the director of the Center for Science & Engineering of Living Systems (CSELS) at Washington University in St. Louis.

Collin M. Stultz is an American biomolecular engineer, physician-scientist and academic at the Massachusetts Institute of Technology and the Massachusetts General Hospital. He is the Nina T. and Robert H. Rubin Professor in Medical Engineering and Science at MIT, a Professor of Electrical Engineering and Computer Science also at MIT, a faculty member in the Harvard-MIT Division of Health Sciences and Technology, and a cardiologist at the Massachusetts General Hospital.

References

  1. Fisher CK, Stultz CM (June 2011). "Constructing ensembles for intrinsically disordered proteins" (PDF). Current Opinion in Structural Biology. (3). 21 (3): 426–31. doi:10.1016/j.sbi.2011.04.001. hdl:1721.1/99137. PMC   3112268 . PMID   21530234.
  2. Varadi M, Kosol S, Lebrun P, Valentini E, Blackledge M, Dunker AK, Felli IC, Forman-Kay JD, Kriwacki RW, Pierattelli R, Sussman J, Svergun DI, Uversky VN, Vendruscolo M, Wishart D, Wright PE, Tompa P (January 2014). "pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins". Nucleic Acids Research. 42 (Database issue): D326-35. doi:10.1093/nar/gkt960. PMC   3964940 . PMID   24174539.
  3. Dyson HJ, Wright PE (March 2005). "Intrinsically unstructured proteins and their functions". Nature Reviews. Molecular Cell Biology. 6 (3): 197–208. doi:10.1038/nrm1589. PMID   15738986. S2CID   18068406.
  4. Tompa P (June 2011). "Unstructural biology coming of age". Current Opinion in Structural Biology. 21 (3): 419–25. doi:10.1016/j.sbi.2011.03.012. PMID   21514142.
  5. Communie G, Habchi J, Yabukarski F, Blocquel D, Schneider R, Tarbouriech N, Papageorgiou N, Ruigrok RW, Jamin M, Jensen MR, Longhi S, Blackledge M (2013). "Atomic resolution description of the interaction between the nucleoprotein and phosphoprotein of Hendra virus". PLOS Pathogens. 9 (9): e1003631. doi:10.1371/journal.ppat.1003631. PMC   3784471 . PMID   24086133.
  6. Kurzbach D, Platzer G, Schwarz TC, Henen MA, Konrat R, Hinderberger D (August 2013). "Cooperative unfolding of compact conformations of the intrinsically disordered protein osteopontin". Biochemistry. 52 (31): 5167–75. doi:10.1021/bi400502c. PMC   3737600 . PMID   23848319.
  7. Allison JR, Varnai P, Dobson CM, Vendruscolo M (December 2009). "Determination of the free energy landscape of alpha-synuclein using spin label nuclear magnetic resonance measurements". Journal of the American Chemical Society. 131 (51): 18314–26. doi:10.1021/ja904716h. PMID   20028147.
  8. Krzeminski M, Marsh JA, Neale C, Choy WY, Forman-Kay JD (February 2013). "Characterization of disordered proteins with ENSEMBLE". Bioinformatics. 29 (3): 398–9. doi: 10.1093/bioinformatics/bts701 . PMID   23233655.
  9. Jensen MR, Salmon L, Nodet G, Blackledge M (February 2010). "Defining conformational ensembles of intrinsically disordered and partially folded proteins directly from chemical shifts". Journal of the American Chemical Society. 132 (4): 1270–2. doi:10.1021/ja909973n. PMID   20063887.