In molecular biology, proteins are generally thought to adopt unique structures determined by their amino acid sequences. However, proteins are not strictly static objects, but rather populate ensembles of (sometimes similar) conformations. Transitions between these states occur on a variety of length scales (tenths of angstroms to nm) and time scales (ns to s), and have been linked to functionally relevant phenomena such as allosteric signaling [1] and enzyme catalysis. [2] [3]
The study of protein dynamics is most directly concerned with the transitions between these states, but can also involve the nature and equilibrium populations of the states themselves. These two perspectives—kinetics and thermodynamics, respectively—can be conceptually synthesized in an "energy landscape" paradigm: [4] highly populated states and the kinetics of transitions between them can be described by the depths of energy wells and the heights of energy barriers, respectively.
Portions of protein structures often deviate from the equilibrium state. Some such excursions are harmonic, such as stochastic fluctuations of chemical bonds and bond angles. Others are anharmonic, such as sidechains that jump between separate discrete energy minima, or rotamers. [5]
Evidence for local flexibility is often obtained from NMR spectroscopy. Flexible and potentially disordered regions of a protein can be detected using the random coil index. Flexibility in folded proteins can be identified by analyzing the spin relaxation of individual atoms in the protein. Flexibility can also be observed in very high-resolution electron density maps produced by X-ray crystallography, [6] particularly when diffraction data is collected at room temperature instead of the traditional cryogenic temperature (typically near 100 K). [7] Information on the frequency distribution and dynamics of local protein flexibility can be obtained using Raman and optical Kerr-effect spectroscopy [8] as well as anisotropic microspectroscopy [9] in the terahertz frequency domain.
Many residues are in close spatial proximity in protein structures. This is true for most residues that are contiguous in the primary sequence, but also for many that are distal in sequence yet are brought into contact in the final folded structure. Because of this proximity, these residue's energy landscapes become coupled based on various biophysical phenomena such as hydrogen bonds, ionic bonds, and van der Waals interactions (see figure).
Transitions between states for such sets of residues therefore become correlated. [10]
This is perhaps most obvious for surface-exposed loops, which often shift collectively to adopt different conformations in different crystal structures (see figure). However, coupled conformational heterogeneity is also sometimes evident in secondary structure. [11] For example, consecutive residues and residues offset by 4 in the primary sequence often interact in α helices. Also, residues offset by 2 in the primary sequence point their sidechains toward the same face of β sheets and are close enough to interact sterically, as are residues on adjacent strands of the same β sheet. Some of these conformational changes are induced by post-translational modifications in protein structure, such as phosphorylation and methylation. [11] [12]
When these coupled residues form pathways linking functionally important parts of a protein, they may participate in allosteric signaling. For example, when a molecule of oxygen binds to one subunit of the hemoglobin tetramer, that information is allosterically propagated to the other three subunits, thereby enhancing their affinity for oxygen. In this case, the coupled flexibility in hemoglobin allows for cooperative oxygen binding, which is physiologically useful because it allows rapid oxygen loading in lung tissue and rapid oxygen unloading in oxygen-deprived tissues (e.g. muscle).
The presence of multiple domains in proteins gives rise to a great deal of flexibility and mobility, leading to protein domain dynamics. [1] Domain motions can be inferred by comparing different structures of a protein (as in Database of Molecular Motions), or they can be directly observed using spectra [13] [2] measured by neutron spin echo spectroscopy. They can also be suggested by sampling in extensive molecular dynamics trajectories [14] and principal component analysis. [15] Domain motions are important for:
One of the largest observed domain motions is the 'swivelling' mechanism in pyruvate phosphate dikinase. The phosphoinositide domain swivels between two states in order to bring a phosphate group from the active site of the nucleotide binding domain to that of the phosphoenolpyruvate/pyruvate domain. [24] The phosphate group is moved over a distance of 45 Å involving a domain motion of about 100 degrees around a single residue. In enzymes, the closure of one domain onto another captures a substrate by an induced fit, allowing the reaction to take place in a controlled way. A detailed analysis by Gerstein led to the classification of two basic types of domain motion; hinge and shear. [21] Only a relatively small portion of the chain, namely the inter-domain linker and side chains undergo significant conformational changes upon domain rearrangement. [25]
A study by Hayward [26] found that the termini of α-helices and β-sheets form hinges in a large number of cases. Many hinges were found to involve two secondary structure elements acting like hinges of a door, allowing an opening and closing motion to occur. This can arise when two neighbouring strands within a β-sheet situated in one domain, diverge apart as they join the other domain. The two resulting termini then form the bending regions between the two domains. α-helices that preserve their hydrogen bonding network when bent are found to behave as mechanical hinges, storing `elastic energy' that drives the closure of domains for rapid capture of a substrate. [26] Khade et. al. worked on prediction of the hinges [27] in any conformation and further built an Elastic Network Model called hdANM [28] that can model those motions.
The interconversion of helical and extended conformations at the site of a domain boundary is not uncommon. In calmodulin, torsion angles change for five residues in the middle of a domain linking α-helix. The helix is split into two, almost perpendicular, smaller helices separated by four residues of an extended strand. [29] [30]
Shear motions involve a small sliding movement of domain interfaces, controlled by the amino acid side chains within the interface. Proteins displaying shear motions often have a layered architecture: stacking of secondary structures. The interdomain linker has merely the role of keeping the domains in close proximity.[ citation needed ]
The analysis of the internal dynamics of structurally different, but functionally similar enzymes has highlighted a common relationship between the positioning of the active site and the two principal protein sub-domains. In fact, for several members of the hydrolase superfamily, the catalytic site is located close to the interface separating the two principal quasi-rigid domains. [14] Such positioning appears instrumental for maintaining the precise geometry of the active site, while allowing for an appreciable functionally oriented modulation of the flanking regions resulting from the relative motion of the two sub-domains.[ citation needed ]
Evidence suggests that protein dynamics are important for function, e.g. enzyme catalysis in dihydrofolate reductase (DHFR), yet they are also posited to facilitate the acquisition of new functions by molecular evolution. [31] This argument suggests that proteins have evolved to have stable, mostly unique folded structures, but the unavoidable residual flexibility leads to some degree of functional promiscuity, which can be amplified/harnessed/diverted by subsequent mutations.[ citation needed ] Research on promiscuous proteins within the BCL-2 family revealed that nanosecond-scale protein dynamics can play a crucial role in protein binding behaviour and thus promiscuity. [32]
However, there is growing awareness that intrinsically unstructured proteins are quite prevalent in eukaryotic genomes, [33] casting further doubt on the simplest interpretation of Anfinsen's dogma: "sequence determines structure (singular)". In effect, the new paradigm is characterized by the addition of two caveats: "sequence and cellular environment determine structural ensemble".
An alpha helix is a sequence of amino acids in a protein that are twisted into a coil.
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.
Protein folding is the physical process by which a protein, after synthesis by a ribosome as a linear chain of amino acids, changes from an unstable random coil into a more ordered three-dimensional structure. This structure permits the protein to become biologically functional.
In the fields of biochemistry and pharmacology an allosteric regulator is a substance that binds to a site on an enzyme or receptor distinct from the active site, resulting in a conformational change that alters the protein's activity, either enhancing or inhibiting its function. In contrast, substances that bind directly to an enzyme's active site or the binding site of the endogenous ligand of a receptor are called orthosteric regulators or modulators.
Phenylalanine hydroxylase (PAH) (EC 1.14.16.1) is an enzyme that catalyzes the hydroxylation of the aromatic side-chain of phenylalanine to generate tyrosine. PAH is one of three members of the biopterin-dependent aromatic amino acid hydroxylases, a class of monooxygenase that uses tetrahydrobiopterin (BH4, a pteridine cofactor) and a non-heme iron for catalysis. During the reaction, molecular oxygen is heterolytically cleaved with sequential incorporation of one oxygen atom into BH4 and phenylalanine substrate. In humans, mutations in its encoding gene, PAH, can lead to the metabolic disorder phenylketonuria.
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design.
In biochemistry and molecular biology, a binding site is a region on a macromolecule such as a protein that binds to another molecule with specificity. The binding partner of the macromolecule is often referred to as a ligand. Ligands may include other proteins, enzyme substrates, second messengers, hormones, or allosteric modulators. The binding event is often, but not always, accompanied by a conformational change that alters the protein's function. Binding to protein binding sites is most often reversible, but can also be covalent reversible or irreversible.
Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions, such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo-electron microscopy (cryo-EM) and dual polarisation interferometry, to determine the structure of proteins.
Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch or by making calculated variants of a known protein structure and its sequence. Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as peptide synthesis, site-directed mutagenesis, or artificial gene synthesis.
In molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs range from fully unstructured to partially structured and include random coil, molten globule-like aggregates, or flexible linkers in large multi-domain proteins. They are sometimes considered as a separate class of proteins along with globular, fibrous and membrane proteins.
In biochemistry, a conformational change is a change in the shape of a macromolecule, often induced by environmental factors.
A turn is an element of secondary structure in proteins where the polypeptide chain reverses its overall direction.
Phosphoglycerate kinase is an enzyme that catalyzes the reversible transfer of a phosphate group from 1,3-bisphosphoglycerate (1,3-BPG) to ADP producing 3-phosphoglycerate (3-PG) and ATP :
Allosteric enzymes are enzymes that change their conformational ensemble upon binding of an effector which results in an apparent change in binding affinity at a different ligand binding site. This "action at a distance" through binding of one ligand affecting the binding of another at a distinctly different site, is the essence of the allosteric concept. Allostery plays a crucial role in many fundamental biological processes, including but not limited to cell signaling and the regulation of metabolism. Allosteric enzymes need not be oligomers as previously thought, and in fact many systems have demonstrated allostery within single enzymes. In biochemistry, allosteric regulation is the regulation of a protein by binding an effector molecule at a site other than the enzyme's active site.
Molecular biophysics is a rapidly evolving interdisciplinary area of research that combines concepts in physics, chemistry, engineering, mathematics and biology. It seeks to understand biomolecular systems and explain biological function in terms of molecular structure, structural organization, and dynamic behaviour at various levels of complexity. This discipline covers topics such as the measurement of molecular forces, molecular associations, allosteric interactions, Brownian motion, and cable theory. Additional areas of study can be found on Outline of Biophysics. The discipline has required development of specialized equipment and procedures capable of imaging and manipulating minute living structures, as well as novel experimental approaches.
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of several domains, and a domain may appear in a variety of different proteins. Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions. In general, domains vary in length from between about 50 amino acids up to 250 amino acids in length. The shortest domains, such as zinc fingers, are stabilized by metal ions or disulfide bridges. Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin. Because they are independently stable, domains can be "swapped" by genetic engineering between one protein and another to make chimeric proteins.
The Gaussian network model (GNM) is a representation of a biological macromolecule as an elastic mass-and-spring network to study, understand, and characterize the mechanical aspects of its long-time large-scale dynamics. The model has a wide range of applications from small proteins such as enzymes composed of a single domain, to large macromolecular assemblies such as a ribosome or a viral capsid. Protein domain dynamics plays key roles in a multitude of molecular recognition and cell signalling processes. Protein domains, connected by intrinsically disordered flexible linker domains, induce long-range allostery via protein domain dynamics. The resultant dynamic modes cannot be generally predicted from static structures of either the entire protein or individual domains.
Aminolevulinic acid dehydratase (porphobilinogen synthase, or ALA dehydratase, or aminolevulinate dehydratase) is an enzyme (EC 4.2.1.24) that in humans is encoded by the ALAD gene. Porphobilinogen synthase (or ALA dehydratase, or aminolevulinate dehydratase) synthesizes porphobilinogen through the asymmetric condensation of two molecules of aminolevulinic acid. All natural tetrapyrroles, including hemes, chlorophylls and vitamin B12, share porphobilinogen as a common precursor. Porphobilinogen synthase is the prototype morpheein.
Morpheeins are proteins that can form two or more different homo-oligomers, but must come apart and change shape to convert between forms. The alternate shape may reassemble to a different oligomer. The shape of the subunit dictates which oligomer is formed. Each oligomer has a finite number of subunits (stoichiometry). Morpheeins can interconvert between forms under physiological conditions and can exist as an equilibrium of different oligomers. These oligomers are physiologically relevant and are not misfolded protein; this distinguishes morpheeins from prions and amyloid. The different oligomers have distinct functionality. Interconversion of morpheein forms can be a structural basis for allosteric regulation, an idea noted many years ago, and later revived. A mutation that shifts the normal equilibrium of morpheein forms can serve as the basis for a conformational disease. Features of morpheeins can be exploited for drug discovery. The dice image represents a morpheein equilibrium containing two different monomeric shapes that dictate assembly to a tetramer or a pentamer. The one protein that is established to function as a morpheein is porphobilinogen synthase, though there are suggestions throughout the literature that other proteins may function as morpheeins.
KcsA (K channel of streptomyces A) is a prokaryotic potassium channel from the soil bacterium Streptomyces lividans that has been studied extensively in ion channel research. The pH activated protein possesses two transmembrane segments and a highly selective pore region, responsible for the gating and shuttling of K+ ions out of the cell. The amino acid sequence found in the selectivity filter of KcsA is highly conserved among both prokaryotic and eukaryotic K+ voltage channels; as a result, research on KcsA has provided important structural and mechanistic insight on the molecular basis for K+ ion selection and conduction. As one of the most studied ion channels to this day, KcsA is a template for research on K+ channel function and its elucidated structure underlies computational modeling of channel dynamics for both prokaryotic and eukaryotic species.