In chemical thermodynamics, conformational entropy is the entropy associated with the number of conformations of a molecule. The concept is most commonly applied to biological macromolecules such as proteins and RNA, but also be used for polysaccharides and other molecules. To calculate the conformational entropy, the possible conformations of the molecule may first be discretized into a finite number of states, usually characterized by unique combinations of certain structural parameters, each of which has been assigned an energy. In proteins, backbone dihedral angles and side chain rotamers are commonly used as parameters, and in RNA the base pairing pattern may be used. These characteristics are used to define the degrees of freedom (in the statistical mechanics sense of a possible "microstate"). The conformational entropy associated with a particular structure or state, such as an alpha-helix, a folded or an unfolded protein structure, is then dependent on the probability of the occupancy of that structure.
The entropy of heterogeneous random coil or denatured proteins is significantly higher than that of the tertiary structure of its folded native state. In particular, the conformational entropy of the amino acid side chains in a protein is thought to be a major contributor to the energetic stabilization of the denatured state and thus a barrier to protein folding. [1] However, a recent study has shown that side-chain conformational entropy can stabilize native structures among alternative compact structures. [2] The conformational entropy of RNA and proteins can be estimated; for example, empirical methods to estimate the loss of conformational entropy in a particular side chain on incorporation into a folded protein can roughly predict the effects of particular point mutations in a protein. Side-chain conformational entropies can be defined as Boltzmann sampling over all possible rotameric states: [3]
where R is the gas constant and pi is the probability of a residue being in rotamer i. [3]
The limited conformational range of proline residues lowers the conformational entropy of the denatured state and thus stabilizes the native states. A correlation has been observed between the thermostability of a protein and its proline residue content. [4]
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earlier along the protein sequence.
Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may interact and bond in a number of ways. The interactions and bonds of side chains within a particular protein determine its tertiary structure. The protein tertiary structure is defined by its atomic coordinates. These coordinates may refer either to a protein domain or to the entire tertiary structure. A number of tertiary structures may fold into a quaternary structure.
Proline (symbol Pro or P) is an organic acid classed as a proteinogenic amino acid (used in the biosynthesis of proteins), although it does not contain the amino group -NH
2 but is rather a secondary amine. The secondary amine nitrogen is in the protonated form (NH2+) under biological conditions, while the carboxyl group is in the deprotonated −COO− form. The "side chain" from the α carbon connects to the nitrogen forming a pyrrolidine loop, classifying it as a aliphatic amino acid. It is non-essential in humans, meaning the body can synthesize it from the non-essential amino acid L-glutamate. It is encoded by all the codons starting with CC (CCU, CCC, CCA, and CCG).
Protein folding is the physical process by which a protein chain is translated to its native three-dimensional structure, typically a "folded" conformation by which the protein becomes biologically functional. Via an expeditious and reproducible process, a polypeptide folds into its characteristic three-dimensional structure from a random coil. Each protein exists first as an unfolded polypeptide or random coil after being translated from a sequence of mRNA to a linear chain of amino acids. At this stage the polypeptide lacks any stable (long-lasting) three-dimensional structure. As the polypeptide chain is being synthesized by a ribosome, the linear chain begins to fold into its three-dimensional structure.
In molecular biology, the term molten globule (MG) refers to protein states that are more or less compact, but are lacking the specific tight packing of amino acid residues which creates the solid state-like tertiary structure of completely folded proteins. It was found, for example, in cytochrome c, which conserves a native-like secondary structure content but without the tightly packed protein interior, under low pH and high salt concentration. For cytochrome c and some other proteins, it has been shown that the molten globule state is a "thermodynamic state" clearly different both from the native and the denatured state, demonstrating for the first time the existence of a third equilibirum state.
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.
In polymer chemistry, a random coil is a conformation of polymers where the monomer subunits are oriented randomly while still being bonded to adjacent units. It is not one specific shape, but a statistical distribution of shapes for all the chains in a population of macromolecules. The conformation's name is derived from the idea that, in the absence of specific, stabilizing interactions, a polymer backbone will "sample" all possible conformations randomly. Many unbranched, linear homopolymers — in solution, or above their melting temperatures — assume (approximate) random coils.
Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch or by making calculated variants of a known protein structure and its sequence. Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as peptide synthesis, site-directed mutagenesis, or artificial gene synthesis.
In computational biology, protein pKa calculations are used to estimate the pKa values of amino acids as they exist within proteins. These calculations complement the pKa values reported for amino acids in their free state, and are used frequently within the fields of molecular modeling, structural bioinformatics, and computational biology.
Bovine pancreatic ribonuclease, also often referred to as bovine pancreatic ribonuclease A or simply RNase A, is a pancreatic ribonuclease enzyme that cleaves single-stranded RNA. Bovine pancreatic ribonuclease is one of the classic model systems of protein science. Two Nobel Prizes in Chemistry have been awarded in recognition of work on bovine pancreatic ribonuclease: in 1972, the Prize was awarded to Christian Anfinsen for his work on protein folding and to Stanford Moore and William Stein for their work on the relationship between the protein's structure and its chemical mechanism; in 1984, the Prize was awarded to Robert Bruce Merrifield for development of chemical synthesis of proteins.
The self-consistent mean field (SCMF) method is an adaptation of mean field theory used in protein structure prediction to determine the optimal amino acid side chain packing given a fixed protein backbone. It is faster but less accurate than dead-end elimination and is generally used in situations where the protein of interest is too large for the problem to be tractable by DEE.
The beta hairpin is a simple protein structural motif involving two beta strands that look like a hairpin. The motif consists of two strands that are adjacent in primary structure, oriented in an antiparallel direction, and linked by a short loop of two to five amino acids. Beta hairpins can occur in isolation or as part of a series of hydrogen bonded strands that collectively comprise a beta sheet.
Prolyl isomerase is an enzyme found in both prokaryotes and eukaryotes that interconverts the cis and trans isomers of peptide bonds with the amino acid proline. Proline has an unusually conformationally restrained peptide bond due to its cyclic structure with its side chain bonded to its secondary amine nitrogen. Most amino acids have a strong energetic preference for the trans peptide bond conformation due to steric hindrance, but proline's unusual structure stabilizes the cis form so that both isomers are populated under biologically relevant conditions. Proteins with prolyl isomerase activity include cyclophilin, FKBPs, and parvulin, although larger proteins can also contain prolyl isomerase domains.
In biochemistry, equilibrium unfolding is the process of unfolding a protein or RNA molecule by gradually changing its environment, such as by changing the temperature or pressure, pH, adding chemical denaturants, or applying force as with an atomic force microscope tip. If the equilibrium was maintained at all steps, the process theoretically should be reversible during equilibrium folding. Equilibrium unfolding can be used to determine the thermodynamic stability of the protein or RNA structure, i.e. free energy difference between the folded and unfolded states.
The folding funnel hypothesis is a specific version of the energy landscape theory of protein folding, which assumes that a protein's native state corresponds to its free energy minimum under the solution conditions usually encountered in cells. Although energy landscapes may be "rough", with many non-native local minima in which partially folded proteins can become trapped, the folding funnel hypothesis assumes that the native state is a deep free energy minimum with steep walls, corresponding to a single well-defined tertiary structure. The term was introduced by Ken A. Dill in a 1987 article discussing the stabilities of globular proteins.
In computational biology, de novo protein structure prediction refers to an algorithmic process by which protein tertiary structure is predicted from its amino acid primary sequence. The problem itself has occupied leading scientists for decades while still remaining unsolved. According to Science, the problem remains one of the top 125 outstanding issues in modern science. At present, some of the most successful methods have a reasonable probability of predicting the folds of small, single-domain proteins within 1.5 angstroms over the entire structure.
FoldX is a protein design algorithm that uses an empirical force field. It can determine the energetic effect of point mutations as well as the interaction energy of protein complexes. FoldX can mutate protein and DNA side chains using a probability-based rotamer library, while exploring alternative conformations of the surrounding side chains.
Protein backbone fragment libraries have been used successfully in a variety of structural biology applications, including homology modeling, de novo structure prediction, and structure determination. By reducing the complexity of the search space, these fragment libraries enable more rapid search of conformational space, leading to more efficient and accurate models.
Graphical models have become powerful frameworks for protein structure prediction, protein–protein interaction, and free energy calculations for protein structures. Using a graphical model to represent the protein structure allows the solution of many problems including secondary structure prediction, protein-protein interactions, protein-drug interaction, and free energy calculations.
In biochemistry, a backbone-dependent rotamer library provides the frequencies, mean dihedral angles, and standard deviations of the discrete conformations of the amino acid side chains in proteins as a function of the backbone dihedral angles φ and ψ of the Ramachandran map. By contrast, backbone-independent rotamer libraries express the frequencies and mean dihedral angles for all side chains in proteins, regardless of the backbone conformation of each residue type. Backbone-dependent rotamer libraries have been shown to have significant advantages over backbone-independent rotamer libraries, principally when used as an energy term, by speeding up search times of side-chain packing algorithms used in protein structure prediction and protein design.