Crystal structures of protein and nucleic acid molecules and their complexes are central to the practice of most parts of biophysics, and have shaped much of what we understand scientifically at the atomic-detail level of biology. Their importance is underlined by the United Nations declaring 2014 as the International Year of Crystallography, as the 100th anniversary of Max von Laue's 1914 Nobel prize for discovering the diffraction of X-rays by crystals. This chronological list of biophysically notable protein and nucleic acid structures is loosely based on a review in the Biophysical Journal . [1] The list includes all the first dozen distinct structures, those that broke new ground in subject or method, and those that became model systems for work in future biophysical areas of research.
1958 – Myoglobin was the very first crystal structure of a protein molecule. [2] Myoglobin cradles an iron-containing heme group that reversibly binds oxygen for use in powering muscle fibers, and those first crystals were of myoglobin from the sperm whale, whose muscles need copious oxygen storage for deep dives. The myoglobin 3-dimensional structure is made up of 8 alpha-helices, and the crystal structure showed that their conformation was right-handed and very closely matched the geometry proposed by Linus Pauling, with 3.6 residues per turn and backbone hydrogen bonds from the peptide NH of one residue to the peptide CO of residue i+4. Myoglobin is a model system for many types of biophysical studies, [3] especially involving the binding process of small ligands such as oxygen and carbon monoxide.
1960 – The hemoglobin crystal structure [4] showed a tetramer of two related chain types and was solved at much lower resolution than the monomeric myoglobin, but it clearly had the same basic 8-helix architecture (now called the "globin fold"). Further hemoglobin crystal structures at higher resolution (PDB 1MHB, 1DHB) soon showed the coupled change of both local and quaternary conformation between the oxy and deoxy states of hemoglobin, [5] which explains the cooperativity of oxygen binding in the blood and the allosteric effect of factors such as pH and DPG. For decades hemoglobin was the primary teaching example for the concept of allostery, as well as being an intensive focus of research and discussion on allostery. In 1909, hemoglobin crystals from >100 species were used to relate taxonomy to molecular properties. [6] That book was cited by Perutz in the 1938 report [7] of horse hemoglobin crystals that began his long saga to solve the crystal structure. Hemoglobin crystals are pleochroic — dark red in two directions and pale red in the third [6] — because of the orientation of the hemes, and the bright Soret band of the heme porphyrin groups is used in spectroscopic analysis of hemoglobin ligand binding.
1965 – Hen-egg-white lysozyme (PDB file 1lyz). [8] was the first crystal structure of an enzyme (it cleaves small carbohydrates into simple sugars), used for early studies of enzyme mechanism. [9] It contained beta sheet (antiparallel) as well as helices, and was also the first macromolecular structure to have its atomic coordinates refined (in real space). [10] The starting material for preparation can be bought at the grocery store, and hen-egg lysozyme crystallizes very readily in many different space groups; it is the favorite test case for new crystallographic experiments and instruments. Recent examples are nanocrystals of lysozyme for free-electron laser data collection [11] and microcrystals for micro electron diffraction. [12]
1967 – Ribonuclease A (PDB file 2RSA) [13] is an RNA-cleaving enzyme stabilized by 4 disulfide bonds. It was used in Anfinsen's seminal research on protein folding which led to the concept that a protein's 3-dimensional structure was determined by its amino-acid sequence. Ribonuclease S, the cleaved, two-component form studied by Fred Richards, was also enzymatically active, had a nearly identical crystal structure (PDB file 1RNS), [14] and was shown to be catalytically active even in the crystal, [15] helping dispel doubts about the relevance of protein crystal structures to biological function.
1967 – The serine proteases are a historically very important group of enzyme structures, because collectively they illuminated catalytic mechanism (in their case, by the Ser-His-Asp "catalytic triad"), the basis of differing substrate specificities, and the activation mechanism by which a controlled enzymatic cleavage buries the new chain end to properly rearrange the active site. [16] The early crystal structures included chymotrypsin (PDB file 2CHA), [17] chymotrypsinogen (PDB file 1CHG), [18] trypsin (PDB file 1PTN), [19] and elastase (PDB file 1EST). [20] They also were the first protein structures that showed two near-identical domains, presumably related by gene duplication. One reason for their wide use as textbook and classroom examples was the insertion-code numbering system, which made Ser195 and His57 consistent and memorable despite the protein-specific sequence differences.[ citation needed ]
1968 – Papain
1969 – Carboxypeptidase A is a zinc metalloprotease. Its crystal structure (PDB file 1CPA) [21] showed the first parallel beta structure: a large, twisted, central sheet of 8 strands with the active-site Zn located at the C-terminal end of the middle strands and the sheet flanked on both sides with alpha helices. It is an exopeptidase that cleaves peptides or proteins from the carboxy-terminal end rather than internal to the sequence. Later a small protein inhibitor of carboxypeptidase was solved (PDB file 4CPA) [22] that mechanically stops the catalysis by presenting its C-terminal end just sticking out from between a ring of disulfide bonds with tight structure behind it, preventing the enzyme from sucking in the chain past the first residue.
1969 – Subtilisin (PDB file 1sbt [23] ) was a second type of serine protease with a near-identical active site to the trypsin family of enzymes, but with a completely different overall fold. This gave the first view of convergent evolution at the atomic level. Later, an intensive mutational study on subtilisin documented the effects of all 19 other amino acids at each individual position. [24]
1970 – Lactate dehydrogenase
1970 – Basic pancreatic trypsin inhibitor, or BPTI (PDB file 2pti [25] ), is a small, very stable protein that has been a highly productive model system for study of super-tight binding, disulfide bond (SS) formation, protein folding, molecular stability by amino-acid mutations or hydrogen-deuterium exchange, and fast local dynamics by NMR. Biologically, BPTI binds and inhibits trypsin while stored in the pancreas, allowing activation of protein digestion only after trypsin is released into the stomach.
1970 – Rubredoxin (PDB file 2rxn [26] ) was the first redox structure solved, a minimalist protein with the iron bound by 4 Cys sidechains from 2 loops at the top of β hairpins. It diffracted to 1.2Å, enabling the first reciprocal-space refinement of a protein (4,5rxn [27] ). (NB: note that 4rxn was done without geometry restraints.) Archaeal rubredoxins account for many of the highest-resolution small structures in the PDB.
1971 – Insulin (PDB file 1INS) [28] is a hormone central to the metabolism of sugar and fat storage, and important in human diseases such as obesity and diabetes. It is biophysically notable for its Zn binding, its equilibrium between monomer, dimer, and hexamer states, its ability to form crystals in vivo, and its synthesis as a longer "pro" form which is then cleaved to fold up as the active 2-chain, SS-linked monomer. Insulin was a success of NASA's crystal-growth program on the Space Shuttle, producing bulk preparations of very uniform tiny crystals for controlled dosage.
1971 – Staphylococcal nuclease
1971 – Cytochrome C
1974 – T4 phage lysozyme
1974 – Immunoglobulins
1975 – Cu,Zn Superoxide dismutase
1976 – Transfer RNA
1976 – Triose phosphate isomerase
Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Uncatalysed, the hydrolysis of peptide bonds is extremely slow, taking hundreds of years. Proteolysis is typically catalysed by cellular enzymes called proteases, but may also occur by intra-molecular digestion.
Sir John Cowdery Kendrew, was an English biochemist, crystallographer, and science administrator. Kendrew shared the 1962 Nobel Prize in Chemistry with Max Perutz, for their work at the Cavendish Laboratory to investigate the structure of haem-containing proteins.
Lysozyme is an antimicrobial enzyme produced by animals that forms part of the innate immune system. It is a glycoside hydrolase that catalyzes the following process:
Serine proteases are enzymes that cleave peptide bonds in proteins. Serine serves as the nucleophilic amino acid at the (enzyme's) active site. They are found ubiquitously in both eukaryotes and prokaryotes. Serine proteases fall into two broad categories based on their structure: chymotrypsin-like (trypsin-like) or subtilisin-like.
The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins. Proteins with the same shapes but having little sequence or functional similarity are placed in different superfamilies, and are assumed to have only a very distant common ancestor. Proteins having the same shape and some similarity of sequence and/or function are placed in "families", and are assumed to have a closer common ancestor.
In biochemistry, a Ramachandran plot, originally developed in 1963 by G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan, is a way to visualize energetically allowed regions for backbone dihedral angles ψ against φ of amino acid residues in protein structure. The figure on the left illustrates the definition of the φ and ψ backbone dihedral angles. The ω angle at the peptide bond is normally 180°, since the partial-double-bond character keeps the peptide bond planar. The figure in the top right shows the allowed φ,ψ backbone conformational regions from the Ramachandran et al. 1963 and 1968 hard-sphere calculations: full radius in solid outline, reduced radius in dashed, and relaxed tau (N-Cα-C) angle in dotted lines. Because dihedral angle values are circular and 0° is the same as 360°, the edges of the Ramachandran plot "wrap" right-to-left and bottom-to-top. For instance, the small strip of allowed values along the lower-left edge of the plot are a continuation of the large, extended-chain region at upper left.
Robert Huber is a German biochemist and Nobel laureate. known for his work crystallizing an intramembrane protein important in photosynthesis and subsequently applying X-ray crystallography to elucidate the protein's structure.
A catalytic triad is a set of three coordinated amino acids that can be found in the active site of some enzymes. Catalytic triads are most commonly found in hydrolase and transferase enzymes. An acid-base-nucleophile triad is a common motif for generating a nucleophilic residue for covalent catalysis. The residues form a charge-relay network to polarise and activate the nucleophile, which attacks the substrate, forming a covalent intermediate which is then hydrolysed to release the product and regenerate free enzyme. The nucleophile is most commonly a serine or cysteine amino acid, but occasionally threonine or even selenocysteine. The 3D structure of the enzyme brings together the triad residues in a precise orientation, even though they may be far apart in the sequence.
Barnase (a portmanteau of "BActerial" "RiboNucleASE") is a bacterial protein that consists of 110 amino acids and has ribonuclease activity. It is synthesized and secreted by the bacterium Bacillus amyloliquefaciens, but is lethal to the cell when expressed without its inhibitor barstar. The inhibitor binds to and occludes the ribonuclease active site, preventing barnase from damaging the cell's RNA after it has been synthesized but before it has been secreted. The barnase/barstar complex is noted for its extraordinarily tight protein-protein binding, with an on-rate of 108s−1M−1.
Subtilisin is a protease initially obtained from Bacillus subtilis.
Frederic Middlebrook Richards, commonly referred to as Fred Richards, was an American biochemist and biophysicist known for solving the pioneering crystal structure of the ribonuclease S enzyme in 1967 and for defining the concept of solvent-accessible surface. He contributed many key experimental and theoretical results and developed new methods, garnering over 20,000 journal citations in several quite distinct research areas. In addition to the protein crystallography and biochemistry of ribonuclease S, these included solvent accessibility and internal packing of proteins, the first side-chain rotamer library, high-pressure crystallography, new types of chemical tags such as biotin/avidin, the nuclear magnetic resonance (NMR) chemical shift index, and structural and biophysical characterization of the effects of mutations.
Dame Louise Napier Johnson,, was a British biochemist and protein crystallographer. She was David Phillips Professor of Molecular Biophysics at the University of Oxford from 1990 to 2007, and later an emeritus professor.
Ribonuclease pancreatic is an enzyme that in humans is encoded by the RNASE1 gene.
Subtilases are a family of subtilisin-like serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like in the trypsin serine proteases. The structure of proteins in this family shows that they have an alpha/beta fold containing a 7-stranded parallel beta sheet.
In molecular biology the protein SSI is a Subtilisin inhibitor-like which stands for Streptomyces subtilisin inhibitor. This is a protease inhibitor. These are often synthesised as part of a larger precursor protein, either as a prepropeptide. The function of this protein domain is to prevent access of the substrate to the active site. It is found only in bacteria.
Streptogrisin B is an enzyme. This enzyme catalyses the following chemical reaction
A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.
Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids. These models, which provide 3D coordinates for each atom in the molecule, come from structural biology experiments such as x-ray crystallography or nuclear magnetic resonance (NMR). The validation has three aspects: 1) checking on the validity of the thousands to millions of measurements in the experiment; 2) checking how consistent the atomic model is with those experimental data; and 3) checking consistency of the model with known physical and chemical properties.
Cryogenic electron microscopy (cryo-EM) is a cryomicroscopy technique applied on samples cooled to cryogenic temperatures. For biological specimens, the structure is preserved by embedding in an environment of vitreous ice. An aqueous sample solution is applied to a grid-mesh and plunge-frozen in liquid ethane or a mixture of liquid ethane and propane. While development of the technique began in the 1970s, recent advances in detector technology and software algorithms have allowed for the determination of biomolecular structures at near-atomic resolution. This has attracted wide attention to the approach as an alternative to X-ray crystallography or NMR spectroscopy for macromolecular structure determination without the need for crystallization.
George N. Phillips, Jr. is a biochemist, researcher, and academic. He is the Ralph and Dorothy Looney Professor of Biochemistry and Cell Biology at Rice University, where he also serves as Associate Dean for Research at the Wiess School of Natural Sciences and as a professor of chemistry. Additionally, he holds the title of professor emeritus of biochemistry at the University of Wisconsin-Madison.