Protein fold class

Last updated
A summary of functional annotation of the most ancestral translation protein folds A summary of functional annotation of the most ancestral translation protein folds.svg
A summary of functional annotation of the most ancestral translation protein folds

In molecular biology, protein fold classes are broad categories of protein tertiary structure topology. They describe groups of proteins that share similar amino acid and secondary structure proportions. Each class contains multiple, independent protein superfamilies (i.e. are not necessarily evolutionarily related to one another). [1] [2] [3]

Contents

Generally recognised classes

Four large classes of protein that are generally agreed upon by the two main structure classification databases (SCOP and CATH).

all-α

All-α proteins are a class of structural domains in which the secondary structure is composed entirely of α-helices, with the possible exception of a few isolated β-sheets on the periphery.

Common examples include the bromodomain, the globin fold and the homeodomain fold.

all-β

All-β proteins are a class of structural domains in which the secondary structure is composed entirely of β-sheets, with the possible exception of a few isolated α-helices on the periphery.

Common examples include the SH3 domain, the beta-propeller domain, the immunoglobulin fold and B3 DNA binding domain.

α+β

α+β proteins are a class of structural domains in which the secondary structure is composed of α-helices and β-strands that occur separately along the backbone. The β-strands are therefore mostly antiparallel. [4]

Common examples include the ferredoxin fold, ribonuclease A, and the SH2 domain.

α/β

α/β proteins are a class of structural domains in which the secondary structure is composed of alternating α-helices and β-strands along the backbone. The β-strands are therefore mostly parallel. [4]

Common examples include the flavodoxin fold, the TIM barrel and leucine-rich-repeat (LRR) proteins such as ribonuclease inhibitor.

Additional classes

Membrane proteins

Membrane proteins interact with biological membranes either by inserting into it, or being tethered via a covalently attached lipid. They are one of the common types of protein along with soluble globular proteins, fibrous proteins, and disordered proteins. [5] They are targets of over 50% of all modern medicinal drugs. [6] It is estimated that 20–30% of all genes in most genomes encode membrane proteins. [7]

Intrinsically disordered proteins

Intrinsically disordered proteins lack a fixed or ordered three-dimensional structure. [8] [9] [10] IDPs cover a spectrum of states from fully unstructured to partially structured and include random coils, (pre-)molten globules, and large multi-domain proteins connected by flexible linkers. They constitute one of the main types of protein (alongside globular, fibrous and membrane proteins). [5]

Coiled coil proteins

Coiled coil proteins form long, insoluble fibers involved in the extracellular matrix. There are many scleroprotein superfamilies including keratin, collagen, elastin, and fibroin. The roles of such proteins include protection and support, forming connective tissue, tendons, bone matrices, and muscle fiber.

Small proteins

Small proteins typically have a tertiary structure that is maintained by disulphide bridges (cysteine-rich proteins), metal ligands (metal-binding proteins), and or cofactors such as heme.

Designed proteins

Numerous protein structures are the result of rational design and do not exist in nature. Proteins can be designed from scratch (de novo design) or by making calculated variations on a known protein structure and its sequence (known as protein redesign). Rational protein design approaches make protein-sequence predictions that will fold to specific structures. These predicted sequences can then be validated experimentally through methods such as peptide synthesis, site-directed mutagenesis, or Artificial gene synthesis.

See also

Related Research Articles

<span class="mw-page-title-main">Beta sheet</span> Protein structural motif

The beta sheet, (β-sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet. A β-strand is a stretch of polypeptide chain typically 3 to 10 amino acids long with backbone in an extended conformation. The supramolecular association of β-sheets has been implicated in the formation of the fibrils and protein aggregates observed in amyloidosis, Alzheimer's disease and other proteinopathies.

<span class="mw-page-title-main">Protein secondary structure</span> General three-dimensional form of local segments of proteins

Protein secondary structure is the local spatial conformation of the polypeptide backbone excluding the side chains. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.

In a chain-like biological molecule, such as a protein or nucleic acid, a structural motif is a common three-dimensional structure that appears in a variety of different, evolutionarily unrelated molecules. A structural motif does not have to be associated with a sequence motif; it can be represented by different and completely unrelated sequences in different proteins or RNA.

<span class="mw-page-title-main">Structural Classification of Proteins database</span> Biological database of proteins

The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins. Proteins with the same shapes but having little sequence or functional similarity are placed in different superfamilies, and are assumed to have only a very distant common ancestor. Proteins having the same shape and some similarity of sequence and/or function are placed in "families", and are assumed to have a closer common ancestor.

<span class="mw-page-title-main">CATH database</span>

The CATH Protein Structure Classification database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It was created in the mid-1990s by Professor Christine Orengo and colleagues including Janet Thornton and David Jones, and continues to be developed by the Orengo group at University College London. CATH shares many broad features with the SCOP resource, however there are also many areas in which the detailed classification differs greatly.

<span class="mw-page-title-main">Protein structure</span> Three-dimensional arrangement of atoms in an amino acid-chain molecule

Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers – specifically polypeptides – formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions, such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, cryo-electron microscopy (cryo-EM) and dual polarisation interferometry, to determine the structure of proteins.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.

<span class="mw-page-title-main">Beta barrel</span>

In protein structures, a beta barrel is a beta sheet composed of tandem repeats that twists and coils to form a closed toroidal structure in which the first strand is bonded to the last strand. Beta-strands in many beta-barrels are arranged in an antiparallel fashion. Beta barrel structures are named for resemblance to the barrels used to contain liquids. Most of them are water-soluble proteins and frequently bind hydrophobic ligands in the barrel center, as in lipocalins. Others span cell membranes and are commonly found in porins. Porin-like barrel structures are encoded by as many as 2–3% of the genes in Gram-negative bacteria. It has been shown that more than 600 proteins with various function contain the beta barrel structure.

<span class="mw-page-title-main">Beta helix</span>

A beta helix is a tandem protein repeat structure formed by the association of parallel beta sheet in a helical pattern with either two or three faces. The beta helix is a type of solenoid protein domain. The structure is stabilized by inter-strand hydrogen bonds, protein-protein interactions, and sometimes bound metal ions. Both left- and right-handed beta helices have been identified. These structures are distinct from jelly-roll folds, a different protein structure sometimes known as a "double-stranded beta helix".

<span class="mw-page-title-main">TIM barrel</span> Protein fold

The TIM barrel, also known as an alpha/beta barrel, is a conserved protein fold consisting of eight alpha helices (α-helices) and eight parallel beta strands (β-strands) that alternate along the peptide backbone. The structure is named after triose-phosphate isomerase, a conserved metabolic enzyme. TIM barrels are ubiquitous, with approximately 10% of all enzymes adopting this fold. Further, five of seven enzyme commission (EC) enzyme classes include TIM barrel proteins. The TIM barrel fold is evolutionarily ancient, with many of its members possessing little similarity today, instead falling within the twilight zone of sequence similarity.

<span class="mw-page-title-main">Protein domain</span> Self-stable region of a proteins chain that folds independently from the rest

In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of several domains, and a domain may appear in a variety of different proteins. Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions. In general, domains vary in length from between about 50 amino acids up to 250 amino acids in length. The shortest domains, such as zinc fingers, are stabilized by metal ions or disulfide bridges. Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin. Because they are independently stable, domains can be "swapped" by genetic engineering between one protein and another to make chimeric proteins.

Protein subfamily is a level of protein classification, based on their close evolutionary relationship. It is below the larger levels of protein superfamily and protein family.

SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.

<span class="mw-page-title-main">David T. Jones (scientist)</span>

David Tudor Jones is a Professor of Bioinformatics, and Head of Bioinformatics Group in the University College London. He is also the director in Bloomsbury Center for Bioinformatics, which is a joint Research Centre between UCL and Birkbeck, University of London and which also provides bioinformatics training and support services to biomedical researchers. In 2013, he is a member of editorial boards for PLoS ONE, BioData Mining, Advanced Bioinformatics, Chemical Biology & Drug Design, and Protein: Structure, Function and Bioinformatics.

A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.

Membranome database provides structural and functional information about more than 6000 single-pass (bitopic) transmembrane proteins from Homo sapiens, Arabidopsis thaliana, Dictyostelium discoideum, Saccharomyces cerevisiae, Escherichia coli and Methanocaldococcus jannaschii. Bitopic membrane proteins consist of a single transmembrane alpha-helix connecting water-soluble domains of the protein situated at the opposite sides of a biological membrane. These proteins are frequently involved in the signal transduction and communication between cells in multicellular organisms.

IntFOLD is fully automated, integrated pipeline for prediction of 3D structure and function from amino acid sequences. The pipeline is wrapped up and deployed as a Web Server. The core of the server method is quality assessment using built-in accuracy self-estimates (ASE) which improves performance prediction of 3D model using ModFOLD.

References

  1. Hubbard, Tim J. P.; Murzin, Alexey G.; Brenner, Steven E.; Chothia, Cyrus (1997-01-01). "SCOP: a Structural Classification of Proteins database". Nucleic Acids Research. 25 (1): 236–239. doi:10.1093/nar/25.1.236. ISSN   0305-1048. PMC   146380 . PMID   9016544.
  2. Greene, Lesley H.; Lewis, Tony E.; Addou, Sarah; Cuff, Alison; Dallman, Tim; Dibley, Mark; Redfern, Oliver; Pearl, Frances; Nambudiry, Rekha (2007-01-01). "The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution". Nucleic Acids Research. 35 (suppl 1): D291–D297. doi:10.1093/nar/gkl959. ISSN   0305-1048. PMC   1751535 . PMID   17135200.
  3. Fox, Naomi K.; Brenner, Steven E.; Chandonia, John-Marc (2014-01-01). "SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures". Nucleic Acids Research. 42 (D1): D304–D309. doi:10.1093/nar/gkt1240. ISSN   0305-1048. PMC   3965108 . PMID   24304899.
  4. 1 2 Efimov, Alexander V. (1995). "Structural Similarity between Two-layer α/β and β-Proteins". Journal of Molecular Biology. 245 (4): 402–415. doi:10.1006/jmbi.1994.0033. PMID   7837272.
  5. 1 2 Andreeva, A (2014). "SCOP2 prototype: a new approach to protein structure mining". Nucleic Acids Res. 42 (Database issue): D310–4. doi:10.1093/nar/gkt1242. PMC   3964979 . PMID   24293656.
  6. Overington JP, Al-Lazikani B, Hopkins AL (December 2006). "How many drug targets are there?". Nat Rev Drug Discov. 5 (12): 993–6. doi:10.1038/nrd2199. PMID   17139284. S2CID   11979420.
  7. Krogh, A.; Larsson, B. R.; Von Heijne, G.; Sonnhammer, E. L. L. (2001). "Predicting transmembrane protein topology with a hidden markov model: Application to complete genomes". Journal of Molecular Biology. 305 (3): 567–580. doi:10.1006/jmbi.2000.4315. PMID   11152613. S2CID   15769874.
  8. Dunker, A. K.; Lawson, J. D.; Brown, C. J.; Williams, R. M.; Romero, P; Oh, J. S.; Oldfield, C. J.; Campen, A. M.; Ratliff, C. M.; Hipps, K. W.; Ausio, J; Nissen, M. S.; Reeves, R; Kang, C; Kissinger, C. R.; Bailey, R. W.; Griswold, M. D.; Chiu, W; Garner, E. C.; Obradovic, Z (2001). "Intrinsically disordered protein". Journal of Molecular Graphics & Modelling. 19 (1): 26–59. CiteSeerX   10.1.1.113.556 . doi:10.1016/s1093-3263(00)00138-8. PMID   11381529.
  9. Dyson HJ, Wright PE (March 2005). "Intrinsically unstructured proteins and their functions". Nat. Rev. Mol. Cell Biol. 6 (3): 197–208. doi:10.1038/nrm1589. PMID   15738986. S2CID   18068406.
  10. Dunker AK, Silman I, Uversky VN, Sussman JL (December 2008). "Function and structure of inherently disordered proteins". Curr. Opin. Struct. Biol. 18 (6): 756–64. doi:10.1016/j.sbi.2008.10.002. PMID   18952168.