Pentapeptide repeat

Last updated
Pentapeptide repeat
PDB 3du1 EBI.png
Structure of the pentapeptide repeat protein HetL. [1]
Identifiers
SymbolPentapeptide
Pfam PF00805
InterPro IPR001646

Pentapeptide repeats are a family of sequence motifs found in multiple tandem copies in protein molecules. [2] [3] Pentapeptide repeat proteins are found in all species, but they are found in many copies in cyanobacterial genomes. The repeats were first identified by Black and colleagues in the hglK protein. [4] The later Bateman et al. showed that a large family of related pentapeptide repeat proteins existed. [3] The function of these repeats is uncertain in most proteins. However, in the MfpA protein a DNA gyrase inhibitor it has been suggested that the pentapeptide repeat structure mimics the structure of DNA. [5] The repeats form a regular right handed four sided beta helix structure known as the Rfr-fold.

Contents

Sequence features

Multiple sequence alignment of pentapeptide repeat proteins. PRP align.jpg
Multiple sequence alignment of pentapeptide repeat proteins.
Dot plot of the HglK protein against itself showing repeats as diagonal lines. HglK dotplot.png
Dot plot of the HglK protein against itself showing repeats as diagonal lines.

The pentapeptide repeat is a feature seen in protein sequence. It can be approximately described using the 1-letter amino acid code as A(D/N)LXX, where X can be any amino acid . This repeating sequence can be seen in multiple sequence alignments and dot plots of proteins such as HglK. The central position in the pentapeptide repeat is usually a leucine and has been designated as position i. The two previous positions are known as i-1 and i-2. Position i-2 is usually an alanine. The two subsequent positions are denoted i+1 and i+2. The side chains of positions i-2 and i point into the hydrophobic interior of the protein while the side chains of positions i-1, i+1 and i+2 are exposed on the surface of the proteins.

Structure

Pentapeptide repeats were initially predicted from sequence to possess a right handed beta helix with three sides. [3] The first crystal structure of a pentapeptide repeat protein was the MfpA protein solved by Hegde and colleagues. It showed that pentapeptide repeat proteins (PRPs) possessed a four sided beta helix structure. [5] Four repeats make up one turn of a solenoid like structure. The structures of eight different proteins have been solved to date.

ProteinPDB codeLengthNumber of repeatsReference
Mycobacterium tuberculosis MfpA PDB: 2bm4 18330 [5]
Cyanobacterium nostoc HetL PDB: 3du1 23740 [1]
Enterococcus faecalis EfsQnr PDB: 2w7z 211 [6]
Nostoc punctiforme Np275 PDB: 2J8I 9817 [7]
Nostoc punctiforme Np276 PDB: 2J8K 7512 [7]
Cyanothece sp. Rfr32 PDB: 2F3L PDB: 2G0Y 16721 [8]
Cyanothece sp. Rfr23 PDB: 2O6W 17423 [9]
Arabidopsis thaliana At2g44920 PDB: 3N90 22425 [10]

Related Research Articles

Alpha helix Type of secondary structure of proteins

The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located three or four residues earlier along the protein sequence.

Collagen helix

The collagen triple helix or type-2 helix is the primary secondary structure of various types of fibrous collagen, including type I collagen. It consists of a triple helix made of the repetitious amino acid sequence glycine-X-Y, where X and Y are frequently proline or hydroxyproline. Collagen folded into a triple helix is known as tropocollagen. Collagen triple helices are often bundled into fibrils which themselves form larger fibres, as in tendon.

Protein secondary structure General three-dimensional form of local segments of proteins

Protein secondary structure is the three dimensional form of local segments of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.

Cyanobacteria Phylum of photosynthesising prokaryotes

Cyanobacteria, also known as Cyanophyta, are a phylum of prokaryotes consisting of both free-living photosynthetic bacteria and the endosymbiotic plastids that are present in the Archaeplastida, the autotrophic eukaryotes that include the red and green algae and land plants. They commonly obtain their energy through oxygenic photosynthesis, which produces the oxygen gas in the atmosphere of Earth. The name cyanobacteria comes from their color, giving them their other name, "blue-green algae", though some modern botanists restrict the term algae to eukaryotes. They appear to have originated in freshwater or a terrestrial environment.

Zinc finger Small structural protein motif found mostly in transcriptional proteins

A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) in order to stabilize the fold. Originally coined to describe the finger-like appearance of a hypothesized structure from Xenopus laevis transcription factor IIIA, the zinc finger name has now come to encompass a wide variety of differing protein structures. Xenopus laevis TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein. It often appears as a metal-binding domain in multi-domain proteins.

A coiled coil is a structural motif in proteins in which 2–7 alpha-helices are coiled together like the strands of a rope. Many coiled coil-type proteins are involved in important biological functions, such as the regulation of gene expression — e.g., transcription factors. Notable examples are the oncoproteins c-Fos and c-jun, as well as the muscle protein tropomyosin.

Triple-stranded DNA

Triple-stranded DNA is a DNA structure in which three oligonucleotides wind around each other and form a triple helix. In triple-stranded DNA, the third strand binds to a B-form DNA double helix by forming Hoogsteen base pairs or reversed Hoogsteen hydrogen bonds.

Basic helix-loop-helix Protein structural motif

A basic helix-loop-helix (bHLH) is a protein structural motif that characterizes one of the largest families of dimerizing transcription factors.

Leucine zipper DNA-binding structural motif

A leucine zipper is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amino acid segment and the display of these amino acid sequences on an idealized alpha helix revealed a periodic repetition of leucine residues at every seventh position over a distance covering eight helical turns. The polypeptide segments containing these periodic arrays of leucine residues were proposed to exist in an alpha-helical conformation and the leucine side chains from one alpha helix interdigitate with those from the alpha helix of a second polypeptide, facilitating dimerization.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

Beta helix

A beta helix is a tandem protein repeat structure formed by the association of parallel beta strands in a helical pattern with either two or three faces. The beta helix is a type of solenoid protein domain. The structure is stabilized by inter-strand hydrogen bonds, protein-protein interactions, and sometimes bound metal ions. Both left- and right-handed beta helices have been identified. Double stranded beta-helices are also very common features of proteins and are generally synonymous with jelly roll folds.

FOXK2

Forkhead box protein K2 is a protein that in humans is encoded by the FOXK2 gene.

Tetratricopeptide repeat

The tetratricopeptide repeat (TPR) is a structural motif. It consists of a degenerate 34 amino acid tandem repeat identified in a wide variety of proteins. It is found in tandem arrays of 3–16 motifs, which form scaffolds to mediate protein–protein interactions and often the assembly of multiprotein complexes. These alpha-helix pair repeats usually fold together to produce a single, linear solenoid domain called a TPR domain. Proteins with such domains include the anaphase-promoting complex (APC) subunits cdc16, cdc23 and cdc27, the NADPH oxidase subunit p67-phox, hsp90-binding immunophilins, transcription factors, the protein kinase R (PKR), the major receptor for peroxisomal matrix protein import PEX5, protein arginine methyltransferase 9 (PRMT9), and mitochondrial import proteins.

RiAFP refers to an antifreeze protein (AFP) produced by the Rhagium inquisitor longhorned beetle. It is a type V antifreeze protein with a molecular weight of 12.8 kDa; this type of AFP is noted for its hyperactivity. R. inquisitor is a freeze-avoidant species, meaning that, due to its AFP, R. inquisitor prevents its body fluids from freezing altogether. This contrasts with freeze-tolerant species, whose AFPs simply depress levels of ice crystal formation in low temperatures. Whereas most insect antifreeze proteins contain cysteines at least every sixth residue, as well as varying numbers of 12- or 13-mer repeats of 8.3-12.5kDa, RiAFP is notable for containing only one disulfide bridge. This property of RiAFP makes it particularly attractive for recombinant expression and biotechnological applications.

Solenoid protein domain

Solenoid protein domains are a highly modular type of protein domain. They consist of a chain of nearly identical folds, often simply called tandem repeats. They are extremely common among all types of proteins, though exact figures are unknown.

The term N cap describes an amino acid in a particular position within a protein or polypeptide. The N cap residue of an alpha helix is the first amino acid residue at the N terminus of the helix. More precisely, it is defined as the first residue (i) whose CO group is hydrogen-bonded to the NH group of residue i+4. Because of this it is sometimes also described as the residue prior to the helix.

Amide Rings are small motifs in proteins and polypeptides. They consist of 9-atom or 11-atom rings formed by two CO...HN hydrogen bonds between a side chain amide group and the main chain atoms of a short polypeptide. They are observed with glutamine or asparagine side chains within proteins and polypeptides. Structurally similar rings occur in the binding of purine, pyrimidine and nicotinamide bases to the main chain atoms of proteins. About 4% of asparagines and glutamines form amide rings; in databases of protein domain structures, one is present, on average, every other protein.

Cyanothece is a genus of unicellular, diazotrophic, oxygenic photosynthesizing cyanobacteria.

Protein tandem repeats

An array of protein tandem repeats is defined as several adjacent copies having the same or similar sequence motifs. These periodic sequences are generated by internal duplications in both coding and non-coding genomic sequences. Repetitive units of protein tandem repeats are considerably diverse, ranging from the repetition of a single amino acid to domains of 100 or more residues.

Toroid repeat proteins

A toroid repeat is a protein fold composed of repeating subunits, arranged in circular fashion to form a closed structure.

References

  1. 1 2 Ni S, Sheldrick GM, Benning MM, Kennedy MA (January 2009). "The 2 Å resolution crystal structure of HetL, a pentapeptide repeat protein involved in regulation of heterocyst differentiation in the cyanobacterium Nostoc sp. strain PCC 7120". J. Struct. Biol. 165 (1): 47–52. doi:10.1016/j.jsb.2008.09.010. PMID   18952182.
  2. Vetting MW, Hegde SS, Fajardo JE, et al. (January 2006). "Pentapeptide repeat proteins". Biochemistry. 45 (1): 1–10. doi:10.1021/bi052130w. PMC   2566302 . PMID   16388575.
  3. 1 2 3 Bateman A, Murzin AG, Teichmann SA (June 1998). "Structure and distribution of pentapeptide repeats in bacteria". Protein Sci. 7 (6): 1477–80. doi:10.1002/pro.5560070625. PMC   2144021 . PMID   9655353.
  4. Black K, Buikema WJ, Haselkorn R (November 1995). "The hglK gene is required for localization of heterocyst-specific glycolipids in the cyanobacterium Anabaena sp. strain PCC 7120". J. Bacteriol. 177 (22): 6440–8. doi:10.1128/jb.177.22.6440-6448.1995. PMC   177493 . PMID   7592418.
  5. 1 2 3 Hegde SS, Vetting MW, Roderick SL, et al. (June 2005). "A fluoroquinolone resistance protein from Mycobacterium tuberculosis that mimics DNA". Science. 308 (5727): 1480–3. Bibcode:2005Sci...308.1480H. doi:10.1126/science.1110699. PMID   15933203. S2CID   20194294.
  6. Vetting MW, Hegde SS, Blanchard JS (May 2009). "Crystallization of a pentapeptide-repeat protein by reductive cyclic pentylation of free amines with glutaraldehyde". Acta Crystallogr. D. 65 (Pt 5): 462–9. doi:10.1107/S0907444909008324. PMC   2672816 . PMID   19390151.
  7. 1 2 Vetting MW, Hegde SS, Hazleton KZ, Blanchard JS (April 2007). "Structural characterization of the fusion of two pentapeptide repeat proteins, Np275 and Np276, from Nostoc punctiforme: resurrection of an ancestral protein". Protein Sci. 16 (4): 755–60. doi:10.1110/ps.062637707. PMC   2203339 . PMID   17384236.
  8. Buchko GW, Ni S, Robinson H, Welsh EA, Pakrasi HB, Kennedy MA (November 2006). "Characterization of two potentially universal turn motifs that shape the repeated five-residues fold--crystal structure of a lumenal pentapeptide repeat protein from Cyanothece 51142". Protein Sci. 15 (11): 2579–95. doi:10.1110/ps.062407506. PMC   2242410 . PMID   17075135.
  9. Buchko GW, Robinson H, Pakrasi HB, Kennedy MA (April 2008). "Insights into the structural variation between pentapeptide repeat proteins--crystal structure of Rfr23 from Cyanothece 51142". J. Struct. Biol. 162 (1): 184–92. doi:10.1016/j.jsb.2007.11.008. PMID   18158251.
  10. Ni S, McGookey ME, Tinch SL, et al. (December 2011). "The 1.7 Å resolution structure of At2g44920, a pentapeptide-repeat protein in the thylakoid lumen of Arabidopsis thaliana". Acta Crystallographica Section F. 67 (Pt 12): 1480–4. doi:10.1107/S1744309111037432. PMC   3232121 . PMID   22139148.