Pentapeptide repeat

Available protein structures:
Pfam	structures / ECOD
PDB	RCSB PDB; PDBe; PDBj
PDBsum	structure summary
PDB	2bm4 , 2bm5 , 2bm6 , 2bm7 , 2w7z , 3du1 , PDB: 2O6W , PDB: 2J8K , PDB: 2J8I

Pentapeptide repeat
	Structure of the pentapeptide repeat protein HetL.
Identifiers
Symbol	Pentapeptide
Pfam	PF00805
InterPro	IPR001646
Available protein structures:
Pfam	structures / ECOD
PDB	RCSB PDB; PDBe; PDBj
PDBsum	structure summary
PDB	2bm4 , 2bm5 , 2bm6 , 2bm7 , 2w7z , 3du1 , PDB: 2O6W , PDB: 2J8K , PDB: 2J8I

Last updated December 10, 2020

Pentapeptide repeats are a family of sequence motifs found in multiple tandem copies in protein molecules.^[2]^[3] Pentapeptide repeat proteins are found in all species, but they are found in many copies in cyanobacterial genomes. The repeats were first identified by Black and colleagues in the hglK protein.^[4] The later Bateman et al. showed that a large family of related pentapeptide repeat proteins existed.^[3] The function of these repeats is uncertain in most proteins. However, in the MfpA protein a DNA gyrase inhibitor it has been suggested that the pentapeptide repeat structure mimics the structure of DNA.^[5] The repeats form a regular right handed four sided beta helix structure known as the Rfr-fold.

Sequence features

The pentapeptide repeat is a feature seen in protein sequence. It can be approximately described using the 1-letter amino acid code as A(D/N)LXX, where X can be any amino acid . This repeating sequence can be seen in multiple sequence alignments and dot plots of proteins such as HglK. The central position in the pentapeptide repeat is usually a leucine and has been designated as position i. The two previous positions are known as i-1 and i-2. Position i-2 is usually an alanine. The two subsequent positions are denoted i+1 and i+2. The side chains of positions i-2 and i point into the hydrophobic interior of the protein while the side chains of positions i-1, i+1 and i+2 are exposed on the surface of the proteins.

Structure

Pentapeptide repeats were initially predicted from sequence to possess a right handed beta helix with three sides.^[3] The first crystal structure of a pentapeptide repeat protein was the MfpA protein solved by Hegde and colleagues. It showed that pentapeptide repeat proteins (PRPs) possessed a four sided beta helix structure.^[5] Four repeats make up one turn of a solenoid like structure. The structures of eight different proteins have been solved to date.

Protein	PDB code	Length	Number of repeats	Reference
Mycobacterium tuberculosis MfpA	PDB: 2bm4	183	30	^[5]
Cyanobacterium nostoc HetL	PDB: 3du1	237	40	^[1]
Enterococcus faecalis EfsQnr	PDB: 2w7z	211		^[6]
Nostoc punctiforme Np275	PDB: 2J8I	98	17	^[7]
Nostoc punctiforme Np276	PDB: 2J8K	75	12	^[7]
Cyanothece sp. Rfr32	PDB: 2F3L PDB: 2G0Y	167	21	^[8]
Cyanothece sp. Rfr23	PDB: 2O6W	174	23	^[9]
Arabidopsis thaliana At2g44920	PDB: 3N90	224	25	^[10]

Related Research Articles

The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located three or four residues earlier along the protein sequence.

The collagen triple helix or type-2 helix is the primary secondary structure of various types of fibrous collagen, including type I collagen. It consists of a triple helix made of the repetitious amino acid sequence glycine-X-Y, where X and Y are frequently proline or hydroxyproline. Collagen folded into a triple helix is known as tropocollagen. Collagen triple helices are often bundled into fibrils which themselves form larger fibres, as in tendon.

Protein secondary structure is the three dimensional form of local segments of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.

Cyanobacteria, also known as Cyanophyta, are a phylum of prokaryotes consisting of both free-living photosynthetic bacteria and the endosymbiotic plastids that are present in the Archaeplastida, the autotrophic eukaryotes that include the red and green algae and land plants. They commonly obtain their energy through oxygenic photosynthesis, which produces the oxygen gas in the atmosphere of Earth. The name cyanobacteria comes from their color, giving them their other name, "blue-green algae", though some modern botanists restrict the term algae to eukaryotes. They appear to have originated in freshwater or a terrestrial environment.

A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn²⁺) in order to stabilize the fold. Originally coined to describe the finger-like appearance of a hypothesized structure from Xenopus laevis transcription factor IIIA, the zinc finger name has now come to encompass a wide variety of differing protein structures. Xenopus laevis TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein. It often appears as a metal-binding domain in multi-domain proteins.

A coiled coil is a structural motif in proteins in which 2–7 alpha-helices are coiled together like the strands of a rope. Many coiled coil-type proteins are involved in important biological functions, such as the regulation of gene expression — e.g., transcription factors. Notable examples are the oncoproteins c-Fos and c-jun, as well as the muscle protein tropomyosin.

Triple-stranded DNA is a DNA structure in which three oligonucleotides wind around each other and form a triple helix. In triple-stranded DNA, the third strand binds to a B-form DNA double helix by forming Hoogsteen base pairs or reversed Hoogsteen hydrogen bonds.

A basic helix-loop-helix (bHLH) is a protein structural motif that characterizes one of the largest families of dimerizing transcription factors.

Leucine zipper DNA-binding structural motif

A leucine zipper is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amino acid segment and the display of these amino acid sequences on an idealized alpha helix revealed a periodic repetition of leucine residues at every seventh position over a distance covering eight helical turns. The polypeptide segments containing these periodic arrays of leucine residues were proposed to exist in an alpha-helical conformation and the leucine side chains from one alpha helix interdigitate with those from the alpha helix of a second polypeptide, facilitating dimerization.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

A beta helix is a tandem protein repeat structure formed by the association of parallel beta strands in a helical pattern with either two or three faces. The beta helix is a type of solenoid protein domain. The structure is stabilized by inter-strand hydrogen bonds, protein-protein interactions, and sometimes bound metal ions. Both left- and right-handed beta helices have been identified. Double stranded beta-helices are also very common features of proteins and are generally synonymous with jelly roll folds.

Forkhead box protein K2 is a protein that in humans is encoded by the FOXK2 gene.

The tetratricopeptide repeat (TPR) is a structural motif. It consists of a degenerate 34 amino acid tandem repeat identified in a wide variety of proteins. It is found in tandem arrays of 3–16 motifs, which form scaffolds to mediate protein–protein interactions and often the assembly of multiprotein complexes. These alpha-helix pair repeats usually fold together to produce a single, linear solenoid domain called a TPR domain. Proteins with such domains include the anaphase-promoting complex (APC) subunits cdc16, cdc23 and cdc27, the NADPH oxidase subunit p67-phox, hsp90-binding immunophilins, transcription factors, the protein kinase R (PKR), the major receptor for peroxisomal matrix protein import PEX5, protein arginine methyltransferase 9 (PRMT9), and mitochondrial import proteins.

RiAFP refers to an antifreeze protein (AFP) produced by the Rhagium inquisitor longhorned beetle. It is a type V antifreeze protein with a molecular weight of 12.8 kDa; this type of AFP is noted for its hyperactivity. R. inquisitor is a freeze-avoidant species, meaning that, due to its AFP, R. inquisitor prevents its body fluids from freezing altogether. This contrasts with freeze-tolerant species, whose AFPs simply depress levels of ice crystal formation in low temperatures. Whereas most insect antifreeze proteins contain cysteines at least every sixth residue, as well as varying numbers of 12- or 13-mer repeats of 8.3-12.5kDa, RiAFP is notable for containing only one disulfide bridge. This property of RiAFP makes it particularly attractive for recombinant expression and biotechnological applications.

Solenoid protein domains are a highly modular type of protein domain. They consist of a chain of nearly identical folds, often simply called tandem repeats. They are extremely common among all types of proteins, though exact figures are unknown.

The term N cap describes an amino acid in a particular position within a protein or polypeptide. The N cap residue of an alpha helix is the first amino acid residue at the N terminus of the helix. More precisely, it is defined as the first residue (i) whose CO group is hydrogen-bonded to the NH group of residue i+4. Because of this it is sometimes also described as the residue prior to the helix.

Amide Rings are small motifs in proteins and polypeptides. They consist of 9-atom or 11-atom rings formed by two CO^...HN hydrogen bonds between a side chain amide group and the main chain atoms of a short polypeptide. They are observed with glutamine or asparagine side chains within proteins and polypeptides. Structurally similar rings occur in the binding of purine, pyrimidine and nicotinamide bases to the main chain atoms of proteins. About 4% of asparagines and glutamines form amide rings; in databases of protein domain structures, one is present, on average, every other protein.

Cyanothece is a genus of unicellular, diazotrophic, oxygenic photosynthesizing cyanobacteria.

An array of protein tandem repeats is defined as several adjacent copies having the same or similar sequence motifs. These periodic sequences are generated by internal duplications in both coding and non-coding genomic sequences. Repetitive units of protein tandem repeats are considerably diverse, ranging from the repetition of a single amino acid to domains of 100 or more residues.

A toroid repeat is a protein fold composed of repeating subunits, arranged in circular fashion to form a closed structure.

References

1 2 Ni S, Sheldrick GM, Benning MM, Kennedy MA (January 2009). "The 2 Å resolution crystal structure of HetL, a pentapeptide repeat protein involved in regulation of heterocyst differentiation in the cyanobacterium Nostoc sp. strain PCC 7120". J. Struct. Biol. 165 (1): 47–52. doi:10.1016/j.jsb.2008.09.010. PMID 18952182.
↑ Vetting MW, Hegde SS, Fajardo JE, et al. (January 2006). "Pentapeptide repeat proteins". Biochemistry. 45 (1): 1–10. doi:10.1021/bi052130w. PMC 2566302 . PMID 16388575.
1 2 3 Bateman A, Murzin AG, Teichmann SA (June 1998). "Structure and distribution of pentapeptide repeats in bacteria". Protein Sci. 7 (6): 1477–80. doi:10.1002/pro.5560070625. PMC 2144021 . PMID 9655353.
↑ Black K, Buikema WJ, Haselkorn R (November 1995). "The hglK gene is required for localization of heterocyst-specific glycolipids in the cyanobacterium Anabaena sp. strain PCC 7120". J. Bacteriol. 177 (22): 6440–8. doi:10.1128/jb.177.22.6440-6448.1995. PMC 177493 . PMID 7592418.
1 2 3 Hegde SS, Vetting MW, Roderick SL, et al. (June 2005). "A fluoroquinolone resistance protein from Mycobacterium tuberculosis that mimics DNA". Science. 308 (5727): 1480–3. Bibcode:2005Sci...308.1480H. doi:10.1126/science.1110699. PMID 15933203. S2CID 20194294.
↑ Vetting MW, Hegde SS, Blanchard JS (May 2009). "Crystallization of a pentapeptide-repeat protein by reductive cyclic pentylation of free amines with glutaraldehyde". Acta Crystallogr. D. 65 (Pt 5): 462–9. doi:10.1107/S0907444909008324. PMC 2672816 . PMID 19390151.
1 2 Vetting MW, Hegde SS, Hazleton KZ, Blanchard JS (April 2007). "Structural characterization of the fusion of two pentapeptide repeat proteins, Np275 and Np276, from Nostoc punctiforme: resurrection of an ancestral protein". Protein Sci. 16 (4): 755–60. doi:10.1110/ps.062637707. PMC 2203339 . PMID 17384236.
↑ Buchko GW, Ni S, Robinson H, Welsh EA, Pakrasi HB, Kennedy MA (November 2006). "Characterization of two potentially universal turn motifs that shape the repeated five-residues fold--crystal structure of a lumenal pentapeptide repeat protein from Cyanothece 51142". Protein Sci. 15 (11): 2579–95. doi:10.1110/ps.062407506. PMC 2242410 . PMID 17075135.
↑ Buchko GW, Robinson H, Pakrasi HB, Kennedy MA (April 2008). "Insights into the structural variation between pentapeptide repeat proteins--crystal structure of Rfr23 from Cyanothece 51142". J. Struct. Biol. 162 (1): 184–92. doi:10.1016/j.jsb.2007.11.008. PMID 18158251.
↑ Ni S, McGookey ME, Tinch SL, et al. (December 2011). "The 1.7 Å resolution structure of At2g44920, a pentapeptide-repeat protein in the thylakoid lumen of Arabidopsis thaliana". Acta Crystallographica Section F. 67 (Pt 12): 1480–4. doi:10.1107/S1744309111037432. PMC 3232121 . PMID 22139148.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[pmid18952182-1] 1 2 Ni S, Sheldrick GM, Benning MM, Kennedy MA (January 2009). "The 2 Å resolution crystal structure of HetL, a pentapeptide repeat protein involved in regulation of heterocyst differentiation in the cyanobacterium Nostoc sp. strain PCC 7120". J. Struct. Biol. 165 (1): 47–52. doi:10.1016/j.jsb.2008.09.010. PMID 18952182.

[pmid16388575-2] Vetting MW, Hegde SS, Fajardo JE, et al. (January 2006). "Pentapeptide repeat proteins". Biochemistry. 45 (1): 1–10. doi:10.1021/bi052130w. PMC 2566302 . PMID 16388575.

[pmid9655353-3] 1 2 3 Bateman A, Murzin AG, Teichmann SA (June 1998). "Structure and distribution of pentapeptide repeats in bacteria". Protein Sci. 7 (6): 1477–80. doi:10.1002/pro.5560070625. PMC 2144021 . PMID 9655353.

[pmid7592418-4] Black K, Buikema WJ, Haselkorn R (November 1995). "The hglK gene is required for localization of heterocyst-specific glycolipids in the cyanobacterium Anabaena sp. strain PCC 7120". J. Bacteriol. 177 (22): 6440–8. doi:10.1128/jb.177.22.6440-6448.1995. PMC 177493 . PMID 7592418.

[pmid15933203-5] 1 2 3 Hegde SS, Vetting MW, Roderick SL, et al. (June 2005). "A fluoroquinolone resistance protein from Mycobacterium tuberculosis that mimics DNA". Science. 308 (5727): 1480–3. Bibcode:2005Sci...308.1480H. doi:10.1126/science.1110699. PMID 15933203. S2CID 20194294.

[pmid19390151-6] Vetting MW, Hegde SS, Blanchard JS (May 2009). "Crystallization of a pentapeptide-repeat protein by reductive cyclic pentylation of free amines with glutaraldehyde". Acta Crystallogr. D. 65 (Pt 5): 462–9. doi:10.1107/S0907444909008324. PMC 2672816 . PMID 19390151.

[pmid17384236-7] 1 2 Vetting MW, Hegde SS, Hazleton KZ, Blanchard JS (April 2007). "Structural characterization of the fusion of two pentapeptide repeat proteins, Np275 and Np276, from Nostoc punctiforme: resurrection of an ancestral protein". Protein Sci. 16 (4): 755–60. doi:10.1110/ps.062637707. PMC 2203339 . PMID 17384236.

[pmid17075135-8] Buchko GW, Ni S, Robinson H, Welsh EA, Pakrasi HB, Kennedy MA (November 2006). "Characterization of two potentially universal turn motifs that shape the repeated five-residues fold--crystal structure of a lumenal pentapeptide repeat protein from Cyanothece 51142". Protein Sci. 15 (11): 2579–95. doi:10.1110/ps.062407506. PMC 2242410 . PMID 17075135.

[pmid18158251-9] Buchko GW, Robinson H, Pakrasi HB, Kennedy MA (April 2008). "Insights into the structural variation between pentapeptide repeat proteins--crystal structure of Rfr23 from Cyanothece 51142". J. Struct. Biol. 162 (1): 184–92. doi:10.1016/j.jsb.2007.11.008. PMID 18158251.

[pmid22139148-10] Ni S, McGookey ME, Tinch SL, et al. (December 2011). "The 1.7 Å resolution structure of At2g44920, a pentapeptide-repeat protein in the thylakoid lumen of Arabidopsis thaliana". Acta Crystallographica Section F. 67 (Pt 12): 1480–4. doi:10.1107/S1744309111037432. PMC 3232121 . PMID 22139148.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Protein tandem repeats
Fibrous:	Coiled coil Collagen helix
Elongated:	Alpha solenoid Ankyrin repeat Armadillo repeat Transcription activator-like effector Beta solenoid Beta helix Antifreeze protein HEAT repeat Leucine-rich repeat Pentapeptide repeat Tetratricopeptide repeat Trefoil knot fold
Closed:	Beta barrel Beta trefoil fold Beta-propeller Kelch motif TIM barrel WD40 repeat
Beads-on-a-string:	Sushi domain

Pentapeptide repeat

Contents

Sequence features

Structure

Related Research Articles

References