Circular permutation in proteins

Last updated
Schematic representation of a circular permutation in two proteins. The first protein (outer circle) has the sequence a-b-c. After the permutation the second protein (inner circle) has the sequence c-a-b. The letters N and C indicate the location of the amino- and carboxy-termini of the protein sequences and how their positions change relative to each other. Circular Permutation In Proteins.svg
Schematic representation of a circular permutation in two proteins. The first protein (outer circle) has the sequence a-b-c. After the permutation the second protein (inner circle) has the sequence c-a-b. The letters N and C indicate the location of the amino- and carboxy-termini of the protein sequences and how their positions change relative to each other.

A circular permutation is a relationship between proteins whereby the proteins have a changed order of amino acids in their peptide sequence. The result is a protein structure with different connectivity, but overall similar three-dimensional (3D) shape. In 1979, the first pair of circularly permuted proteins – concanavalin A and lectin – were discovered; over 2000 such proteins are now known.

Contents

Circular permutation can occur as the result of evolutionary events, posttranslational modifications, or artificially engineered mutations. The two main models proposed to explain the evolution of circularly permuted proteins are permutation by duplication and fission and fusion. Permutation by duplication occurs when a gene undergoes duplication to form a tandem repeat, before redundant sections of the protein are removed; this relationship is found between saposin and swaposin. Fission and fusion occurs when partial proteins fuse to form a single polypeptide, such as in nicotinamide nucleotide transhydrogenases.

Circular permutations are routinely engineered in the laboratory to improve their catalytic activity or thermostability, or to investigate properties of the original protein.

Traditional algorithms for sequence alignment and structure alignment are not able to detect circular permutations between proteins. New non-linear approaches have been developed that overcome this and are able to detect topology-independent similarities.

History

Two proteins that are related by a circular permutation. Concanavalin A (left), from the Protein Data Bank (PDB: 3cna ), and peanut lectin (right), from PDB: 2pel , which is homologous to favin. The termini of the proteins are highlighted by blue and green spheres, and the sequence of residues is indicated by the gradient from blue (N-terminus) to green (C-terminus). The 3D fold of the two proteins is highly similar; however, the N- and C- termini are located on different positions of the protein. Concanavalin A vs Lectin.png
Two proteins that are related by a circular permutation. Concanavalin A (left), from the Protein Data Bank ( PDB: 3cna ), and peanut lectin (right), from PDB: 2pel , which is homologous to favin. The termini of the proteins are highlighted by blue and green spheres, and the sequence of residues is indicated by the gradient from blue (N-terminus) to green (C-terminus). The 3D fold of the two proteins is highly similar; however, the N- and C- termini are located on different positions of the protein.

In 1979, Bruce Cunningham and his colleagues discovered the first instance of a circularly permuted protein in nature. [1] After determining the peptide sequence of the lectin protein favin, they noticed its similarity to a known protein concanavalin A  – except that the ends were circularly permuted. Later work confirmed the circular permutation between the pair [2] and showed that concanavalin A is permuted post-translationally [3] through cleavage and an unusual protein ligation. [4]

After the discovery of a natural circularly permuted protein, researchers looked for a way to emulate this process. In 1983, David Goldenberg and Thomas Creighton were able to create a circularly permuted version of a protein by chemically ligating the termini to create a cyclic protein, then introducing new termini elsewhere using trypsin. [5] In 1989, Karolin Luger and her colleagues introduced a genetic method for making circular permutations by carefully fragmenting and ligating DNA. [6] This method allowed for permutations to be introduced at arbitrary sites. [6]

Despite the early discovery of post-translational circular permutations and the suggestion of a possible genetic mechanism for evolving circular permutants, it was not until 1995 that the first circularly permuted pair of genes were discovered. Saposins are a class of proteins involved in sphingolipid catabolism and antigen presentation of lipids in humans. Chris Ponting and Robert Russell identified a circularly permuted version of a saposin inserted into plant aspartic proteinase, which they nicknamed swaposin. [7] Saposin and swaposin were the first known case of two natural genes related by a circular permutation. [7]

Hundreds of examples of protein pairs related by a circular permutation were subsequently discovered in nature or produced in the laboratory. As of February 2012, the Circular Permutation Database [8] contains 2,238 circularly permuted protein pairs with known structures, and many more are known without structures. [9] The CyBase database collects proteins that are cyclic, some of which are permuted variants of cyclic wild-type proteins. [10] SISYPHUS is a database that contains a collection of hand-curated manual alignments of proteins with non-trivial relationships, several of which have circular permutations. [11]

Evolution

There are two main models that are currently being used to explain the evolution of circularly permuted proteins: permutation by duplication and fission and fusion. The two models have compelling examples supporting them, but the relative contribution of each model in evolution is still under debate. [12] Other, less common, mechanisms have been proposed, such as "cut and paste" [13] or "exon shuffling". [14]

Permutation by duplication

The permutation by duplication mechanism for producing a circular permutation. First, a gene 1-2-3 is duplicated to form 1-2-3-1-2-3. Next, a start codon is introduced before the first domain 2 and a stop codon after the second domain 1, removing redundant sections and resulting in a circularly permuted gene 2-3-1. Permutation by Duplication.svg
The permutation by duplication mechanism for producing a circular permutation. First, a gene 1-2-3 is duplicated to form 1-2-3-1-2-3. Next, a start codon is introduced before the first domain 2 and a stop codon after the second domain 1, removing redundant sections and resulting in a circularly permuted gene 2-3-1.

The earliest model proposed for the evolution of circular permutations is the permutation by duplication mechanism. [1] In this model, a precursor gene first undergoes a duplication and fusion to form a large tandem repeat. Next, start and stop codons are introduced at corresponding locations in the duplicated gene, removing redundant sections of the protein.

One surprising prediction of the permutation by duplication mechanism is that intermediate permutations can occur. For instance, the duplicated version of the protein should still be functional, since otherwise evolution would quickly select against such proteins. Likewise, partially duplicated intermediates where only one terminus was truncated should be functional. Such intermediates have been extensively documented in protein families such as DNA methyltransferases. [15]

Saposin and swaposin

Suggested relationship between saposin and swaposin. They could have evolved from a similar gene. Both consist of four alpha helices with the order of helices being permuted relative to each other. Saposin Swaposin.svg
Suggested relationship between saposin and swaposin. They could have evolved from a similar gene. Both consist of four alpha helices with the order of helices being permuted relative to each other.

An example for permutation by duplication is the relationship between saposin and swaposin. Saposins are highly conserved glycoproteins, approximately 80 amino acid residues long and forming a four alpha helical structure. They have a nearly identical placement of cysteine residues and glycosylation sites. The cDNA sequence that codes for saposin is called prosaposin. It is a precursor for four cleavage products, the saposins A, B, C, and D. The four saposin domains most likely arose from two tandem duplications of an ancestral gene. [16] This repeat suggests a mechanism for the evolution of the relationship with the plant-specific insert (PSI). The PSI is a domain exclusively found in plants, consisting of approximately 100 residues and found in plant aspartic proteases. [17] It belongs to the saposin-like protein family (SAPLIP) and has the N- and C- termini "swapped", such that the order of helices is 3-4-1-2 compared with saposin, thus leading to the name "swaposin". [7] [18]

Fission and fusion

The fission and fusion mechanism of circular permutation. Two separate genes arise (potentially from the fission of a single gene). If the genes fuse together in different orders in two orthologues, a circular permutation occurs. Fission-fusion (genetics).svg
The fission and fusion mechanism of circular permutation. Two separate genes arise (potentially from the fission of a single gene). If the genes fuse together in different orders in two orthologues, a circular permutation occurs.

Another model for the evolution of circular permutations is the fission and fusion model. The process starts with two partial proteins. These may represent two independent polypeptides (such as two parts of a heterodimer), or may have originally been halves of a single protein that underwent a fission event to become two polypeptides.

The two proteins can later fuse together to form a single polypeptide. Regardless of which protein comes first, this fusion protein may show similar function. Thus, if a fusion between two proteins occurs twice in evolution (either between paralogues within the same species or between orthologues in different species) but in a different order, the resulting fusion proteins will be related by a circular permutation.

Evidence for a particular protein having evolved by a fission and fusion mechanism can be provided by observing the halves of the permutation as independent polypeptides in related species, or by demonstrating experimentally that the two halves can function as separate polypeptides. [19]

Transhydrogenases

Transhydrogenases in various organisms can be found in three different domain arrangements. In cattle, the three domains are arranged sequentially. In the bacteria E. coli, Rb. capsulatus, and R. rubrum, the transhydrogenase consists of two or three subunits. Finally, transhydrogenase from the protist E. tenella consists of a single subunit that is circularly permuted relative to cattle transhydrogenase. Transhydrogenase Circular Permutations.svg
Transhydrogenases in various organisms can be found in three different domain arrangements. In cattle, the three domains are arranged sequentially. In the bacteria E. coli , Rb. capsulatus , and R. rubrum , the transhydrogenase consists of two or three subunits. Finally, transhydrogenase from the protist E. tenella consists of a single subunit that is circularly permuted relative to cattle transhydrogenase.

An example for the fission and fusion mechanism can be found in nicotinamide nucleotide transhydrogenases. [20] These are membrane-bound enzymes that catalyze the transfer of a hydride ion between NAD(H) and NADP(H) in a reaction that is coupled to transmembrane proton translocation. They consist of three major functional units (I, II, and III) that can be found in different arrangement in bacteria, protozoa, and higher eukaryotes. Phylogenetic analysis suggests that the three groups of domain arrangements were acquired and fused independently. [12]

Other processes that can lead to circular permutations

Post-translational modification

The two evolutionary models mentioned above describe ways in which genes may be circularly permuted, resulting in a circularly permuted mRNA after transcription. Proteins can also be circularly permuted via post-translational modification, without permuting the underlying gene. Circular permutations can happen spontaneously through autocatalysis, as in the case of concanavalin A. [4] Alternately, permutation may require restriction enzymes and ligases. [5]

Role in protein engineering

Many proteins have their termini located close together in 3D space. [21] [22] Because of this, it is often possible to design circular permutations of proteins. Today, circular permutations are generated routinely in the lab using standard genetics techniques. [6] Although some permutation sites prevent the protein from folding correctly, many permutants have been created with nearly identical structure and function to the original protein.

The motivation for creating a circular permutant of a protein can vary. Scientists may want to improve some property of the protein, such as:

Alternately, scientists may be interested in properties of the original protein, such as:

Algorithmic detection

Many sequence alignment and protein structure alignment algorithms have been developed assuming linear data representations and as such are not able to detect circular permutations between proteins. [34] Two examples of frequently used methods that have problems correctly aligning proteins related by circular permutation are dynamic programming and many hidden Markov models. [34] As an alternative to these, a number of algorithms are built on top of non-linear approaches and are able to detect topology-independent similarities, or employ modifications allowing them to circumvent the limitations of dynamic programming. [34] [35] The table below is a collection of such methods.

The algorithms are classified according to the type of input they require. Sequence-based algorithms require only the sequence of two proteins in order to create an alignment. [36] Sequence methods are generally fast and suitable for searching whole genomes for circularly permuted pairs of proteins. [36] Structure-based methods require 3D structures of both proteins being considered. [37] They are often slower than sequence-based methods, but are able to detect circular permutations between distantly related proteins with low sequence similarity. [37] Some structural methods are topology independent, meaning that they are also able to detect more complex rearrangements than circular permutation. [38]

NAMETypeDescriptionAuthorYearAvailabilityReference
FBPLOTSequenceDraws dot plots of suboptimal sequence alignmentsZuker1991 [39]
Bachar et al.Structure, topology independentUses geometric hashing for the topology independent comparison of proteinsBachar et al.1993 [35]
Uliel at alSequenceFirst suggestion of how a sequence comparison algorithm for the detection of circular permutations can workUliel et al.1999 [36]
SHEBAStructureUses SHEBA algorithm to create structural alignments for various permutation points, while iteratively improving the cut point.Jung & Lee2001 [14]
MultiprotStructure, Topology independentCalculates a sequence order independent multiple protein structure alignmentShatsky2004 server, download [38]
RASPODOMSequenceModified Needleman & Wunsch sequence comparison algorithm Weiner et al.2005 download [34]
CPSARSTStructureDescribes protein structures as one-dimensional text strings by using a Ramachandran sequential transformation (RST) algorithm. Detects circular permutations through a duplication of the sequence representation and "double filter-and-refine" strategy.Lo, Lyu2008 server [40]
GANGSTA +StructureWorks in two stages: Stage one identifies coarse alignments based on secondary structure elements. Stage two refines the alignment on residue level and extends into loop regions.Schmidt-Goenner et al.2009 server, download [41]
SANAStructureDetect initial aligned fragment pairs (AFPs). Build network of possible AFPs. Use random-mate algorithm to connect components to a graph.Wang et al.2010 download [42]
CE-CPStructureBuilt on top of the combinatorial extension algorithm. Duplicates atoms before alignment, truncates results after alignmentBliven et al.2015 server, download [43]
TopMatchStructureHas option to calculate topology-independent protein structure alignmentSippl & Wiederstein2012 server, download [44]

Related Research Articles

<span class="mw-page-title-main">Protein</span> Biomolecule consisting of chains of amino acid residues

Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

<span class="mw-page-title-main">Protein secondary structure</span> General three-dimensional form of local segments of proteins

Protein secondary structure is the local spatial conformation of the polypeptide backbone excluding the side chains. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary structure elements typically spontaneously form as an intermediate before the protein folds into its three dimensional tertiary structure.

<span class="mw-page-title-main">Sequence alignment</span> Process in bioinformatics that identifies equivalent sites within molecular sequences

In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences such as calculating the distance cost between strings in a natural language, or to display financial data.

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. It can be performed on the entire genome, transcriptome or proteome of an organism, and can also involve only selected segments or regions, like tandem repeats and transposable elements. Methodologies used include sequence alignment, searches against biological databases, and others.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.

<span class="mw-page-title-main">Structural alignment</span> Aligning molecular sequences using sequence and structural information

Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.

<span class="mw-page-title-main">Structural bioinformatics</span> Bioinformatics subfield

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a Protein Data Bank (PDB) file, interacting with Jmol and many more. This application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis.

<span class="mw-page-title-main">Concanavalin A</span> Lectin (carbohydrate-binding protein) originally extracted from the jack-bean

Concanavalin A (ConA) is a lectin originally extracted from the jack-bean. It is a member of the legume lectin family. It binds specifically to certain structures found in various sugars, glycoproteins, and glycolipids, mainly internal and nonreducing terminal α-D-mannosyl and α-D-glucosyl groups. Its physiological function in plants, however, is still unknown. ConA is a plant mitogen, and is known for its ability to stimulate mouse T-cell subsets giving rise to four functionally distinct T cell populations, including precursors to regulatory T cells; a subset of human suppressor T-cells is also sensitive to ConA. ConA was the first lectin to be available on a commercial basis, and is widely used in biology and biochemistry to characterize glycoproteins and other sugar-containing entities on the surface of various cells. It is also used to purify glycosylated macromolecules in lectin affinity chromatography, as well as to study immune regulation by various immune cells.

<span class="mw-page-title-main">Fusion gene</span>

A fusion gene is a hybrid gene formed from two previously independent genes. It can occur as a result of translocation, interstitial deletion, or chromosomal inversion. Fusion genes have been found to be prevalent in all main types of human neoplasia. The identification of these fusion genes play a prominent role in being a diagnostic and prognostic marker.

<span class="mw-page-title-main">Transfer-messenger RNA</span>

Transfer-messenger RNA is a bacterial RNA molecule with dual tRNA-like and messenger RNA-like properties. The tmRNA forms a ribonucleoprotein complex (tmRNP) together with Small Protein B (SmpB), Elongation Factor Tu (EF-Tu), and ribosomal protein S1. In trans-translation, tmRNA and its associated proteins bind to bacterial ribosomes which have stalled in the middle of protein biosynthesis, for example when reaching the end of a messenger RNA which has lost its stop codon. The tmRNA is remarkably versatile: it recycles the stalled ribosome, adds a proteolysis-inducing tag to the unfinished polypeptide, and facilitates the degradation of the aberrant messenger RNA. In the majority of bacteria these functions are carried out by standard one-piece tmRNAs. In other bacterial species, a permuted ssrA gene produces a two-piece tmRNA in which two separate RNA chains are joined by base-pairing.

In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.

<span class="mw-page-title-main">TIM barrel</span> Protein fold

The TIM barrel, also known as an alpha/beta barrel, is a conserved protein fold consisting of eight alpha helices (α-helices) and eight parallel beta strands (β-strands) that alternate along the peptide backbone. The structure is named after triose-phosphate isomerase, a conserved metabolic enzyme. TIM barrels are ubiquitous, with approximately 10% of all enzymes adopting this fold. Further, five of seven enzyme commission (EC) enzyme classes include TIM barrel proteins. The TIM barrel fold is evolutionarily ancient, with many of its members possessing little similarity today, instead falling within the twilight zone of sequence similarity.

<span class="mw-page-title-main">Molecular biophysics</span> Interdisciplinary research area

Molecular biophysics is a rapidly evolving interdisciplinary area of research that combines concepts in physics, chemistry, engineering, mathematics and biology. It seeks to understand biomolecular systems and explain biological function in terms of molecular structure, structural organization, and dynamic behaviour at various levels of complexity. This discipline covers topics such as the measurement of molecular forces, molecular associations, allosteric interactions, Brownian motion, and cable theory. Additional areas of study can be found on Outline of Biophysics. The discipline has required development of specialized equipment and procedures capable of imaging and manipulating minute living structures, as well as novel experimental approaches.

<span class="mw-page-title-main">Protein domain</span> Self-stable region of a proteins chain that folds independently from the rest

In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of several domains, and a domain may appear in a variety of different proteins. Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions. In general, domains vary in length from between about 50 amino acids up to 250 amino acids in length. The shortest domains, such as zinc fingers, are stabilized by metal ions or disulfide bridges. Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin. Because they are independently stable, domains can be "swapped" by genetic engineering between one protein and another to make chimeric proteins.

Biological data visualization is a branch of bioinformatics concerned with the application of computer graphics, scientific visualization, and information visualization to different areas of the life sciences. This includes visualization of sequences, genomes, alignments, phylogenies, macromolecular structures, systems biology, microscopy, and magnetic resonance imaging data. Software tools used for visualizing biological data range from simple, standalone programs to complex, integrated systems.

Phyre and Phyre2 are free web-based services for protein structure prediction. Phyre is among the most popular methods for protein structure prediction having been cited over 1500 times. Like other remote homology recognition techniques, it is able to regularly generate reliable protein models when other widely used methods such as PSI-BLAST cannot. Phyre2 has been designed to ensure a user-friendly interface for users inexpert in protein structure prediction methods. Its development is funded by the Biotechnology and Biological Sciences Research Council.

<span class="mw-page-title-main">Plant-specific insert</span>

The plant-specific insert (PSI) or plant-specific sequence (PSS) is an independent domain, exclusively found in plants, consisting of approximately 100 residues, found on the C-terminal lobe on some aspartic proteases (AP) called phytepsins. The PSI, as an independent entity separate from its parent AP, is homologous to saposin and belongs to the saposin-like protein family (SAPLIP).

A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.

<span class="mw-page-title-main">Protein tandem repeats</span>

An array of protein tandem repeats is defined as several adjacent copies having the same or similar sequence motifs. These periodic sequences are generated by internal duplications in both coding and non-coding genomic sequences. Repetitive units of protein tandem repeats are considerably diverse, ranging from the repetition of a single amino acid to domains of 100 or more residues.

References

Open Access logo PLoS transparent.svg This article was adapted from the following source under a CC BY 4.0 license (2012) (reviewer reports): Spencer Bliven; Andreas Prlić (2012). "Circular permutation in proteins". PLOS Computational Biology . 8 (3): e1002445. doi:10.1371/JOURNAL.PCBI.1002445. ISSN   1553-734X. PMC   3320104 . PMID   22496628. Wikidata   Q5121672.{{cite journal}}: CS1 maint: unflagged free DOI (link)

  1. 1 2 3 Cunningham BA, Hemperly JJ, Hopp TP, Edelman GM (July 1979). "Favin versus concanavalin A: Circularly permuted amino acid sequences". Proceedings of the National Academy of Sciences of the United States of America. 76 (7): 3218–22. Bibcode:1979PNAS...76.3218C. doi: 10.1073/pnas.76.7.3218 . PMC   383795 . PMID   16592676.
  2. Einspahr H, Parks EH, Suguna K, Subramanian E, Suddath FL (December 1986). "The crystal structure of pea lectin at 3.0-A resolution". The Journal of Biological Chemistry. 261 (35): 16518–27. doi: 10.1016/S0021-9258(18)66597-4 . PMID   3782132.
  3. Carrington DM, Auffret A, Hanke DE (1985). "Polypeptide ligation occurs during post-translational modification of concanavalin A". Nature. 313 (5997): 64–7. Bibcode:1985Natur.313...64C. doi:10.1038/313064a0. PMID   3965973. S2CID   4359482.
  4. 1 2 Bowles DJ, Pappin DJ (February 1988). "Traffic and assembly of concanavalin A". Trends in Biochemical Sciences. 13 (2): 60–4. doi:10.1016/0968-0004(88)90030-8. PMID   3070848.
  5. 1 2 Goldenberg DP, Creighton TE (April 1983). "Circular and circularly permuted forms of bovine pancreatic trypsin inhibitor". Journal of Molecular Biology. 165 (2): 407–13. doi:10.1016/S0022-2836(83)80265-4. PMID   6188846.
  6. 1 2 3 Luger K, Hommel U, Herold M, Hofsteenge J, Kirschner K (January 1989). "Correct folding of circularly permuted variants of a beta alpha barrel enzyme in vivo". Science. 243 (4888): 206–10. Bibcode:1989Sci...243..206L. doi:10.1126/science.2643160. PMID   2643160.
  7. 1 2 3 4 Ponting CP, Russell RB (May 1995). "Swaposins: circular permutations within genes encoding saposin homologues". Trends in Biochemical Sciences. 20 (5): 179–80. doi:10.1016/S0968-0004(00)89003-9. PMID   7610480.
  8. Lo WC, Lee CC, Lee CY, Lyu PC. "Circular Permutation Database". Institute of Bioinformatics and Structural Biology, National Tsing Hua University . Retrieved 16 February 2012.
  9. Lo WC, Lee CC, Lee CY, Lyu PC (January 2009). "CPDB: a database of circular permutation in proteins". Nucleic Acids Research. 37 (Database issue): D328–32. doi:10.1093/nar/gkn679. PMC   2686539 . PMID   18842637.
  10. Kaas Q, Craik DJ (2010). "Analysis and classification of circular proteins in CyBase". Biopolymers. 94 (5): 584–91. doi: 10.1002/bip.21424 . PMID   20564021.
  11. Andreeva A, Prlić A, Hubbard TJ, Murzin AG (January 2007). "SISYPHUS--structural alignments for proteins with non-trivial relationships". Nucleic Acids Research. 35 (Database issue): D253–9. doi:10.1093/nar/gkl746. PMC   1635320 . PMID   17068077.
  12. 1 2 Weiner J, Bornberg-Bauer E (April 2006). "Evolution of circular permutations in multidomain proteins". Molecular Biology and Evolution. 23 (4): 734–43. doi: 10.1093/molbev/msj091 . PMID   16431849.
  13. Bujnicki JM (March 2002). "Sequence permutations in the molecular evolution of DNA methyltransferases". BMC Evolutionary Biology. 2 (1): 3. doi: 10.1186/1471-2148-2-3 . PMC   102321 . PMID   11914127.
  14. 1 2 Jung J, Lee B (September 2001). "Circularly permuted proteins in the protein structure database". Protein Science. 10 (9): 1881–6. doi:10.1110/ps.05801. PMC   2253204 . PMID   11514678.
  15. Jeltsch A (July 1999). "Circular permutations in the molecular evolution of DNA methyltransferases". Journal of Molecular Evolution. 49 (1): 161–4. Bibcode:1999JMolE..49..161J. doi:10.1007/pl00006529. PMID   10368444. S2CID   24116226.
  16. Hazkani-Covo E, Altman N, Horowitz M, Graur D (January 2002). "The evolutionary history of prosaposin: two successive tandem-duplication events gave rise to the four saposin domains in vertebrates". Journal of Molecular Evolution. 54 (1): 30–4. Bibcode:2002JMolE..54...30H. doi:10.1007/s00239-001-0014-0. PMID   11734895. S2CID   7402721.
  17. Guruprasad K, Törmäkangas K, Kervinen J, Blundell TL (September 1994). "Comparative modelling of barley-grain aspartic proteinase: a structural rationale for observed hydrolytic specificity". FEBS Letters. 352 (2): 131–6. Bibcode:1994FEBSL.352..131G. doi: 10.1016/0014-5793(94)00935-X . PMID   7925961. S2CID   32524531.
  18. Bruhn H (July 2005). "A short guided tour through functional and structural features of saposin-like proteins". The Biochemical Journal. 389 (Pt 2): 249–57. doi:10.1042/BJ20050051. PMC   1175101 . PMID   15992358.
  19. Lee J, Blaber M (January 2011). "Experimental support for the evolution of symmetric protein architecture from a simple peptide motif". Proceedings of the National Academy of Sciences of the United States of America. 108 (1): 126–30. Bibcode:2011PNAS..108..126L. doi: 10.1073/pnas.1015032108 . PMC   3017207 . PMID   21173271.
  20. 1 2 Hatefi Y, Yamaguchi M (March 1996). "Nicotinamide nucleotide transhydrogenase: a model for utilization of substrate binding energy for proton translocation". FASEB Journal. 10 (4): 444–52. doi: 10.1096/fasebj.10.4.8647343 . PMID   8647343. S2CID   21898930.
  21. Thornton JM, Sibanda BL (June 1983). "Amino and carboxy-terminal regions in globular proteins". Journal of Molecular Biology. 167 (2): 443–60. doi:10.1016/S0022-2836(83)80344-1. PMID   6864804.
  22. Yu Y, Lutz S (January 2011). "Circular permutation: a different way to engineer enzyme structure and function". Trends in Biotechnology. 29 (1): 18–25. doi:10.1016/j.tibtech.2010.10.004. PMID   21087800.
  23. Whitehead TA, Bergeron LM, Clark DS (October 2009). "Tying up the loose ends: circular permutation decreases the proteolytic susceptibility of recombinant proteins". Protein Engineering, Design & Selection. 22 (10): 607–13. doi: 10.1093/protein/gzp034 . PMID   19622546.
  24. 1 2 Cheltsov AV, Barber MJ, Ferreira GC (June 2001). "Circular permutation of 5-aminolevulinate synthase. Mapping the polypeptide chain to its function". The Journal of Biological Chemistry. 276 (22): 19141–9. doi: 10.1074/jbc.M100329200 . PMC   4547487 . PMID   11279050.
  25. Qian Z, Lutz S (October 2005). "Improving the catalytic activity of Candida antarctica lipase B by circular permutation". Journal of the American Chemical Society. 127 (39): 13466–7. doi:10.1021/ja053932h. PMID   16190688. (primary source)
  26. Topell S, Hennecke J, Glockshuber R (August 1999). "Circularly permuted variants of the green fluorescent protein". FEBS Letters. 457 (2): 283–9. Bibcode:1999FEBSL.457..283T. doi:10.1016/S0014-5793(99)01044-3. PMID   10471794. S2CID   43085373. (primary source)
  27. Viguera AR, Serrano L, Wilmanns M (October 1996). "Different folding transition states may result in the same native structure". Nature Structural Biology. 3 (10): 874–80. doi:10.1038/nsb1096-874. PMID   8836105. S2CID   11542397. (primary source)
  28. Capraro DT, Roy M, Onuchic JN, Jennings PA (September 2008). "Backtracking on the folding landscape of the beta-trefoil protein interleukin-1beta?". Proceedings of the National Academy of Sciences of the United States of America. 105 (39): 14844–8. Bibcode:2008PNAS..10514844C. doi: 10.1073/pnas.0807812105 . PMC   2567455 . PMID   18806223.
  29. Zhang P, Schachman HK (July 1996). "In vivo formation of allosteric aspartate transcarbamoylase containing circularly permuted catalytic polypeptide chains: implications for protein folding and assembly". Protein Science. 5 (7): 1290–300. doi:10.1002/pro.5560050708. PMC   2143468 . PMID   8819162. (primary source)
  30. Huang YM, Nayak S, Bystroff C (November 2011). "Quantitative in vivo solubility and reconstitution of truncated circular permutants of green fluorescent protein". Protein Science. 20 (11): 1775–80. doi:10.1002/pro.735. PMC   3267941 . PMID   21910151. (primary source)
  31. Beernink PT, Yang YR, Graf R, King DS, Shah SS, Schachman HK (March 2001). "Random circular permutation leading to chain disruption within and near alpha helices in the catalytic chains of aspartate transcarbamoylase: effects on assembly, stability, and function". Protein Science. 10 (3): 528–37. doi:10.1110/ps.39001. PMC   2374132 . PMID   11344321.
  32. 1 2 Baird GS, Zacharias DA, Tsien RY (September 1999). "Circular permutation and receptor insertion within green fluorescent proteins". Proceedings of the National Academy of Sciences of the United States of America. 96 (20): 11241–6. Bibcode:1999PNAS...9611241B. doi: 10.1073/pnas.96.20.11241 . PMC   18018 . PMID   10500161.
  33. Turner NJ (August 2009). "Directed evolution drives the next generation of biocatalysts". Nature Chemical Biology. 5 (8): 567–73. doi:10.1038/nchembio.203. PMID   19620998.
  34. 1 2 3 4 Weiner J, Thomas G, Bornberg-Bauer E (April 2005). "Rapid motif-based prediction of circular permutations in multi-domain proteins". Bioinformatics. 21 (7): 932–7. doi: 10.1093/bioinformatics/bti085 . PMID   15788783.
  35. 1 2 Bachar O, Fischer D, Nussinov R, Wolfson H (April 1993). "A computer vision based technique for 3-D sequence-independent structural comparison of proteins". Protein Engineering. 6 (3): 279–88. doi:10.1093/protein/6.3.279. PMID   8506262.
  36. 1 2 3 Uliel S, Fliess A, Amir A, Unger R (November 1999). "A simple algorithm for detecting circular permutations in proteins". Bioinformatics. 15 (11): 930–6. doi: 10.1093/bioinformatics/15.11.930 . PMID   10743559.
  37. 1 2 Prlic A, Bliven S, Rose PW, Bluhm WF, Bizon C, Godzik A, Bourne PE (December 2010). "Pre-calculated protein structure alignments at the RCSB PDB website". Bioinformatics. 26 (23): 2983–5. doi:10.1093/bioinformatics/btq572. PMC   3003546 . PMID   20937596.
  38. 1 2 Shatsky M, Nussinov R, Wolfson HJ (July 2004). "A method for simultaneous alignment of multiple protein structures". Proteins. 56 (1): 143–56. doi:10.1002/prot.10628. PMID   15162494. S2CID   14665486.
  39. Zuker M (September 1991). "Suboptimal sequence alignment in molecular biology. Alignment with error analysis". Journal of Molecular Biology. 221 (2): 403–20. doi:10.1016/0022-2836(91)80062-Y. PMID   1920426.
  40. Lo WC, Lyu PC (January 2008). "CPSARST: an efficient circular permutation search tool applied to the detection of novel protein structural relationships". Genome Biology. 9 (1): R11. doi: 10.1186/gb-2008-9-1-r11 . PMC   2395249 . PMID   18201387.
  41. Schmidt-Goenner T, Guerler A, Kolbeck B, Knapp EW (May 2010). "Circular permuted proteins in the universe of protein folds". Proteins. 78 (7): 1618–30. doi:10.1002/prot.22678. PMID   20112421. S2CID   20673981.
  42. Wang L, Wu LY, Wang Y, Zhang XS, Chen L (July 2010). "SANA: an algorithm for sequential and non-sequential protein structure alignment". Amino Acids. 39 (2): 417–25. doi:10.1007/s00726-009-0457-y. PMID   20127263. S2CID   2292831.
  43. Bliven SE, Bourne PE, Prlić A (April 2015). "Detection of circular permutations within protein structures using CE-CP". Bioinformatics. 31 (8): 1316–8. doi:10.1093/bioinformatics/btu823. PMC   4393524 . PMID   25505094.
  44. Sippl MJ, Wiederstein M (April 2012). "Detection of spatial correlations in protein structures and molecular complexes". Structure. 20 (4): 718–28. doi:10.1016/j.str.2012.01.024. PMC   3320710 . PMID   22483118.

Further reading