List of protein tandem repeat annotation software

Last updated

Computational methods use different properties of protein sequences and structures to find, characterize and annotate protein tandem repeats.

Contents

Sequence-based annotation methods

NameLast updateUsageResult typesDescriptionOpen source?Repeat type specificReference
ard2 2013webannotated sequence Neural network noalpha-solenoid [1]
DECIPHER 2021downloadableDetection of tandem and/or interspersed repeats by orthology (DetectRepeats function in R package)yesno [2]
TRUST 2004downloadable / webunit position, multiple sequence alignment Ab-initio determination of internal repeats in proteins. Exploits transitivity of alignments?no [3]
T-REKS 2009downloadable / webrepeat unitClustering of lengths between identical short strings by using a K-means algorithm yesno [4]
HHRepID 2008downloadable / webIdentification of repeats in protein sequences via HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologsno [5]
RADAR 2018downloadable / webunit position, multiple sequence alignment RADAR identifies short composition biased and gapped approximate repeats, as well as complex repeat architectures involving many different types of repeats in a query sequenceyesno [6] [7]
XSTREAM 2007webunit position, different periods, multiple sequence alignment data-mining tool designed to efficiently identify Tandem Repeat (TR) patterns in biological sequence data. The program uses a seed-extension strategy coupled with several post-processing algorithms to analyze FASTA-formatted protein or nucleotide sequencesnono [8]
TRED 2007downloadabledefinition for tandem repeats over the edit distance and an efficient, deterministic algorithm for finding these repeatsnono
TRAL 2015downloadableDetects tandem repeats with both de novo software and sequence profile HMMs; statistical significance analysis of putative tandem repeats, and filtering of redundant predictionsyes [9]
DOTTER 1995downloadableGraphical dotplot program for detailed comparison of two sequences [10]
0J.PY [11]
PTRStalker 2012downloadableunit position, multiple sequence alignment Ab-initio detection of fuzzy tandem repeats in protein amino acid sequences.no [12]
TRDistiller 2015Rapid sorting of tandem repeat (TR)- and no-TR-containing sequences [13]
REPRO 2000webRepeats detection based on a variation of the Smith-Waterman local alignment strategy followed by a graph-based iterative clustering procedurenono [14]
REP 2000webnoyes

Structure-based annotation methods

NameLast updateUsageResult typesDescriptionOpen source?Repeat type specificReference
TAPO 2016webunit positionUses periodicities of atomic coordinates and other types of structural representation, including strings generated by conformational alphabets, residue contact maps, and arrangements of vectors of secondary structure elementsnono [15]
SYMD 2014galaxyrepeat geometryDetects internally symmetric protein structures through an "alignment scan" procedure in which a protein structure is aligned to itself after circularly permuting the second copy by all possible number of residuesnono [16]
RAPHAEL 2012webrepeat probabilityReduce to three dimensional structure to a wave function. It then determines periodicity information.nono [17]
CE-SYMM 2021
ProSTRIP 2010
DAVROS 2004
RQA2009
OPAAS 2006
Gplus 2009
REUPRED 2016
ConSole 2015
RepeatsDB-Lite 2017
PRIGSA 2014
Swelfe 2008
Frustratometer 2021

References

  1. Fournier D, Palidwor GA, Shcherbinin S, Szengel A, Schaefer MH, Perez-Iratxeta C, Andrade-Navarro MA (21 November 2013). "Functional and genomic analyses of alpha-solenoid proteins". PLOS ONE. 8 (11) e79894. Bibcode:2013PLoSO...879894F. doi: 10.1371/journal.pone.0079894 . PMC   3837014 . PMID   24278209.
  2. Wright ES (2015). "Using DECIPHER v2.0 to Analyze Big Biological Sequence Data in R". The R Journal. 8 (1): 352–359. doi: 10.1186/s12859-015-0749-z . PMC   4595117 . PMID   26445311.
  3. Szklarczyk, Radek; Heringa, Jaap (2004-08-04). "Tracking repeats using significance and transitivity". Bioinformatics. 20 (Suppl 1): i311–317. doi: 10.1093/bioinformatics/bth911 . ISSN   1367-4811. PMID   15262814.
  4. Jorda, Julien; Kajava, Andrey V. (2009-10-15). "T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm". Bioinformatics. 25 (20): 2632–2638. doi: 10.1093/bioinformatics/btp482 . ISSN   1367-4811. PMID   19671691.
  5. Zimmermann, Lukas; Stephens, Andrew; Nam, Seung-Zin; Rau, David; Kübler, Jonas; Lozajic, Marko; Gabler, Felix; Söding, Johannes; Lupas, Andrei N. (2018-07-20). "A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core". Journal of Molecular Biology. Computation Resources for Molecular Biology. 430 (15): 2237–2243. doi:10.1016/j.jmb.2017.12.007. ISSN   0022-2836. PMID   29258817. S2CID   22415932.
  6. Heger, Andreas; Holm, Liisa (2000). "Rapid automatic detection and alignment of repeats in protein sequences". Proteins: Structure, Function, and Genetics. 41 (2): 224–237. doi:10.1002/1097-0134(20001101)41:2<224::aid-prot70>3.0.co;2-z. ISSN   0887-3585. PMID   10966575. S2CID   21757391.
  7. Lopez, Rodrigo; Paern, Juri; Squizzato, Silvano; Valentin, Franck; Li, Weizhong; McWilliam, Hamish; Goujon, Mickael (2010-07-01). "A new bioinformatics analysis tools framework at EMBL–EBI". Nucleic Acids Research. 38 (suppl_2): W695 –W699. doi:10.1093/nar/gkq313. ISSN   0305-1048. PMC   2896090 . PMID   20439314.
  8. Newman, Aaron M.; Cooper, James B. (2007-10-11). "XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences". BMC Bioinformatics. 8 (1): 382. doi: 10.1186/1471-2105-8-382 . ISSN   1471-2105. PMC   2233649 . PMID   17931424.
  9. Anisimova, Maria; Xenarios, Ioannis; Zoller, Stefan; Stockinger, Heinz; Murri, Riccardo; Messina, Antonio; Pečerska, Jūlija; Korsunsky, Alexander; Schaper, Elke (2015-09-15). "TRAL: tandem repeat annotation library". Bioinformatics. 31 (18): 3051–3053. doi: 10.1093/bioinformatics/btv306 . hdl: 20.500.11850/103876 . ISSN   1367-4803. PMID   25987568.
  10. Sonnhammer, E. L.; Durbin, R. (1995-12-29). "A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis". Gene. 167 (1–2): GC1–10. doi:10.1016/0378-1119(95)00714-8. ISSN   0378-1119. PMID   8566757.
  11. Wise, M. J. (2001). "0j.py: a software tool for low complexity proteins and protein domains". Bioinformatics. 17 (Suppl 1): S288–295. doi: 10.1093/bioinformatics/17.suppl_1.s288 . ISSN   1367-4803. PMID   11473020.
  12. Pellegrini, Marco; Renda, Maria Elena; Vecchio, Alessio (2012-03-21). "Ab initio detection of fuzzy amino acid tandem repeats in protein sequences". BMC Bioinformatics. 13 (3): S8. doi: 10.1186/1471-2105-13-S3-S8 . ISSN   1471-2105. PMC   3402919 . PMID   22536906.
  13. Richard, François D.; Kajava, Andrey V. (2014-06-01). "TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats". Journal of Structural Biology. 186 (3): 386–391. doi:10.1016/j.jsb.2014.03.013. ISSN   1047-8477. PMID   24681324.
  14. George, Richard A.; Heringa, Jaap (October 2000). "The REPRO server: finding protein internal sequence repeats through the Web". Trends in Biochemical Sciences. 25 (10): 515–517. doi:10.1016/s0968-0004(00)01643-1. ISSN   0968-0004. PMID   11203383.
  15. Do Viet, Phuong; Roche, Daniel B.; Kajava, Andrey V. (2015-09-14). "TAPO: A combined method for the identification of tandem repeats in protein structures". FEBS Letters. 589 (19 Pt A): 2611–2619. doi: 10.1016/j.febslet.2015.08.025 . ISSN   1873-3468. PMID   26320412.
  16. Tai, Chin-Hsien; Paul, Rohit; KC, Dukka; Shilling, Jeffery D.; Lee, Byungkook (2014-07-01). "SymD webserver: a platform for detecting internally symmetric protein structures". Nucleic Acids Research. 42 (Web Server issue): W296 –W300. doi:10.1093/nar/gku364. ISSN   0305-1048. PMC   4086132 . PMID   24799435.
  17. Walsh, Ian; Sirocco, Francesco G.; Minervini, Giovanni; Di Domenico, Tomás; Ferrari, Carlo; Tosatto, Silvio C. E. (2012-09-08). "RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures". Bioinformatics. 28 (24): 3257–3264. doi: 10.1093/bioinformatics/bts550 . ISSN   1460-2059. PMID   22962341.