List of gene prediction software

Last updated May 23, 2025

This is a list of software tools and web portals used for gene prediction.

Name	Description	Species	References
FINDER	Automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences	Eukaryotes	^[1]
FragGeneScan	Predicting genes in complete genomes and sequencing Reads	Prokaryotes, Metagenomes	^[2]
ATGpr	Identifies translational initiation sites in cDNA sequences	Human	^[3]
Prodigal	Its name stands for Prokaryotic Dynamic Programming Genefinding Algorithm. It is based on log-likelihood functions and does not use Hidden or Interpolated Markov Models.	Prokaryotes, Metagenomes (metaProdigal)	^[4]
AUGUSTUS	Eukaryote gene predictor	Eukaryotes	^[5]
BGF	Hidden Markov model (HMM) and dynamic programming based ab initio gene prediction program		^[6]
DIOGENES	Fast detection of coding regions in short genome sequences
Dragon Promoter Finder	Program to recognize vertebrate RNA polymerase II promoters	Vertebrates	^[7]
EasyGene	The gene finder is based on a hidden Markov model (HMM) that is automatically estimated for a new genome.	Prokaryotes	^[8]^[9]
EuGene	Integrative gene finding	Prokaryotes, Eukaryotes	^[10]^[11]
FGENESH	HMM-based gene structure prediction: multiple genes, both chains	Eukaryotes	^[12]
FrameD	Find genes and frameshift in G+C rich prokaryote sequences	Prokaryotes, Eukaryotes	^[13]
GeMoMa	Homology-based gene prediction based on amino acid and intron position conservation as well as RNA-Seq data		^[14]^[15]
GENIUS II	Links ORFs in complete genomes to protein 3D structures	Prokaryotes, Eukaryotes	^[16]
geneid	Program to predict genes, exons, splice sites, and other signals along DNA sequences	Eukaryotes	^[17]
GeneParser	Parse DNA sequences into introns and exons	Eukaryotes	^[18]
GeneMark	Family of self-training gene prediction programs	Prokaryotes, Eukaryotes, Metagenomes	^[19]^[20]^[21]^[22]
GeneTack	Predicts genes with frameshifts in prokaryote genomes	Prokaryotes	^[23]
GenomeScan	Predicts the locations and exon-intron structures of genes in genome sequences from a variety of organisms, GENSCAN server is the GenomeScan's predecessor	Vertebrate, Arabidopsis, Maize	^[24]
GENSCAN	Predicts the locations and exon-intron structures of genes in genome sequences from a variety of organisms	Vertebrate, Arabidopsis, Maize	^[25]^[26]^[27]
GLIMMER	Finds genes in microbial DNA	Prokaryotes	^[28]^[29]^[30]
GLIMMERHMM	Eukaryotic gene-finding system	Eukaryotes	^[31]
GrailEXP	Predicts exons, genes, promoters, polyas, CpG islands, EST similarities, and repeat elements in DNA sequence	Human, Mus musculus, Arabidopsis thaliana, Drosophila melanogaster	^[32]^[33]
mGene	Support-vector machine (SVM) based system to find genes	Eukaryotes	^[34]
mGene.ngs	SVM based system to find genes using heterogeneous information: RNA-seq, tiling arrays	Eukaryotes	^[35]
MORGAN	Decision tree system to find genes in vertebrate DNA	Eukaryotes	^[36]
BioNIX	Web tool to combine results from different programs: GRAIL, FEX, HEXON, MZEF, GENEMARK, GENEFINDER, FGENE, BLAST, POLYAH, REPEATMASKER, TRNASCAN	Prokaryotes, Eukaryotes	^[37]
NNPP	Neural network promoter prediction	Prokaryotes, Eukaryotes	^[38]
NNSPLICE	Neural network splice site prediction	Drosophila, Human	^[39]
ORFfinder	Graphical analysis tool to find all open reading frames	Prokaryotes, Eukaryotes	^[40]
Regulatory Sequence Analysis Tools	Series of modular computer programs to detect regulatory signals in non-coding sequences	Fungi, Prokaryotes, Metazoa, Protist, Plants	^[41]^[42]
PHANOTATE	A tool to annotate phage genomes.	Phages	^[43]
SplicePredictor	Method to identify potential splice sites in (plant) pre-mRNA by sequence inspection using Bayesian statistical models	Eukaryotes	^[44]
VEIL	Hidden Markov model to find genes in vertebrate DNA Server	Eukaryotes	^[45]

References

↑ Banerjee S, Bhandary P, Woodhouse M, Sen TZ, Wise RP, Andorf CM (Apr 2021). "FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences". BMC Bioinformatics. 44 (9): e89. doi: 10.1186/s12859-021-04120-9 . PMC 8056616 . PMID 33879057.
↑ Rho M, Tang H, Ye Y (November 2010). "FragGeneScan: predicting genes in short and error-prone reads". Nucleic Acids Research. 38 (20): e191. doi:10.1093/nar/gkq747. PMC 2978382 . PMID 20805240.
↑ Nishikawa, Tetsuo; Ota, Toshio; Isogai, Takao (2000-11-01). "Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences". Bioinformatics. 16 (11): 960–967. doi: 10.1093/bioinformatics/16.11.960 . ISSN 1367-4803. PMID 11159307.
↑ Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ (March 2010). "Prodigal: prokaryotic gene recognition and translation initiation site identification". BMC Bioinformatics. 11: 119. doi: 10.1186/1471-2105-11-119 . PMC 2848648 . PMID 20211023.
↑ Keller O, Kollmar M, Stanke M, Waack S (March 2011). "A novel hybrid gene prediction method employing protein multiple sequence alignments". Bioinformatics. 27 (6): 757–63. doi: 10.1093/bioinformatics/btr010 . hdl: 11858/00-001M-0000-0011-F244-D . PMID 21216780.
↑ Li, Heng; Liu, Jin-Song; Xu, Zhao; Jin, Jiao; Fang, Lin; Gao, Lei; Li, Yu-Dong; Xing, Zi-Xing; Gao, Shao-Gen; Liu, Tao; Li, Hai-Hong (2005-07-01). "Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome" . Journal of Computer Science and Technology. 20 (4): 446–453. doi:10.1007/s11390-005-0446-x. ISSN 1860-4749. S2CID 13497894.
↑ Bajic, Vladimir B.; Seah, Seng Hong; Chong, Allen; Zhang, Guanglan; Koh, Judice L. Y.; Brusic, Vladimir (2002-01-01). "Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoters". Bioinformatics. 18 (1): 198–199. doi: 10.1093/bioinformatics/18.1.198 . ISSN 1367-4803. PMID 11836231.
↑ Nielsen, P.; Krogh, A. (2005-12-15). "Large-scale prokaryotic gene prediction and comparison to genome annotation". Bioinformatics. 21 (24): 4322–4329. doi: 10.1093/bioinformatics/bti701 . ISSN 1367-4803. PMID 16249266.
↑ Larsen, Thomas Schou; Krogh, Anders (2003-06-03). "EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance". BMC Bioinformatics. 4 (1): 21. doi: 10.1186/1471-2105-4-21 . ISSN 1471-2105. PMC 521197 . PMID 12783628.
↑ Foissac S, Gouzy J, Rombauts S, Mathé C, Amselem J, Sterck L, de Peer YV, Rouzé P, Schiex T (May 2008). "Genome annotation in plants and fungi: EuGene as a model platform" . Current Bioinformatics. 3 (2): 87–97. doi:10.2174/157489308784340702.
↑ Sallet, Erika; Gouzy, Jérôme; Schiex, Thomas (2019), Kollmar, Martin (ed.), "EuGene: An Automated Integrative Gene Finder for Eukaryotes and Prokaryotes" , Gene Prediction: Methods and Protocols, Methods in Molecular Biology, vol. 1962, New York, NY: Springer, pp. 97–120, doi:10.1007/978-1-4939-9173-0_6, ISBN 978-1-4939-9173-0, PMID 31020556, S2CID 131776381 , retrieved 2021-11-24
↑ Salamov AA, Solovyev VV (April 2000). "Ab initio gene finding in Drosophila genomic DNA". Genome Research. 10 (4): 516–22. doi:10.1101/gr.10.4.516. PMC 310882 . PMID 10779491.
↑ Schiex T, Gouzy J, Moisan A, de Oliveira Y (July 2003). "FrameD: A flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences". Nucleic Acids Research. 31 (13): 3738–41. doi:10.1093/nar/gkg610. PMC 169016 . PMID 12824407.
↑ Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F (May 2016). "Using intron position conservation for homology-based gene prediction". Nucleic Acids Research. 44 (9): e89. doi: 10.1186/s12859-018-2203-5 . PMC 4872089 . PMID 26893356.
↑ Keilwagen J, Hartung F, Paulini M, Twardziok SO, Grau J (May 2018). "Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi". BMC Bioinformatics. 19 (1): 189. doi:10.1093/nar/gkw092. PMC 5975413 . PMID 29843602.
↑ Yabuki, Yukimitsu; Mukai, Yuri; Swindells, Mark B.; Suwa, Makiko (2004-03-01). "GENIUS II: a high-throughput database system for linking ORFs in complete genomes to known protein three-dimensional structures". Bioinformatics. 20 (4): 596–598. doi: 10.1093/bioinformatics/btg478 . ISSN 1367-4803. PMID 14751990.
↑ Blanco, Enrique; Parra, Genís; Guigó, Roderic (June 2007), "Using geneid to Identify Genes", Current Protocols in Bioinformatics, Chapter 4, John Wiley & Sons, Inc.: 4.3.1–4.3.28, doi:10.1002/0471250953.bi0403s18, ISBN 978-0471250951, PMID 18428791
↑ Snyder, Eric E.; Stormo, Gary D. (1995-04-21). "Identification of Protein Coding Regions In Genomic DNA" . Journal of Molecular Biology. 248 (1): 1–18. doi:10.1006/jmbi.1995.0198. ISSN 0022-2836. PMID 7731036.
↑ Lukashin AV, Borodovsky M (February 1998). "GeneMark.hmm: new solutions for gene finding". Nucleic Acids Research. 26 (4): 1107–15. doi:10.1093/nar/26.4.1107. PMC 147337 . PMID 9461475.
↑ Besemer J, Lomsadze A, Borodovsky M (June 2001). "GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions". Nucleic Acids Research. 29 (12): 2607–18. doi:10.1093/nar/29.12.2607. PMC 55746 . PMID 11410670.
↑ Lomsadze A, Burns PD, Borodovsky M (September 2014). "Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm". Nucleic Acids Research. 42 (15): e119. doi:10.1093/nar/gku557. PMC 4150757 . PMID 24990371.
↑ Zhu W, Lomsadze A, Borodovsky M (July 2010). "Ab initio gene identification in metagenomic sequences". Nucleic Acids Research. 38 (12): e132. doi:10.1093/nar/gkq275. PMC 2896542 . PMID 20403810.
↑ Antonov I, Borodovsky M (June 2010). "Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm". Journal of Bioinformatics and Computational Biology. 8 (3): 535–51. doi: 10.1142/S0219720010004847 . PMID 20556861.
↑ Yeh, Ru-Fang; Lim, Lee P.; Burge, Christopher B. (2001-05-01). "Computational Inference of Homologous Gene Structures in the Human Genome". Genome Research. 11 (5): 803–816. doi:10.1101/gr.175701. ISSN 1088-9051. PMC 311055 . PMID 11337476.
↑ Burge, Chris; Karlin, Samuel (1997-04-25). "Prediction of complete gene structures in human genomic DNA11Edited by F. E. Cohen". Journal of Molecular Biology. 268 (1): 78–94. doi: 10.1006/jmbi.1997.0951 . ISSN 0022-2836. PMID 9149143.
↑ Burge, Christopher B. (1998-01-01), Salzberg, Steven L.; Searls, David B.; Kasif, Simon (eds.), "Chapter 8 - Modeling dependencies in pre-mRNA splicing signals" , New Comprehensive Biochemistry, Computational Methods in Molecular Biology, vol. 32, Elsevier, pp. 129–164, doi:10.1016/S0167-7306(08)60465-2, ISBN 978-0-444-82875-0 , retrieved 2021-11-24
↑ Burge, Christopher B; Karlin, Samuel (1998-06-01). "Finding the genes in genomic DNA". Current Opinion in Structural Biology. 8 (3): 346–354. doi: 10.1016/S0959-440X(98)80069-9 . ISSN 0959-440X. PMID 9666331.
↑ Delcher, Arthur L.; Bratke, Kirsten A.; Powers, Edwin C.; Salzberg, Steven L. (2007-01-19). "Identifying bacterial genes and endosymbiont DNA with Glimmer". Bioinformatics. 23 (6): 673–679. doi:10.1093/bioinformatics/btm009. ISSN 1460-2059. PMC 2387122 . PMID 17237039.
↑ Delcher, A. (1999-12-01). "Improved microbial gene identification with GLIMMER". Nucleic Acids Research. 27 (23): 4636–4641. doi:10.1093/nar/27.23.4636. ISSN 1362-4962. PMC 148753 . PMID 10556321.
↑ Salzberg, S. L.; Delcher, A. L.; Kasif, S.; White, O. (1998-01-01). "Microbial gene identification using interpolated Markov models". Nucleic Acids Research. 26 (2): 544–548. doi:10.1093/nar/26.2.544. ISSN 0305-1048. PMC 147303 . PMID 9421513.
↑ Majoros WH, Pertea M, Salzberg SL (November 2004). "TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders". Bioinformatics. 20 (16): 2878–9. doi: 10.1093/bioinformatics/bth315 . PMID 15145805.
↑ Uberbacher, Edward C.; Hyatt, Doug; Shah, Manesh (2004). "GrailEXP and Genome Analysis Pipeline for Genome Annotation" . Current Protocols in Bioinformatics. 8 (1): 4.9.1–4.9.15. doi:10.1002/0471250953.bi0409s04. ISSN 1934-340X. PMID 18428726.
↑ Uberbacher, Edward C.; Hyatt, Doug; Shah, Manesh (2003). "GrailEXP and Genome Analysis Pipeline for Genome Annotation" . Current Protocols in Human Genetics. 39 (1): 6.5.1–6.5.15. doi:10.1002/0471142905.hg0605s39. ISSN 1934-8258. PMID 18428363. S2CID 21431978.
↑ Schweikert G, Zien A, Zeller G, Behr J, Dieterich C, Ong CS, et al. (November 2009). "mGene: accurate SVM-based gene finding with an application to nematode genomes". Genome Research. 19 (11): 2133–43. doi:10.1101/gr.090597.108. PMC 2775605 . PMID 19564452.
↑ Gan X, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, et al. (August 2011). "Multiple reference genomes and transcriptomes for Arabidopsis thaliana". Nature. 477 (7365): 419–23. Bibcode:2011Natur.477..419G. doi:10.1038/nature10414. PMC 4856438 . PMID 21874022.
↑ "MORGAN". sites.stat.washington.edu. Retrieved 2021-11-24.
↑ Bedő, Justin; Di Stefano, Leon; Papenfuss, Anthony T (November 2020). "Unifying package managers, workflow engines, and containers: Computational reproducibility with BioNix". GigaScience. 9 (11). doi:10.1093/gigascience/giaa121. ISSN 2047-217X. PMC 7672450 . PMID 33205815.
↑ Reese, Martin G (2001-12-01). "Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome" . Computers & Chemistry. 26 (1): 51–56. doi:10.1016/S0097-8485(01)00099-7. ISSN 0097-8485. PMID 11765852.
↑ Reese, Martin G.; Eeckman, Frank H.; Kulp, David; Haussler, David (1997-01-01). "Improved Splice Site Detection in Genie" . Journal of Computational Biology. 4 (3): 311–323. doi:10.1089/cmb.1997.4.311. PMID 9278062.
↑ "Home - ORFfinder - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-11-24.
↑ Santana-Garcia, Walter; Rocha-Acevedo, Maria; Ramirez-Navarro, Lucia; Mbouamboua, Yvon; Thieffry, Denis; Thomas-Chollier, Morgane; Contreras-Moreira, Bruno; van Helden, Jacques; Medina-Rivera, Alejandra (2019-01-01). "RSAT variation-tools: An accessible and flexible framework to predict the impact of regulatory variants on transcription factor binding". Computational and Structural Biotechnology Journal. 17: 1415–1428. doi:10.1016/j.csbj.2019.09.009. ISSN 2001-0370. PMC 6906655 . PMID 31871587.
↑ Nguyen, Nga Thi Thuy; Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Santana-Garcia, Walter; Ossio, Raul; Robles-Espinoza, Carla Daniela; Bahin, Mathieu; Collombet, Samuel; Vincens, Pierre; Thieffry, Denis; van Helden, Jacques (2018-05-02). "RSAT 2018: regulatory sequence analysis tools 20th anniversary". Nucleic Acids Research. 46 (W1): W209 –W214. doi:10.1093/nar/gky317. ISSN 0305-1048. PMC 6030903 . PMID 29722874.
↑ McNair, Katelyn; Zhou, Carol; Dinsdale, Elizabeth A.; Souza, Brian; Edwards, Robert A. (2019-11-01). "PHANOTATE: a novel approach to gene identification in phage genomes". Bioinformatics. 35 (22): 4537–4542. doi: 10.1093/bioinformatics/btz265 . ISSN 1367-4803. PMC 6853651 . PMID 31329826.
↑ Brendel, V.; Xing, L.; Zhu, W. (2004-02-05). "Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus". Bioinformatics. 20 (7): 1157–1169. doi: 10.1093/bioinformatics/bth058 . ISSN 1367-4803. PMID 14764557.
↑ Henderson, John; Salzberg, Steven; Fasman, Kenneth H. (1997-01-01). "Finding Genes in DNA with a Hidden Markov Model". Journal of Computational Biology. 4 (2): 127–141. doi:10.1089/cmb.1997.4.127. hdl: 1903/8004 . PMID 9228612.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[pmid33879057-1] Banerjee S, Bhandary P, Woodhouse M, Sen TZ, Wise RP, Andorf CM (Apr 2021). "FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences". BMC Bioinformatics. 44 (9): e89. doi: 10.1186/s12859-021-04120-9 . PMC 8056616 . PMID 33879057.

[2] Rho M, Tang H, Ye Y (November 2010). "FragGeneScan: predicting genes in short and error-prone reads". Nucleic Acids Research. 38 (20): e191. doi:10.1093/nar/gkq747. PMC 2978382 . PMID 20805240.

[3] Nishikawa, Tetsuo; Ota, Toshio; Isogai, Takao (2000-11-01). "Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences". Bioinformatics. 16 (11): 960–967. doi: 10.1093/bioinformatics/16.11.960 . ISSN 1367-4803. PMID 11159307.

[4] Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ (March 2010). "Prodigal: prokaryotic gene recognition and translation initiation site identification". BMC Bioinformatics. 11: 119. doi: 10.1186/1471-2105-11-119 . PMC 2848648 . PMID 20211023.

[pmid21216780-5] Keller O, Kollmar M, Stanke M, Waack S (March 2011). "A novel hybrid gene prediction method employing protein multiple sequence alignments". Bioinformatics. 27 (6): 757–63. doi: 10.1093/bioinformatics/btr010 . hdl: 11858/00-001M-0000-0011-F244-D . PMID 21216780.

[6] Li, Heng; Liu, Jin-Song; Xu, Zhao; Jin, Jiao; Fang, Lin; Gao, Lei; Li, Yu-Dong; Xing, Zi-Xing; Gao, Shao-Gen; Liu, Tao; Li, Hai-Hong (2005-07-01). "Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome" . Journal of Computer Science and Technology. 20 (4): 446–453. doi:10.1007/s11390-005-0446-x. ISSN 1860-4749. S2CID 13497894.

[7] Bajic, Vladimir B.; Seah, Seng Hong; Chong, Allen; Zhang, Guanglan; Koh, Judice L. Y.; Brusic, Vladimir (2002-01-01). "Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoters". Bioinformatics. 18 (1): 198–199. doi: 10.1093/bioinformatics/18.1.198 . ISSN 1367-4803. PMID 11836231.

[8] Nielsen, P.; Krogh, A. (2005-12-15). "Large-scale prokaryotic gene prediction and comparison to genome annotation". Bioinformatics. 21 (24): 4322–4329. doi: 10.1093/bioinformatics/bti701 . ISSN 1367-4803. PMID 16249266.

[9] Larsen, Thomas Schou; Krogh, Anders (2003-06-03). "EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance". BMC Bioinformatics. 4 (1): 21. doi: 10.1186/1471-2105-4-21 . ISSN 1471-2105. PMC 521197 . PMID 12783628.

[10] Foissac S, Gouzy J, Rombauts S, Mathé C, Amselem J, Sterck L, de Peer YV, Rouzé P, Schiex T (May 2008). "Genome annotation in plants and fungi: EuGene as a model platform" . Current Bioinformatics. 3 (2): 87–97. doi:10.2174/157489308784340702.

[11] Sallet, Erika; Gouzy, Jérôme; Schiex, Thomas (2019), Kollmar, Martin (ed.), "EuGene: An Automated Integrative Gene Finder for Eukaryotes and Prokaryotes" , Gene Prediction: Methods and Protocols, Methods in Molecular Biology, vol. 1962, New York, NY: Springer, pp. 97–120, doi:10.1007/978-1-4939-9173-0_6, ISBN 978-1-4939-9173-0, PMID 31020556, S2CID 131776381 , retrieved 2021-11-24

[pmid10779491-12] Salamov AA, Solovyev VV (April 2000). "Ab initio gene finding in Drosophila genomic DNA". Genome Research. 10 (4): 516–22. doi:10.1101/gr.10.4.516. PMC 310882 . PMID 10779491.

[pmid12824407-13] Schiex T, Gouzy J, Moisan A, de Oliveira Y (July 2003). "FrameD: A flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences". Nucleic Acids Research. 31 (13): 3738–41. doi:10.1093/nar/gkg610. PMC 169016 . PMID 12824407.

[pmid26893356-14] Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F (May 2016). "Using intron position conservation for homology-based gene prediction". Nucleic Acids Research. 44 (9): e89. doi: 10.1186/s12859-018-2203-5 . PMC 4872089 . PMID 26893356.

[pmid29843602-15] Keilwagen J, Hartung F, Paulini M, Twardziok SO, Grau J (May 2018). "Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi". BMC Bioinformatics. 19 (1): 189. doi:10.1093/nar/gkw092. PMC 5975413 . PMID 29843602.

[16] Yabuki, Yukimitsu; Mukai, Yuri; Swindells, Mark B.; Suwa, Makiko (2004-03-01). "GENIUS II: a high-throughput database system for linking ORFs in complete genomes to known protein three-dimensional structures". Bioinformatics. 20 (4): 596–598. doi: 10.1093/bioinformatics/btg478 . ISSN 1367-4803. PMID 14751990.

[17] Blanco, Enrique; Parra, Genís; Guigó, Roderic (June 2007), "Using geneid to Identify Genes", Current Protocols in Bioinformatics, Chapter 4, John Wiley & Sons, Inc.: 4.3.1–4.3.28, doi:10.1002/0471250953.bi0403s18, ISBN 978-0471250951, PMID 18428791

[18] Snyder, Eric E.; Stormo, Gary D. (1995-04-21). "Identification of Protein Coding Regions In Genomic DNA" . Journal of Molecular Biology. 248 (1): 1–18. doi:10.1006/jmbi.1995.0198. ISSN 0022-2836. PMID 7731036.

[pmid9461475-19] Lukashin AV, Borodovsky M (February 1998). "GeneMark.hmm: new solutions for gene finding". Nucleic Acids Research. 26 (4): 1107–15. doi:10.1093/nar/26.4.1107. PMC 147337 . PMID 9461475.

[20] Besemer J, Lomsadze A, Borodovsky M (June 2001). "GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions". Nucleic Acids Research. 29 (12): 2607–18. doi:10.1093/nar/29.12.2607. PMC 55746 . PMID 11410670.

[21] Lomsadze A, Burns PD, Borodovsky M (September 2014). "Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm". Nucleic Acids Research. 42 (15): e119. doi:10.1093/nar/gku557. PMC 4150757 . PMID 24990371.

[22] Zhu W, Lomsadze A, Borodovsky M (July 2010). "Ab initio gene identification in metagenomic sequences". Nucleic Acids Research. 38 (12): e132. doi:10.1093/nar/gkq275. PMC 2896542 . PMID 20403810.

[pmid20556861-23] Antonov I, Borodovsky M (June 2010). "Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm". Journal of Bioinformatics and Computational Biology. 8 (3): 535–51. doi: 10.1142/S0219720010004847 . PMID 20556861.

[24] Yeh, Ru-Fang; Lim, Lee P.; Burge, Christopher B. (2001-05-01). "Computational Inference of Homologous Gene Structures in the Human Genome". Genome Research. 11 (5): 803–816. doi:10.1101/gr.175701. ISSN 1088-9051. PMC 311055 . PMID 11337476.

[25] Burge, Chris; Karlin, Samuel (1997-04-25). "Prediction of complete gene structures in human genomic DNA11Edited by F. E. Cohen". Journal of Molecular Biology. 268 (1): 78–94. doi: 10.1006/jmbi.1997.0951 . ISSN 0022-2836. PMID 9149143.

[26] Burge, Christopher B. (1998-01-01), Salzberg, Steven L.; Searls, David B.; Kasif, Simon (eds.), "Chapter 8 - Modeling dependencies in pre-mRNA splicing signals" , New Comprehensive Biochemistry, Computational Methods in Molecular Biology, vol. 32, Elsevier, pp. 129–164, doi:10.1016/S0167-7306(08)60465-2, ISBN 978-0-444-82875-0 , retrieved 2021-11-24

[27] Burge, Christopher B; Karlin, Samuel (1998-06-01). "Finding the genes in genomic DNA". Current Opinion in Structural Biology. 8 (3): 346–354. doi: 10.1016/S0959-440X(98)80069-9 . ISSN 0959-440X. PMID 9666331.

[28] Delcher, Arthur L.; Bratke, Kirsten A.; Powers, Edwin C.; Salzberg, Steven L. (2007-01-19). "Identifying bacterial genes and endosymbiont DNA with Glimmer". Bioinformatics. 23 (6): 673–679. doi:10.1093/bioinformatics/btm009. ISSN 1460-2059. PMC 2387122 . PMID 17237039.

[29] Delcher, A. (1999-12-01). "Improved microbial gene identification with GLIMMER". Nucleic Acids Research. 27 (23): 4636–4641. doi:10.1093/nar/27.23.4636. ISSN 1362-4962. PMC 148753 . PMID 10556321.

[30] Salzberg, S. L.; Delcher, A. L.; Kasif, S.; White, O. (1998-01-01). "Microbial gene identification using interpolated Markov models". Nucleic Acids Research. 26 (2): 544–548. doi:10.1093/nar/26.2.544. ISSN 0305-1048. PMC 147303 . PMID 9421513.

[31] Majoros WH, Pertea M, Salzberg SL (November 2004). "TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders". Bioinformatics. 20 (16): 2878–9. doi: 10.1093/bioinformatics/bth315 . PMID 15145805.

[32] Uberbacher, Edward C.; Hyatt, Doug; Shah, Manesh (2004). "GrailEXP and Genome Analysis Pipeline for Genome Annotation" . Current Protocols in Bioinformatics. 8 (1): 4.9.1–4.9.15. doi:10.1002/0471250953.bi0409s04. ISSN 1934-340X. PMID 18428726.

[33] Uberbacher, Edward C.; Hyatt, Doug; Shah, Manesh (2003). "GrailEXP and Genome Analysis Pipeline for Genome Annotation" . Current Protocols in Human Genetics. 39 (1): 6.5.1–6.5.15. doi:10.1002/0471142905.hg0605s39. ISSN 1934-8258. PMID 18428363. S2CID 21431978.

[34] Schweikert G, Zien A, Zeller G, Behr J, Dieterich C, Ong CS, et al. (November 2009). "mGene: accurate SVM-based gene finding with an application to nematode genomes". Genome Research. 19 (11): 2133–43. doi:10.1101/gr.090597.108. PMC 2775605 . PMID 19564452.

[35] Gan X, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, et al. (August 2011). "Multiple reference genomes and transcriptomes for Arabidopsis thaliana". Nature. 477 (7365): 419–23. Bibcode:2011Natur.477..419G. doi:10.1038/nature10414. PMC 4856438 . PMID 21874022.

[36] "MORGAN". sites.stat.washington.edu. Retrieved 2021-11-24.

[37] Bedő, Justin; Di Stefano, Leon; Papenfuss, Anthony T (November 2020). "Unifying package managers, workflow engines, and containers: Computational reproducibility with BioNix". GigaScience. 9 (11). doi:10.1093/gigascience/giaa121. ISSN 2047-217X. PMC 7672450 . PMID 33205815.

[38] Reese, Martin G (2001-12-01). "Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome" . Computers & Chemistry. 26 (1): 51–56. doi:10.1016/S0097-8485(01)00099-7. ISSN 0097-8485. PMID 11765852.

[39] Reese, Martin G.; Eeckman, Frank H.; Kulp, David; Haussler, David (1997-01-01). "Improved Splice Site Detection in Genie" . Journal of Computational Biology. 4 (3): 311–323. doi:10.1089/cmb.1997.4.311. PMID 9278062.

[40] "Home - ORFfinder - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-11-24.

[41] Santana-Garcia, Walter; Rocha-Acevedo, Maria; Ramirez-Navarro, Lucia; Mbouamboua, Yvon; Thieffry, Denis; Thomas-Chollier, Morgane; Contreras-Moreira, Bruno; van Helden, Jacques; Medina-Rivera, Alejandra (2019-01-01). "RSAT variation-tools: An accessible and flexible framework to predict the impact of regulatory variants on transcription factor binding". Computational and Structural Biotechnology Journal. 17: 1415–1428. doi:10.1016/j.csbj.2019.09.009. ISSN 2001-0370. PMC 6906655 . PMID 31871587.

[42] Nguyen, Nga Thi Thuy; Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Santana-Garcia, Walter; Ossio, Raul; Robles-Espinoza, Carla Daniela; Bahin, Mathieu; Collombet, Samuel; Vincens, Pierre; Thieffry, Denis; van Helden, Jacques (2018-05-02). "RSAT 2018: regulatory sequence analysis tools 20th anniversary". Nucleic Acids Research. 46 (W1): W209 –W214. doi:10.1093/nar/gky317. ISSN 0305-1048. PMC 6030903 . PMID 29722874.

[43] McNair, Katelyn; Zhou, Carol; Dinsdale, Elizabeth A.; Souza, Brian; Edwards, Robert A. (2019-11-01). "PHANOTATE: a novel approach to gene identification in phage genomes". Bioinformatics. 35 (22): 4537–4542. doi: 10.1093/bioinformatics/btz265 . ISSN 1367-4803. PMC 6853651 . PMID 31329826.

[44] Brendel, V.; Xing, L.; Zhu, W. (2004-02-05). "Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus". Bioinformatics. 20 (7): 1157–1169. doi: 10.1093/bioinformatics/bth058 . ISSN 1367-4803. PMID 14764557.

[45] Henderson, John; Salzberg, Steven; Fasman, Kenneth H. (1997-01-01). "Finding Genes in DNA with a Hidden Markov Model". Journal of Computational Biology. 4 (2): 127–141. doi:10.1089/cmb.1997.4.127. hdl: 1903/8004 . PMID 9228612.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

List of gene prediction software

See also

References