List of phylogenetics software

Last updated

This list of phylogenetics software is a compilation of computational phylogenetics software used to produce phylogenetic trees. Such tools are commonly used in comparative genomics, cladistics, and bioinformatics. Methods for estimating phylogenies include neighbor-joining, maximum parsimony (also simply referred to as parsimony), unweighted pair group method with arithmetic mean (UPGMA), Bayesian phylogenetic inference, maximum likelihood, and distance matrix methods.

Contents

List

NameDescriptionMethodsAuthor
ADMIXTOOLS [1] R software package that contains the qpGraph, qpAdm, qpWave, and qpDstat programs Nick Patterson, David Reich
AncesTree [2] An algorithm for clonal tree reconstruction from multi-sample cancer sequencing data.Maximum Likelihood, Integer Linear Programming (ILP)M. El-Kebir, L. Oesper, H. Acheson-Field, B. J. Raphael
AliGROOVE [3] Visualisation of heterogeneous sequence divergence within multiple sequence alignments and detection of inflated branch supportIdentification of single taxa which show predominately randomized sequence similarity in comparison with other taxa in a multiple sequence alignment and evaluation of the reliability of node support in a given topologyPatrick Kück, Sandra A Meid, Christian Groß, Bernhard Misof, Johann Wolfgang Wägele.
ape [4] R-Project package for analysis of phylogenetics and evolutionProvides a large variety of phylogenetics functionsMaintainer: Emmanuel Paradis
Armadillo Workflow Platform [5] Workflow platform dedicated to phylogenetic and general bioinformatic analysisInference of phylogenetic trees using Distance, Maximum Likelihood, Maximum Parsimony, Bayesian methods and related workflowsE. Lord, M. Leclercq, A. Boc, A.B. Diallo and V. Makarenkov
BAli-Phy [6] Simultaneous Bayesian inference of alignment and phylogenyBayesian inference, alignment as well as tree searchM.A. Suchard, B. D. Redelings
BATWING [7] Bayesian Analysis of Trees With Internal Node GenerationBayesian inference, demographic history, population splitsI. J. Wilson, Weale, D.Balding
BayesPhylogenies [8] Bayesian inference of trees using Markov chain Monte Carlo methodsBayesian inference, multiple models, mixture model (auto-partitioning)M. Pagel, A. Meade
BayesTraits [9] Analyses trait evolution among groups of species for which a phylogeny or sample of phylogenies is availableTrait analysisM. Pagel, A. Meade
BEAST [10] Bayesian Evolutionary Analysis Sampling TreesBayesian inference, relaxed molecular clock, demographic historyA. J. Drummond, M. A. Suchard, D Xie & A. Rambaut
BioNumerics Universal platform for the management, storage and analysis of all types of biological data, including tree and network inference of sequence dataNeighbor-joining, maximum parsimony, UPGMA, maximum likelihood, distance matrix methods,... Calculation of the reliability of trees/branches using bootstrapping, permutation resampling or error resamplingL. Vauterin & P. Vauterin.
BosqueIntegrated graphical software to perform phylogenetic analyses, from the importing of sequences to the plotting and graphical edition of trees and alignmentsDistance and maximum likelihood methods (through PhyML, PHYLIP, Tree-Puzzle)S. Ramirez, E. Rodriguez.
BUCKyBayesian concordance of gene treesBayesian concordance using modified greedy consensus of unrooted quartets C. Ané, B. Larget, D.A. Baum, S.D. Smith, A. Rokas and B. Larget, S.K. Kotha, C.N. Dewey, C. Ané
Canopy [11] Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencingMaximum Likelihood, Markov Chain Monte Carlo (MCMC) methodsY. Jiang, Y. Qiu, A. J. Minn, and N. R. Zhang
CGRphylo [12] CGR method for accurate classification and tracking of rapidly evolving virusesChaos Game Representation (CGR) method, based on concepts of statistical physicsAmarinder Singh Thind, Somdatta Sinha
CITUPClonality Inference in Tumors Using PhylogenyExhaustive search, Quadratic Integer Programming (QIP)S. Malikic, A.W. McPherson, N. Donmez, C.S. Sahinalp
ClustalWProgressive multiple sequence alignmentDistance matrix/nearest neighborThompson et al. [13]
CoalEvolSimulation of DNA and protein evolution along phylogenetic trees (that can also be simulated with the coalescent)Simulation of multiple sequence alignments of DNA or protein sequencesM. Arenas, D. Posada
CodABCCoestimation of substitution, recombination and dN/dS in protein sequencesApproximate Bayesian computationM. Arenas, J.S. Lopes, M.A. Beaumont, D. Posada
Dendroscope [14] Tool for visualizing rooted trees and calculating rooted networksRooted trees, tanglegrams, consensus networks, hybridization networksDaniel Huson et al.
EXACT [15] [16] EXACT is based on the perfect phylogeny model, and uses a very fast homotopy algorithm to evaluate the fitness of different trees, and then it brute forces the tree search using GPUs, or multiple CPUs, on the same or on different machinesBrute force search and homotopy algorithmJia B., Ray S., Safavi S., Bento J.
EzEditor [17] EzEditor is a java-based sequence alignment editor for rRNA and protein coding genes. It allows manipulation of both DNA and protein sequence alignments for phylogenetic analysisNeighbor JoiningJeon, Y.S. et al.
fastDNAmlOptimized maximum likelihood (nucleotides only)Maximum likelihoodG.J. Olsen
FastTree 2 [18] Fast phylogenetic inference for alignments with up to hundreds of thousands of sequencesApproximate maximum likelihoodM.N. Price, P.S. Dehal, A.P. Arkin
fitmodelFits branch-site codon models without the need of prior knowledge of clades undergoing positive selectionMaximum likelihoodS. Guindon
GeneiousGeneious provides genome and proteome research toolsNeighbor-joining, UPGMA, MrBayes plugin, PhyML plugin, RAxML plugin, FastTree plugin, GARLi plugin, PAUP* PluginA. J. Drummond, M.Suchard, V.Lefort et al.
HyPhy Hypothesis testing using phylogeniesMaximum likelihood, neighbor-joining, clustering techniques, distance matricesS.L. Kosakovsky Pond, S.D.W. Frost, S.V. Muse
INDELlible [19] Simulation of DNA/protein sequence evolutionSimulationW. Fletcher, Z. Yang
IQPNNIIterative ML treesearch with stopping ruleMaximum likelihood, neighbor-joiningL.S. Vinh, A. von Haeseler, B.Q. Minh
IQ-Tree [20] An efficient phylogenomic software by maximum likelihood, as successor of IQPNNI and Tree-PuzzleMaximum likelihood, model selection, partitioning scheme finding, AIC, AICc, BIC, ultrafast bootstrapping, [21] branch tests, tree topology tests, likelihood mappingLam-Tung Nguyen, O. Chernomor, H.A. Schmidt, A. von Haeseler, B.Q. Minh
jModelTest 2A high-performance computing program to carry out statistical selection of best-fit models of nucleotide substitutionMaximum likelihood, AIC, BIC, DT, hLTR, dLTRD. Darriba, GL. Taboada, R. Doallo, D. Posada
JolyTree [22] [23] An alignment-free bioinformatics procedure to infer distance-based phylogenetic trees from genome assemblies, specifically designed to quickly infer trees from genomes belonging to the same genus MinHash-based pairwise genome distance, Balanced Minimum Evolution (BME), ratchet-based BME tree search, Rate of Elementary QuartetsA. Criscuolo
LisBethThree-item analysis for phylogenetics and biogeographyThree-item analysisJ. Ducasse, N. Cao & R. Zaragüeta-Bagils
MEGA Molecular Evolutionary Genetics AnalysisDistance, Parsimony and Maximum Composite Likelihood MethodsTamura K, Dudley J, Nei M & Kumar S
MegAlign ProMegAlign Pro is part of DNASTAR's Lasergene Molecular Biology package. This application performs multiple and pairwise sequence alignments, provides alignment editing, and generates phylogenetic trees.Maximum Likelihood (RAxML) and Neighbor-Joining DNASTAR
MesquiteMesquite is software for evolutionary biology, designed to help biologists analyze comparative data about organisms. Its emphasis is on phylogenetic analysis, but some of its modules concern comparative analyses or population genetics, while others do non-phylogenetic multivariate analysis. It can also be used to build timetrees incorporating a geological timescale, with some optional modules.Maximum parsimony, distance matrix, maximum likelihood Wayne Maddison and D. R. Maddison
MetaPIGA2Maximum likelihood phylogeny inference multi-core program for DNA and protein sequences, and morphological data. Analyses can be performed using an extensive and user-friendly graphical interface or by using batch files. It also implements tree visualization tools, ancestral sequences, and automated selection of best substitution model and parameters.Maximum likelihood, stochastic heuristics (genetic algorithm, metapopulation genetic algorithm, simulated annealing, etc.), discrete Gamma rate heterogeneity, ancestral state reconstruction, model testingMichel C. Milinkovitch and Raphaël Helaers
MicrobeTrace MicrobeTrace is a free, browser-based web application.2D and 3D network visualization tool, Neighbor-joining tree visualization, Gantt charts, bubbles charts, networks visualized on maps, flow diagrams, aggregate tables, epi curves, histograms, alignment viewer, and much more.Ellsworth M. Campbell, Anthony Boyles, Anupama Shankar, Jay Kim, Sergey Knyazev, Roxana Cintron, William M. Switzer [24]
MNHN-Tree-Tools MNHN-Tree-Tools is an opensource phylogenetics inference software working on nucleic and protein sequences.Clustering of DNA or protein sequences and phylogenetic tree inference from a set of sequences. At the core it employs a distance-density based approach.Thomas Haschka, Loïc Ponger, Christophe Escudé and Julien Mozziconacci [25]
ModelgeneratorModel selection (protein or nucleotide)Maximum likelihoodThomas Keane
MOLPHYMolecular phylogenetics (protein or nucleotide)Maximum likelihoodJ. Adachi and M. Hasegawa
MorphoBank Web application to organize trait data (morphological characters) for tree buildingfor use with Maximum Parsimony (via the CIPRES portal), Maximum Likelihood, and Bayesian analysis)O'Leary, M. A., and S. Kaufman, [26] also K. Alphonse
MrBayes Posterior probability estimation Bayesian inference J. Huelsenbeck, et al. [27]
NetworkFree Phylogenetic Network SoftwareMedian Joining, Reduced Median, Steiner NetworkA. Roehl
NonaPhylogenetic inferenceMaximum parsimony, implied weighting, ratchetP. Goloboff
PAMLPhylogenetic analysis by maximum likelihoodMaximum likelihood and Bayesian inference Z. Yang
ParaPhylo [28] Computation of gene and species trees based on event-relations (orthology, paralogy)Cograph-Editing and Triple-InferenceHellmuth
PartitionFinderCombined selection of models of molecular evolution and partitioning schemes for DNA and protein alignmentsMaximum likelihood, AIC, AICc, BICR. Lanfear, B Calcott, SYW Ho, S Guindon
PASTISR package for phylogenetic assemblyR, two‐stage Bayesian inference using MrBayes 3.2Thomas et al. 2013 [29]
PAUP* Phylogenetic analysis using parsimony (*and other methods)Maximum parsimony, distance matrix, maximum likelihoodD. Swofford
phangorn [30] Phylogenetic analysis in RML, MP, distance matrix, bootstrap, phylogentic networks, bootstrap, model selection, SH-test, SOWH-testMaintainer: K. Schliep
Phybase [31] an R package for species tree analysisphylogenetics functions, STAR, NJst, STEAC, maxtree, etcL. Liu & L. Yu
phyclustPhylogenetic Clustering (Phyloclustering)Maximum likelihood of Finite Mixture ModesWei-Chen Chen
PHYLIP PHYLogeny Inference PackageMaximum parsimony, distance matrix, maximum likelihood J. Felsenstein
phyloTGenerates phylogenetic trees in various formats, based on NCBI taxonomynoneI. Letunic
PhyloQuartQuartet implementation (uses sequences or distances)Quartet methodV. Berry
PhyloWGSReconstructing subclonal composition and evolution from whole-genome sequencing of tumorsMCMCA. G. Deshwar, S. Vembu, C. K. Yung, G. H. Jang, L. Stein, and Q. Morris
PhyML [32] Fast and accurate estimation of phylogenies using maximum likelihoodMaximum likelihoodS. Guindon & O. Gascuel
phyx [33] Unix/Linux command line phylogenetic toolsExplore, manipulate, analyze, and simulate phylogenetic objects (alignments, trees, and MCMC logs)J.W. Brown, J.F. Walker, and S.A. Smith
POYA phylogenetic analysis program that supports multiple kinds of data and can perform alignment and phylogeny inference. A variety of heuristic algorithms have been developed for this purposeMaximum parsimony, Maximum likelihood, Chromosome rearrangement, discreet characters, continuous characters, AlignmentA. Varon, N. Lucaroni, L. Hong, W. Wheeler
ProtASR2 [34] Ancestral reconstruction of protein sequences accounting for folding stabilityMaximum likelihood, substitution modelsM. Arenas, U. Bastolla
ProtEvolSimulation of protein sequences under structurally constrained substitution modelsSimulating sequences, substitution modelsM. Arenas, A. Sanchez-Cobos, U. Bastolla U
ProteinEvolverSimulation of protein sequences along phylogenies under empirical and structurally constrained substitution models of protein evolutionSimulating sequences forward in time, substitution modelsM. Arenas, H.G. Dos Santos, D. Posada, U. Bastolla
ProteinEvolverABC [35] Coestimation of recombination and substitution rates in protein sequencesApproximate Bayesian computationM. Arenas
ProteinModelerABC [36] Selection among site-dependent structurally constrained substitution models of protein evolutionApproximate Bayesian computationD. Ferreiro et al
ProtTest3A high-performance computing program for selecting the model of protein evolution that best fits a given set of aligned sequencesMaximum likelihood, AIC, BIC, DTD. Darriba, GL. Taboada, R. Doallo, D. Posada
PyCogentSoftware library for genomic biologySimulating sequences, alignment, controlling third party applications, workflows, querying databases, generating graphics and phylogenetic treesKnight et al.
QuickTreeTree construction optimized for efficiencyNeighbor-joiningK. Howe, A. Bateman, R. Durbin
RAxML-HPCRandomized Axelerated Maximum Likelihood for High Performance Computing (nucleotides and aminoacids)Maximum likelihood, simple Maximum parsimonyA. Stamatakis
RAxML-NG [37] Randomized Axelerated Maximum Likelihood for High Performance Computing (nucleotides and aminoacids) Next GenerationMaximum likelihood, simple Maximum parsimonyA. Kozlov, D. Darriba, T. Flouri, B. Morel, A. Stamatakis
SEMPHYTree reconstruction using the combined strengths of maximum-likelihood (accuracy) and neighbor-joining (speed). SEMPHY has become outdated. The authors now refer users to RAxML, which is superior in accuracy and speed.A hybrid maximum-likelihood – neighbor-joining methodM. Ninio, E. Privman, T. Pupko, N. Friedman
SGWESimulation of genome-wide evolution along phylogenetic treesSimulating genome-wide sequences forward timeArenas M., Posada D.
SimPlot++ [38] Sequence similarity plots (SimPlots [39] ), detection of intragenic and intergenic recombination events, bootscan analysis [40] and sequence similarity networksSimPlot using different nucleotide/protein distance models; Phi, χ2 and NSS recombination tests; Sequence similarity network analysisS. Samson, E. Lord, V. Makarenkov
sowhat [41] Hypothesis testingSOWH testChurch, Ryan, Dunn
Splatche3 [42] Simulation of genetic data under diverse spatially explicit evolutionary scenariosCoalescent, molecular evolution, DNA sequences, SNPs, STRs, RFLPsM. Currat et al.
SplitsTree [43] Tree and network programComputation, visualization and exploration of phylogenetic trees and networksD.H. Huson and D. Bryant
TNTPhylogenetic inferenceParsimony, weighting, ratchet, tree drift, tree fusing, sectorial searchesP. Goloboff et al.
TOPALiPhylogenetic inferencePhylogenetic model selection, Bayesian analysis and Maximum Likelihood phylogenetic tree estimation, detection of sites under positive selection, and recombination breakpoint location analysisIain Milne, Dominik Lindner et al.
TreeGenTree construction given precomputed distance dataDistance matrixETH Zurich
TreeAlignEfficient hybrid methodDistance matrix and approximate parsimony J. Hein
TreeLineTree construction algorithm within the DECIPHER package for RMaximum likelihood, maximum parsimony, and distanceE. Wright
Treefinder [44] Fast ML tree reconstruction, bootstrap analysis, model selection, hypothesis testing, tree calibration, tree manipulation and visualization, computation of sitewise rates, sequence simulation, many models of evolution (DNA, protein, rRNA, mixed protein, user-definable), GUI and scripting language Maximum likelihood, distances, and othersJobb G, von Haeseler A, Strimmer K
Tree-Puzzle [45] [46] Maximum likelihood and statistical analysisMaximum likelihoodMakarenkov
T-REX (Webserver) [47] Tree inference and visualization, Horizontal gene transfer detection, multiple sequence alignmentDistance (neighbor joining), Parsimony and Maximum likelihood (PhyML, RAxML) tree inference, MUSCLE, MAFFT and ClustalW sequence alignments and related applicationsBoc A, Diallo AB, Makarenkov V
UShER [48] Phylogenetic placement using maximum parsimony for viral genomesMaximum parsimonyTurakhia Y, Thornlow B, Hinrichs AS, De Maio N, Gozashti L, Lanfear R, Haussler D and Corbett-Detig R
UGENE Fast and free multiplatform tree editorGUI with PHYLIP 3.6 and IQTree algorithmsUnipro
VeryFastTree [49] A highly-tuned tool that uses parallelizing and vectorizing strategies to speed inference of phylogenies for huge alignmentsApproximate maximum likelihoodCésar Piñeiro. José M. Abuín and Juan C. Pichel
WincladaGUI and tree editor (requires Nona)Maximum parsimony, ratchetK. Nixon
Xrate Phylo-grammar engineRate estimation, branch length estimation, alignment annotationI. Holmes

See also

Related Research Articles

In biology, phylogenetics is the study of the evolutionary history of life using genetics, which is known as phylogenetic inference. It establishes the relationship between organisms with the empirical data and observed heritable traits of DNA sequences, protein amino acid sequences, and morphology. The results are a phylogenetic tree—a diagram setting the hypothetical relationships between organisms and their evolutionary history.

A phylogenetic tree, phylogeny or evolutionary tree is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time. In other words, it is a branching diagram or a tree showing the evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical or genetic characteristics. In evolutionary biology, all life on Earth is theoretically part of a single phylogenetic tree, indicating common ancestry. Phylogenetics is the study of phylogenetic trees. The main challenge is to find a phylogenetic tree representing optimal evolutionary ancestry between a set of species or taxa. Computational phylogenetics focuses on the algorithms involved in finding optimal phylogenetic tree in the phylogenetic landscape.

The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotide sequences for DNA, RNA, or amino acid sequences for proteins.

<span class="mw-page-title-main">Substitution model</span> Model of changes in a sequence over evolutionary time

In biology, a substitution model, also called models of sequence evolution, are Markov models that describe changes over evolutionary time. These models describe evolutionary changes in macromolecules, such as DNA sequences or protein sequences, that can be represented as sequence of symbols. Substitution models are used to calculate the likelihood of phylogenetic trees using multiple sequence alignment data. Thus, substitution models are central to maximum likelihood estimation of phylogeny as well as Bayesian inference in phylogeny. Estimates of evolutionary distances are typically calculated using substitution models. Substitution models are also central to phylogenetic invariants because they are necessary to predict site pattern frequencies given a tree topology. Substitution models are also necessary to simulate sequence data for a group of organisms related by a specific tree.

Phylogenomics is the intersection of the fields of evolution and genomics. The term has been used in multiple ways to refer to analysis that involves genome data and evolutionary reconstructions. It is a group of techniques within the larger fields of phylogenetics and genomics. Phylogenomics draws information by comparing entire genomes, or at least large portions of genomes. Phylogenetics compares and analyzes the sequences of single genes, or a small number of genes, as well as many other types of data. Four major areas fall under phylogenomics:

A phylogenetic network is any graph used to visualize evolutionary relationships between nucleotide sequences, genes, chromosomes, genomes, or species. They are employed when reticulation events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved. They differ from phylogenetic trees by the explicit modeling of richly linked networks, by means of the addition of hybrid nodes instead of only tree nodes. Phylogenetic trees are a subset of phylogenetic networks. Phylogenetic networks can be inferred and visualised with software such as SplitsTree, the R-package, phangorn, and, more recently, Dendroscope. A standard format for representing phylogenetic networks is a variant of Newick format which is extended to support networks as well as trees.

Computational phylogenetics, phylogeny inference, or phylogenetic inference focuses on computational and optimization algorithms, heuristics, and approaches involved in phylogenetic analyses. The goal is to find a phylogenetic tree representing optimal evolutionary ancestry between a set of genes, species, or taxa. Maximum likelihood, parsimony, Bayesian, and minimum evolution are typical optimality criteria used to assess how well a phylogenetic tree topology describes the sequence data. Nearest Neighbour Interchange (NNI), Subtree Prune and Regraft (SPR), and Tree Bisection and Reconnection (TBR), known as tree rearrangements, are deterministic algorithms to search for optimal or the best phylogenetic tree. The space and the landscape of searching for the optimal phylogenetic tree is known as phylogeny search space.

PHYLogeny Inference Package (PHYLIP) is a free computational phylogenetics package of programs for inferring evolutionary trees (phylogenies). It consists of 65 portable programs, i.e., the source code is written in the programming language C. As of version 3.696, it is licensed as open-source software; versions 3.695 and older were proprietary software freeware. Releases occur as source code, and as precompiled executables for many operating systems including Windows, Mac OS 8, Mac OS 9, OS X, Linux ; and FreeBSD from FreeBSD.org. Full documentation is written for all the programs in the package and is included therein. The programs in the phylip package were written by Professor Joseph Felsenstein, of the Department of Genome Sciences and the Department of Biology, University of Washington, Seattle.

Ancestral reconstruction is the extrapolation back in time from measured characteristics of individuals, populations, or species to their common ancestors. It is an important application of phylogenetics, the reconstruction and study of the evolutionary relationships among individuals, populations or species to their ancestors. In the context of evolutionary biology, ancestral reconstruction can be used to recover different kinds of ancestral character states of organisms that lived millions of years ago. These states include the genetic sequence, the amino acid sequence of a protein, the composition of a genome, a measurable characteristic of an organism (phenotype), and the geographic range of an ancestral population or species. This is desirable because it allows us to examine parts of phylogenetic trees corresponding to the distant past, clarifying the evolutionary history of the species in the tree. Since modern genetic sequences are essentially a variation of ancient ones, access to ancient sequences may identify other variations and organisms which could have arisen from those sequences. In addition to genetic sequences, one might attempt to track the changing of one character trait to another, such as fins turning to legs.

Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.

TREE-PUZZLE is a computer program used to construct phylogenetic trees from sequence data by maximum likelihood analysis. Branch lengths can be calculated with and without the molecular clock hypothesis.

A supertree is a single phylogenetic tree assembled from a combination of smaller phylogenetic trees, which may have been assembled using different datasets or a different selection of taxa. Supertree algorithms can highlight areas where additional data would most usefully resolve any ambiguities. The input trees of a supertree should behave as samples from the larger tree.

Ziheng Yang FRS is a Chinese biologist. He holds the R.A. Fisher Chair of Statistical Genetics at University College London, and is the Director of R.A. Fisher Centre for Computational Biology at UCL. He was elected a Fellow of the Royal Society in 2006.

T-REX is a freely available web server, developed at the department of Computer Science of the Université du Québec à Montréal, dedicated to the inference, validation and visualization of phylogenetic trees and phylogenetic networks. The T-REX web server allows the users to perform several popular methods of phylogenetic analysis as well as some new phylogenetic applications for inferring, drawing and validating phylogenetic trees and networks.

Bacterial phylodynamics is the study of immunology, epidemiology, and phylogenetics of bacterial pathogens to better understand the evolutionary role of these pathogens. Phylodynamic analysis includes analyzing genetic diversity, natural selection, and population dynamics of infectious disease pathogen phylogenies during pandemics and studying intra-host evolution of viruses. Phylodynamics combines the study of phylogenetic analysis, ecological, and evolutionary processes to better understand of the mechanisms that drive spatiotemporal incidence and phylogenetic patterns of bacterial pathogens. Bacterial phylodynamics uses genome-wide single-nucleotide polymorphisms (SNP) in order to better understand the evolutionary mechanism of bacterial pathogens. Many phylodynamic studies have been performed on viruses, specifically RNA viruses which have high mutation rates. The field of bacterial phylodynamics has increased substantially due to the advancement of next-generation sequencing and the amount of data available.

Multispecies Coalescent Process is a stochastic process model that describes the genealogical relationships for a sample of DNA sequences taken from several species. It represents the application of coalescent theory to the case of multiple species. The multispecies coalescent results in cases where the relationships among species for an individual gene can differ from the broader history of the species. It has important implications for the theory and practice of phylogenetics and for understanding genome evolution.

Arndt von Haeseler is a German bioinformatician and evolutionary biologist. He is the scientific director of the Max F. Perutz Laboratories at the Vienna Biocenter and a professor of bioinformatics at the University of Vienna and the Medical University of Vienna.

Minimum evolution is a distance method employed in phylogenetics modeling. It shares with maximum parsimony the aspect of searching for the phylogeny that has the shortest total sum of branch lengths.

<span class="mw-page-title-main">Phylogenetic reconciliation</span> Technique in evolutionary study

In phylogenetics, reconciliation is an approach to connect the history of two or more coevolving biological entities. The general idea of reconciliation is that a phylogenetic tree representing the evolution of an entity can be drawn within another phylogenetic tree representing an encompassing entity to reveal their interdependence and the evolutionary events that have marked their shared history. The development of reconciliation approaches started in the 1980s, mainly to depict the coevolution of a gene and a genome, and of a host and a symbiont, which can be mutualist, commensalist or parasitic. It has also been used for example to detect horizontal gene transfer, or understand the dynamics of genome evolution.

References

  1. Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, Genschoreck T, Webster T, Reich D (November 2012). "Ancient admixture in human history". Genetics. 192 (3): 1065–93. doi:10.1534/genetics.112.145037. PMC   3522152 . PMID   22960212.
  2. El-Kebir M, Oesper L, Acheson-Field H, Raphael BJ (June 2015). "Reconstruction of clonal trees and tumor composition from multi-sample sequencing data". Bioinformatics. 31 (12): i62-70. doi:10.1093/bioinformatics/btv261. PMC   4542783 . PMID   26072510.
  3. Kück P, Meid SA, Groß C, Wägele JW, Misof B (August 2014). "AliGROOVE--visualization of heterogeneous sequence divergence within multiple sequence alignments and detection of inflated branch support". BMC Bioinformatics. 15 (1): 294. doi: 10.1186/1471-2105-15-294 . PMC   4167143 . PMID   25176556.
  4. Paradis E, Claude J, Strimmer K (January 2004). "APE: Analyses of Phylogenetics and Evolution in R language". Bioinformatics. 20 (2). Oxford, England: 289–90. doi: 10.1093/bioinformatics/btg412 . PMID   14734327.
  5. Lord E, Leclercq M, Boc A, Diallo AB, Makarenkov V (2012). "Armadillo 1.1: an original workflow platform for designing and conducting phylogenetic analysis and simulations". PLOS One . 7 (1): e29903. Bibcode:2012PLoSO...729903L. doi: 10.1371/journal.pone.0029903 . PMC   3256230 . PMID   22253821.
  6. Suchard MA, Redelings BD (August 2006). "BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny". Bioinformatics. 22 (16): 2047–8. doi: 10.1093/bioinformatics/btl175 . PMID   16679334.
  7. Wilson IJ, Weale ME, Balding DJ (June 2003). "Inferences from DNA data: population histories, evolutionary processes and forensic match probabilities". Journal of the Royal Statistical Society, Series A (Statistics in Society). 166 (2): 155–88. doi: 10.1111/1467-985X.00264 .
  8. Pagel M, Meade A (2007), BayesPhylogenies 1.0. Software distributed by the authors.
  9. Pagel M, Meade A (2007). "BayesTraits. Computer program and documentation". pp. 1216–23.[ permanent dead link ]
  10. Drummond A, Suchard MA, Xie D, Rambaut A (2012). "Bayesian phylogenetics with BEAUti and the BEAST 1.7". Molecular Biology and Evolution. 29 (8): 1969–1973. doi:10.1093/molbev/mss075. PMC   3408070 . PMID   22367748.
  11. Jiang Y, Qiu Y, Minn AJ, Zhang NR (September 2016). "Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing". Proceedings of the National Academy of Sciences of the United States of America. 113 (37): E5528-37. Bibcode:2016PNAS..113E5528J. doi: 10.1073/pnas.1522203113 . PMC   5027458 . PMID   27573852.
  12. Thind, Amarinder Singh; Sinha, Somdatta (2023). "Using Chaos-Game-Representation for Analysing the SARS-CoV-2 Lineages, Newly Emerging Strains and Recombinants". Current Genomics. 24 (3): 187–195. doi:10.2174/0113892029264990231013112156. PMC   10761335 . PMID   38178984. S2CID   264500732.
  13. Thompson, Julie D.; Gibson, Toby J.; Higgins, Des G. (August 2002). "Multiple sequence alignment using ClustalW and ClustalX". Current Protocols in Bioinformatics. Chapter 2: 2.3.1–2.3.22. doi:10.1002/0471250953.bi0203s00. ISSN   1934-340X. PMID   18792934. S2CID   34156490.
  14. Huson DH, Scornavacca C (December 2012). "Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks". Systematic Biology. 61 (6): 1061–7. doi: 10.1093/sysbio/sys062 . PMID   22780991.
  15. Jia B, Ray S, Safavi S, Bento J (2018). "Efficient Projection onto the Perfect Phylogeny Model". In Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds.). Advances in Neural Information Processing Systems 31 (NeurIPS 2018). pp. 4108–4118.
  16. Ray S, Jia B, Safavi S, Opijnen T, Isberg R, Rosch J, Bento J. Exact inference under the perfect phylogeny model. arXiv: 1908.08623 .
  17. Jeon YS, Lee K, Park SC, Kim BS, Cho YJ, Ha SM, Chun J (February 2014). "EzEditor: a versatile sequence alignment editor for both rRNA- and protein-coding genes". International Journal of Systematic and Evolutionary Microbiology. 64 (Pt 2): 689–91. doi:10.1099/ijs.0.059360-0. PMID   24425826.
  18. Price MN, Dehal PS, Arkin AP (March 2010). "FastTree 2--approximately maximum-likelihood trees for large alignments". PLOS One . 5 (3): e9490. Bibcode:2010PLoSO...5.9490P. doi: 10.1371/journal.pone.0009490 . PMC   2835736 . PMID   20224823.
  19. Fletcher, William; Yang, Ziheng (2009-08-01). "INDELible: A Flexible Simulator of Biological Sequence Evolution". Molecular Biology and Evolution. 26 (8): 1879–1888. doi:10.1093/molbev/msp098. ISSN   0737-4038. PMC   2712615 . PMID   19423664.
  20. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ (January 2015). "IQ-Tree: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies". Molecular Biology and Evolution. 32 (1): 268–74. doi:10.1093/molbev/msu300. PMC   4271533 . PMID   25371430.
  21. Minh BQ, Nguyen MA, von Haeseler A (May 2013). "Ultrafast approximation for phylogenetic bootstrap". Molecular Biology and Evolution. 30 (5): 1188–95. doi:10.1093/molbev/mst024. PMC   3670741 . PMID   23418397.
  22. Criscuolo A (June 2019). "A fast alignment-free bioinformatics procedure to infer accurate distance-based phylogenetic trees from genome assemblies". Research Ideas and Outcomes. 5: e36178. doi: 10.3897/rio.5.e36178 . S2CID   196180156.
  23. Criscuolo A (November 2020). "On the transformation of MinHash-based uncorrected distances into proper evolutionary distances for phylogenetic inference". F1000Research. 9: 1309. doi: 10.12688/f1000research.26930.1 . PMC   7713896 . PMID   33335719.
  24. Campbell, Ellsworth M.; Boyles, Anthony; Shankar, Anupama; Kim, Jay; Knyazev, Sergey; Cintron, Roxana; Switzer, William M. (2021-09-07). "MicrobeTrace: Retooling molecular epidemiology for rapid public health response". PLOS Computational Biology. 17 (9): e1009300. Bibcode:2021PLSCB..17E9300C. doi: 10.1371/journal.pcbi.1009300 . ISSN   1553-7358. PMC   8491948 . PMID   34492010.
  25. Haschka, Thomas; Ponger, Loic; Escudé, Christophe; Mozziconacci, Julien (2021-06-08). "MNHN-Tree-Tools: a toolbox for tree inference using multi-scale clustering of a set of sequences". Bioinformatics. 37 (21): 3947–3949. doi:10.1093/bioinformatics/btab430. ISSN   1367-4803. PMID   34100911.
  26. O'Leary, Maureen A.; Kaufman, Seth (October 2011). "MorphoBank: phylophenomics in the "cloud"". Cladistics. 27 (5): 529–537. doi: 10.1111/j.1096-0031.2011.00355.x . PMID   34875801. S2CID   76652345.
  27. Huelsenbeck, J. P.; Ronquist, F. (August 2001). "MRBAYES: Bayesian inference of phylogenetic trees". Bioinformatics. 17 (8): 754–755. doi: 10.1093/bioinformatics/17.8.754 . ISSN   1367-4803. PMID   11524383.
  28. Hellmuth M, Wieseke N, Lechner M, Lenhof HP, Middendorf M, Stadler PF (February 2015). "Phylogenomics with paralogs". Proceedings of the National Academy of Sciences of the United States of America. 112 (7): 2058–63. arXiv: 1712.06442 . Bibcode:2015PNAS..112.2058H. doi: 10.1073/pnas.1412770112 . PMC   4343152 . PMID   25646426.
  29. Thomas, Gavin H.; Hartmann, Klaas; Jetz, Walter; Joy, Jeffrey B.; Mimoto, Aki; Mooers, Arne O. (2013). "PASTIS: an R package to facilitate phylogenetic assembly with soft taxonomic inferences". Methods in Ecology and Evolution. 4 (11): 1011–1017. Bibcode:2013MEcEv...4.1011T. doi:10.1111/2041-210X.12117. ISSN   2041-210X. S2CID   86694418.
  30. Schliep KP (February 2011). "phangorn: phylogenetic analysis in R". Bioinformatics. 27 (4): 592–3. doi:10.1093/bioinformatics/btq706. PMC   3035803 . PMID   21169378.
  31. Liu L, Yu L (April 2010). "Phybase: an R package for species tree analysis". Bioinformatics. 26 (7): 962–3. doi: 10.1093/bioinformatics/btq062 . PMID   20156990.
  32. Guindon, Stéphane; Dufayard, Jean-François; Lefort, Vincent; Anisimova, Maria; Hordijk, Wim; Gascuel, Olivier (2010-03-29). "New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0". Systematic Biology. 59 (3): 307–321. doi:10.1093/sysbio/syq010. hdl: 20.500.11850/25281 . ISSN   1076-836X. PMID   20525638.
  33. Brown JW, Walker JF, Smith SA (June 2017). "Phyx: phylogenetic tools for unix". Bioinformatics. 33 (12): 1886–1888. doi:10.1093/bioinformatics/btx063. PMC   5870855 . PMID   28174903.
  34. Arenas, Miguel; Bastolla, Ugo (2020). "ProtASR2: Ancestral reconstruction of protein sequences accounting for folding stability". Methods in Ecology and Evolution. 11 (2): 248–257. Bibcode:2020MEcEv..11..248A. doi:10.1111/2041-210X.13341. ISSN   2041-210X.
  35. Arenas, Miguel (2021-08-27). "ProteinEvolverABC: coestimation of recombination and substitution rates in protein sequences by approximate Bayesian computation". Bioinformatics. 38 (1): 58–64. doi:10.1093/bioinformatics/btab617. ISSN   1367-4803. PMC   8696103 . PMID   34450622.
  36. Ferreiro, David; Branco, Catarina; Arenas, Miguel. "Selection among site-dependent structurally constrained substitution models of protein evolution by approximate Bayesian computation". Bioinformatics. 40 (3): btae096. doi:10.1093/bioinformatics/btae096. ISSN   1367-4811. PMC   10914458 . PMID   38374231.
  37. Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A (May 2019). "RAxML-NG: A fast, scalable, and user-friendly tool for maximum likelihood phylogenetic inference". Bioinformatics. 35 (21): 4453–4455. doi:10.1093/bioinformatics/btz305. PMC   6821337 . PMID   31070718.
  38. Samson, Stéphane; Lord, Étienne; Makarenkov, Vladimir (26 May 2022). "SimPlot++: a Python application for representing sequence similarity and detecting recombination". Bioinformatics. 38 (11): 3118–3120. arXiv: 2112.09755 . doi:10.1093/bioinformatics/btac287. PMID   35451456.
  39. Lole, Kavita S.; Bollinger, Robert C.; Paranjape, Ramesh S.; Gadkari, Deepak; Kulkarni, Smita S.; Novak, Nicole G.; Ingersoll, Roxann; Sheppard, Haynes W.; Ray, Stuart C. (January 1999). "Full-Length Human Immunodeficiency Virus Type 1 Genomes from Subtype C-Infected Seroconverters in India, with Evidence of Intersubtype Recombination". Journal of Virology. 73 (1): 152–160. doi:10.1128/JVI.73.1.152-160.1999. PMC   103818 . PMID   9847317.
  40. Salminen, Mika O.; Carr, Jean K.; Burke, Donald S.; McCutchan, Francine E. (November 1995). "Identification of Breakpoints in Intergenotypic Recombinants of HIV Type 1 by Bootscanning". AIDS Research and Human Retroviruses. 11 (11): 1423–1425. doi:10.1089/aid.1995.11.1423. PMID   8573403.
  41. Church SH, Ryan JF, Dunn CW (November 2015). "Automation and Evaluation of the SOWH Test with SOWHAT". Systematic Biology. 64 (6): 1048–58. doi:10.1093/sysbio/syv055. PMC   4604836 . PMID   26231182.
  42. Currat, Mathias; Arenas, Miguel; Quilodràn, Claudio S; Excoffier, Laurent; Ray, Nicolas (2019-05-11). "SPLATCHE3: simulation of serial genetic data under spatially explicit evolutionary scenarios including long-distance dispersal". Bioinformatics. 35 (21): 4480–4483. doi:10.1093/bioinformatics/btz311. ISSN   1367-4803. PMC   6821363 . PMID   31077292.
  43. Huson DH, Bryant D (February 2006). "Application of phylogenetic networks in evolutionary studies". Molecular Biology and Evolution. 23 (2): 254–67. doi: 10.1093/molbev/msj030 . PMID   16221896.
  44. Jobb G, von Haeseler A, Strimmer K (June 2004). "Treefinder: a powerful graphical analysis environment for molecular phylogenetics". BMC Evolutionary Biology. 4: 18. doi: 10.1186/1471-2148-4-18 . PMC   459214 . PMID   15222900. (Retracted, see doi:10.1186/s12862-015-0513-z, PMID   26542699,  Retraction Watch . If this is an intentional citation to a retracted paper, please replace {{ retracted |...}} with {{ retracted |...|intentional=yes}}.)
  45. Makarenkov V (July 2001). "T-REX: reconstructing and visualizing phylogenetic trees and reticulation networks". Bioinformatics. 17 (7): 664–8. doi: 10.1093/bioinformatics/17.7.664 . PMID   11448889.
  46. Schmidt HA, Strimmer K, Vingron M, von Haeseler A (March 2002). "Tree-Puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing". Bioinformatics. 18 (3): 502–4. doi: 10.1093/bioinformatics/18.3.502 . PMID   11934758.
  47. Boc A, Diallo AB, Makarenkov V (July 2012). "T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks". Nucleic Acids Research. 40 (Web Server issue): W573–9. doi:10.1093/nar/gks485. PMC   3394261 . PMID   22675075.
  48. Turakhia Y, Thornlow B, Hinrichs AS, De Maio N, Gozashti L, Lanfear R, Haussler D, Corbett-Detig R (June 2021). "Ultrafast Sample Placement on Existing Trees (UShER) Empowers Real-Time Phylogenetics for the SARS-CoV-2 Pandemic". Nature Genetics. 53 (6): 809–816. doi: 10.1038/s41588-021-00862-7 . PMC   9248294 . PMID   33972780.
  49. Piñeiro, César; Abuín, José M; Pichel, Juan C (2020-11-01). Ponty, Yann (ed.). "Very Fast Tree: speeding up the estimation of phylogenies for large alignments through parallelization and vectorization strategies". Bioinformatics. 36 (17): 4658–4659. doi: 10.1093/bioinformatics/btaa582 . ISSN   1367-4803. PMID   32573652.