Tip dating

Last updated

Tip dating is a technique used in molecular dating that allows the inference of time-calibrated phylogenetic trees. Its defining feature is that it uses the ages of the samples to provide time information for the analysis, in contrast with traditional 'node dating' methods that require age constraints to be applied to the internal nodes of the evolutionary tree.

In tip dating, morphological data and molecular data are typically analysed together to estimate the evolutionary relationships (tree topology) and the divergence times among lineages (node times); this approach is also known as 'total-evidence dating'. However, tip dating can also be used to analyse data sets that only comprise morphological characters or that only comprise molecular characters (e.g., data sets that include samples of ancient DNA or of serially sampled viruses).

Tip dating has been implemented in Bayesian phylogenetic software and typically draws on the fossilised birth-death model for evolution. This is a model of diversification that allows speciation, extinction, and sampling of fossil and extant taxa.

This promising method is not yet fully mature, and there are a number of possible biases or undesirable behaviour that must be taken into account when interpreting its results. [1]

Related Research Articles

Cladistics A method of biological systematics in evolutionary biology

Cladistics is an approach to biological classification in which organisms are categorized in groups ("clades") based on the most recent common ancestor. Hypothesized relationships are typically based on shared derived characteristics (synapomorphies) that can be traced to the most recent common ancestor and are not present in more distant groups and ancestors. A key feature of a clade is that a common ancestor and all its descendants are part of the clade. Importantly, all descendants stay in their overarching ancestral clade. For example, if within a strict cladistic framework the terms animals, bilateria/worms, fishes/vertebrata, or monkeys/anthropoidea were used, these terms would include humans. Many of these terms are normally used paraphyletically, outside of cladistics, e.g. as a 'grade'. Radiation results in the generation of new subclades by bifurcation, but in practice sexual hybridization may blur very closely related groupings.

Phylogenetics Study of evolutionary relationships between organisms

In biology, phylogenetics is a part of systematics that addresses the inference of the evolutionary history and relationships among or within groups of organisms. These relationships are hypothesized by phylogenetic inference methods that evaluate observed heritable traits, such as DNA sequences or morphology, often under a specified model of evolution of these traits. The result of such an analysis is a phylogeny —a diagrammatic hypothesis of relationships that reflects the evolutionary history of a group of organisms. The tips of a phylogenetic tree can be living taxa or fossils, and represent the 'end', or the present, in an evolutionary lineage. A phylogenetic diagram can be rooted or unrooted. A rooted tree diagram indicates the hypothetical common ancestor, or ancestral lineage, of the tree. An unrooted tree diagram makes no assumption about the ancestral line, and does not show the origin or "root" of the taxa in question or the direction of inferred evolutionary transformations. In addition to their proper use for inferring phylogenetic patterns among taxa, phylogenetic analyses are often employed to represent relationships among gene copies or individual organisms. Such uses have become central to understanding biodiversity, evolution, ecology, and genomes.

Cladogram A diagram used to show relations among groups of organisms with common origins

A cladogram is a diagram used in cladistics to show relations among organisms. A cladogram is not, however, an evolutionary tree because it does not show how ancestors are related to descendants, nor does it show how much they have changed, so many differing evolutionary trees can be consistent with the same cladogram. A cladogram uses lines that branch off in different directions ending at a clade, a group of organisms with a last common ancestor. There are many shapes of cladograms but they all have lines that branch off from other lines. The lines can be traced back to where they branch off. These branching off points represent a hypothetical ancestor which can be inferred to exhibit the traits shared among the terminal taxa above it. This hypothetical ancestor might then provide clues about the order of evolution of various features, adaptation, and other evolutionary narratives about ancestors. Although traditionally such cladograms were generated largely on the basis of morphological characters, DNA and RNA sequencing data and computational phylogenetics are now very commonly used in the generation of cladograms, either on their own or in combination with morphology.

Phylogenetic tree Branching diagram of evolutionary relationships between organisms

A phylogenetic tree or evolutionary tree is a branching diagram or "tree" showing the evolutionary relationships among various biological species or other entities—their phylogeny —based upon similarities and differences in their physical or genetic characteristics. All life on Earth is part of a single phylogenetic tree, indicating common ancestry.

Molecular phylogenetics The branch of phylogeny that analyzes genetic, hereditary molecular differences

Molecular phylogenetics is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominately in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to determine the processes by which diversity among species has been achieved. The result of a molecular phylogenetic analysis is expressed in a phylogenetic tree. Molecular phylogenetics is one aspect of molecular systematics, a broader term that also includes the use of molecular data in taxonomy and biogeography.

Caprimulgiformes Order of birds

The Caprimulgiformes is an order of birds that includes a number of birds with global distribution. They are generally insectivorous and nocturnal. The order gets its name from the Latin for "goat-milker", an old name based on an erroneous view of the European nightjar's feeding habits.

Molecular clock Technique to deduce the time in prehistory when two or more life forms diverged

The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotide sequences for DNA, RNA, or amino acid sequences for proteins. The benchmarks for determining the mutation rate are often fossil or archaeological dates. The molecular clock was first tested in 1962 on the hemoglobin protein variants of various animals, and is commonly used in molecular evolution to estimate times of speciation or radiation. It is sometimes called a gene clock or an evolutionary clock.

Phylogenesis

Phylogenesis is the biological process by which a taxon appears. The science that studies these processes is called phylogenetics.

Evolutionary taxonomy, evolutionary systematics or Darwinian classification is a branch of biological classification that seeks to classify organisms using a combination of phylogenetic relationship, progenitor-descendant relationship, and degree of evolutionary change. This type of taxonomy may consider whole taxa rather than single species, so that groups of species can be inferred as giving rise to new groups. The concept found its most well-known form in the modern evolutionary synthesis of the early 1940s.

In phylogenetics, maximum parsimony is an optimality criterion under which the phylogenetic tree that minimizes the total number of character-state changes is to be preferred. Under the maximum-parsimony criterion, the optimal tree will minimize the amount of homoplasy. In other words, under this criterion, the shortest possible tree that explains the data is considered best. The principle is akin to Occam's razor, which states that—all else being equal—the simplest hypothesis that explains the data should be selected. Some of the basic ideas behind maximum parsimony were presented by James S. Farris in 1970 and Walter M. Fitch in 1971.

Substitution model Description of the process by which states in sequences change into each other and back

In biology, a substitution model, also called models of DNA sequence evolution, are Markov models that describe changes over evolutionary time. These models describe evolutionary changes in macromolecules represented as sequence of symbols. Substitution models is used to calculate the likelihood of phylogenetic trees using multiple sequence alignment data. Thus, substitution models are central to maximum likelihood estimation of phylogeny as well as Bayesian inference in phylogeny. Estimates of evolutionary distances are typically calculated using substitution models. Substitution models are also central to phylogenetic invariants since they can be used predict the frequencies of site pattern frequencies given a tree topology. Substitution models are necessary to simulate sequence data for a group of organisms related by a specific tree.

In phylogenetics, long branch attraction (LBA) is a form of systematic error whereby distantly related lineages are incorrectly inferred to be closely related. LBA arises when the amount of molecular or morphological change accumulated within a lineage is sufficient to cause that lineage to appear similar to another long-branched lineage, solely because they have both undergone a large amount of change, rather than because they are related by descent. Such bias is more common when the overall divergence of some taxa results in long branches within a phylogeny. Long branches are often attracted to the base of a phylogenetic tree, because the lineage included to represent an outgroup is often also long-branched. The frequency of true LBA is unclear and often debated, and some authors view it as untestable and therefore irrelevant to empirical phylogenetic inference. Although often viewed as a failing of parsimony-based methodology, LBA could in principle result from a variety of scenarios and be inferred under multiple analytical paradigms.

Polytomy

An internal node of a phylogenetic tree is described as a polytomy or multifurcation if (i) it is in a rooted tree and is linked to three or more child subtrees or (ii) it is in an unrooted tree and is attached to four or more branches. A tree that contains any multifurcations can be described as a multifurcating tree.

Computational phylogenetics is the application of computational algorithms, methods, and programs to phylogenetic analyses. The goal is to assemble a phylogenetic tree representing a hypothesis about the evolutionary ancestry of a set of genes, species, or other taxa. For example, these techniques have been used to explore the family tree of hominid species and the relationships between specific genes shared by many types of organisms.

Ancestral reconstruction is the extrapolation back in time from measured characteristics of individuals to their common ancestors. It is an important application of phylogenetics, the reconstruction and study of the evolutionary relationships among individuals, populations or species to their ancestors. In the context of evolutionary biology, ancestral reconstruction can be used to recover different kinds of ancestral character states of organisms that lived millions of years ago. These states include the genetic sequence, the amino acid sequence of a protein, the composition of a genome, a measurable characteristic of an organism (phenotype), and the geographic range of an ancestral population or species. This is desirable because it allows us to examine parts of phylogenetic trees corresponding to the distant past, clarifying the evolutionary history of the species in the tree. Since modern genetic sequences are essentially a variation of ancient ones, access to ancient sequences may identify other variations and organisms which could have arisen from those sequences. In addition to genetic sequences, one might attempt to track the changing of one character trait to another, such as fins turning to legs.

Bayesian inference of phylogeny uses a likelihood function to create a quantity called the posterior probability of trees using a model of evolution, based on some prior probabilities, producing the most likely phylogenetic tree for the given data. The Bayesian approach has become popular due to advances in computing speeds and the integration of Markov chain Monte Carlo (MCMC) algorithms. Bayesian inference has a number of applications in molecular phylogenetics and systematics.

Phylogenetic comparative methods (PCMs) use information on the historical relationships of lineages (phylogenies) to test evolutionary hypotheses. The comparative method has a long history in evolutionary biology; indeed, Charles Darwin used differences and similarities between species as a major source of evidence in The Origin of Species. However, the fact that closely related lineages share many traits and trait combinations as a result of the process of descent with modification means that lineages are not independent. This realization inspired the development of explicitly phylogenetic comparative methods. Initially, these methods were primarily developed to control for phylogenetic history when testing for adaptation; however, in recent years the use of the term has broadened to include any use of phylogenies in statistical tests. Although most studies that employ PCMs focus on extant organisms, many methods can also be applied to extinct taxa and can incorporate information from the fossil record.

Distance matrices are used in phylogeny as non-parametric distance methods and were originally applied to phenetic data using a matrix of pairwise distances. These distances are then reconciled to produce a tree. The distance matrix can come from a number of different sources, including measured distance or morphometric analysis, various pairwise distance formulae applied to discrete morphological characters, or genetic distance from sequence, restriction fragment, or allozyme data. For phylogenetic character data, raw distance values can be calculated by simply counting the number of pairwise differences in character states.

Quantitative comparative linguistics is the use of quantitative analysis as applied to comparative linguistics.

Bacterial phylodynamics is the study of immunology, epidemiology, and phylogenetics of bacterial pathogens to better understand the evolutionary role of these pathogens. Phylodynamic analysis includes analyzing genetic diversity, natural selection, and population dynamics of infectious disease pathogen phylogenies during pandemics and studying intra-host evolution of viruses. Phylodynamics combines the study of phylogenetic analysis, ecological, and evolutionary processes to better understand of the mechanisms that drive spatiotemportal incidence and phylogenetic patterns of bacterial pathogens. Bacterial phylodynamics uses genome-wide single-nucleotide polymorphisms (SNP) in order to better understand the evolutionary mechanism of bacterial pathogens. Many phylodynamic studies have been performed on viruses, specifically RNA viruses which have high mutation rates. The field of bacterial phylodynamics has increased substantially due to the advancement of next-generation sequencing and the amount of data available.

References

  1. Luo, Arong; Duchêne, David A.; Zhang, Chi; Zhu, Chao-Dong; Ho, Simon (2019). "A Simulation-Based Evaluation of Tip-Dating Under the Fossilized Birth-Death Process". Systematic Biology. doi: 10.1093/sysbio/syz038 .