|Part of a series on|
The molecular clock is figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotide sequences for DNA or amino acid sequences for proteins. The benchmarks for determining the mutation rate are often fossil or archaeological dates. The molecular clock was first tested in 1962 on the hemoglobin protein variants of various animals, and is commonly used in molecular evolution to estimate times of speciation or radiation. It is sometimes called a gene clock or an evolutionary clock.
The notion of the existence of a so-called "molecular clock" was first attributed to Émile Zuckerkandl and Linus Pauling who, in 1962, noticed that the number of amino acid differences in hemoglobin between different lineages changes roughly linearly with time, as estimated from fossil evidence.They generalized this observation to assert that the rate of evolutionary change of any specified protein was approximately constant over time and over different lineages (known as the molecular clock hypothesis).
The genetic equidistance phenomenon was first noted in 1963 by Emanuel Margoliash, who wrote: "It appears that the number of residue differences between cytochrome c of any two species is mostly conditioned by the time elapsed since the lines of evolution leading to these two species originally diverged. If this is correct, the cytochrome c of all mammals should be equally different from the cytochrome c of all birds. Since fish diverges from the main stem of vertebrate evolution earlier than either birds or mammals, the cytochrome c of both mammals and birds should be equally different from the cytochrome c of fish. Similarly, all vertebrate cytochrome c should be equally different from the yeast protein."For example, the difference between the cytochrome c of a carp and a frog, turtle, chicken, rabbit, and horse is a very constant 13% to 14%. Similarly, the difference between the cytochrome c of a bacterium and yeast, wheat, moth, tuna, pigeon, and horse ranges from 64% to 69%. Together with the work of Emile Zuckerkandl and Linus Pauling, the genetic equidistance result directly led to the formal postulation of the molecular clock hypothesis in the early 1960s.
Similarly, Vincent Sarich and Allan Wilson in 1967 demonstrated that molecular differences among modern Primates in albumin proteins showed that approximately constant rates of change had occurred in all the lineages they assessed.The basic logic of their analysis involved recognizing that if one species lineage had evolved more quickly than a sister species lineage since their common ancestor, then the molecular differences between an outgroup (more distantly related) species and the faster-evolving species should be larger (since more molecular changes would have accumulated on that lineage) than the molecular differences between the outgroup species and the slower-evolving species. This method is known as the relative rate test. Sarich and Wilson's paper reported, for example, that human ( Homo sapiens ) and chimpanzee ( Pan troglodytes ) albumin immunological cross-reactions suggested they were about equally different from Ceboidea (New World Monkey) species (within experimental error). This meant that they had both accumulated approximately equal changes in albumin since their shared common ancestor. This pattern was also found for all the primate comparisons they tested. When calibrated with the few well-documented fossil branch points (such as no Primate fossils of modern aspect found before the K-T boundary), this led Sarich and Wilson to argue that the human-chimp divergence probably occurred only ~4–6 million years ago.
The observation of a clock-like rate of molecular change was originally purely phenomenological. Later, the work of Motoo Kimura . The probability that this new mutation will become fixed in the population is then 1/N, since each copy of the gene is as good as any other. Every generation, each individual can have new mutations, so there are N new neutral mutations in the population as a whole. That means that each generation, new neutral mutations will become fixed. If most changes seen during molecular evolution are neutral, then fixations in a population will accumulate at a clock-rate that is equal to the rate of neutral mutations in an individual.developed the neutral theory of molecular evolution, which predicted a molecular clock. Let there be N individuals, and to keep this calculation simple, let the individuals be haploid (i.e. have one copy of each gene). Let the rate of neutral mutations (i.e. mutations with no effect on fitness) in a new individual be
The molecular clock alone can only say that one time period is twice as long as another: it cannot assign concrete dates. For viral phylogenetics and ancient DNA studies—two areas of evolutionary biology where it is possible to sample sequences over an evolutionary timescale—the dates of the intermediate samples can be used to more precisely calibrate the molecular clock. However, most phylogenies require that the molecular clock be calibrated against independent evidence about dates, such as the fossil record.There are two general methods for calibrating the molecular clock using fossil data: node calibration and tip calibration.
Sometimes referred to as node dating, node calibration is a method for phylogeny calibration that is done by placing fossil constraints at nodes. A node calibration fossil is the oldest discovered representative of that clade, which is used to constrain its minimum age. Due to the fragmentary nature of the fossil record, the true most recent common ancestor of a clade will likely never be found.In order to account for this in node calibration analyses, a maximum clade age must be estimated. Determining the maximum clade age is challenging because it relies on negative evidence—the absence of older fossils in that clade. There are a number of methods for deriving the maximum clade age using birth-death models, fossil stratigraphic distribution analyses, or taphonomic controls. Alternatively, instead of a maximum and a minimum, a prior probability of the divergence time can be established and used to calibrate the clock. There are several prior probability distributions including normal, lognormal, exponential, gamma, uniform, etc.) that can be used to express the probability of the true age of divergence relative to the age of the fossil; however, there are very few methods for estimating the shape and parameters of the probability distribution empirically. The placement of calibration nodes on the tree informs the placement of the unconstrained nodes, giving divergence date estimates across the phylogeny. Historical methods of clock calibration could only make use of a single fossil constraint (non-parametric rate smoothing), while modern analyses (BEAST and r8s ) allow for the use of multiple fossils to calibrate the molecular clock. Simulation studies have shown that increasing the number of fossil constraints increases the accuracy of divergence time estimation.
Sometimes referred to as tip dating, tip calibration is a method of molecular clock calibration in which fossils are treated as taxa and placed on the tips of the tree. This is achieved by creating a matrix that includes a molecular dataset for the extant taxa along with a morphological dataset for both the extinct and the extant taxa.Unlike node calibration, this method reconstructs the tree topology and places the fossils simultaneously. Molecular and morphological models work together simultaneously, allowing morphology to inform the placement of fossils. Tip calibration makes use of all relevant fossil taxa during clock calibration, rather than relying on only the oldest fossil of each clade. This method does not rely on the interpretation of negative evidence to infer maximum clade ages.
This approach to tip calibration goes a step further by simultaneously estimating fossil placement, topology, and the evolutionary timescale. In this method, the age of a fossil can inform its phylogenetic position in addition to morphology. By allowing all aspects of tree reconstruction to occur simultaneously, the risk of biased results is decreased.This approach has been improved upon by pairing it with different models. One current method of molecular clock calibration is total evidence dating paired with the fossilized birth-death (FBD) model and a model of morphological evolution. The FBD model is novel in that it allows for “sampled ancestors,” which are fossil taxa that are the direct ancestor of a living taxon or lineage. This allows fossils to be placed on a branch above an extant organism, rather than being confined to the tips.
Bayesian methods can provide more appropriate estimates of divergence times, especially if large datasets—such as those yielded by phylogenomics—are employed.
Sometimes only a single divergence date can be estimated from fossils, with all other dates inferred from that. Other sets of species have abundant fossils available, allowing the hypothesis of constant divergence rates to be tested. DNA sequences experiencing low levels of negative selection showed divergence rates of 0.7–0.8% per Myr in bacteria, mammals, invertebrates, and plants. In the same study, genomic regions experiencing very high negative or purifying selection (encoding rRNA) were considerably slower (1% per 50 Myr).
In addition to such variation in rate with genomic position, since the early 1990s variation among taxa has proven fertile ground for research too,even over comparatively short periods of evolutionary time (for example mockingbirds ). Tube-nosed seabirds have molecular clocks that on average run at half speed of many other birds, possibly due to long generation times, and many turtles have a molecular clock running at one-eighth the speed it does in small mammals, or even slower. Effects of small population size are also likely to confound molecular clock analyses. Researchers such as Francisco Ayala have more fundamentally challenged the molecular clock hypothesis. According to Ayala's 1999 study, five factors combine to limit the application of molecular clock models:
Molecular clock users have developed workaround solutions using a number of statistical approaches including maximum likelihood techniques and later Bayesian modeling. In particular, models that take into account rate variation across lineages have been proposed in order to obtain better estimates of divergence times. These models are called relaxed molecular clocksbecause they represent an intermediate position between the 'strict' molecular clock hypothesis and Joseph Felsenstein's many-rates model and are made possible through MCMC techniques that explore a weighted range of tree topologies and simultaneously estimate parameters of the chosen substitution model. It must be remembered that divergence dates inferred using a molecular clock are based on statistical inference and not on direct evidence.
The molecular clock runs into particular challenges at very short and very long timescales. At long timescales, the problem is saturation. When enough time has passed, many sites have undergone more than one change, but it is impossible to detect more than one. This means that the observed number of changes is no longer linear with time, but instead flattens out. Even at intermediate genetic distances, with phylogenetic data still sufficient to estimate topology, signal for the overall scale of the tree can be weak under complex likelihood models, leading to highly uncertain molecular clock estimates.
At very short time scales, many differences between samples do not represent fixation of different sequences in the different populations. Instead, they represent alternative alleles that were both present as part of a polymorphism in the common ancestor. The inclusion of differences that have not yet become fixed leads to a potentially dramatic inflation of the apparent rate of the molecular clock at very short timescales.
The molecular clock technique is an important tool in molecular systematics, the use of molecular genetics information to determine the correct scientific classification of organisms or to study variation in selective forces. Knowledge of approximately constant rate of molecular evolution in particular sets of lineages also facilitates establishing the dates of phylogenetic events, including those not documented by fossils, such as the divergence of living taxa and the formation of the phylogenetic tree. In these cases—especially over long stretches of time—the limitations of the molecular clock hypothesis (above) must be considered; such estimates may be off by 50% or more.
Molecular phylogenetics is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominately in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to determine the processes by which diversity among species has been achieved. The result of a molecular phylogenetic analysis is expressed in a phylogenetic tree. Molecular phylogenetics is one aspect of molecular systematics, a broader term that also includes the use of molecular data in taxonomy and biogeography.
Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.
The neutral theory of molecular evolution holds that most evolutionary changes at the molecular level, and most of the variation within and between species are due to random genetic drift of mutant alleles that are selectively neutral. The theory applies only for evolution at the molecular level, and is compatible with phenotypic evolution being shaped by natural selection as postulated by Charles Darwin. The neutral theory allows for the possibility that most mutations are deleterious, but holds that because these are rapidly removed by natural selection, they do not make significant contributions to variation within and between species at the molecular level. A neutral mutation is one that does not affect an organism's ability to survive and reproduce. The neutral theory assumes that most mutations that are not deleterious are neutral rather than beneficial. Because only a fraction of gametes are sampled in each generation of a species, the neutral theory suggests that a mutant allele can arise within a population and reach fixation by chance, rather than by selective advantage.
The Batrachia are a clade of amphibians that includes frogs and salamanders, as well as the extinct allocaudates, but not caecilians. The name Batrachia was first used by French zoologist Pierre André Latreille in 1800 to refer to frogs, but has more recently been defined in a phylogenetic sense as a node-based taxon that includes the last common ancestor of frogs and salamanders and all of its descendants. The idea that frogs and salamanders are more closely related to each other than either is to caecilians is strongly supported by morphological and molecular evidence, they are for instance the only vertebrates able to raise and lower their eyes, but an alternative hypothesis exists in which salamanders and caecilians are each other's closest relatives as part of a clade called the Procera, with frogs positioned as the sister taxon of this group.
Cladogenesis is an evolutionary splitting of a parent species into two distinct species, forming a clade.
Émile Zuckerkandl was an Austrian-born French biologist considered one of the founders of the field of molecular evolution. He is best known for introducing, with Linus Pauling, the concept of the "molecular clock", which enabled the neutral theory of molecular evolution.
In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids or proteins across species, or within a genome, or between donor and receptor taxa. Conservation indicates that a sequence has been maintained by natural selection.
Molecular anthropology is a field of anthropology in which molecular analysis is used to determine evolutionary links between ancient and modern human populations, as well as between contemporary species. Generally, comparisons are made between sequences, either DNA or protein sequences; however, early studies used comparative serology.
Michael Lynch is the Director of the Biodesign Institute for Mechanisms of Evolution at Arizona State University, Tempe, Arizona. He held a Distinguished Professorship of Evolution, Population Genetics and Genomics at Indiana University, Bloomington, Indiana. Besides over 250 papers, especially in population genetics, he has written a two volume textbook with Bruce Walsh, widely considered the "Bible" of quantitative genetics. Alongside this textbook he has also published two other books. He has been a major force in promoting neutral theories to explain genomic architecture based on the effects of population sizes in different lineages; he presented this point of view comprehensively in his 2007 book "The Origins of Genome Architecture". In 2009, he was elected to the National Academy of Sciences. Lynch was a Biology undergraduate at St. Bonaventure University and received a B.S. in Biology in 1973. He obtained his PhD from the University of Minnesota in 1977.
Computational phylogenetics is the application of computational algorithms, methods, and programs to phylogenetic analyses. The goal is to assemble a phylogenetic tree representing a hypothesis about the evolutionary ancestry of a set of genes, species, or other taxa. For example, these techniques have been used to explore the family tree of hominid species and the relationships between specific genes shared by many types of organisms. Traditional phylogenetics relies on morphological data obtained by measuring and quantifying the phenotypic properties of representative organisms, while the more recent field of molecular phylogenetics uses nucleotide sequences encoding genes or amino acid sequences encoding proteins as the basis for classification. Many forms of molecular phylogenetics are closely related to and make extensive use of sequence alignment in constructing and refining phylogenetic trees, which are used to classify the evolutionary relationships between homologous genes represented in the genomes of divergent species. The phylogenetic trees constructed by computational methods are unlikely to perfectly reproduce the evolutionary tree that represents the historical relationships between the species being analyzed. The historical species tree may also differ from the historical tree of an individual homologous gene shared by those species.
Neutral mutations are changes in DNA sequence that are neither beneficial nor detrimental to the ability of an organism to survive and reproduce. In population genetics, mutations in which natural selection does not affect the spread of the mutation in a species are termed neutral mutations. Neutral mutations that are inheritable and not linked to any genes under selection will either be lost or will replace all other alleles of the gene. This loss or fixation of the gene proceeds based on random sampling known as genetic drift. A neutral mutation that is in linkage disequilibrium with other alleles that are under selection may proceed to loss or fixation via genetic hitchhiking and/or background selection.
Ancestral reconstruction is the extrapolation back in time from measured characteristics of individuals to their common ancestors. It is an important application of phylogenetics, the reconstruction and study of the evolutionary relationships among individuals, populations or species to their ancestors. In the context of evolutionary biology, ancestral reconstruction can be used to recover different kinds of ancestral character states of organisms that lived millions of years ago. These states include the genetic sequence, the amino acid sequence of a protein, the composition of a genome, a measurable characteristic of an organism (phenotype), and the geographic range of an ancestral population or species. This is desirable because it allows us to examine parts of phylogenetic trees corresponding to the distant past, clarifying the evolutionary history of the species in the tree. Since modern genetic sequences are essentially a variation of ancient ones, access to ancient sequences may identify other variations and organisms which could have arisen from those sequences. In addition to genetic sequences, one might attempt to track the changing of one character trait to another, such as fins turning to legs.
Human evolutionary genetics studies how one human genome differs from another human genome, the evolutionary past that gave rise to the human genome, and its current effects. Differences between genomes have anthropological, medical, historical and forensic implications and applications. Genetic data can provide important insights into human evolution.
The history of molecular evolution starts in the early 20th century with "comparative biochemistry", but the field of molecular evolution came into its own in the 1960s and 1970s, following the rise of molecular biology. The advent of protein sequencing allowed molecular biologists to create phylogenies based on sequence comparison, and to use the differences between homologous sequences as a molecular clock to estimate the time since the last common ancestor. In the late 1960s, the neutral theory of molecular evolution provided a theoretical basis for the molecular clock, though both the clock and the neutral theory were controversial, since most evolutionary biologists held strongly to panselectionism, with natural selection as the only important cause of evolutionary change. After the 1970s, nucleic acid sequencing allowed molecular evolution to reach beyond proteins to highly conserved ribosomal RNA sequences, the foundation of a reconceptualization of the early history of life.
Planulozoa is a proposed basal ParaHoxozoa clade, sometimes as sister of the Placozoa, but originally included Placozoa. In the latter case Planulozoa may be a senior synonym to Parahoxozoa. It contains the Cnidaria and the Bilateria. The Placozoa may be a sister of Cnidaria or more conventionally of Planulozoa. The clade excludes basal animals such as the Ctenophora, and Porifera (sponges).
The chimpanzee–human last common ancestor, or CHLCA, is the last common ancestor shared by the extant Homo (human) and Pan genera of Hominini. Due to complex hybrid speciation, it is not possible to give a precise estimate on the age of this ancestral population. While "original divergence" between populations may have occurred as early as 13 million years ago (Miocene), hybridization may have been ongoing until as recently as 4 million years ago (Pliocene).
Genetic saturation is the result of multiple substitutions at the same site in a sequence, or identical substitutions in different sequence, such that the apparent sequence divergence rate is lower than the actual divergence that has occurred. In phylogenetics, saturation effects result in long branch attraction, where the most distant lineages have misleadingly short branch lengths. It also decreases phylogenetic information contained in the sequences.
The human mitochondrial molecular clock is the rate at which mutations have been accumulating in the mitochondrial genome of hominids during the course of human evolution. The archeological record of human activity from early periods in human prehistory is relatively limited and its interpretation has been controversial. Because of the uncertainties from the archeological record, scientists have turned to molecular dating techniques in order to refine the timeline of human evolution. A major goal of scientists in the field is to develop an accurate hominid mitochondrial molecular clock which could then be used to confidently date events that occurred during the course of human evolution.
The relative rate test is a genetic comparative test between two ingroups and an outgroup or “reference species” to compare mutation and evolutionary rates between the species. Each ingroup species is compared independently to the outgroup to determine how closely related the two species are without knowing the exact time of divergence from their closest common ancestor. If more change has occurred on one lineage relative to another lineage since their shared common ancestor, then the outgroup species will be more different from the faster-evolving lineage's species than it is from the slower-evolving lineage's species. This is because the faster-evolving lineage will, by definition, have accumulated more differences since the common ancestor than the slower-evolving lineage. This method can be applied to averaged data, or individual molecules. It is possible for individual molecules to show evidence of approximately constant rates of change in different lineages even while the rates differ between different molecules. The relative rate test is a direct internal test of the molecular clock, for a given molecule and a given set of species, and shows that the molecular clock does not need to be assumed: It can be directly assessed from the data itself. Note that the logic can also be applied to any kind of data for which a distance measure can be defined.
Horizontal or lateral gene transfer is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages.