Molecular anthropology, also known as genetic anthropology, is the study of how molecular biology has contributed to the understanding of human evolution. [1] This field of anthropology examines evolutionary links between ancient and modern human populations, as well as between contemporary species. Generally, comparisons are made between sequences, either DNA or protein sequences; however, early studies used comparative serology.
By examining DNA sequences in different populations, scientists can determine the closeness of relationships between populations (or within populations). Certain similarities in genetic makeup let molecular anthropologists determine whether or not different groups of people belong to the same haplogroup, and thus if they share a common geographical origin. This is significant because it allows anthropologists to trace patterns of migration and settlement, which gives helpful insight as to how contemporary populations have formed and progressed over time. [2]
Molecular anthropology has been extremely useful in establishing the evolutionary tree of humans and other primates, including closely related species like chimps and gorillas. While there are clearly many morphological similarities between humans and chimpanzees, for example, certain studies also have concluded that there is roughly a 98 percent commonality between the DNA of both species. [ citation needed ] However, more recent studies have modified the commonality of 98 percent to a commonality of 94 percent, showing that the genetic gap between humans and chimps is larger than originally thought. [3] Such information is useful in searching for common ancestors and coming to a better understanding of how humans evolved.
There are two continuous linkage groups in humans that are carried by a single sex. The first is the Y chromosome, which is passed from father to son. Anatomical females carry a Y chromosome only rarely, as a result of genetic defect. The other linkage group is the mitochondrial DNA (mtDNA). MtDNA is almost always only passed to the next generation by females, but under highly exceptional circumstances mtDNA can be passed through males.[ clarification needed ] The non-recombinant portion of the Y chromosome and the mtDNA, under normal circumstances, do not undergo productive recombination. Part of the Y chromosome can undergo recombination with the X chromosome and within ape history the boundary has changed. Such recombinant changes in the non-recombinant region of Y are extremely rare.[ citation needed ]
Mitochondrial DNA became an area of research in phylogenetics in the late 1970s. Unlike genomic DNA, it offered advantages in that it did not undergo recombination. The process of recombination, if frequent enough, corrupts the ability to create parsimonious trees because of stretches of amino acid subsititions (SNPs).[ clarification needed ] When looking between distantly related species, recombination is less of a problem since recombination between branches from common ancestors is prevented after true speciation occurs. When examining closely related species, or branching within species, recombination creates a large number of 'irrelevant SNPs' for cladistic analysis. MtDNA, through the process of organelle division, became clonal over time; very little, or often none, of that paternal mtDNA is passed. While recombination may occur in mtDNA, there is little risk that it will be passed to the next generation. As a result, mtDNA become clonal copies of each other, except when a new mutation arises. As a result, mtDNA does not have pitfalls of autosomal loci when studied in interbreeding groups. Another advantage of mtDNA is that the hyper-variable regions evolve very quickly; this shows that certain regions of mitochondrial DNA approach neutrality. This allowed the use of mitochondrial DNA to determine that the relative age of the human population was small, having gone through a recent constriction at about 150,000 years ago (see #Causes of errors).
Mitochondrial DNA has also been used to verify the proximity of chimpanzees to humans relative to gorillas, and to verify the relationship of these three species relative to the orangutans.
More recently,[ when? ] the mtDNA genome has been used to estimate branching patterns in peoples around the world, such as when the new world was settled and how. The problem with these studies have been that they rely heavily on mutations in the coding region. Researchers have increasingly discovered that as humans moved from Africa's south-eastern regions, that more mutations accumulated in the coding region than expected, and in passage to the new world some groups are believed[ citation needed ] to have passed from the Asian tropics to Siberia to an ancient land region called Beringia and quickly migrated to South America. Many of the mtDNA have far more mutations and at rarely mutated coding sites relative to expectations of neutral mutations.
Mitochondrial DNA offers another advantage over autosomal DNA. There are generally 2 to 4 copies of each chromosome in each cell (1 to 2 from each parent chromosome). For mtDNA there can be dozens to hundreds in each cell. This increases the amount of each mtDNA loci by at least a magnitude. For ancient DNA, in which the DNA is highly degraded, the number of copies of DNA is helpful in extending and bridging short fragments together, and decreases the amount of bone extracted from highly valuable fossil/ancient remains. Unlike Y chromosome, both male and female remains carry mtDNA in roughly equal quantities.
The Y chromosome is found in the nucleus of normal cells (nuclear DNA). Unlike mtDNA, it has mutations in the non-recombinant portion (NRY) of the chromosome spaced widely apart, so far apart that finding the mutations on new Y chromosomes is labor-intensive compared with mtDNA. Many studies rely on tandem repeats; however, tandem repeats can expand and retract rapidly and in some predictable patterns. The Y chromosome only tracks male lines, and is not found in females, whereas mtDNA can be traced in males even though they fail to pass on mtDNA. In addition, it has been estimated that effective male populations in the prehistoric period were typically two females per male, and recent studies show that cultural hegemony plays a large role in the passage of Y. This has created discordance between males and females for the Time to the Most Recent Common Ancestor (TMRCA). The estimates for Y TMRCA range from 1/4 to less than 1/2 that of mtDNA TMRCA. It is unclear whether this is due to high male-to-female ratios in the past coupled with repeat migrations from Africa, as a result of mutational rate change, or as some have even proposed that females of the LCA between chimps and humans continued to pass DNA millions after males ceased to pass DNA. At present the best evidence suggests that in migration the male to female ratio in humans may have declined, causing a trimming of Y diversity on multiple occasions within and outside of Africa.
For short-range molecular phylogenetics and molecular clocking, the Y chromosome is highly effective and creates a second perspective. One argument that arose was that the Maori by mtDNA appear to have migrated from Eastern China or Taiwan, by Y chromosome from the Papua New Guinea region. When HLA haplotypes were used to evaluate the two hypotheses, it was uncovered that both were right, that the Maori were an admixed population. Such admixtures appear to be common in the human population and thus the use of a single haploid loci can give a biased perspective.
The X-chromosome is also a form of nuclear DNA. Since it is found as 1 copy in males and 2 non-identical chromosomes in females it has a ploidy of 1.5. However, in humans the effective ploidy is somewhat higher, ~1.7, as females in the breeding population have tended to outnumber males by 2:1 during a large portion of human prehistory. Like mtDNA, X-linked DNA tends to over emphasize female population history much more than male. There have been several studies of loci on X chromosome, in total 20 sites have been examined. These include PDHA1, PDHA1, Xq21.3, Xq13.3, Zfx, Fix, Il2rg, Plp, Gk, Ids, Alas2, Rrm2p4, AmeIX, Tnfsf5, Licam, and Msn. The time to most recent common ancestor (TMRCA) ranges from fixed to ~1.8 million years, with a median around 700ky. These studies roughly plot to the expected fixation distribution of alleles, given linkage disequilibrium between adjacent sites. For some alleles the point of origin is elusive, for others, the point of origin points toward Sub-Saharan Africa. There are some distinctions within SSA that suggest a smaller region, but there is not adequate enough sample size and coverage to define a place of most recent common ancestor. The TMRCA is consistent with and extends the bottleneck implied by mtDNA, confidently to about 500,000 years.
This section is empty. You can help by adding to it. (July 2010) |
Krings Neandertal mtDNA have been sequenced, and sequence similarity indicates an equally recent origin from a small population on the Neanderthal branch of late hominids. The MCR1 gene has also been sequenced but the results are controversial, with one study claiming that contamination issues cannot be resolved from human Neandertal similarities. Critically, however, no DNA sequence has been obtained from Homo erectus , Homo floresiensis , or any of the other late hominids. Some of the ancient sequences obtained have highly probable errors, and proper control to avoid contamination.
The molecular phylogenetics is based on quantification substitutions and then comparing sequence with other species, there are several points in the process which create errors. The first and greatest challenge is finding "anchors" that allow the research to calibrate the system. In this example, there are 10 mutations between chimpanzee and humans, but the researcher has no known fossils that are agreeably ancestral to both but not ancestral to the next species in the tree, gorilla. However, there are fossils believed to be ancestral to orangutans and humans, from about 14 million years ago. So that the researcher can use orangutan and human comparison and comes up with a difference of 24. Using this he can estimate (24/(14*2, the "2" is for the length of the branch to human (14my) and the branch to orangutan (14 my) from their last common ancestor (LCA). The mutation rate at 0.857 for a stretch of sequence. Mutation rates are given, however, as rate per nucleotide(nt)-site, so if the sequence were say 100 nt in length that rate would be 0.00857/nt per million years. Ten mutations*100nt/(0.00857*2) = 5.8 million years.
There are several problems not seen in the above. First, mutations occur as random events. Second, the chance that any site in the genome varies is different from the next site, a very good example is the codons for amino acids, the first two nt in a codon may mutate at 1 per billion years, but the third nt may mutate 1 per million years. Unless scientist study the sequence of a great many animals, particularly those close to the branch being examined, they generally do not know what the rate of mutation for a given site. Mutations do occur at 1st and 2nd positions of codons, but in most cases these mutations are under negative selection and so are removed from the population over small periods of time. In defining the rate of evolution in the anchor one has the problem that random mutation creates. For example, a rate of .005 or .010 can also explain 24 mutations according to the binomial probability distribution. Some of the mutations that did occur between the two have reverted, hiding an initially higher rate. Selection may play into this, a rare mutation may be selective at point X in time, but later climate may change or the species migrates and it is not longer selective, and pressure exerted on new mutations that revert the change, and sometimes the reversion of a nt can occur, the greater the distance between two species the more likely this is going to occur. In addition, from that ancestral species both species may randomly mutate a site to the same nucleotide. Many times this can be resolved by obtaining DNA samples from species in the branches, creating a parsimonious tree in which the order of mutation can be deduced, creating branch-length diagram. This diagram will then produce a more accurate estimate of mutations between two species. Statistically one can assign variance based on the problem of randomnicity, back mutations, and parallel mutations (homoplasies) in creating an error range.
There is another problem in calibration however that has defied statistical analysis. There is a true/false designation of a fossil to a least common ancestor. In reality the odds of having the least common ancestor of two extant species as an anchor is low, often that fossil already lies in one branch (underestimating the age), lies in a third branch (underestimating the age) or in the case of being within the LCA species, may have been millions of years older than the branch. To date the only way to assess this variance is to apply molecular phylogenetics on species claimed to be branch points. This only, however identifies the 'outlying' anchor points. And since it is more likely the more abundant fossils are younger than the branch point the outlying fossil may simply be a rare older representative. These unknowns create uncertainty that is difficult to quantify, and often not attempted.
Recent papers have been able to estimate, roughly, variance. The general trend as new fossils are discovered, is that the older fossils underestimated the age of the branch point. In addition to this dating of fossils has had a history of errors and there have been many revised datings. The age assigned by researchers to some major branch points have almost doubled in age over the last 30 years. An excellent example of this is the debate over LM3 (Mungo lake 3) in Australia. Originally it was dated to around 30 ky by carbon dating, carbon dating has problems, however, for sampled over 20ky in age, and severe problems for samples around 30ky in age. Another study looked at the fossil and estimated the age to be 62 ky in age.
At the point one has an estimation of mutation rate, given the above there must be two sources of variance that need to be cross-multiplied to generate an overall variance. This is infrequently done in the literature.
Time to most recent common ancestor (TMRCA) combines the errors in calibration with errors in determining the age of a local branch.
With DNA newly discovered as the genetic material, in the early 1960s protein sequencing was beginning to take off. [4] Protein sequencing began on cytochrome C and Hemoglobin. Gerhard Braunitzer sequenced hemoglobin and myoglobin, in total more than hundreds of sequences from wide-ranging species were done. In 1967 A.C. Wilson began to promote the idea of a "molecular clock". By 1969 molecular clocking was applied to anthropoid evolution and V. Sarich and A.C. Wilson found that albumin and hemoglobin has comparable rates of evolution, indicating chimps and humans split about 4 to 5 million years ago. [5] In 1970, Louis Leakey confronted this conclusion with arguing for improper calibration of molecular clocks. [6] By 1975 protein sequencing and comparative serology combined were used to propose that humans closest living relative (as a species) was the chimpanzee. [7] In hindsight, the last common ancestor (LCA) from humans and chimps appears to older than the Sarich and Wilson estimate, but not as old as Leakey claimed, either. However, Leakey was correct in the divergence of old and new world monkeys, the value Sarich and wilson used was a significant underestimate. This error in prediction capability highlights a common theme. (See Causes of Error)
In 1979, W.M.Brown and Wilson began looking at the evolution of mitochondrial DNA in animals, and found they were evolving rapidly. [8] The technique they used was restriction fragment length polymorphism (RFLP), which was more affordable at the time compared to sequencing. In 1980, W.M. Brown, looking at the relative variation between human and other species, recognized there was a recent constriction (180,000 years ago) in the human population. [9] A year later Brown and Wilson were looking at RFLP fragments and determined the human population expanded more recently than other ape populations. [10] In 1984 the first DNA sequence from an extinct animal was done. [11] Sibley and Ahlquist apply DNA-DNA hybridization technology to anthropoid phylogeny, and see pan/human split closer than gorilla/pan or gorilla/human split, a highly controversial claim. [12] [13] However, in 1987 they were able to support their claim. [14] In 1987, Cann, Stoneking and Wilson suggest, by RFLP analysis of human mitochondrial DNA, that humans evolved from a constrict in Africa of a single female in a small population, ~10,00 individuals, 200,000 years ago. [15]
In 1987, PCR-amplification of mtDNA was first used to determine sequences. [16] In 1991 Vigilante et al. published the seminal work on mtDNA phylogeny implicating sub-saharan Africa as the place of humans most recent common ancestors for all mtDNAs. [17] The war between out-of-Africa and multiregionalism, already simmering with the critiques of Allan Templeton, soon escalated with the paleoanthropologist, like Milford Wolpoff, getting involved. [18] [19] [20] In 1995, F. Ayala published his critical Science article "The Myth about Eve", which relied on HLA-DR sequence. [21] At the time, however Ayala was not aware of rapid evolution of HLA loci via recombinatory process. In 1996, Parham and Ohta published their finds on the rapid evolution of HLA by short-distance recombination ('gene conversion' or 'abortive recombination'), weakening Ayala's claim (Parham had actually written a review a year earlier, but this had gone unnoticed). [22] [23] A stream of papers would follow from both sides, many with highly flawed methods and sampling. One of the more interesting[ according to whom? ] was Harris and Hey, 1998 which showed that the TMCRA (time to most recent common ancestor) for the PDHA1 gene was well in excess of 1 million years. Given a ploidy at this locus of 1.5 (3 fold higher than mtDNA) the TMRCA was more than double the expectation. While this falls into the 'fixation curve' of 1.5 ploidy (averaging 2 female and 1 male) the suggested age of 1.8 my is close a significantly deviant p-value for the population size, possibly indicating that the human population shrank or split off of another population. [24] Oddly, the next X-linked loci they examined, Factor IX, showed a TMRCA of less than 300,000 years. [25]
Ancient DNA sequencing had been conducted on a limited scale up to the late 1990s when the staff at the Max Planck Institute shocked the anthropology world by sequencing DNA from an estimated 40,000-year-old Neanderthal. [26] [27] [28] The result of that experiment is that the differences between humans living in Europe, many of which were derived from haplogroup H (CRS), Neandertals branched from humans more than 300,000 years before haplogroup H reached Europe. While the mtDNA and other studies continued to support a unique recent African origin, this new study basically answered critiques from the Neandertal side.
Significant progress has been made in genomic sequencing since Ingman and colleague published their finding on mitochondrial genome. [29] Several papers on genomic mtDNA have been published; there is considerable variability in the rate of evolution, and rate variation and selection are evident at many sites. In 2007, Gonder et al. proposed that a core population of humans, with greatest level of diversity and lowest selection, once lived in the region of Tanzania and proximal parts of southern Africa, since humans left this part of Africa, mitochondria have been selectively evolving to new regions. [30]
Critical in the history of molecular anthropology:
Loci that have been used in molecular phylogenetics:
Xlinked loci: PDHA1, Xq21.3, Xq13.3, Zfx, Fix, Il2rg, Plp, Gk, Ids, Alas2, Rrm2p4, AmeIX, Tnfsf5, Licam, and Msn
Autosomal:Numerous.
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.
In human genetics, the Mitochondrial Eve is the matrilineal most recent common ancestor (MRCA) of all living humans. In other words, she is defined as the most recent woman from whom all living humans descend in an unbroken line purely through their mothers and through the mothers of those mothers, back until all lines converge on one woman.
Mitochondrial DNA is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial DNA is only a small portion of the DNA in a eukaryotic cell; most of the DNA can be found in the cell nucleus and, in plants and algae, also in plastids such as chloroplasts.
Genetic recombination is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. In eukaryotes, genetic recombination during meiosis can lead to a novel set of genetic information that can be further passed on from parents to offspring. Most recombination occurs naturally and can be classified into two types: (1) interchromosomal recombination, occurring through independent assortment of alleles whose loci are on different but homologous chromosomes ; & (2) intrachromosomal recombination, occurring through crossing over.
Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.
In human genetics, the Y-chromosomal most recent common ancestor is the patrilineal most recent common ancestor (MRCA) from whom all currently living humans are descended. He is the most recent male from whom all living humans are descended through an unbroken line of their male ancestors. The term Y-MRCA reflects the fact that the Y chromosomes of all currently living human males are directly derived from the Y chromosome of this remote ancestor. The analogous concept of the matrilineal most recent common ancestor is known as "Mitochondrial Eve", the most recent woman from whom all living humans are descended matrilineally. As with "Mitochondrial Eve", the title of "Y-chromosomal Adam" is not permanently fixed to a single individual, but can advance over the course of human history as paternal lineages become extinct.
The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotide sequences for DNA, RNA, or amino acid sequences for proteins. The benchmarks for determining the mutation rate are often fossil or archaeological dates. The molecular clock was first tested in 1962 on the hemoglobin protein variants of various animals, and is commonly used in molecular evolution to estimate times of speciation or radiation. It is sometimes called a gene clock or an evolutionary clock.
Archaeogenetics is the study of ancient DNA using various molecular genetic methods and DNA resources. This form of genetic analysis can be applied to human, animal, and plant specimens. Ancient DNA can be extracted from various fossilized specimens including bones, eggshells, and artificially preserved tissues in human and animal specimens. In plants, ancient DNA can be extracted from seeds and tissue. Archaeogenetics provides us with genetic evidence of ancient population group migrations, domestication events, and plant and animal evolution. The ancient DNA cross referenced with the DNA of relative modern genetic populations allows researchers to run comparison studies that provide a more complete analysis when ancient DNA is compromised.
Ribosomal DNA (rDNA) is a DNA sequence that codes for ribosomal RNA. These sequences regulate transcription initiation and amplification, and contain both transcribed and non-transcribed spacer segments.
A haplotype is a group of alleles in an organism that are inherited together from a single parent.
In biology and genetic genealogy, the most recent common ancestor (MRCA), also known as the last common ancestor (LCA), of a set of organisms is the most recent individual from which all the organisms of the set are descended. The term is also used in reference to the ancestry of groups of genes (haplotypes) rather than organisms.
A haplotype is a group of alleles in an organism that are inherited together from a single parent, and a haplogroup is a group of similar haplotypes that share a common ancestor with a single-nucleotide polymorphism mutation. More specifically, a haplotype is a combination of alleles at different chromosomal regions that are closely linked and that tend to be inherited together. As a haplogroup consists of similar haplotypes, it is usually possible to predict a haplogroup from haplotypes. Haplogroups pertain to a single line of descent. As such, membership of a haplogroup, by any individual, relies on a relatively small proportion of the genetic material possessed by that individual.
Haplogroup N is a human mitochondrial DNA (mtDNA) clade. A macrohaplogroup, its descendant lineages are distributed across many continents. Like its sibling macrohaplogroup M, macrohaplogroup N is a descendant of the haplogroup L3.
In human genetics, a human mitochondrial DNA haplogroup is a haplogroup defined by differences in human mitochondrial DNA. Haplogroups are used to represent the major branch points on the mitochondrial phylogenetic tree. Understanding the evolutionary path of the female lineage has helped population geneticists trace the matrilineal inheritance of modern humans back to human origins in Africa and the subsequent spread around the globe.
Human evolutionary genetics studies how one human genome differs from another human genome, the evolutionary past that gave rise to the human genome, and its current effects. Differences between genomes have anthropological, medical, historical and forensic implications and applications. Genetic data can provide important insights into human evolution.
In human mitochondrial genetics, Haplogroup Y is a human mitochondrial DNA (mtDNA) haplogroup.
NUMT, pronounced "new might", is an acronym for "nuclear mitochondrial DNA" segment or genetic locus coined by evolutionary geneticist, Jose V. Lopez, which describes a transposition of any type of cytoplasmic mitochondrial DNA into the nuclear genome of eukaryotic organisms.
The human mitochondrial molecular clock is the rate at which mutations have been accumulating in the mitochondrial genome of hominids during the course of human evolution. The archeological record of human activity from early periods in human prehistory is relatively limited and its interpretation has been controversial. Because of the uncertainties from the archeological record, scientists have turned to molecular dating techniques in order to refine the timeline of human evolution. A major goal of scientists in the field is to develop an accurate hominid mitochondrial molecular clock which could then be used to confidently date events that occurred during the course of human evolution.
The multiregional hypothesis, multiregional evolution (MRE), or polycentric hypothesis, is a scientific model that provides an alternative explanation to the more widely accepted "Out of Africa" model of monogenesis for the pattern of human evolution.
Molecular paleontology refers to the recovery and analysis of DNA, proteins, carbohydrates, or lipids, and their diagenetic products from ancient human, animal, and plant remains. The field of molecular paleontology has yielded important insights into evolutionary events, species' diasporas, the discovery and characterization of extinct species. loo In shallow time, advancements in the field of molecular paleontology have allowed scientists to pursue evolutionary questions on a genetic level rather than relying on phenotypic variation alone. By applying molecular analytical techniques to DNA in Recent animal remains, one can quantify the level of relatedness between any two organisms for which DNA has been recovered. Using various biotechnological techniques such as DNA isolation, amplification, and sequencing scientists have been able to gain expanded new insights into the divergence and evolutionary history of countless recently extinct organisms. In February 2021, scientists reported, for the first time, the sequencing of DNA from animal remains, a mammoth in this instance, over a million years old, the oldest DNA sequenced to date.