Mutation bias is a pattern in which some type of mutation occurs more often than expected under uniformity. The types are most often defined by the molecular nature of the mutational change, but sometimes they are based on downstream effects, e.g., Ostrow, et al. [1]
The concept of mutation bias appears in several scientific contexts, most commonly in molecular studies of evolution, where mutation biases may be invoked to account for such phenomena as systematic differences in codon usage or genome composition between species. [2] The short tandem repeat (STR) loci used in forensic identification may show biased patterns of gain and loss of repeats. [3] In cancer research, some types of tumors have distinctive mutational signatures that reflect differences in the contributions of mutational pathways. Mutational signatures have proved useful in both detection and treatment.
Recent studies of the emergence of resistance to anti-microbials and anti-cancer drugs show that mutation biases are an important determinant of the prevalence for different types of resistant strains or tumors. [4] [5] Thus, a knowledge of mutation bias can be used to design more evolution-resistant therapies. [4]
When mutation bias is invoked as a possible cause of some pattern in evolution, this is generally an application of the theory of arrival biases, and the alternative hypotheses may include selection, biased gene conversion, and demographic factors.
In the past, due to the technical difficulty of detecting rare mutations, most attempts to characterize the mutation spectrum were based on reporter gene systems, or based on patterns of presumptively neutral change in pseudogenes. More recently, there has been an effort to use the MA (mutation accumulation) method and high-throughput sequencing (e.g., [6] ).
Cases of mutation bias are cited by mutationism advocates of the extended evolutionary synthesis who have argued that mutation bias is an entirely novel evolutionary principle. This viewpoint has been criticized by Erik Svensson. [7] A 2019 review by Svensson and David Berger concluded that "we find little support for mutation bias as an independent force in adaptive evolution, although it can interact with selection under conditions of small population size and when standing genetic variation is limited, entirely consistent with standard evolutionary theory." [8] In contrast to Svensson and Berger a 2023 review by Arlin Stoltzfus and colleagues concluded that there is strong empirical evidence and theoretical arguments that mutation bias has predictable effects on genetic changes fixed in adaptation. [9]
The canonical DNA nucleotides include 2 purines (A and G) and 2 pyrimidines (T and C). In the molecular evolution literature, the term transition is used for nucleotide changes within a chemical class, and transversion for changes from one chemical class to the other. Each nucleotide is subject to one transition (e.g., T to C) and 2 transversions (e.g., T to A or T to G).
Because a site (or a sequence) is subject to twice as many transversions as transitions, the total rate of transversions for a sequence may be higher even when the rate of transitions is higher on a per-path basis. In the molecular evolution literature, the per-path rate bias is typically denoted by κ (kappa), so that, if the rate of each transversion is u, the rate of each transition is κu. Then, the aggregate rate ratio (transitions to transversions) is R = (1 * κu) / (2 * u) = κ / 2. For instance, in yeast, κ ~ 1.2, [10] therefore the aggregate bias is R = 1.2 / 2 = 0.6, whereas in E. coli, κ ~ 4 so that R ~ 2.
In a variety of organisms, transition mutations occur several-fold more frequently than expected under uniformity. [11] The bias in animal viruses is sometimes much more extreme, e.g., 31 of 34 nucleotide mutations in a recent study in HIV were transitions. [12] As noted above, the bias toward transitions is weak in yeast, and appear to be absent in the grasshopper Podisma pedestris. [13]
Male mutation bias is also called "Male-Driven Evolution". The rate of male germline mutations is generally higher than in females. [14] The phenomenon of Male mutation bias have been observed in many species. [15]
In 1935, the British-Indian scientist J.B.S. Haldane found that in hemophilia, the blood clotting disorder originated on the X chromosomes is due to fathers' germline mutation. [16] Then he proposed the hypothesis that the male germline contributes inordinately more mutations to succeeding generations than that in the female germline mutation. [17]
In 1987, Takashi Miyata at al. designed an approach to test Haldane’s hypothesis. [18] If α is the ratio of the male mutation rate to the female mutation rate, Y and X are denoted as Y and X-linked sequence mutation rate, he include that the ratio of Y-linked sequence mutation rate to X-linked sequence mutation rate is:
The mean Y/X ratio is 2.25 in higher primates. [19] By using the equation, we could estimate the ratio of male to female mutation rates α ≈ 6. In some organisms with a shorter generation time than humans, the mutation rate in males is also larger than those in females. Because their cell divisions in males are usually not that large. The ratio of the number of germ cell divisions from one generation to the next in males to females is less than that in human. [20] [21] [22]
There are also other hypotheses that want to explain the male mutation bias. They think it may be caused by the mutation rate in the Y-linked sequence higher than the X-linked sequence mutation rate. The male germline genome is heavily methylated and more inclined to mutate than females. X chromosomes experience more purifying selection mutations on hemizygous chromosomes. [23] To test this hypothesis, people use birds to study their mutation rate. [24] [25] Contrary to humans, bird males are homogametes (WW), and females are heterogametes (WZ). They found that the bird male-to-female ratio in mutation rates ranges from 4 to 7. [26] It also proved that the mutation bias is mostly resulted from more male germline mutation than the female.
A mutation is a heritable variation in the genetic information of a short region of DNA sequences. Mutations can be categorized into replication-dependent mutations and replication-independent mutations. Therefore, there are two kinds of mutation mechanisms to explain the phenomenon of male mutation bias.
The number of germ cell divisions in females are constant and are much less than that in males. In females, most primary oocytes are formed at birth. The number of cell divisions occurred in the production of a mature ovum is constant. In males, more cell divisions are required during the process of spermatogenesis. Not only that, the cycle of spermatogenesis is never-ending. Spermatogonia will continue to divide throughout the whole productive life of the male. The number of male germline cell divisions at production is not only higher than female germline cell divisions but also mounting as the age of the male increases. [27]
One might expect the male mutation rate would be similar to the rate of male germline cell divisions. But only few species conform to the estimation of the male mutation rate. [22] Even in these species, the ratio of male-to-female mutation rate is lower than the ratio of male-to-female in the number of germline cell divisions. [28]
The skew estimates of the male-to-female mutation rate ratio introduce the other important mechanism that highly influences male mutation bias. Mutations at CpG sites result in a C-to-T transition. [29] These C-to-T nucleotide substitutions occur 10-50 times faster than that at rest sites in DNA sequences, especially likely appeared in the male and female germlines. [30] The CpG mutation barely expresses any sex biases because of the independence of replication, and effectively lower the ratio of male-to-female mutation rate. [31] Besides, neighbor-dependent mutations can also cause biases in mutation rate, and may have no relevance to DNA replication. For example, if mutations originated by the effect of mutagens show weak male mutation bias, such as exposure to the UV light. [32]
A GC-AT bias is a bias with a net effect on GC content. For instance, if G and C sites are simply more mutable than A and T sites, other things being equal, this would result in a net downward pressure on GC content. Mutation-accumulation studies indicate a strong many-fold bias toward AT in mitochondria of D. melanogaster, [33] and a more modest 2-fold bias toward AT in yeast. [10]
A common idea in the literature of molecular evolution is that codon usage and genome composition reflect the effects of mutation bias, e.g., codon usage has been treated with a mutation-selection-drift model combining mutation biases, selection for translationally preferred codons, and drift. To the extent that mutation bias prevails under this model, mutation bias toward GC is responsible for genomes with high GC content, and likewise the opposite bias is responsible for genomes with low GC content. [34]
Starting in the 1990s, it became clear that GC-biased gene conversion was a major factor—previously unanticipated—in affecting GC content in diploid organisms such as mammals. [35]
Similarly, although it may be the case that bacterial genome composition strongly reflects GC and AT biases, the proposed mutational biases have not been demonstrated to exist. Indeed, Hershberg and Petrov suggest that mutation in most bacterial genomes is biased toward AT, even when the genome is not AT-rich. [2]
The concept of mutation bias, as defined above, does not imply foresight, design, or even a specially evolved tendency, e.g., the bias may emerge simply as a side-effect of DNA repair processes. Currently there is no established terminology for mutation-generating systems that tend to produce useful mutations. The term "directed mutation" or adaptive mutation is sometimes used with the implication of a process of mutation that senses and responds to conditions directly. When the sense is simply that the mutation system is tuned to enhance the production of helpful mutations under certain conditions, the terminology of "mutation strategies" [38] or "natural genetic engineering" [39] has been suggested, but these terms are not widely used. As argued in Ch. 5 of Stoltzfus 2021, [40] various mechanisms of mutation in pathogenic microbes, e.g., mechanisms for phase variation and antigenic variation, appear to have evolved so as to enhance lineage survival, and these mechanisms are routinely described as strategies or adaptations in the microbial genetics literature, such as by Foley 2015. [41]
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA, which then may undergo error-prone repair, cause an error during other forms of repair, or cause an error during replication. Mutations may also result from insertion or deletion of segments of DNA due to mobile genetic elements.
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.
A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.
Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.
The coding region of a gene, also known as the coding sequence(CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy.
Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.
In molecular biology and genetics, GC-content is the percentage of nitrogenous bases in a DNA or RNA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out of an implied four total bases, also including adenine and thymine in DNA and adenine and uracil in RNA.
A germline mutation, or germinal mutation, is any detectable variation within germ cells. Mutations in these cells are the only mutations that can be passed on to offspring, when either a mutated sperm or oocyte come together to form a zygote. After this fertilization event occurs, germ cells divide rapidly to produce all of the cells in the body, causing this mutation to be present in every somatic and germline cell in the offspring; this is also known as a constitutional mutation. Germline mutation is distinct from somatic mutation.
In genetics, the mutation rate is the frequency of new mutations in a single gene or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mutations. Mutation rates are given for specific classes of mutations. Point mutations are a class of mutations which are changes to a single base. Missense and Nonsense mutations are two subtypes of point mutations. The rate of these types of substitutions can be further subdivided into a mutation spectrum which describes the influence of the genetic context on the mutation rate.
Gene conversion is the process by which one DNA sequence replaces a homologous sequence such that the sequences become identical after the conversion event. Gene conversion can be either allelic, meaning that one allele of the same gene replaces another allele, or ectopic, meaning that one paralogous DNA sequence converts another.
Transition, in genetics and molecular biology, refers to a point mutation that changes a purine nucleotide to another purine, or a pyrimidine nucleotide to another pyrimidine. Approximately two out of three single nucleotide polymorphisms (SNPs) are transitions.
In genetics, the Ka/Ks ratio, also known as ω or dN/dS ratio, is used to estimate the balance between neutral mutations, purifying selection and beneficial mutations acting on a set of homologous protein-coding genes. It is calculated as the ratio of the number of nonsynonymous substitutions per non-synonymous site (Ka), in a given period of time, to the number of synonymous substitutions per synonymous site (Ks), in the same period. The latter are assumed to be neutral, so that the ratio indicates the net balance between deleterious and beneficial mutations. Values of Ka/Ks significantly above 1 are unlikely to occur without at least some of the mutations being advantageous. If beneficial mutations are assumed to make little contribution, then Ka/Ks estimates the degree of evolutionary constraint.
Neutral mutations are changes in DNA sequence that are neither beneficial nor detrimental to the ability of an organism to survive and reproduce. In population genetics, mutations in which natural selection does not affect the spread of the mutation in a species are termed neutral mutations. Neutral mutations that are inheritable and not linked to any genes under selection will be lost or will replace all other alleles of the gene. That loss or fixation of the gene proceeds based on random sampling known as genetic drift. A neutral mutation that is in linkage disequilibrium with other alleles that are under selection may proceed to loss or fixation via genetic hitchhiking and/or background selection.
A number of different Markov models of DNA sequence evolution have been proposed. These substitution models differ in terms of the parameters used to describe the rates at which one nucleotide replaces another during evolution. These models are frequently used in molecular phylogenetic analyses. In particular, they are used during the calculation of likelihood of a tree and they are used to estimate the evolutionary distance between sequences from the observed differences between the sequences.
In bioinformatics, k-mers are substrings of length contained within a biological sequence. Primarily used within the context of computational genomics and sequence analysis, in which k-mers are composed of nucleotides, k-mers are capitalized upon to assemble DNA sequences, improve heterologous gene expression, identify species in metagenomic samples, and create attenuated vaccines. Usually, the term k-mer refers to all of a sequence's subsequences of length , such that the sequence AGAT would have four monomers, three 2-mers, two 3-mers and one 4-mer (AGAT). More generally, a sequence of length will have k-mers and total possible k-mers, where is number of possible monomers.
The human mitochondrial molecular clock is the rate at which mutations have been accumulating in the mitochondrial genome of hominids during the course of human evolution. The archeological record of human activity from early periods in human prehistory is relatively limited and its interpretation has been controversial. Because of the uncertainties from the archeological record, scientists have turned to molecular dating techniques in order to refine the timeline of human evolution. A major goal of scientists in the field is to develop an accurate hominid mitochondrial molecular clock which could then be used to confidently date events that occurred during the course of human evolution.
A nonsynonymous substitution is a nucleotide mutation that alters the amino acid sequence of a protein. Nonsynonymous substitutions differ from synonymous substitutions, which do not alter amino acid sequences and are (sometimes) silent mutations. As nonsynonymous substitutions result in a biological change in the organism, they are subject to natural selection.
The Infinite sites model (ISM) is a mathematical model of molecular evolution first proposed by Motoo Kimura in 1969. Like other mutation models, the ISM provides a basis for understanding how mutation develops new alleles in DNA sequences. Using allele frequencies, it allows for the calculation of heterozygosity, or genetic diversity, in a finite population and for the estimation of genetic distances between populations of interest.
Amino acid replacement is a change from one amino acid to a different amino acid in a protein due to point mutation in the corresponding DNA sequence. It is caused by nonsynonymous missense mutation which changes the codon sequence to code other amino acid instead of the original.
Bias in the introduction of variation is a theory in the domain of evolutionary biology that asserts biases in the introduction of heritable variation are reflected in the outcome of evolution. It is relevant to topics in molecular evolution, evo-devo, and self-organization. In the context of this theory, "introduction" ("origination") is a technical term for events that shift an allele frequency upward from zero. Formal models demonstrate that when an evolutionary process depends on introduction events, mutational and developmental biases in the generation of variation may influence the course of evolution by a first come, first served effect, so that evolution reflects the arrival of the likelier, not just the survival of the fitter. Whereas mutational explanations for evolutionary patterns are often associated with neutral evolution, the theory of arrival biases distinctively predicts that biases in the generation of variation may shape adaptive change. The most direct evidence for this kind of cause-effect relationship comes from laboratory studies showing that adaptive changes are systematically enriched for mutationally likely types of changes. Retrospective analyses of natural cases of adaptation also provide support for the theory. This theory is notable as an example of contemporary structuralist thinking, contrasting with a classical functionalist view in which the course of evolution is determined by natural selection.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)