Mutation bias refers to a predictable or systematic difference in rates for different types of mutation. The types are most often defined by the molecular nature of the mutational change, but sometimes they are based on downstream effects, e.g., Ostrow, et al. [1] refer to "mutational bias for body size".
The concept of mutation bias appears in several scientific contexts, most commonly in molecular studies of evolution, where mutation biases may be invoked to account for such phenomena as systematic differences in codon usage or genome composition between species. [2] The short tandem repeat (STR) loci used in forensic identification may show biased patterns of gain and loss of repeats. [3] In cancer research, some types of tumors have distinctive mutational signatures that reflect differences in the contributions of mutational pathways. Mutational signatures have proved useful in both detection and treatment.
Recent studies of the emergence of resistance to anti-microbials and anti-cancer drugs show that mutation biases are an important determinant of the prevalence for different types of resistant strains or tumors. [4] [5] Thus, a knowledge of mutation bias can be used to design more evolution-resistant therapies. [4]
When mutation bias is invoked as a possible cause of some pattern in evolution, this is generally an application of the theory of arrival biases, and the alternative hypotheses may include selection, biased gene conversion, and demographic factors. Evidence for an evolutionary impact of mutation biases on changes involved in adaptation is summarized in the Arrival Bias article (note that [6] argued in 2019 that this line of argument is flawed and that apparently mutation-biased patterns of change are better explained by selection).
In the past, due to the technical difficulty of detecting rare mutations, most attempts to characterize the mutation spectrum were based on reporter gene systems, or based on patterns of presumptively neutral change in pseudogenes. More recently, there has been an effort to use the MA (mutation accumulation) method and high-throughput sequencing (e.g., [7] ).
The canonical DNA nucleotides include 2 purines (A and G) and 2 pyrimidines (T and C). In the molecular evolution literature, the term transition is used for nucleotide changes within a chemical class, and transversion for changes from one chemical class to the other. Each nucleotide is subject to one transition (e.g., T to C) and 2 transversions (e.g., T to A or T to G).
Because a site (or a sequence) is subject to twice as many transversions as transitions, the total rate of transversions for a sequence may be higher even when the rate of transitions is higher on a per-path basis. In the molecular evolution literature, the per-path rate bias is typically denoted by κ (kappa), so that, if the rate of each transversion is u, the rate of each transition is κu. Then, the aggregate rate ratio (transitions to transversions) is R = (1 * κu) / (2 * u) = κ / 2. For instance, in yeast, κ ~ 1.2, [8] therefore the aggregate bias is R = 1.2 / 2 = 0.6, whereas in E. coli, κ ~ 4 so that R ~ 2.
In a variety of organisms, transition mutations occur several-fold more frequently than expected under uniformity. [9] The bias in animal viruses is sometimes much more extreme, e.g., 31 of 34 nucleotide mutations in a recent study in HIV were transitions. [10] As noted above, the bias toward transitions is weak in yeast, and appear to be absent in the grasshopper Podisma pedestris. [11]
Male mutation bias is also called "Male-Driven Evolution". The rate of male germline mutations is generally higher than in females. [12] The phenomenon of Male mutation bias have been observed in many species. [13]
In 1935, the British-Indian scientist J.B.S. Haldane found that in hemophilia, the blood clotting disorder originated on the X chromosomes is due to fathers' germline mutation. [14] Then he proposed the hypothesis that the male germline contributes inordinately more mutations to succeeding generations than that in the female germline mutation. [15]
In 1987, Takashi Miyata at al. designed an approach to test Haldane’s hypothesis. [16] If α is the ratio of the male mutation rate to the female mutation rate, Y and X are denoted as Y and X-linked sequence mutation rate, he include that the ratio of Y-linked sequence mutation rate to X-linked sequence mutation rate is:
The mean Y/X ratio is 2.25 in higher primates. [17] By using the equation, we could estimate the ratio of male to female mutation rates α ≈ 6. In some organisms with a shorter generation time than humans, the mutation rate in males is also larger than those in females. Because their cell divisions in males are usually not that large. The ratio of the number of germ cell divisions from one generation to the next in males to females is less than that in human. [18] [19] [20]
There are also other hypotheses that want to explain the male mutation bias. They think it may be caused by the mutation rate in the Y-linked sequence higher than the X-linked sequence mutation rate. The male germline genome is heavily methylated and more inclined to mutate than females. X chromosomes experience more purifying selection mutations on hemizygous chromosomes. [21] To test this hypothesis, people use birds to study their mutation rate. [22] [23] Contrary to humans, bird males are homogametes (WW), and females are heterogametes (WZ). They found that the bird male-to-female ratio in mutation rates ranges from 4 to 7. [24] It also proved that the mutation bias is mostly resulted from more male germline mutation than the female.
A mutation is a heritable variation in the genetic information of a short region of DNA sequences. Mutations can be categorized into replication-dependent mutations and replication-independent mutations. Therefore, there are two kinds of mutation mechanisms to explain the phenomenon of male mutation bias.
The number of germ cell divisions in females are constant and are much less than that in males. In females, most primary oocytes are formed at birth. The number of cell divisions occurred in the production of a mature ovum is constant. In males, more cell divisions are required during the process of spermatogenesis. Not only that, the cycle of spermatogenesis is never-ending. Spermatogonia will continue to divide throughout the whole productive life of the male. The number of male germline cell divisions at production is not only higher than female germline cell divisions but also mounting as the age of the male increases. [25]
One might expect the male mutation rate would be similar to the rate of male germline cell divisions. But only few species conform to the estimation of the male mutation rate. [20] Even in these species, the ratio of male-to-female mutation rate is lower than the ratio of male-to-female in the number of germline cell divisions. [26]
The skew estimates of the male-to-female mutation rate ratio introduce the other important mechanism that highly influences male mutation bias. Mutations at CpG sites result in a C-to-T transition. [27] These C-to-T nucleotide substitutions occur 10-50 times faster than that at rest sites in DNA sequences, especially likely appeared in the male and female germlines. [28] The CpG mutation barely expresses any sex biases because of the independence of replication, and effectively lower the ratio of male-to-female mutation rate. [29] Besides, neighbor-dependent mutations can also cause biases in mutation rate, and may have no relevance to DNA replication. For example, if mutations originated by the effect of mutagens show weak male mutation bias, such as exposure to the UV light. [30]
A GC-AT bias is a bias with a net effect on GC content. For instance, if G and C sites are simply more mutable than A and T sites, other things being equal, this would result in a net downward pressure on GC content. Mutation-accumulation studies indicate a strong many-fold bias toward AT in mitochondria of D. melanogaster, [31] and a more modest 2-fold bias toward AT in yeast. [8]
A common idea in the literature of molecular evolution is that codon usage and genome composition reflect the effects of mutation bias, e.g., codon usage has been treated with a mutation-selection-drift model combining mutation biases, selection for translationally preferred codons, and drift. To the extent that mutation bias prevails under this model, mutation bias toward GC is responsible for genomes with high GC content, and likewise the opposite bias is responsible for genomes with low GC content. [32]
Starting in the 1990s, it became clear that GC-biased gene conversion was a major factor—previously unanticipated—in affecting GC content in diploid organisms such as mammals. [33]
Similarly, although it may be the case that bacterial genome composition strongly reflects GC and AT biases, the proposed mutational biases have not been demonstrated to exist. Indeed, Hershberg and Petrov suggest that mutation in most bacterial genomes is biased toward AT, even when the genome is not AT-rich. [2]
Mutation biases are not constant, but vary taxonomically, as shown in the table below from, [39] and by conditions such as nutritional state. [40]
Group | Species | AT Bias | Ts:Tv Bias | Nonsyn:Syn Ratio | Ins:Del Ratio |
---|---|---|---|---|---|
Prokaryotes | Bacillus subtilis NCIB3610 | 0.60 | 6:1 | 3:1 | — |
Prokaryotes | Burkholderia cenocepacia | 0.83 | 2:1 | 3:1 | 0.94 |
Prokaryotes | Deinococcus radiodurans | 0.49 | 3:1 | 3:1 | 1.11 |
Prokaryotes | Escherichia coli K12 substr. MG1655 | 1.24 | 3:1 | 2:1 | 0.40 |
Prokaryotes | Escherichia coli ED1a | 2.09 | 3:1 | 3:1 | 0.19 |
Prokaryotes | Escherichia coli IAI1 | 2.04 | 2:1 | 2:1 | 0.19 |
Prokaryotes | Mesoplasma florum L1 | 15.97 | 3:1 | 6:1 | 0.98 |
Prokaryotes | Mycobacterium smegmatisb | 0.73 | 3:1 | 2:1 | 2.14 |
Prokaryotes | Vibrio cholerae 2740–80 | 2.71 | 3:1 | 2:1 | 0.29 |
Prokaryotes | Vibrio fischeri ES114 | 4.26 | 2:1 | 5:1 | 0.58 |
Unicell. euk. | Bathycoccus prasinos | 2.89 | 1:1 | 2:1 | 1.00 |
Unicell. euk. | Chlamydomonas reinhardtii | 1.10 | 1:1 | — | 1.60 |
Unicell. euk. | Chlamydomonas reinhardtii | 2.88 | 2:1 | 2:1 | 0.84 |
Unicell. euk. | Micromonas pusilla | 1.00 | 2:1 | 3:1 | 0.17 |
Unicell. euk. | Ostreococcus mediterraneus | 1.31 | 3:1 | 4:1 | 0.38 |
Unicell. euk. | Ostreococcus tauri | 1.74 | 7:1 | 2:1 | 0.63 |
Unicell. euk. | Paramecium tetraurelia | 12.86 | 1:1 | 2:1 | _ (5:0) |
Unicell. euk. | Saccharomyces cerevisiae | 3.96 | 1:1 | 3:1 | _ (0:1) |
Unicell. euk. | Saccharomyces cerevisiae | 2.23 | 2:1 | 3:1 | 0.45 |
Unicell. euk. | Schizosaccharomyces pombe | 2.65 | 2:1 | 3:1 | 6.00 |
Unicell. euk. | Schizosaccharomyces pombe | 2.97 | 1:1 | 2:1 | 6.13 |
Unicell. euk. | Tetrahymena thermophila | 10.04 | 3:1 | 2:1 | — |
Multicell. euk. | Arabidopsis thaliana | 6.09 | 5:1 | 3:1 | 0.50 |
Multicell. euk. | Caenorhabditis elegans | 2.24 | 1:1 | 2:1 | — |
Multicell. euk. | Daphnia pulex | 2.69 | 3:1 | — | — |
Multicell. euk. | Drosophila melanogaster | 2.08 | 2:1 | 2:1 | 0.17 |
Multicell. euk. | Drosophila melanogaster | 4.33 | 6:1 | 9:1 | 0.20 |
Multicell. euk. | Drosophila melanogaster | 2.85 | 2:1 | 3:1 | 0.33 |
Multicell. euk. | Drosophila melanogaster | 3.84 | 2:1 | 3:1 | 0.32 |
Multicell. euk. | Drosophila melanogaster | 3.12 | 2:1 | — | — |
Multicell. euk. | Pristionchus pacificus | 5.16 | 2:1 | 3:1 | — |
The concept of mutation bias, as defined above, does not imply foresight, design, or even a specially evolved tendency, e.g., the bias may emerge simply as a side-effect of DNA repair processes. Currently there is no established terminology for mutation-generating systems that tend to produce useful mutations. The term "directed mutation" or adaptive mutation is sometimes used with the implication of a process of mutation that senses and responds to conditions directly. When the sense is simply that the mutation system is tuned to enhance the production of helpful mutations under certain conditions, the terminology of "mutation strategies" [41] or "natural genetic engineering" [42] has been suggested, but these terms are not widely used. As argued in Ch. 5 of Stoltzfus 2021, [43] various mechanisms of mutation in pathogenic microbes, e.g., mechanisms for phase variation and antigenic variation, appear to have evolved so as to enhance lineage survival, and these mechanisms are routinely described as strategies or adaptations in the microbial genetics literature, such as by Foley 2015. [44]
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA, which then may undergo error-prone repair, cause an error during other forms of repair, or cause an error during replication. Mutations may also result from substitution, insertion or deletion of segments of DNA due to mobile genetic elements.
A microsatellite is a tract of repetitive DNA in which certain DNA motifs are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organism's genome. They have a higher mutation rate than other areas of DNA leading to high genetic diversity. Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists and in genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.
Molecular evolution describes how inherited DNA and/or RNA change over evolutionary time, and the consequences of this for proteins and other components of cells and organisms. Molecular evolution is the basis of phylogenetic approaches to describing the tree of life. Molecular evolution overlaps with population genetics, especially on shorter timescales. Topics in molecular evolution include the origins of new genes, the genetic nature of complex traits, the genetic basis of adaptation and speciation, the evolution of development, and patterns and processes underlying genomic changes during evolution.
The coding region of a gene, also known as the coding DNA sequence (CDS), is the portion of a gene's DNA or RNA that codes for a protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy.
In biology and genetics, the germline is the population of a multicellular organism's cells that develop into germ cells. In other words, they are the cells that form gametes, which can come together to form a zygote. They differentiate in the gonads from primordial germ cells into gametogonia, which develop into gametocytes, which develop into the final gametes. This process is known as gametogenesis.
Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.
In genetics and bioinformatics, a single-nucleotide polymorphism is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a sufficiently large fraction of the population, many publications do not apply such a frequency threshold.
A point mutation is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences that are moderately predictable based upon the specifics of the mutation. These consequences can range from no effect to deleterious effects, with regard to protein production, composition, and function.
In molecular biology and genetics, GC-content is the percentage of nitrogenous bases in a DNA or RNA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out of an implied four total bases, also including adenine and thymine in DNA and adenine and uracil in RNA.
A germline mutation, or germinal mutation, is any detectable variation within germ cells. Mutations in these cells are the only mutations that can be passed on to offspring, when either a mutated sperm or oocyte come together to form a zygote. After this fertilization event occurs, germ cells divide rapidly to produce all of the cells in the body, causing this mutation to be present in every somatic and germline cell in the offspring; this is also known as a constitutional mutation. Germline mutation is distinct from somatic mutation.
In genetics, the mutation rate is the frequency of new mutations in a single gene, nucleotide sequence, or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mutations. Mutation rates are given for specific classes of mutations. Point mutations are a class of mutations which are changes to a single base. Missense, nonsense, and synonymous mutations are three subtypes of point mutations. The rate of these types of substitutions can be further subdivided into a mutation spectrum which describes the influence of the genetic context on the mutation rate.
Gene conversion is the process by which one DNA sequence replaces a homologous sequence such that the sequences become identical after the conversion. Gene conversion can be either allelic, meaning that one allele of the same gene replaces another allele, or ectopic, meaning that one paralogous DNA sequence converts another.
Indel (insertion-deletion) is a molecular biology term for an insertion or deletion of bases in the genome of an organism. Indels ≥ 50 bases in length are classified as structural variants.
Transition, in genetics and molecular biology, refers to a point mutation that changes a purine nucleotide to another purine, or a pyrimidine nucleotide to another pyrimidine. Approximately two out of three single nucleotide polymorphisms (SNPs) are transitions.
In genetics, the Ka/Ks ratio, also known as ω or dN/dS ratio, is used to estimate the balance between neutral mutations, purifying selection and beneficial mutations acting on a set of homologous protein-coding genes. It is calculated as the ratio of the number of nonsynonymous substitutions per non-synonymous site (Ka), in a given period of time, to the number of synonymous substitutions per synonymous site (Ks), in the same period. The latter are assumed to be neutral, so that the ratio indicates the net balance between deleterious and beneficial mutations. Values of Ka/Ks significantly above 1 are unlikely to occur without at least some of the mutations being advantageous. If beneficial mutations are assumed to make little contribution, then Ka/Ks estimates the degree of evolutionary constraint.
In bioinformatics, k-mers are substrings of length contained within a biological sequence. Primarily used within the context of computational genomics and sequence analysis, in which k-mers are composed of nucleotides, k-mers are capitalized upon to assemble DNA sequences, improve heterologous gene expression, identify species in metagenomic samples, and create attenuated vaccines. Usually, the term k-mer refers to all of a sequence's subsequences of length , such that the sequence AGAT would have four monomers, three 2-mers, two 3-mers and one 4-mer (AGAT). More generally, a sequence of length will have k-mers and total possible k-mers, where is number of possible monomers.
The human mitochondrial molecular clock is the rate at which mutations have been accumulating in the mitochondrial genome of hominids during the course of human evolution. The archeological record of human activity from early periods in human prehistory is relatively limited and its interpretation has been controversial. Because of the uncertainties from the archeological record, scientists have turned to molecular dating techniques in order to refine the timeline of human evolution. A major goal of scientists in the field is to develop an accurate hominid mitochondrial molecular clock which could then be used to confidently date events that occurred during the course of human evolution.
In molecular biology, mutagenesis is an important laboratory technique whereby DNA mutations are deliberately engineered to produce libraries of mutant genes, proteins, strains of bacteria, or other genetically modified organisms. The various constituents of a gene, as well as its regulatory elements and its gene products, may be mutated so that the functioning of a genetic locus, process, or product can be examined in detail. The mutation may produce mutant proteins with interesting properties or enhanced or novel functions that may be of commercial use. Mutant strains may also be produced that have practical application or allow the molecular basis of a particular cell function to be investigated.
The Infinite sites model (ISM) is a mathematical model of molecular evolution first proposed by Motoo Kimura in 1969. Like other mutation models, the ISM provides a basis for understanding how mutation develops new alleles in DNA sequences. Using allele frequencies, it allows for the calculation of heterozygosity, or genetic diversity, in a finite population and for the estimation of genetic distances between populations of interest.
A somatic mutation is a change in the DNA sequence of a somatic cell of a multicellular organism with dedicated reproductive cells; that is, any mutation that occurs in a cell other than a gamete, germ cell, or gametocyte. Unlike germline mutations, which can be passed on to the descendants of an organism, somatic mutations are not usually transmitted to descendants. This distinction is blurred in plants, which lack a dedicated germline, and in those animals that can reproduce asexually through mechanisms such as budding, as in members of the cnidarian genus Hydra.
{{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link)