Quasispecies model

Last updated

The quasispecies model is a description of the process of the Darwinian evolution of certain self-replicating entities within the framework of physical chemistry. A quasispecies is a large group or "cloud" of related genotypes that exist in an environment of high mutation rate (at stationary state [1] ), where a large fraction of offspring are expected to contain one or more mutations relative to the parent. This is in contrast to a species, which from an evolutionary perspective is a more-or-less stable single genotype, most of the offspring of which will be genetically accurate copies. [2]

Contents

It is useful mainly in providing a qualitative understanding of the evolutionary processes of self-replicating macromolecules such as RNA or DNA or simple asexual organisms such as bacteria or viruses (see also viral quasispecies), and is helpful in explaining something of the early stages of the origin of life. Quantitative predictions based on this model are difficult because the parameters that serve as its input are impossible to obtain from actual biological systems. The quasispecies model was put forward by Manfred Eigen and Peter Schuster [3] based on initial work done by Eigen. [4]

Simplified explanation

When evolutionary biologists describe competition between species, they generally assume that each species is a single genotype whose descendants are mostly accurate copies. (Such genotypes are said to have a high reproductive fidelity.) In evolutionary terms, we are interested in the behavior and fitness of that one species or genotype over time. [5]

Some organisms or genotypes, however, may exist in circumstances of low fidelity, where most descendants contain one or more mutations. A group of such genotypes is constantly changing, so discussions of which single genotype is the most fit become meaningless. Importantly, if many closely related genotypes are only one mutation away from each other, then genotypes in the group can mutate back and forth into each other. For example, with one mutation per generation, a child of the sequence AGGT could be AGTT, and a grandchild could be AGGT again. Thus we can envision a "cloud" of related genotypes that is rapidly mutating, with sequences going back and forth among different points in the cloud. Though the proper definition is mathematical, that cloud, roughly speaking, is a quasispecies.[ citation needed ] [6]

Quasispecies behavior exists for large numbers of individuals existing at a certain (high) range of mutation rates. [7]

Quasispecies, fitness, and evolutionary selection

In a species, though reproduction may be mostly accurate, periodic mutations will give rise to one or more competing genotypes. If a mutation results in greater replication and survival, the mutant genotype may out-compete the parent genotype and come to dominate the species. Thus, the individual genotypes (or species) may be seen as the units on which selection acts and biologists will often speak of a single genotype's fitness. [8]

In a quasispecies, however, mutations are ubiquitous and so the fitness of an individual genotype becomes meaningless: if one particular mutation generates a boost in reproductive success, it can't amount to much because that genotype's offspring are unlikely to be accurate copies with the same properties. Instead, what matters is the connectedness of the cloud. For example, the sequence AGGT has 12 (3+3+3+3) possible single point mutants AGGA, AGGG, and so on. If 10 of those mutants are viable genotypes that may reproduce (and some of whose offspring or grandchildren may mutate back into AGGT again), we would consider that sequence a well-connected node in the cloud. If instead only two of those mutants are viable, the rest being lethal mutations, then that sequence is poorly connected and most of its descendants will not reproduce. The analog of fitness for a quasispecies is the tendency of nearby relatives within the cloud to be well-connected, meaning that more of the mutant descendants will be viable and give rise to further descendants within the cloud. [9]

When the fitness of a single genotype becomes meaningless because of the high rate of mutations, the cloud as a whole or quasispecies becomes the natural unit of selection.

Application to biological research

Quasispecies represents the evolution of high-mutation-rate viruses such as HIV and sometimes single genes or molecules within the genomes of other organisms. [10] [11] [12] Quasispecies models have also been proposed by Jose Fontanari and Emmanuel David Tannenbaum to model the evolution of sexual reproduction. [13] Quasispecies was also shown in compositional replicators (based on the Gard model for abiogenesis) [14] and was also suggested to be applicable to describe cell's replication, which amongst other things requires the maintenance and evolution of the internal composition of the parent and bud.

Formal background

The model rests on four assumptions: [15]

  1. The self-replicating entities can be represented as sequences composed of a small number of building blocks—for example, sequences of RNA consisting of the four bases adenine, guanine, cytosine, and uracil.
  2. New sequences enter the system solely as the result of a copy process, either correct or erroneous, of other sequences that are already present.
  3. The substrates, or raw materials, necessary for ongoing replication are always present in sufficient quantity. Excess sequences are washed away in an outgoing flux.
  4. Sequences may decay into their building blocks. The probability of decay does not depend on the sequences' age; old sequences are just as likely to decay as young sequences.

In the quasispecies model, mutations occur through errors made in the process of copying already existing sequences. Further, selection arises because different types of sequences tend to replicate at different rates, which leads to the suppression of sequences that replicate more slowly in favor of sequences that replicate faster. However, the quasispecies model does not predict the ultimate extinction of all but the fastest replicating sequence. Although the sequences that replicate more slowly cannot sustain their abundance level by themselves, they are constantly replenished as sequences that replicate faster mutate into them. At equilibrium, removal of slowly replicating sequences due to decay or outflow is balanced by replenishing, so that even relatively slowly replicating sequences can remain present in finite abundance. [16]

Due to the ongoing production of mutant sequences, selection does not act on single sequences, but on mutational "clouds" of closely related sequences, referred to as quasispecies. In other words, the evolutionary success of a particular sequence depends not only on its own replication rate, but also on the replication rates of the mutant sequences it produces, and on the replication rates of the sequences of which it is a mutant. As a consequence, the sequence that replicates fastest may even disappear completely in selection-mutation equilibrium, in favor of more slowly replicating sequences that are part of a quasispecies with a higher average growth rate. [17] Mutational clouds as predicted by the quasispecies model have been observed in RNA viruses and in in vitro RNA replication. [18] [19]

The mutation rate and the general fitness of the molecular sequences and their neighbors is crucial to the formation of a quasispecies. If the mutation rate is zero, there is no exchange by mutation, and each sequence is its own species. If the mutation rate is too high, exceeding what is known as the error threshold, the quasispecies will break down and be dispersed over the entire range of available sequences. [20]

Mathematical description

A simple mathematical model for a quasispecies is as follows: [21] let there be possible sequences and let there be organisms with sequence i. Let's say that each of these organisms asexually gives rise to offspring. Some are duplicates of their parent, having sequence i, but some are mutant and have some other sequence. Let the mutation rate correspond to the probability that a j type parent will produce an i type organism. Then the expected fraction of offspring generated by j type organisms that would be i type organisms is ,

where .

Then the total number of i-type organisms after the first round of reproduction, given as , is

Sometimes a death rate term is included so that:

where is equal to 1 when i=j and is zero otherwise. Note that the n-th generation can be found by just taking the n-th power of W substituting it in place of W in the above formula.

This is just a system of linear equations. The usual way to solve such a system is to first diagonalize the W matrix. Its diagonal entries will be eigenvalues corresponding to certain linear combinations of certain subsets of sequences which will be eigenvectors of the W matrix. These subsets of sequences are the quasispecies. Assuming that the matrix W is a primitive matrix (irreducible and aperiodic), then after very many generations only the eigenvector with the largest eigenvalue will prevail, and it is this quasispecies that will eventually dominate. The components of this eigenvector give the relative abundance of each sequence at equilibrium. [22]

Note about primitive matrices

W being primitive means that for some integer , that the power of W is > 0, i.e. all the entries are positive. If W is primitive then each type can, through a sequence of mutations (i.e. powers of W) mutate into all the other types after some number of generations. W is not primitive if it is periodic, where the population can perpetually cycle through different disjoint sets of compositions, or if it is reducible, where the dominant species (or quasispecies) that develops can depend on the initial population, as is the case in the simple example given below.[ citation needed ]

Alternative formulations

The quasispecies formulae may be expressed as a set of linear differential equations. If we consider the difference between the new state and the old state to be the state change over one moment of time, then we can state that the time derivative of is given by this difference, we can write:

The quasispecies equations are usually expressed in terms of concentrations where

.
.

The above equations for the quasispecies then become for the discrete version:

or, for the continuum version:

Simple example

The quasispecies concept can be illustrated by a simple system consisting of 4 sequences. Sequences [0,0], [0,1], [1,0], and [1,1] are numbered 1, 2, 3, and 4, respectively. Let's say the [0,0] sequence never mutates and always produces a single offspring. Let's say the other 3 sequences all produce, on average, replicas of themselves, and of each of the other two types, where . The W matrix is then:

.

The diagonalized matrix is:

.

And the eigenvectors corresponding to these eigenvalues are:

Eigenvalue Eigenvector
1-2k[0,-1,0,1]
1-2k[0,-1,1,0]
1[1,0,0,0]
1+k[0,1,1,1]

Only the eigenvalue is more than unity. For the n-th generation, the corresponding eigenvalue will be and so will increase without bound as time goes by. This eigenvalue corresponds to the eigenvector [0,1,1,1], which represents the quasispecies consisting of sequences 2, 3, and 4, which will be present in equal numbers after a very long time. Since all population numbers must be positive, the first two quasispecies are not legitimate. The third quasispecies consists of only the non-mutating sequence 1. It's seen that even though sequence 1 is the most fit in the sense that it reproduces more of itself than any other sequence, the quasispecies consisting of the other three sequences will eventually dominate (assuming that the initial population was not homogeneous of the sequence 1 type).[ citation needed ]

Related Research Articles

<span class="mw-page-title-main">Mutation</span> Alteration in the nucleotide sequence of a genome

In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA, which then may undergo error-prone repair, cause an error during other forms of repair, or cause an error during replication. Mutations may also result from insertion or deletion of segments of DNA due to mobile genetic elements.

<span class="mw-page-title-main">Phenotype</span> Composite of the organisms observable characteristics or traits

In genetics, the phenotype is the set of observable characteristics or traits of an organism. The term covers the organism's morphology, its developmental processes, its biochemical and physiological properties, its behavior, and the products of behavior. An organism's phenotype results from two basic factors: the expression of an organism's genetic code and the influence of environmental factors. Both factors may interact, further affecting the phenotype. When two or more clearly different phenotypes exist in the same population of a species, the species is called polymorphic. A well-documented example of polymorphism is Labrador Retriever coloring; while the coat color depends on many genes, it is clearly seen in the environment as yellow, black, and brown. Richard Dawkins in 1978 and then again in his 1982 book The Extended Phenotype suggested that one can regard bird nests and other built structures such as caddisfly larva cases and beaver dams as "extended phenotypes".

Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.

Hepatitis D is a type of viral hepatitis caused by the hepatitis delta virus (HDV). HDV is one of five known hepatitis viruses: A, B, C, D, and E. HDV is considered to be a satellite because it can propagate only in the presence of the hepatitis B virus (HBV). Transmission of HDV can occur either via simultaneous infection with HBV (coinfection) or superimposed on chronic hepatitis B or hepatitis B carrier state (superinfection).

<span class="mw-page-title-main">Muller's ratchet</span> Accumulation of harmful mutations

In evolutionary genetics, Muller's ratchet is a process which, in the absence of recombination, results in an accumulation of irreversible deleterious mutations. This happens because in the absence of recombination, and assuming reverse mutations are rare, offspring bear at least as much mutational load as their parents. Muller proposed this mechanism as one reason why sexual reproduction may be favored over asexual reproduction, as sexual organisms benefit from recombination and consequent elimination of deleterious mutations. The negative effect of accumulating irreversible deleterious mutations may not be prevalent in organisms which, while they reproduce asexually, also undergo other forms of recombination. This effect has also been observed in those regions of the genomes of sexual organisms that do not undergo recombination.

Viral evolution is a subfield of evolutionary biology and virology that is specifically concerned with the evolution of viruses. Viruses have short generation times, and many—in particular RNA viruses—have relatively high mutation rates. Although most viral mutations confer no benefit and often even prove deleterious to viruses, the rapid rate of viral mutation combined with natural selection allows viruses to quickly adapt to changes in their host environment. In addition, because viruses typically produce many copies in an infected host, mutated genes can be passed on to many offspring quickly. Although the chance of mutations and evolution can change depending on the type of virus, viruses overall have high chances for mutations.

<span class="mw-page-title-main">Helicase</span> Class of enzymes to unpack an organisms genes

Helicases are a class of enzymes thought to be vital to all organisms. Their main function is to unpack an organism's genetic material. Helicases are motor proteins that move directionally along a nucleic acid phosphodiester backbone, separating two hybridized nucleic acid strands, using energy from ATP hydrolysis. There are many helicases, representing the great variety of processes in which strand separation must be catalyzed. Approximately 1% of eukaryotic genes code for helicases.

<span class="mw-page-title-main">Mumps virus</span> Viral agent that causes mumps

The mumps virus (MuV) is the virus that causes mumps. MuV contains a single-stranded, negative-sense genome made of ribonucleic acid (RNA). Its genome is about 15,000 nucleotides in length and contains seven genes that encode nine proteins. The genome is encased by a capsid that is in turn surrounded by a viral envelope. MuV particles, called virions, are pleomorphic in shape and vary in size from 100 to 600 nanometers in diameter. One serotype and twelve genotypes that vary in their geographic distribution are recognized. Humans are the only natural host of the mumps virus.

<span class="mw-page-title-main">Hepatitis C virus</span> Species of virus

The hepatitis C virus (HCV) is a small, enveloped, positive-sense single-stranded RNA virus of the family Flaviviridae. The hepatitis C virus is the cause of hepatitis C and some cancers such as liver cancer and lymphomas in humans.

<span class="mw-page-title-main">Mutation rate</span> Rate at which mutations occur during some unit of time

In genetics, the mutation rate is the frequency of new mutations in a single gene, nucleotide sequence, or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mutations. Mutation rates are given for specific classes of mutations. Point mutations are a class of mutations which are changes to a single base. Missense and Nonsense mutations are two subtypes of point mutations. The rate of these types of substitutions can be further subdivided into a mutation spectrum which describes the influence of the genetic context on the mutation rate.

Error catastrophe refers to the cumulative loss of genetic information in a lineage of organisms due to high mutation rates. The mutation rate above which error catastrophe occurs is called the error threshold. Both terms were coined by Manfred Eigen in his mathematical evolutionary theory of the quasispecies.

In evolutionary biology and population genetics, the error threshold is a limit on the number of base pairs a self-replicating molecule may have before mutation will destroy the information in subsequent generations of the molecule. The error threshold is crucial to understanding "Eigen's paradox".

<span class="mw-page-title-main">Gene</span> Sequence of DNA or RNA that codes for an RNA or protein product

In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA, that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and non-coding genes.

A viral quasispecies is a population structure of viruses with a large number of variant genomes. Quasispecies result from high mutation rates as mutants arise continually and change in relative frequency as viral replication and selection proceeds.

The term proofreading is used in genetics to refer to the error-correcting processes, first proposed by John Hopfield and Jacques Ninio, involved in DNA replication, immune system specificity, and enzyme-substrate recognition among many other processes that require enhanced specificity. The proofreading mechanisms of Hopfield and Ninio are non-equilibrium active processes that consume ATP to enhance specificity of various biochemical reactions.

<span class="mw-page-title-main">Zygosity</span> Degree of similarity of the alleles in an organism

Zygosity is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism.

<span class="mw-page-title-main">Hypercycle (chemistry)</span> Cyclic sequence of self-reproducing single cycles

In chemistry, a hypercycle is an abstract model of organization of self-replicating molecules connected in a cyclic, autocatalytic manner. It was introduced in an ordinary differential equation (ODE) form by the Nobel Prize in Chemistry winner Manfred Eigen in 1971 and subsequently further extended in collaboration with Peter Schuster. It was proposed as a solution to the error threshold problem encountered during modelling of replicative molecules that hypothetically existed on the primordial Earth. As such, it explained how life on Earth could have begun using only relatively short genetic sequences, which in theory were too short to store all essential information. The hypercycle is a special case of the replicator equation. The most important properties of hypercycles are autocatalytic growth competition between cycles, once-for-ever selective behaviour, utilization of small selective advantage, rapid evolvability, increased information capacity, and selection against parasitic branches.

In evolutionary biology, the GARD model is a general kinetic model for homeostatic-growth and fission of compositional-assemblies, with specific application towards lipids.

<span class="mw-page-title-main">Reverse genetics</span> Method in molecular genetics

Reverse genetics is a method in molecular genetics that is used to help understand the function(s) of a gene by analysing the phenotypic effects caused by genetically engineering specific nucleic acid sequences within the gene. The process proceeds in the opposite direction to forward genetic screens of classical genetics. While forward genetics seeks to find the genetic basis of a phenotype or trait, reverse genetics seeks to find what phenotypes are controlled by particular genetic sequences.

<span class="mw-page-title-main">Mutagenesis (molecular biology technique)</span>

In molecular biology, mutagenesis is an important laboratory technique whereby DNA mutations are deliberately engineered to produce libraries of mutant genes, proteins, strains of bacteria, or other genetically modified organisms. The various constituents of a gene, as well as its regulatory elements and its gene products, may be mutated so that the functioning of a genetic locus, process, or product can be examined in detail. The mutation may produce mutant proteins with interesting properties or enhanced or novel functions that may be of commercial use. Mutant strains may also be produced that have practical application or allow the molecular basis of a particular cell function to be investigated.

References

  1. Eigen M, McCaskill J, Schuster P (1989). Molecular quasi-species. Vol. 92. John Wiley & Sons, Inc. pp. 6881–6891. doi:10.1021/j100335a010. hdl:11858/00-001M-0000-002C-84A7-C. ISBN   9780471622192.{{cite book}}: |journal= ignored (help)
  2. Biebricher, C.K, Eigen, M. (2006). "What is a Quasispecies". In Esteban Domingo (ed.). Quasispecies: Concept and Implications for Virology. Springer. p.  1. ISBN   978-3-540-26395-1.
  3. Eigen M, Schuster P (1979). The Hypercycle: A Principle of Natural Self-Organization. Berlin: Springer-Verlag. ISBN   978-0-387-09293-5.
  4. Eigen M (October 1971). "Selforganization of matter and the evolution of biological macromolecules". Die Naturwissenschaften. 58 (10): 465–523. Bibcode:1971NW.....58..465E. doi:10.1007/BF00623322. PMID   4942363. S2CID   38296619.
  5. Charlesworth B, Charlesworth D (November 2009). "Darwin and genetics". Genetics. 183 (3): 757–66. doi:10.1534/genetics.109.109991. PMC   2778973 . PMID   19933231.
  6. Eigen M (2013). From Strange Simplicity to Complex Familiarity. Oxford University Press. pp. 404–406. ISBN   978-0-19-857021-9.
  7. Martinez, MA, Martus G, Capel E, Parera M, Franco S, Nevot M (2012) Quasispecies Dynamics of RNA Viruses. In: Viruses: Essential Agents of Life, Springer, Dordrecht, pp. 21-42.
  8. "Evolution and the tree of life | Biology | Science". Khan Academy. Retrieved 2019-02-20.
  9. Heylighen F (2023). "Complexity and Evolution" (PDF). Entropy. 25 (2): 286. doi: 10.3390/e25020286 . PMC   9955364 . PMID   36832653. Lecture notes 2014-2015
  10. Holland JJ, De La Torre JC, Steinhauer DA (1992). "RNA Virus Populations as Quasispecies". Genetic Diversity of RNA Viruses. Current Topics in Microbiology and Immunology. Vol. 176. pp. 1–20. doi: 10.1007/978-3-642-77011-1_1 . ISBN   978-3-642-77013-5. PMID   1600747. S2CID   46530529.
  11. Shuman LJ, Wolfe H, Whetsell GW, Huber GA (September 1976). "Reimbursement alternatives for home health care". Inquiry: A Journal of Medical Care Organization, Provision and Financing. 13 (3): 277–87. PMID   135734.
  12. Wilke CO (August 2005). "Quasispecies theory in the context of population genetics". BMC Evolutionary Biology. 5: 44. doi: 10.1186/1471-2148-5-44 . PMC   1208876 . PMID   16107214.
  13. Tannenbaum E, Fontanari JF (March 2008). "A quasispecies approach to the evolution of sexual replication in unicellular organisms". Theory in Biosciences. 127 (1): 53–65. doi:10.1007/s12064-008-0023-2. PMID   18286313. S2CID   8741998.
  14. Gross R, Fouxon I, Lancet D, Markovitch O (December 2014). "Quasispecies in population of compositional assemblies". BMC Evolutionary Biology. 14: 265. doi: 10.1186/s12862-014-0265-1 . PMC   4357159 . PMID   25547629.
  15. Bull JJ, Meyers LA, Lachmann M (November 2005). "Quasispecies made simple". PLOS Computational Biology. 1 (6): e61. Bibcode:2005PLSCB...1...61B. doi: 10.1371/journal.pcbi.0010061 . PMC   1289388 . PMID   16322763.
  16. Systems Biology: A Textbook. By Edda Klipp, Wolfram Liebermeister, Christoph Wierling, Axel Kowald.
  17. Schuster P, Swetina J (November 1988). "Stationary mutant distributions and evolutionary optimization". Bulletin of Mathematical Biology. 50 (6): 635–60. doi:10.1007/BF02460094. PMID   3219448. S2CID   189885782.
  18. Domingo E, Holland JJ (October 1997). "RNA virus mutations and fitness for survival". Annual Review of Microbiology. 51: 151–78. doi:10.1146/annurev.micro.51.1.151. PMID   9343347.
  19. Burch CL, Chao L (August 2000). "Evolvability of an RNA virus is determined by its mutational neighbourhood". Nature. 406 (6796): 625–8. Bibcode:2000Natur.406..625B. doi:10.1038/35020564. PMID   10949302. S2CID   1894386.
  20. Manrubia SC, Domingo E, Lázaro E (June 2010). "Pathways to extinction: beyond the error threshold". Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 365 (1548): 1943–52. doi:10.1098/rstb.2010.0076. PMC   2880120 . PMID   20478889.
  21. Eigen M, McCaskill J, Schuster P (1989). Molecular quasi-species. Vol. 92. John Wiley & Sons, Inc. pp. 6881–6891. doi:10.1021/j100335a010. hdl:11858/00-001M-0000-002C-84A7-C. ISBN   9780471622192.{{cite book}}: |journal= ignored (help)
  22. S Tseng Z (2008). Phase Portraits of Linear Systems.

Further reading