Fitness landscape

Last updated

In evolutionary biology, fitness landscapes or adaptive landscapes (types of evolutionary landscapes) are used to visualize the relationship between genotypes and reproductive success. It is assumed that every genotype has a well-defined replication rate (often referred to as fitness). This fitness is the "height" of the landscape. Genotypes which are similar are said to be "close" to each other, while those that are very different are "far" from each other. The set of all possible genotypes, their degree of similarity, and their related fitness values is then called a fitness landscape. The idea of a fitness landscape is a metaphor to help explain flawed forms in evolution by natural selection, including exploits and glitches in animals like their reactions to supernormal stimuli.

Contents

The idea of studying evolution by visualizing the distribution of fitness values as a kind of landscape was first introduced by Sewall Wright in 1932. [1]

In evolutionary optimization problems, fitness landscapes are evaluations of a fitness function for all candidate solutions (see below).

In biology

Sketch of a fitness landscape. The arrows indicate the preferred flow of a population on the landscape, and the points A and C are local optima. The red ball indicates a population that has moved from a very low fitness value to the top of a peak. Fitness-landscape-cartoon.png
Sketch of a fitness landscape. The arrows indicate the preferred flow of a population on the landscape, and the points A and C are local optima. The red ball indicates a population that has moved from a very low fitness value to the top of a peak.

In all fitness landscapes, height represents and is a visual metaphor for fitness. There are three distinct ways of characterizing the other dimensions, though in each case distance represents and is a metaphor for degree of dissimilarity. [2]

Fitness landscapes are often conceived of as ranges of mountains. There exist local peaks (points from which all paths are downhill, i.e. to lower fitness) and valleys (regions from which many paths lead uphill). A fitness landscape with many local peaks surrounded by deep valleys is called rugged. If all genotypes have the same replication rate, on the other hand, a fitness landscape is said to be flat. An evolving population typically climbs uphill in the fitness landscape, by a series of small genetic changes, until – in the infinite time limit – a local optimum is reached.

Note that a local optimum cannot always be found even in evolutionary time: if the local optimum can be found in a reasonable amount of time then the fitness landscape is called "easy" and if the time required is exponential then the fitness landscape is called "hard". [3] Hard landscapes are characterized by the maze-like property by which an allele that was once beneficial becomes deleterious, forcing evolution to backtrack. However, the presence of the maze-like property in biophysically inspired fitness landscapes may not be sufficient to generate a hard landscape. [4]

Visualization of two dimensions of an NK fitness landscape. The arrows represent various mutational paths that the population could follow while evolving on the fitness landscape. Visualization of two dimensions of a NK fitness landscape.png
Visualization of two dimensions of an NK fitness landscape. The arrows represent various mutational paths that the population could follow while evolving on the fitness landscape.

Genotype to fitness landscapes

Wright visualized a genotype space as a hypercube. [1] No continuous genotype "dimension" is defined. Instead, a network of genotypes are connected via mutational paths.

Stuart Kauffman's NK model falls into this category of fitness landscape. Newer network analysis techniques such as selection-weighted attraction graphing (SWAG) also use a dimensionless genotype space. [5]

Allele frequency to fitness landscapes

Wright's mathematical work described fitness as a function of allele frequencies. [2] Here, each dimension describes an allele frequency at a different gene, and goes between 0 and 1.

Phenotype to fitness landscapes

In the third kind of fitness landscape, each dimension represents a different phenotypic trait. [2] Under the assumptions of quantitative genetics, these phenotypic dimensions can be mapped onto genotypes. See the visualizations below for examples of phenotype to fitness landscapes.

In evolutionary optimization

Apart from the field of evolutionary biology, the concept of a fitness landscape has also gained importance in evolutionary optimization methods such as genetic algorithms or evolution strategies. In evolutionary optimization, one tries to solve real-world problems (e.g., engineering or logistics problems) by imitating the dynamics of biological evolution. For example, a delivery truck with a number of destination addresses can take a large variety of different routes, but only very few will result in a short driving time.

In order to use many common forms of evolutionary optimization, one has to define for every possible solution s to the problem of interest (i.e., every possible route in the case of the delivery truck) how 'good' it is. This is done by introducing a scalar-valued function f(s) (scalar valued means that f(s) is a simple number, such as 0.3, while s can be a more complicated object, for example a list of destination addresses in the case of the delivery truck), which is called the fitness function.

A high f(s) implies that s is a good solution. In the case of the delivery truck, f(s) could be the number of deliveries per hour on route s. The best, or at least a very good, solution is then found in the following way: initially, a population of random solutions is created. Then, the solutions are mutated and selected for those with higher fitness, until a satisfying solution has been found.

Evolutionary optimization techniques are particularly useful in situations in which it is easy to determine the quality of a single solution, but hard to go through all possible solutions one by one (it is easy to determine the driving time for a particular route of the delivery truck, but it is almost impossible to check all possible routes once the number of destinations grows to more than a handful).

Even in cases where a fitness function is hard to define, the concept of a fitness landscape can be useful. For example, if fitness evaluation is by stochastic sampling, then sampling is from a (usually unknown) distribution at each point; nevertheless is can be useful to reason about the landscape formed by the expected fitness at each point. If fitness changes with time (dynamic optimisation) or with other species in the environment (co-evolution), it can still be useful to reason about the trajectories of the instantaneous fitness landscape. However, in some cases (for example, preference-based interactive evolutionary computation) the relevance is more limited, because there is no guarantee that human preferences are consistent with a single fitness assignment.

The concept of a scalar valued fitness function f(s) also corresponds to the concept of a potential or energy function in physics. The two concepts only differ in that physicists traditionally think in terms of minimizing the potential function, while biologists prefer the notion that fitness is being maximized. Therefore, taking the inverse of a potential function turns it into a fitness function, and vice versa. [6]

Caveats and limitations

Several important caveats exist. Since the human mind struggles to think in greater than three dimensions, 3D topologies can mislead when discussing highly multi-dimensional fitness landscapes. [7] [8] In particular it is not clear whether peaks in natural biological fitness landscapes are ever truly separated by fitness valleys in such multidimensional landscapes, or whether they are connected by vastly long neutral ridges. [9] [10] Additionally, the fitness landscape is not static in time but dependent on the changing environment and evolution of other genes. [5] It is hence more of a seascape, [11] further affecting how separated adaptive peaks can actually be. Additionally, it is relevant to take into account that a landscape is in general not an absolute but a relative function. [12] Finally, since it is common to use function as a proxy for fitness when discussing enzymes, any promiscuous activities exist as overlapping landscapes that together will determine the ultimate fitness of the organism, implying a gap between different coexisting relative landscapes. [13]

With these limitations in mind, fitness landscapes can still be an instructive way of thinking about evolution. It is fundamentally possible to measure (even if not to visualise) some of the parameters of landscape ruggedness and of peak number, height, separation, and clustering. Simplified 3D landscapes can then be used relative to each other to visually represent the relevant features. Additionally, fitness landscapes of small subsets of evolutionary pathways may be experimentally constructed and visualized, potentially revealing features such as fitness peaks and valleys. [5] Fitness landscapes of evolutionary pathways indicate the probable evolutionary steps and endpoints among sets of individual mutations.

Visualization of a population evolving in a static fitness landscape.gif Visualization of a population evolving in a dynamic fitness landscape.gif

See also

Related Research Articles

<span class="mw-page-title-main">Natural selection</span> Mechanism of evolution by differential survival and reproduction of individuals

Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. It is a key mechanism of evolution, the change in the heritable traits characteristic of a population over generations. Charles Darwin popularised the term "natural selection", contrasting it with artificial selection, which is intentional, whereas natural selection is not.

Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the frequency of an existing gene variant (allele) in a population due to random chance.

Fitness is the quantitative representation of individual reproductive success. It is also equal to the average contribution to the gene pool of the next generation, made by the same individuals of the specified genotype or phenotype. Fitness can be defined either with respect to a genotype or to a phenotype in a given environment or time. The fitness of a genotype is manifested through its phenotype, which is also affected by the developmental environment. The fitness of a given phenotype can also be different in different selective environments.

<span class="mw-page-title-main">Evolutionary algorithm</span> Subset of evolutionary computation

In computational intelligence (CI), an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. Evolution of the population then takes place after the repeated application of the above operators.

Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as adaptation, speciation, and population structure.

<span class="mw-page-title-main">Polymorphism (biology)</span> Occurrence of two or more clearly different morphs or forms in the population of a species

In biology, polymorphism is the occurrence of two or more clearly different morphs or forms, also referred to as alternative phenotypes, in the population of a species. To be classified as such, morphs must occupy the same habitat at the same time and belong to a panmictic population.

Fisher's fundamental theorem of natural selection is an idea about genetic variance in population genetics developed by the statistician and evolutionary biologist Ronald Fisher. The proper way of applying the abstract mathematics of the theorem to actual biology has been a matter of some debate.

Evolvability is defined as the capacity of a system for adaptive evolution. Evolvability is the ability of a population of organisms to not merely generate genetic diversity, but to generate adaptive genetic diversity, and thereby evolve through natural selection.

In biology, adaptation has three related meanings. Firstly, it is the dynamic evolutionary process of natural selection that fits organisms to their environment, enhancing their evolutionary fitness. Secondly, it is a state reached by the population during that process. Thirdly, it is a phenotypic trait or adaptive trait, with a functional role in each individual organism, that is maintained and has evolved through natural selection.

In natural evolution and artificial evolution the fitness of a schema is rescaled to give its effective fitness which takes into account crossover and mutation.

<span class="mw-page-title-main">William B. Provine</span>

William Ball Provine was an American historian of science and of evolutionary biology and population genetics. He was the Andrew H. and James S. Tisch Distinguished University Professor at Cornell University and was a professor in the Departments of History, Science and Technology Studies, and Ecology and Evolutionary Biology.

The gene-centered view of evolution, gene's eye view, gene selection theory, or selfish gene theory holds that adaptive evolution occurs through the differential survival of competing genes, increasing the allele frequency of those alleles whose phenotypic trait effects successfully promote their own propagation. The proponents of this viewpoint argue that, since heritable information is passed from generation to generation almost exclusively by DNA, natural selection and evolution are best considered from the perspective of genes.

<span class="mw-page-title-main">Sewall Wright</span> American geneticist (1889–1988)

Sewall Green Wright FRS(For) Honorary FRSE was an American geneticist known for his influential work on evolutionary theory and also for his work on path analysis. He was a founder of population genetics alongside Ronald Fisher and J. B. S. Haldane, which was a major step in the development of the modern synthesis combining genetics with evolution. He discovered the inbreeding coefficient and methods of computing it in pedigree animals. He extended this work to populations, computing the amount of inbreeding between members of populations as a result of random genetic drift, and along with Fisher he pioneered methods for computing the distribution of gene frequencies among populations as a result of the interaction of natural selection, mutation, migration and genetic drift. Wright also made major contributions to mammalian and biochemical genetics.

<span class="mw-page-title-main">Directed evolution</span> Protein engineering method

Directed evolution (DE) is a method used in protein engineering that mimics the process of natural selection to steer proteins or nucleic acids toward a user-defined goal. It consists of subjecting a gene to iterative rounds of mutagenesis, selection and amplification. It can be performed in vivo, or in vitro. Directed evolution is used both for protein engineering as an alternative to rationally designing modified proteins, as well as for experimental evolution studies of fundamental evolutionary principles in a controlled, laboratory environment.

An evolutionary landscape is a metaphor or a construct used to think about and visualize the processes of evolution acting on a biological entity. This entity can be viewed as searching or moving through a search space. For example, the search space of a gene would be all possible nucleotide sequences. The search space is only part of an evolutionary landscape. The final component is the "y-axis", which is usually fitness. Each value along the search space can result in a high or low fitness for the entity. If small movements through search space cause changes in fitness that are relatively small, then the landscape is considered smooth. Smooth landscapes happen when most fixed mutations have little to no effect on fitness, which is what one would expect with the neutral theory of molecular evolution. In contrast, if small movements result in large changes in fitness, then the landscape is said to be rugged. In either case, movement tends to be toward areas of higher fitness, though usually not the global optima.

<span class="mw-page-title-main">Shifting balance theory</span> One version of the theory of evolution

The shifting balance theory is a theory of evolution proposed in 1932 by Sewall Wright, suggesting that adaptive evolution may proceed most quickly when a population divides into subpopulations with restricted gene flow. The name of the theory is borrowed from Wright's metaphor of fitness landscapes, attempting to explain how a population may move across an adaptive valley to a higher adaptive peak. According to the theory, this movement occurs in three steps:

  1. Genetic drift allows a locally adapted subpopulation to move across an adaptive valley to the base of a higher adaptive peak.
  2. Natural selection will move the subpopulation up the higher peak.
  3. This new superiorly adapted subpopulation may then expand its range and outcompete or interbreed with other subpopulations, causing the spread of new adaptations and movement of the global population toward the new fitness peak.
<span class="mw-page-title-main">Bateson–Dobzhansky–Muller model</span> Model of the evolution of genetic incompatibility

The Bateson–Dobzhansky–Muller model, also known as Dobzhansky–Muller model, is a model of the evolution of genetic incompatibility, important in understanding the evolution of reproductive isolation during speciation and the role of natural selection in bringing it about. The theory was first described by William Bateson in 1909, then independently described by Theodosius Dobzhansky in 1934, and later elaborated in different forms by Herman Muller, H. Allen Orr and Sergey Gavrilets.

A neutral network is a set of genes all related by point mutations that have equivalent function or fitness. Each node represents a gene sequence and each line represents the mutation connecting two sequences. Neutral networks can be thought of as high, flat plateaus in a fitness landscape. During neutral evolution, genes can randomly move through neutral networks and traverse regions of sequence space which may have consequences for robustness and evolvability.

<span class="mw-page-title-main">Epistasis</span> Dependence of a gene mutations phenotype on mutations in other genes

Epistasis is a phenomenon in genetics in which the effect of a gene mutation is dependent on the presence or absence of mutations in one or more other genes, respectively termed modifier genes. In other words, the effect of the mutation is dependent on the genetic background in which it appears. Epistatic mutations therefore have different effects on their own than when they occur together. Originally, the term epistasis specifically meant that the effect of a gene variant is masked by that of different gene.

Bias in the introduction of variation is a theory in the domain of evolutionary biology that asserts biases in the introduction of heritable variation are reflected in the outcome of evolution. It is relevant to topics in molecular evolution, evo-devo, and self-organization. In the context of this theory, "introduction" ("origination") is a technical term for events that shift an allele frequency upward from zero. Formal models demonstrate that when an evolutionary process depends on introduction events, mutational and developmental biases in the generation of variation may influence the course of evolution by a first come, first served effect, so that evolution reflects the arrival of the likelier, not just the survival of the fitter. Whereas mutational explanations for evolutionary patterns are often associated with neutral evolution, the theory of arrival biases distinctively predicts that biases in the generation of variation may shape adaptive change. The most direct evidence for this kind of cause-effect relationship comes from laboratory studies showing that adaptive changes are systematically enriched for mutationally likely types of changes. Retrospective analyses of natural cases of adaptation also provide support for the theory. This theory is notable as an example of contemporary structuralist thinking, contrasting with a classical functionalist view in which the course of evolution is determined by natural selection.

References

  1. 1 2 Wright, Sewall (1932). "The roles of mutation, inbreeding, crossbreeding, and selection in evolution" (PDF). Proceedings of the Sixth International Congress on Genetics. 1 (8): 355–66.
  2. 1 2 3 Provine, William B. (1986). Sewall Wright and Evolutionary Biology. University of Chicago Press.[ page needed ]
  3. Kaznatcheev, Artem (2019). "Computational Complexity as an Ultimate Constraint on Evolution". Genetics. 212 (1): 245–265. doi:10.1534/genetics.119.302000. PMC   6499524 . PMID   30833289.
  4. Bertram, Jason; Masel, Joanna (April 2020). "Evolution Rapidly Optimizes Stability and Aggregation in Lattice Proteins Despite Pervasive Landscape Valleys and Mazes". Genetics. 214 (4): 1047–1057. doi:10.1534/genetics.120.302815. PMC   7153934 . PMID   32107278.
  5. 1 2 3 Steinberg, B; Ostermeier, M (2016). "Environmental changes bridge evolutionary valleys". Science Advances. 2 (1): e1500921. Bibcode:2016SciA....2E0921S. doi:10.1126/sciadv.1500921. PMC   4737206 . PMID   26844293.
  6. Kauffman, Stuart A. (1993). The origins of order, self-organization and selection in evolution (1st ed.). New york - Oxford: Oxford University Press. p. 43. ISBN   0-19-505811-9.
  7. McCandlish, David M (2011). "Visualizing Fitness Landscapes". Evolution. 65 (6): 1544–58. doi:10.1111/j.1558-5646.2011.01236.x. PMC   3668694 . PMID   21644947.
  8. McGhee, George R. (2006). The Geometry of Evolution: Adaptive Landscapes and Theoretical Morphospaces. Cambridge University Press. ISBN   978-1-139-45995-2.[ page needed ]
  9. Gavrilets, S. (2004). Fitness Landscapes and the Origin of Species. Princeton University Press. ISBN   978-0-691-11983-0.[ page needed ]
  10. Kaplan, Jonathan (2008). "The end of the adaptive landscape metaphor?". Biology & Philosophy. 23 (5): 625–38. doi:10.1007/s10539-008-9116-z. S2CID   170649453.
  11. Mustonen, Ville; Lässig, Michael (2009). "From fitness landscapes to seascapes: Non-equilibrium dynamics of selection and adaptation". Trends in Genetics. 25 (3): 111–9. doi:10.1016/j.tig.2009.01.002. PMID   19232770.
  12. Woodcock, Glenn; Higgs, Paul G (1996). "Population Evolution on a Multiplicative Single-Peak Fitness Landscape". Journal of Theoretical Biology. 179 (1): 61–73. doi:10.1006/jtbi.1996.0049. PMID   8733432.
  13. Diaz Ochoa, Juan G (2017). "Elastic Multi-scale Mechanisms: Computation and Biological Evolution". Journal of Molecular Evolution. 86 (1): 47–57. Bibcode:2018JMolE..86...47D. doi:10.1007/s00239-017-9823-7. PMID   29248946. S2CID   22624633.
Examples of visualized fitness landscapes
Further reading