Selection (evolutionary algorithm)

Last updated

Selection is a genetic operator in a evolutionary algorithm in which individual genomes are chosen from a population for later breeding (e.g., using the crossover operator). Selection mechanisms are also used to choose candidate solutions (individuals) for the next generation. Retaining the best individuals in a generation unchanged in the next generation, is called elitism or elitist selection. It is a successful (slight) variant of the general process of constructing a new population.

Contents

A selection procedure for breeding used early on [1] may be implemented as follows:

  1. The fitness values that have been computed (fitness function) are normalized, such that the sum of all resulting fitness values equals 1.
  2. Accumulated normalized fitness values are computed: the accumulated fitness value of an individual is the sum of its own fitness value plus the fitness values of all the previous individuals; the accumulated fitness of the last individual should be 1, otherwise something went wrong in the normalization step.
  3. A random number R between 0 and 1 is chosen.
  4. The selected individual is the first one whose accumulated normalized value is greater than or equal to R.

For many problems the above algorithm might be computationally demanding. A simpler and faster alternative uses the so-called stochastic acceptance.

If this procedure is repeated until there are enough selected individuals, this selection method is called fitness proportionate selection or roulette-wheel selection. If instead of a single pointer spun multiple times, there are multiple, equally spaced pointers on a wheel that is spun once, it is called stochastic universal sampling. Repeatedly selecting the best individual of a randomly chosen subset is tournament selection. Taking the best half, third or another proportion of the individuals is truncation selection.

There are other selection algorithms that do not consider all individuals for selection, but only those with a fitness value that is higher than a given (arbitrary) constant. Other algorithms select from a restricted pool where only a certain percentage of the individuals are allowed, based on fitness value.

Methods of selection

The listed methods differ mainly in the selection pressure, [2] [3] which can be set by a strategy parameter in the rank selection described below. The higher the selection pressure, the faster a population converges against a certain solution and the search space may not be explored sufficiently. For more selection methods and further detail see. [4] [5]

Roulette wheel selection

In the roulette wheel selection, the probability of choosing an individual for breeding of the next generation is proportional to its fitness, the better the fitness is, the higher chance for that individual to be chosen. Choosing individuals can be depicted as spinning a roulette that has as many pockets as there are individuals in the current generation, with sizes depending on their probability. Probability of choosing individual is equal to , where is the fitness of and is the size of current generation (note that in this method one individual can be drawn multiple times).

Stochastic universal sampling

Stochastic universal sampling is a development of roulette wheel selection with minimal spread and no bias.

Rank selection

In rank selection, the probability for selection does not depend directly on the fitness, but on the fitness rank of an individual within the population. [6] The exact fitness values themselves do not have to be available, but only a sorting of the individuals according to quality.

In addition to the adjustable selection pressure, an advantage of rank-based selection can be seen in the fact that it also gives worse individuals a chance to reproduce and thus to improve. [7] This can be particularly helpful in applications with restrictions, since it facilitates the overcoming of a restriction in several intermediate steps, i.e. via a sequence of several individuals rated poorly due to restriction violations.

Linear rank selection

Linear ranking, which goes back to Baker, [8] [9] is often used. It allows the selection pressure to be set by the parameter , which can take values between 1.0 (no selection pressure) and 2.0 (high selection pressure). The probability for rank positions is obtained as follows:

Another definition for the probability for rank positions is: [6]

Exponential rank selection

Exponential rank selection is defined as follows: [6]

Steady state selection

In every generation few chromosomes are selected (good - with high fitness) for creating a new offspring. Then some (bad - with low fitness) chromosomes are removed and the new offspring is placed in their place. The rest of population survives to new generation.

Tournament selection

Tournament selection is a method of choosing the individual from the set of individuals. The winner of each tournament is selected to perform crossover.

Truncation selection

For truncation selection, individuals are sorted according to their fitness and a portion (10% to 50%) of the top individuals is selected for next generation. [6]

Elitist selection

Often to get better results, strategies with partial reproduction are used. One of them is elitism, in which a small portion of the best individuals from the last generation is carried over (without any changes) to the next one.

Boltzmann selection

In Boltzmann selection, a continuously varying temperature controls the rate of selection according to a preset schedule. The temperature starts out high, which means that the selection pressure is low. The temperature is gradually lowered, which gradually increases the selection pressure, thereby allowing the GA to narrow in more closely to the best part of the search space while maintaining the appropriate degree of diversity. [10]

See also

Related Research Articles

<span class="mw-page-title-main">Random variable</span> Variable representing a random phenomenon

A random variable is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' in its mathematical definition refers to neither randomness nor variability but instead is a mathematical function in which

In probability theory, there exist several different notions of convergence of sequences of random variables, including convergence in probability, convergence in distribution, and almost sure convergence. The different notions of convergence capture different properties about the sequence, with some notions of convergence being stronger than others. For example, convergence in distribution tells us about the limit distribution of a sequence of random variables. This is a weaker notion than convergence in probability, which tells us about the value a random variable will take, rather than just the distribution.

Fitness is a quantitative representation of individual reproductive success. It is also equal to the average contribution to the gene pool of the next generation, made by the same individuals of the specified genotype or phenotype. Fitness can be defined either with respect to a genotype or to a phenotype in a given environment or time. The fitness of a genotype is manifested through its phenotype, which is also affected by the developmental environment. The fitness of a given phenotype can also be different in different selective environments.

<span class="mw-page-title-main">Evolutionary algorithm</span> Subset of evolutionary computation

An evolutionary algorithm (EA) in computational intelligence is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. Evolution of the population then takes place after the repeated application of the above operators.

Fitness proportionate selection, also known as roulette wheel selection or spinning wheel selection, is a selection technique used in evolutionary algorithms for selecting potentially useful solutions for recombination.

Tournament selection is a method of selecting an individual from a population of individuals in a evolutionary algorithm. Tournament selection involves running several "tournaments" among a few individuals chosen at random from the population. The winner of each tournament is selected for crossover. Selection pressure is then a probabilistic measure of a chromosome's likelihood of participation in the tournament based on the participant selection pool size, is easily adjusted by changing the tournament size. The reason is that if the tournament size is larger, weak individuals have a smaller chance to be selected, because, if a weak individual is selected to be in a tournament, there is a higher probability that a stronger individual is also in that tournament.

<span class="mw-page-title-main">Crossover (evolutionary algorithm)</span> Operator used to vary the programming of chromosomes from one generation to the next

Crossover in evolutionary algorithms and evolutionary computation, also called recombination, is a genetic operator used to combine the genetic information of two parents to generate new offspring. It is one way to stochastically generate new solutions from an existing population, and is analogous to the crossover that happens during sexual reproduction in biology. Solutions can also be generated by cloning an existing solution, which is analogous to asexual reproduction. Newly generated solutions may be mutated before being added to the population.

<span class="mw-page-title-main">Mutation (evolutionary algorithm)</span> Genetic operation used to add population diversity

Mutation is a genetic operator used to maintain genetic diversity of the chromosomes of a population of an evolutionary algorithm (EA), including genetic algorithms in particular. It is analogous to biological mutation.

<span class="mw-page-title-main">Evolution strategy</span>

Evolution strategy (ES) from computer science is a subclass of evolutionary algorithms, which serves as an optimization technique.

In the theory of evolution and natural selection, the Price equation describes how a trait or allele changes in frequency over time. The equation uses a covariance between a trait and fitness, to give a mathematical description of evolution and natural selection. It provides a way to understand the effects that gene transmission and natural selection have on the frequency of alleles within each new generation of a population. The Price equation was derived by George R. Price, working in London to re-derive W.D. Hamilton's work on kin selection. Examples of the Price equation have been constructed for various evolutionary cases. The Price equation also has applications in economics.

<span class="mw-page-title-main">Mating pool</span>

Mating pool is a concept used in evolutionary algorithms and means a population of parents for the next population.

<span class="mw-page-title-main">Estimation of distribution algorithm</span> Family of stochastic optimization methods

Estimation of distribution algorithms (EDAs), sometimes called probabilistic model-building genetic algorithms (PMBGAs), are stochastic optimization methods that guide the search for the optimum by building and sampling explicit probabilistic models of promising candidate solutions. Optimization is viewed as a series of incremental updates of a probabilistic model, starting with the model encoding an uninformative prior over admissible solutions and ending with the model that generates only the global optima.

<span class="mw-page-title-main">Memetic algorithm</span> Algorithm for searching a problem space

A memetic algorithm (MA) in computer science and operations research, is an extension of the traditional genetic algorithm (GA) or more general evolutionary algorithm (EA). It may provide a sufficiently good solution to an optimization problem. It uses a suitable heuristic or local search technique to improve the quality of solutions generated by the EA and to reduce the likelihood of premature convergence.

Holland's schema theorem, also called the fundamental theorem of genetic algorithms, is an inequality that results from coarse-graining an equation for evolutionary dynamics. The Schema Theorem says that short, low-order schemata with above-average fitness increase exponentially in frequency in successive generations. The theorem was proposed by John Holland in the 1970s. It was initially widely taken to be the foundation for explanations of the power of genetic algorithms. However, this interpretation of its implications has been criticized in several publications reviewed in, where the Schema Theorem is shown to be a special case of the Price equation with the schema indicator function as the macroscopic measurement.

A stochastic simulation is a simulation of a system that has variables that can change stochastically (randomly) with individual probabilities.

Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation and selection: in each generation (iteration) new individuals are generated by variation of the current parental individuals, usually in a stochastic way. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value . Like this, individuals with better and better -values are generated over the generation sequence.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

A Moran process or Moran model is a simple stochastic process used in biology to describe finite populations. The process is named after Patrick Moran, who first proposed the model in 1958. It can be used to model variety-increasing processes such as mutation as well as variety-reducing effects such as genetic drift and natural selection. The process can describe the probabilistic dynamics in a finite population of constant size N in which two alleles A and B are competing for dominance. The two alleles are considered to be true replicators.

Reward-based selection is a technique used in evolutionary algorithms for selecting potentially useful solutions for recombination. The probability of being selected for an individual is proportional to the cumulative reward obtained by the individual. The cumulative reward can be computed as a sum of the individual reward and the reward inherited from parents.

Mean-field particle methods are a broad class of interacting type Monte Carlo algorithms for simulating from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability measures can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depends on the distributions of the current random states. A natural way to simulate these sophisticated nonlinear Markov processes is to sample a large number of copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and Markov chain Monte Carlo methods these mean-field particle techniques rely on sequential interacting samples. The terminology mean-field reflects the fact that each of the samples interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. In other words, starting with a chaotic configuration based on independent copies of initial state of the nonlinear Markov chain model, the chaos propagates at any time horizon as the size the system tends to infinity; that is, finite blocks of particles reduces to independent copies of the nonlinear Markov process. This result is called the propagation of chaos property. The terminology "propagation of chaos" originated with the work of Mark Kac in 1976 on a colliding mean-field kinetic gas model.

References

  1. Holland, John H. (1992). Adaptation in natural and artificial systems. PhD thesis, The University of Michigan, 1975. Cambridge, Mass.: MIT Press. ISBN   0-585-03844-9. OCLC   42854623.
  2. Back, T. (1994). "Selective pressure in evolutionary algorithms: A characterization of selection mechanisms". Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence. Orlando, FL, USA: IEEE. pp. 57–62. doi:10.1109/ICEC.1994.350042. ISBN   978-0-7803-1899-1. S2CID   195867383.
  3. Goldberg, David E.; Deb, Kalyanmoy (1991), "A Comparative Analysis of Selection Schemes Used in Genetic Algorithms", Foundations of Genetic Algorithms, vol. 1, Elsevier, pp. 69–93, CiteSeerX   10.1.1.101.9494 , doi:10.1016/b978-0-08-050684-5.50008-2, ISBN   978-0-08-050684-5, S2CID   938257 , retrieved 2023-01-09
  4. Eiben, A.E.; Smith, J.E. (2015). "Fitness, Selection, and Population Management". Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 79–98. doi:10.1007/978-3-662-44874-8. ISBN   978-3-662-44873-1. S2CID   20912932.
  5. De Jong, Kenneth A. (2006). Evolutionary computation : a unified approach. Cambridge, Mass.: MIT Press. ISBN   978-0-262-25598-1. OCLC   69652176.
  6. 1 2 3 4 Jannoud, Ismael; Jaradat, Yousef; Masoud, Mohammad Z.; Manasrah, Ahmad; Alia, Mohammad (22 December 2021). "The Role of Genetic Algorithm Selection Operators in Extending WSN Stability Period: A Comparative Study". Electronics. 11 (1): 28. doi: 10.3390/electronics11010028 .
  7. Whitley, Darrell (1989), Schaffer, J.D. (ed.), "The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best", Proceedings of the Third International Conference on Genetic Algorithms (ICGA), San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., pp. 116–121, ISBN   978-1-55860-066-9
  8. Baker, James E. (1985), Grefenstette, John J. (ed.), "Adaptive Selection Methods for Genetic Algorithms", Conf. Proc. of the 1st Int. Conf. on Genetic Algorithms and Their Applications (ICGA), Hillsdale, New. Jersey: L. Erlbaum Associates, pp. 101–111, ISBN   0-8058-0426-9
  9. Baker, James E. (1987), Grefenstette, John J. (ed.), "Reducing Bias and Inefficiency in the Selection Algorithm", Conf. Proc. of the 2nd Int. Conf. on Genetic Algorithms and Their Applications (ICGA), Hillsdale, New. Jersey: L. Erlbaum Associates, pp. 14–21, ISBN   0-8058-0158-8
  10. Sivanandam, S. N. (2013). Principles of soft computing. Deepa, S. N. New Delhi: Wiley. ISBN   978-1-118-54680-2. OCLC   891566849.