Selection (genetic algorithm)

Last updated

Selection is the stage of a genetic algorithm or more general evolutionary algorithm in which individual genomes are chosen from a population for later breeding (e.g., using the crossover operator). Selection mechanisms are also used to choose candidate solutions (individuals) for the next generation. Retaining the best individuals in a generation unchanged in the next generation, is called elitism or elitist selection. It is a successful (slight) variant of the general process of constructing a new population.

Contents

A selection procedure for breeding used early on [1] may be implemented as follows:

  1. The fitness values that have been computed (fitness function) are normalized, such that the sum of all resulting fitness values equals 1.
  2. Accumulated normalized fitness values are computed: the accumulated fitness value of an individual is the sum of its own fitness value plus the fitness values of all the previous individuals; the accumulated fitness of the last individual should be 1, otherwise something went wrong in the normalization step.
  3. A random number R between 0 and 1 is chosen.
  4. The selected individual is the first one whose accumulated normalized value is greater than or equal to R.

For many problems the above algorithm might be computationally demanding. A simpler and faster alternative uses the so-called stochastic acceptance.

If this procedure is repeated until there are enough selected individuals, this selection method is called fitness proportionate selection or roulette-wheel selection. If instead of a single pointer spun multiple times, there are multiple, equally spaced pointers on a wheel that is spun once, it is called stochastic universal sampling. Repeatedly selecting the best individual of a randomly chosen subset is tournament selection. Taking the best half, third or another proportion of the individuals is truncation selection.

There are other selection algorithms that do not consider all individuals for selection, but only those with a fitness value that is higher than a given (arbitrary) constant. Other algorithms select from a restricted pool where only a certain percentage of the individuals are allowed, based on fitness value.

Methods of Selection (Evolutionary Algorithm)

The listed methods differ mainly in the selection pressure, [2] [3] which can be set by a strategy parameter in the rank selection described below. The higher the selection pressure, the faster a population converges against a certain solution and the search space may not be explored sufficiently. For more selection methods and further detail see. [4] [5]

Roulette Wheel Selection

In the roulette wheel selection, the probability of choosing an individual for breeding of the next generation is proportional to its fitness, the better the fitness is, the higher chance for that individual to be chosen. Choosing individuals can be depicted as spinning a roulette that has as many pockets as there are individuals in the current generation, with sizes depending on their probability. Probability of choosing individual is equal to , where is the fitness of and is the size of current generation (note that in this method one individual can be drawn multiple times).

Rank Selection

In rank selection, the selection probability does not depend directly on the fitness, but on the fitness rank of an individual within the population. This puts large fitness differences into perspective; moreover, the exact fitness values themselves do not have to be available, but only a sorting of the individuals according to quality.

Linear ranking, which goes back to Baker, [6] [7] is often used. It allows the selection pressure to be set by the parameter , which can take values between 1.0 (no selection pressure) and 2.0 (high selection pressure). The probability for rank positions is obtained as follows:

In addition to the adjustable selection pressure, an advantage of rank-based selection can additionally be seen in the fact that it also gives worse individuals a chance to reproduce and thus to improve. [8] This can be particularly helpful in applications with restrictions, since it facilitates the overcoming of a restriction in several intermediate steps, i.e. via a sequence of several individuals rated poorly due to restriction violations.

Steady State Selection

In every generation few chromosomes are selected (good - with high fitness) for creating a new offspring. Then some (bad - with low fitness) chromosomes are removed and the new offspring is placed in their place. The rest of population survives to new generation.

Tournament Selection

Tournament Selection is a method of choosing the individual from the set of individuals. The winner of each tournament is selected to perform crossover.

Elitist Selection

Often to get better results, strategies with partial reproduction are used. One of them is elitism, in which a small portion of the best individuals from the last generation is carried over (without any changes) to the next one.

Boltzmann Selection

In Boltzmann selection, a continuously varying temperature controls the rate of selection according to a preset schedule. The temperature starts out high, which means that the selection pressure is low. The temperature is gradually lowered, which gradually increases the selection pressure, thereby allowing the GA to narrow in more closely to the best part of the search space while maintaining the appropriate degree of diversity. [9]

See also

Related Research Articles

<span class="mw-page-title-main">Genetic algorithm</span> Competitive algorithm for searching a problem space

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, causal inference, etc.

<span class="mw-page-title-main">Inverse transform sampling</span> Basic method for pseudo-random number sampling

Inverse transform sampling is a basic method for pseudo-random number sampling, i.e., for generating sample numbers at random from any probability distribution given its cumulative distribution function.

In computational intelligence (CI), an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. Evolution of the population then takes place after the repeated application of the above operators.

Fitness proportionate selection, also known as roulette wheel selection, is a genetic operator used in genetic algorithms for selecting potentially useful solutions for recombination.

A fitness function is a particular type of objective function that is used to summarise, as a single figure of merit, how close a given design solution is to achieving the set aims. Fitness functions are used in evolutionary algorithms (EA), such as genetic programming and genetic algorithms to guide simulations towards optimal design solutions.

Tournament selection is a method of selecting an individual from a population of individuals in a genetic algorithm. Tournament selection involves running several "tournaments" among a few individuals chosen at random from the population. The winner of each tournament is selected for crossover. Selection pressure, a probabilistic measure of a chromosome's likelihood of participation in the tournament based on the participant selection pool size, is easily adjusted by changing the tournament size. The reason is that if the tournament size is larger, weak individuals have a smaller chance to be selected, because, if a weak individual is selected to be in a tournament, there is a higher probability that a stronger individual is also in that tournament.

In genetic algorithms and evolutionary computation, crossover, also called recombination, is a genetic operator used to combine the genetic information of two parents to generate new offspring. It is one way to stochastically generate new solutions from an existing population, and is analogous to the crossover that happens during sexual reproduction in biology. Solutions can also be generated by cloning an existing solution, which is analogous to asexual reproduction. Newly generated solutions may be mutated before being added to the population.

Mutation is a genetic operator used to maintain genetic diversity of the chromosomes of a population of a genetic or, more generally, an evolutionary algorithm (EA). It is analogous to biological mutation.

In computer science, an evolution strategy (ES) is an optimization technique based on ideas of evolution. It belongs to the general class of evolutionary computation or artificial evolution methodologies.

<span class="mw-page-title-main">Mating pool</span>

A mating pool is a concept used in evolutionary computation, which refers to a family of algorithms used to solve optimization and search problems.

<span class="mw-page-title-main">Estimation of distribution algorithm</span>

Estimation of distribution algorithms (EDAs), sometimes called probabilistic model-building genetic algorithms (PMBGAs), are stochastic optimization methods that guide the search for the optimum by building and sampling explicit probabilistic models of promising candidate solutions. Optimization is viewed as a series of incremental updates of a probabilistic model, starting with the model encoding an uninformative prior over admissible solutions and ending with the model that generates only the global optima.

<span class="mw-page-title-main">Differential evolution</span> Method of mathematical optimization

In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the optimized problem and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found.

A memetic algorithm (MA) in computer science and operations research, is an extension of the traditional genetic algorithm (GA) or more general evolutionary algorithm (EA). It may provide a sufficiently good solution to an optimization problem. It uses a suitable heuristic or local search technique to improve the quality of solutions generated by the EA and to reduce the likelihood of premature convergence.

Holland's schema theorem, also called the fundamental theorem of genetic algorithms, is an inequality that results from coarse-graining an equation for evolutionary dynamics. The Schema Theorem says that short, low-order schemata with above-average fitness increase exponentially in frequency in successive generations. The theorem was proposed by John Holland in the 1970s. It was initially widely taken to be the foundation for explanations of the power of genetic algorithms. However, this interpretation of its implications has been criticized in several publications reviewed in, where the Schema Theorem is shown to be a special case of the Price equation with the schema indicator function as the macroscopic measurement.

A stochastic simulation is a simulation of a system that has variables that can change stochastically (randomly) with individual probabilities.

Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation and selection: in each generation (iteration) new individuals are generated by variation, usually in a stochastic way, of the current parental individuals. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value . Like this, over the generation sequence, individuals with better and better -values are generated.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

Reward-based selection is a technique used in evolutionary algorithms for selecting potentially useful solutions for recombination. The probability of being selected for an individual is proportional to the cumulative reward obtained by the individual. The cumulative reward can be computed as a sum of the individual reward and the reward inherited from parents.

Mean-field particle methods are a broad class of interacting type Monte Carlo algorithms for simulating from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability measures can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depends on the distributions of the current random states. A natural way to simulate these sophisticated nonlinear Markov processes is to sample a large number of copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and Markov chain Monte Carlo methods these mean-field particle techniques rely on sequential interacting samples. The terminology mean-field reflects the fact that each of the samples interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. In other words, starting with a chaotic configuration based on independent copies of initial state of the nonlinear Markov chain model, the chaos propagates at any time horizon as the size the system tends to infinity; that is, finite blocks of particles reduces to independent copies of the nonlinear Markov process. This result is called the propagation of chaos property. The terminology "propagation of chaos" originated with the work of Mark Kac in 1976 on a colliding mean-field kinetic gas model.

References

  1. Holland, John H. (1992). Adaptation in natural and artificial systems. PhD thesis, The University of Michigan, 1975. Cambridge, Mass.: MIT Press. ISBN   0-585-03844-9. OCLC   42854623.
  2. Back, T. (1994). "Selective pressure in evolutionary algorithms: A characterization of selection mechanisms". Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence. Orlando, FL, USA: IEEE. pp. 57–62. doi:10.1109/ICEC.1994.350042. ISBN   978-0-7803-1899-1. S2CID   195867383.
  3. Goldberg, David E.; Deb, Kalyanmoy (1991), "A Comparative Analysis of Selection Schemes Used in Genetic Algorithms", Foundations of Genetic Algorithms, Elsevier, vol. 1, pp. 69–93, CiteSeerX   10.1.1.101.9494 , doi:10.1016/b978-0-08-050684-5.50008-2, ISBN   978-0-08-050684-5, S2CID   938257 , retrieved 2023-01-09
  4. Eiben, A.E.; Smith, J.E. (2015). "Fitness, Selection, and Population Management". Introduction to Evolutionary Computing. Natural Computing Series. Berlin, Heidelberg: Springer. pp. 79–98. doi:10.1007/978-3-662-44874-8. ISBN   978-3-662-44873-1. S2CID   20912932.
  5. De Jong, Kenneth A. (2006). Evolutionary computation : a unified approach. Cambridge, Mass.: MIT Press. ISBN   978-0-262-25598-1. OCLC   69652176.
  6. Baker, James E. (1985), Grefenstette, John J. (ed.), "Adaptive Selection Methods for Genetic Algorithms", Conf. Proc. of the 1st Int. Conf. on Genetic Algorithms and Their Applications (ICGA), Hillsdale, New. Jersey: L. Erlbaum Associates, pp. 101–111, ISBN   0-8058-0426-9
  7. Baker, James E. (1987), Grefenstette, John J. (ed.), "Reducing Bias and Inefficiency in the Selection Algorithm", Conf. Proc. of the 2nd Int. Conf. on Genetic Algorithms and Their Applications (ICGA), Hillsdale, New. Jersey: L. Erlbaum Associates, pp. 14–21, ISBN   0-8058-0158-8
  8. Whitley, Darrell (1989), Schaffer, J.D. (ed.), "The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best", Proceedings of the Third International Conference on Genetic Algorithms (ICGA), San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., pp. 116–121, ISBN   978-1-55860-066-9
  9. Sivanandam, S. N. (2013). Principles of soft computing. Deepa, S. N. New Delhi: Wiley. ISBN   978-1-118-54680-2. OCLC   891566849.