Genetic algorithm

Last updated

The 2006 NASA ST5 spacecraft antenna. This complicated shape was found by an evolutionary computer design program to create the best radiation pattern. It is known as an evolved antenna. St 5-xband-antenna.jpg
The 2006 NASA ST5 spacecraft antenna. This complicated shape was found by an evolutionary computer design program to create the best radiation pattern. It is known as an evolved antenna.

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. [1] Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, [2] hyperparameter optimization, causal inference, [3] etc.

Contents

Methodology

Optimization problems

In a genetic algorithm, a population of candidate solutions (called individuals, creatures, organisms, or phenotypes) to an optimization problem is evolved toward better solutions. Each candidate solution has a set of properties (its chromosomes or genotype) which can be mutated and altered; traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. [4]

The evolution usually starts from a population of randomly generated individuals, and is an iterative process, with the population in each iteration called a generation. In each generation, the fitness of every individual in the population is evaluated; the fitness is usually the value of the objective function in the optimization problem being solved. The more fit individuals are stochastically selected from the current population, and each individual's genome is modified (recombined and possibly randomly mutated) to form a new generation. The new generation of candidate solutions is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population.

A typical genetic algorithm requires:

  1. a genetic representation of the solution domain,
  2. a fitness function to evaluate the solution domain.

A standard representation of each candidate solution is as an array of bits (also called bit set or bit string). [4] Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations. Variable length representations may also be used, but crossover implementation is more complex in this case. Tree-like representations are explored in genetic programming and graph-form representations are explored in evolutionary programming; a mix of both linear chromosomes and trees is explored in gene expression programming.

Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions and then to improve it through repetitive application of the mutation, crossover, inversion and selection operators.

Initialization

The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Often, the initial population is generated randomly, allowing the entire range of possible solutions (the search space). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found or the distribution of the sampling probability tuned to focus in those areas of greater interest. [5]

Selection

During each successive generation, a portion of the existing population is selected to reproduce for a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as the former process may be very time-consuming.

The fitness function is defined over the genetic representation and measures the quality of the represented solution. The fitness function is always problem-dependent. For instance, in the knapsack problem one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity. A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack. The fitness of the solution is the sum of values of all objects in the knapsack if the representation is valid, or 0 otherwise.

In some problems, it is hard or even impossible to define the fitness expression; in these cases, a simulation may be used to determine the fitness function value of a phenotype (e.g. computational fluid dynamics is used to determine the air resistance of a vehicle whose shape is encoded as the phenotype), or even interactive genetic algorithms are used.

Genetic operators

The next step is to generate a second generation population of solutions from those selected, through a combination of genetic operators: crossover (also called recombination), and mutation.

For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. Although reproduction methods that are based on the use of two parents are more "biology inspired", some research [6] [7] suggests that more than two "parents" generate higher quality chromosomes.

These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally, the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions. These less fit solutions ensure genetic diversity within the genetic pool of the parents and therefore ensure the genetic diversity of the subsequent generation of children.

Opinion is divided over the importance of crossover versus mutation. There are many references in Fogel (2006) that support the importance of mutation-based search.

Although crossover and mutation are known as the main genetic operators, it is possible to use other operators such as regrouping, colonization-extinction, or migration in genetic algorithms.[ citation needed ]

It is worth tuning parameters such as the mutation probability, crossover probability and population size to find reasonable settings for the problem class being worked on. A very small mutation rate may lead to genetic drift (which is non-ergodic in nature). A recombination rate that is too high may lead to premature convergence of the genetic algorithm. A mutation rate that is too high may lead to loss of good solutions, unless elitist selection is employed. An adequate population size ensures sufficient genetic diversity for the problem at hand, but can lead to a waste of computational resources if set to a value larger than required.

Heuristics

In addition to the main operators above, other heuristics may be employed to make the calculation faster or more robust. The speciation heuristic penalizes crossover between candidate solutions that are too similar; this encourages population diversity and helps prevent premature convergence to a less optimal solution. [8] [9]

Termination

This generational process is repeated until a termination condition has been reached. Common terminating conditions are:

  • A solution is found that satisfies minimum criteria
  • Fixed number of generations reached
  • Allocated budget (computation time/money) reached
  • The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results
  • Manual inspection
  • Combinations of the above

The building block hypothesis

Genetic algorithms are simple to implement, but their behavior is difficult to understand. In particular, it is difficult to understand why these algorithms frequently succeed at generating solutions of high fitness when applied to practical problems. The building block hypothesis (BBH) consists of:

  1. A description of a heuristic that performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length schemata with above average fitness.
  2. A hypothesis that a genetic algorithm performs adaptation by implicitly and efficiently implementing this heuristic.

Goldberg describes the heuristic as follows:

"Short, low order, and highly fit schemata are sampled, recombined [crossed over], and resampled to form strings of potentially higher fitness. In a way, by working with these particular schemata [the building blocks], we have reduced the complexity of our problem; instead of building high-performance strings by trying every conceivable combination, we construct better and better strings from the best partial solutions of past samplings.
"Because highly fit schemata of low defining length and low order play such an important role in the action of genetic algorithms, we have already given them a special name: building blocks. Just as a child creates magnificent fortresses through the arrangement of simple blocks of wood, so does a genetic algorithm seek near optimal performance through the juxtaposition of short, low-order, high-performance schemata, or building blocks." [10]

Despite the lack of consensus regarding the validity of the building-block hypothesis, it has been consistently evaluated and used as reference throughout the years. Many estimation of distribution algorithms, for example, have been proposed in an attempt to provide an environment in which the hypothesis would hold. [11] [12] Although good results have been reported for some classes of problems, skepticism concerning the generality and/or practicality of the building-block hypothesis as an explanation for GAs' efficiency still remains. Indeed, there is a reasonable amount of work that attempts to understand its limitations from the perspective of estimation of distribution algorithms. [13] [14] [15]

Limitations

The practical use of a genetic algorithm has limitations, especially as compared to alternative optimization algorithms:

Variants

Chromosome representation

The simplest algorithm represents each chromosome as a bit string. Typically, numeric parameters can be represented by integers, though it is possible to use floating point representations. The floating point representation is natural to evolution strategies and evolutionary programming. The notion of real-valued genetic algorithms has been offered but is really a misnomer because it does not really represent the building block theory that was proposed by John Henry Holland in the 1970s. This theory is not without support though, based on theoretical and experimental results (see below). The basic algorithm performs crossover and mutation at the bit level. Other variants treat the chromosome as a list of numbers which are indexes into an instruction table, nodes in a linked list, hashes, objects, or any other imaginable data structure. Crossover and mutation are performed so as to respect data element boundaries. For most data types, specific variation operators can be designed. Different chromosomal data types seem to work better or worse for different specific problem domains.

When bit-string representations of integers are used, Gray coding is often employed. In this way, small changes in the integer can be readily affected through mutations or crossovers. This has been found to help prevent premature convergence at so-called Hamming walls, in which too many simultaneous mutations (or crossover events) must occur in order to change the chromosome to a better solution.

Other approaches involve using arrays of real-valued numbers instead of bit strings to represent chromosomes. Results from the theory of schemata suggest that in general the smaller the alphabet, the better the performance, but it was initially surprising to researchers that good results were obtained from using real-valued chromosomes. This was explained as the set of real values in a finite population of chromosomes as forming a virtual alphabet (when selection and recombination are dominant) with a much lower cardinality than would be expected from a floating point representation. [18] [19]

An expansion of the Genetic Algorithm accessible problem domain can be obtained through more complex encoding of the solution pools by concatenating several types of heterogenously encoded genes into one chromosome. [20] This particular approach allows for solving optimization problems that require vastly disparate definition domains for the problem parameters. For instance, in problems of cascaded controller tuning, the internal loop controller structure can belong to a conventional regulator of three parameters, whereas the external loop could implement a linguistic controller (such as a fuzzy system) which has an inherently different description. This particular form of encoding requires a specialized crossover mechanism that recombines the chromosome by section, and it is a useful tool for the modelling and simulation of complex adaptive systems, especially evolution processes.

Elitism

A practical variant of the general process of constructing a new population is to allow the best organism(s) from the current generation to carry over to the next, unaltered. This strategy is known as elitist selection and guarantees that the solution quality obtained by the GA will not decrease from one generation to the next. [21]

Parallel implementations

Parallel implementations of genetic algorithms come in two flavors. Coarse-grained parallel genetic algorithms assume a population on each of the computer nodes and migration of individuals among the nodes. Fine-grained parallel genetic algorithms assume an individual on each processor node which acts with neighboring individuals for selection and reproduction. Other variants, like genetic algorithms for online optimization problems, introduce time-dependence or noise in the fitness function.

Adaptive GAs

Genetic algorithms with adaptive parameters (adaptive genetic algorithms, AGAs) is another significant and promising variant of genetic algorithms. The probabilities of crossover (pc) and mutation (pm) greatly determine the degree of solution accuracy and the convergence speed that genetic algorithms can obtain. Researchers have analyzed GA convergence analytically. [22] [23]

Instead of using fixed values of pc and pm, AGAs utilize the population information in each generation and adaptively adjust the pc and pm in order to maintain the population diversity as well as to sustain the convergence capacity. In AGA (adaptive genetic algorithm), [24] the adjustment of pc and pm depends on the fitness values of the solutions. There are more examples of AGA variants: Successive zooming method is an early example of improving convergence. [25] In CAGA (clustering-based adaptive genetic algorithm), [26] through the use of clustering analysis to judge the optimization states of the population, the adjustment of pc and pm depends on these optimization states. Recent approaches use more abstract variables for deciding pc and pm. Examples are dominance & co-dominance principles [27] and LIGA (levelized interpolative genetic algorithm), which combines a flexible GA with modified A* search to tackle search space anisotropicity. [28]

It can be quite effective to combine GA with other optimization methods. A GA tends to be quite good at finding generally good global solutions, but quite inefficient at finding the last few mutations to find the absolute optimum. Other techniques (such as simple hill climbing) are quite efficient at finding absolute optimum in a limited region. Alternating GA and hill climbing can improve the efficiency of GA [ citation needed ] while overcoming the lack of robustness of hill climbing.

This means that the rules of genetic variation may have a different meaning in the natural case. For instance provided that steps are stored in consecutive order crossing over may sum a number of steps from maternal DNA adding a number of steps from paternal DNA and so on. This is like adding vectors that more probably may follow a ridge in the phenotypic landscape. Thus, the efficiency of the process may be increased by many orders of magnitude. Moreover, the inversion operator has the opportunity to place steps in consecutive order or any other suitable order in favour of survival or efficiency. [29]

A variation, where the population as a whole is evolved rather than its individual members, is known as gene pool recombination.

A number of variations have been developed to attempt to improve performance of GAs on problems with a high degree of fitness epistasis, i.e. where the fitness of a solution consists of interacting subsets of its variables. Such algorithms aim to learn (before exploiting) these beneficial phenotypic interactions. As such, they are aligned with the Building Block Hypothesis in adaptively reducing disruptive recombination. Prominent examples of this approach include the mGA, [30] GEMGA [31] and LLGA. [32]

Problem domains

Problems which appear to be particularly appropriate for solution by genetic algorithms include timetabling and scheduling problems, and many scheduling software packages are based on GAs[ citation needed ]. GAs have also been applied to engineering. [33] Genetic algorithms are often applied as an approach to solve global optimization problems.

As a general rule of thumb genetic algorithms might be useful in problem domains that have a complex fitness landscape as mixing, i.e., mutation in combination with crossover, is designed to move the population away from local optima that a traditional hill climbing algorithm might get stuck in. Observe that commonly used crossover operators cannot change any uniform population. Mutation alone can provide ergodicity of the overall genetic algorithm process (seen as a Markov chain).

Examples of problems solved by genetic algorithms include: mirrors designed to funnel sunlight to a solar collector, [34] antennae designed to pick up radio signals in space, [35] walking methods for computer figures, [36] optimal design of aerodynamic bodies in complex flowfields [37]

In his Algorithm Design Manual, Skiena advises against genetic algorithms for any task:

[I]t is quite unnatural to model applications in terms of genetic operators like mutation and crossover on bit strings. The pseudobiology adds another level of complexity between you and your problem. Second, genetic algorithms take a very long time on nontrivial problems. [...] [T]he analogy with evolution—where significant progress require [sic] millions of years—can be quite appropriate.

[...]

I have never encountered any problem where genetic algorithms seemed to me the right way to attack it. Further, I have never seen any computational results reported using genetic algorithms that have favorably impressed me. Stick to simulated annealing for your heuristic search voodoo needs.

Steven Skiena [38] :267

History

In 1950, Alan Turing proposed a "learning machine" which would parallel the principles of evolution. [39] Computer simulation of evolution started as early as in 1954 with the work of Nils Aall Barricelli, who was using the computer at the Institute for Advanced Study in Princeton, New Jersey. [40] [41] His 1954 publication was not widely noticed. Starting in 1957, [42] the Australian quantitative geneticist Alex Fraser published a series of papers on simulation of artificial selection of organisms with multiple loci controlling a measurable trait. From these beginnings, computer simulation of evolution by biologists became more common in the early 1960s, and the methods were described in books by Fraser and Burnell (1970) [43] and Crosby (1973). [44] Fraser's simulations included all of the essential elements of modern genetic algorithms. In addition, Hans-Joachim Bremermann published a series of papers in the 1960s that also adopted a population of solution to optimization problems, undergoing recombination, mutation, and selection. Bremermann's research also included the elements of modern genetic algorithms. [45] Other noteworthy early pioneers include Richard Friedberg, George Friedman, and Michael Conrad. Many early papers are reprinted by Fogel (1998). [46]

Although Barricelli, in work he reported in 1963, had simulated the evolution of ability to play a simple game, [47] artificial evolution only became a widely recognized optimization method as a result of the work of Ingo Rechenberg and Hans-Paul Schwefel in the 1960s and early 1970s Rechenberg's group was able to solve complex engineering problems through evolution strategies. [48] [49] [50] [51] Another approach was the evolutionary programming technique of Lawrence J. Fogel, which was proposed for generating artificial intelligence. Evolutionary programming originally used finite state machines for predicting environments, and used variation and selection to optimize the predictive logics. Genetic algorithms in particular became popular through the work of John Holland in the early 1970s, and particularly his book Adaptation in Natural and Artificial Systems (1975). His work originated with studies of cellular automata, conducted by Holland and his students at the University of Michigan. Holland introduced a formalized framework for predicting the quality of the next generation, known as Holland's Schema Theorem. Research in GAs remained largely theoretical until the mid-1980s, when The First International Conference on Genetic Algorithms was held in Pittsburgh, Pennsylvania.

Commercial products

In the late 1980s, General Electric started selling the world's first genetic algorithm product, a mainframe-based toolkit designed for industrial processes. [52] [ circular reference ] In 1989, Axcelis, Inc. released Evolver, the world's first commercial GA product for desktop computers. The New York Times technology writer John Markoff wrote [53] about Evolver in 1990, and it remained the only interactive commercial genetic algorithm until 1995. [54] Evolver was sold to Palisade in 1997, translated into several languages, and is currently in its 6th version. [55] Since the 1990s, MATLAB has built in three derivative-free optimization heuristic algorithms (simulated annealing, particle swarm optimization, genetic algorithm) and two direct search algorithms (simplex search, pattern search). [56]

Parent fields

Genetic algorithms are a sub-field:

Evolutionary algorithms

Evolutionary algorithms is a sub-field of evolutionary computing.

  • Evolution strategies (ES, see Rechenberg, 1994) evolve individuals by means of mutation and intermediate or discrete recombination. ES algorithms are designed particularly to solve problems in the real-value domain. [57] They use self-adaptation to adjust control parameters of the search. De-randomization of self-adaptation has led to the contemporary Covariance Matrix Adaptation Evolution Strategy (CMA-ES).
  • Evolutionary programming (EP) involves populations of solutions with primarily mutation and selection and arbitrary representations. They use self-adaptation to adjust parameters, and can include other variation operations such as combining information from multiple parents.
  • Estimation of Distribution Algorithm (EDA) substitutes traditional reproduction operators by model-guided operators. Such models are learned from the population by employing machine learning techniques and represented as Probabilistic Graphical Models, from which new solutions can be sampled [58] [59] or generated from guided-crossover. [60]
  • Genetic programming (GP) is a related technique popularized by John Koza in which computer programs, rather than function parameters, are optimized. Genetic programming often uses tree-based internal data structures to represent the computer programs for adaptation instead of the list structures typical of genetic algorithms. There are many variants of Genetic Programming, including Cartesian genetic programming, Gene expression programming, [61] grammatical evolution, Linear genetic programming, Multi expression programming etc.
  • Grouping genetic algorithm (GGA) is an evolution of the GA where the focus is shifted from individual items, like in classical GAs, to groups or subset of items. [62] The idea behind this GA evolution proposed by Emanuel Falkenauer is that solving some complex problems, a.k.a. clustering or partitioning problems where a set of items must be split into disjoint group of items in an optimal way, would better be achieved by making characteristics of the groups of items equivalent to genes. These kind of problems include bin packing, line balancing, clustering with respect to a distance measure, equal piles, etc., on which classic GAs proved to perform poorly. Making genes equivalent to groups implies chromosomes that are in general of variable length, and special genetic operators that manipulate whole groups of items. For bin packing in particular, a GGA hybridized with the Dominance Criterion of Martello and Toth, is arguably the best technique to date.
  • Interactive evolutionary algorithms are evolutionary algorithms that use human evaluation. They are usually applied to domains where it is hard to design a computational fitness function, for example, evolving images, music, artistic designs and forms to fit users' aesthetic preference.

Swarm intelligence

Swarm intelligence is a sub-field of evolutionary computing.

  • Ant colony optimization (ACO) uses many ants (or agents) equipped with a pheromone model to traverse the solution space and find locally productive areas.
  • Although considered an Estimation of distribution algorithm, [63] Particle swarm optimization (PSO) is a computational method for multi-parameter optimization which also uses population-based approach. A population (swarm) of candidate solutions (particles) moves in the search space, and the movement of the particles is influenced both by their own best known position and swarm's global best known position. Like genetic algorithms, the PSO method depends on information sharing among population members. In some problems the PSO is often more computationally efficient than the GAs, especially in unconstrained problems with continuous variables. [64]

Other evolutionary computing algorithms

Evolutionary computation is a sub-field of the metaheuristic methods.

  • Memetic algorithm (MA), often called hybrid genetic algorithm among others, is a population-based method in which solutions are also subject to local improvement phases. The idea of memetic algorithms comes from memes, which unlike genes, can adapt themselves. In some problem areas they are shown to be more efficient than traditional evolutionary algorithms.
  • Bacteriologic algorithms (BA) inspired by evolutionary ecology and, more particularly, bacteriologic adaptation. Evolutionary ecology is the study of living organisms in the context of their environment, with the aim of discovering how they adapt. Its basic concept is that in a heterogeneous environment, there is not one individual that fits the whole environment. So, one needs to reason at the population level. It is also believed BAs could be successfully applied to complex positioning problems (antennas for cell phones, urban planning, and so on) or data mining. [65]
  • Cultural algorithm (CA) consists of the population component almost identical to that of the genetic algorithm and, in addition, a knowledge component called the belief space.
  • Differential evolution (DE) inspired by migration of superorganisms. [66]
  • Gaussian adaptation (normal or natural adaptation, abbreviated NA to avoid confusion with GA) is intended for the maximisation of manufacturing yield of signal processing systems. It may also be used for ordinary parametric optimisation. It relies on a certain theorem valid for all regions of acceptability and all Gaussian distributions. The efficiency of NA relies on information theory and a certain theorem of efficiency. Its efficiency is defined as information divided by the work needed to get the information. [67] Because NA maximises mean fitness rather than the fitness of the individual, the landscape is smoothed such that valleys between peaks may disappear. Therefore it has a certain "ambition" to avoid local peaks in the fitness landscape. NA is also good at climbing sharp crests by adaptation of the moment matrix, because NA may maximise the disorder (average information) of the Gaussian simultaneously keeping the mean fitness constant.

Other metaheuristic methods

Metaheuristic methods broadly fall within stochastic optimisation methods.

  • Simulated annealing (SA) is a related global optimization technique that traverses the search space by testing random mutations on an individual solution. A mutation that increases fitness is always accepted. A mutation that lowers fitness is accepted probabilistically based on the difference in fitness and a decreasing temperature parameter. In SA parlance, one speaks of seeking the lowest energy instead of the maximum fitness. SA can also be used within a standard GA algorithm by starting with a relatively high rate of mutation and decreasing it over time along a given schedule.
  • Tabu search (TS) is similar to simulated annealing in that both traverse the solution space by testing mutations of an individual solution. While simulated annealing generates only one mutated solution, tabu search generates many mutated solutions and moves to the solution with the lowest energy of those generated. In order to prevent cycling and encourage greater movement through the solution space, a tabu list is maintained of partial or complete solutions. It is forbidden to move to a solution that contains elements of the tabu list, which is updated as the solution traverses the solution space.
  • Extremal optimization (EO) Unlike GAs, which work with a population of candidate solutions, EO evolves a single solution and makes local modifications to the worst components. This requires that a suitable representation be selected which permits individual solution components to be assigned a quality measure ("fitness"). The governing principle behind this algorithm is that of emergent improvement through selectively removing low-quality components and replacing them with a randomly selected component. This is decidedly at odds with a GA that selects good solutions in an attempt to make better solutions.

Other stochastic optimisation methods

  • The cross-entropy (CE) method generates candidate solutions via a parameterized probability distribution. The parameters are updated via cross-entropy minimization, so as to generate better samples in the next iteration.
  • Reactive search optimization (RSO) advocates the integration of sub-symbolic machine learning techniques into search heuristics for solving complex optimization problems. The word reactive hints at a ready response to events during the search through an internal online feedback loop for the self-tuning of critical parameters. Methodologies of interest for Reactive Search include machine learning and statistics, in particular reinforcement learning, active or query learning, neural networks, and metaheuristics.

See also

Related Research Articles

In artificial intelligence, genetic programming (GP) is a technique of evolving programs, starting from a population of unfit programs, fit for a particular task by applying operations analogous to natural genetic processes to the population of programs.

In computational intelligence (CI), an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. Evolution of the population then takes place after the repeated application of the above operators.

<span class="mw-page-title-main">Evolutionary computation</span> Trial and error problem solvers with a metaheuristic or stochastic optimization character

In computer science, evolutionary computation is a family of algorithms for global optimization inspired by biological evolution, and the subfield of artificial intelligence and soft computing studying these algorithms. In technical terms, they are a family of population-based trial and error problem solvers with a metaheuristic or stochastic optimization character.

NeuroEvolution of Augmenting Topologies (NEAT) is a genetic algorithm (GA) for the generation of evolving artificial neural networks developed by Kenneth Stanley and Risto Miikkulainen in 2002 while at The University of Texas at Austin. It alters both the weighting parameters and structures of networks, attempting to find a balance between the fitness of evolved solutions and their diversity. It is based on applying three key techniques: tracking genes with history markers to allow crossover among topologies, applying speciation to preserve innovations, and developing topologies incrementally from simple initial structures ("complexifying").

A genetic operator is an operator used in genetic algorithms to guide the algorithm towards a solution to a given problem. There are three main types of operators, which must work in conjunction with one another in order for the algorithm to be successful. Genetic operators are used to create and maintain genetic diversity, combine existing solutions into new solutions (crossover) and select between solutions (selection). In his book discussing the use of genetic programming for the optimization of complex problems, computer scientist John Koza has also identified an 'inversion' or 'permutation' operator; however, the effectiveness of this operator has never been conclusively demonstrated and this operator is rarely discussed.

A fitness function is a particular type of objective function that is used to summarise, as a single figure of merit, how close a given design solution is to achieving the set aims. Fitness functions are used in evolutionary algorithms (EA), such as genetic programming and genetic algorithms to guide simulations towards optimal design solutions.

In genetic algorithms (GA), or more general, evolutionary algorithms (EA), a chromosome is a set of parameters which define a proposed solution of the problem that the evolutionary algorithm is trying to solve. The set of all solutions, also called individuals according to the biological model, is known as the population. The genome of an individual consists of one, more rarely of several, chromosomes and corresponds to the genetic representation of the task to be solved. A chromosome is composed of a set of genes, where a gene consists of one or more semantically connected parameters, which are often also called decision variables. They determine one or more phenotypic characteristics of the individual or at least have an influence on them. In the basic form of genetic algorithms, the chromosome is represented as a binary string, while in later variants and in EAs in general, a wide variety of other data structures are used.

In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, tune, or select a heuristic that may provide a sufficiently good solution to an optimization problem or a machine learning problem, especially with incomplete or imperfect information or limited computation capacity. Metaheuristics sample a subset of solutions which is otherwise too large to be completely enumerated or otherwise explored. Metaheuristics may make relatively few assumptions about the optimization problem being solved and so may be usable for a variety of problems.

Interactive evolutionary computation (IEC) or aesthetic selection is a general term for methods of evolutionary computation that use human evaluation. Usually human evaluation is necessary when the form of fitness function is not known or the result of optimization should fit a particular user preference.

In computer science, an evolution strategy (ES) is an optimization technique based on ideas of evolution. It belongs to the general class of evolutionary computation or artificial evolution methodologies.

In computer programming, gene expression programming (GEP) is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and composition, much like a living organism. And like living organisms, the computer programs of GEP are also encoded in simple linear chromosomes of fixed length. Thus, GEP is a genotype–phenotype system, benefiting from a simple genome to keep and transmit the genetic information and a complex phenotype to explore the environment and adapt to it.

Selection is the stage of a genetic algorithm or more general evolutionary algorithm in which individual genomes are chosen from a population for later breeding. Selection mechanisms are also used to choose candidate solutions (individuals) for the next generation. Retaining the best individuals in a generation unchanged in the next generation, is called elitism or elitist selection. It is a successful (slight) variant of the general process of constructing a new population.

In natural evolution and artificial evolution the fitness of a schema is rescaled to give its effective fitness which takes into account crossover and mutation.

<span class="mw-page-title-main">Mating pool</span>

A mating pool is a concept used in evolutionary computation, which refers to a family of algorithms used to solve optimization and search problems.

In evolutionary algorithms (EA), the term of premature convergence means that a population for an optimization problem converged too early, resulting in being suboptimal. In this context, the parental solutions, through the aid of genetic operators, are not able to generate offspring that are superior to, or outperform, their parents. Premature convergence is a common problem found in evolutionary algorithms in general and genetic algorithms in particular, as it leads to a loss, or convergence of, a large number of alleles, subsequently making it very difficult to search for a specific gene in which the alleles were present. An allele is considered lost if, in a population, a gene is present, where all individuals are sharing the same value for that particular gene. An allele is, as defined by De Jong, considered to be a converged allele, when 95% of a population share the same value for a certain gene.

<span class="mw-page-title-main">Differential evolution</span> Method of mathematical optimization

In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the optimized problem and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found.

In computer programming, genetic representation is a way of presenting solutions/individuals in evolutionary computation methods. The term encompasses both the concrete data structures and data types used to realize the genetic material of the candidate solutions in the form of a genome, and the relationships between search space and problem space. In the simplest case, the search space corresponds to the problem space. The choice of problem representation is tied to the choice of genetic operators, both of which have a decisive effect on the efficiency of the optimization. Genetic representation can encode appearance, behavior, physical qualities of individuals. Difference in genetic representations is one of the major criteria drawing a line between known classes of evolutionary computation.

A memetic algorithm (MA) in computer science and operations research, is an extension of the traditional genetic algorithm (GA) or more general evolutionary algorithm (EA). It may provide a sufficiently good solution to an optimization problem. It uses a suitable heuristic or local search technique to improve the quality of solutions generated by the EA and to reduce the likelihood of premature convergence.

Soft computing is an umbrella term used to describe types of algorithms that produce approximate solutions to unsolvable high-level problems in computer science. Typically, traditional hard-computing algorithms heavily rely on concrete data and mathematical models to produce solutions to problems. Soft computing was coined in the late 20th century. During this period, revolutionary research in three fields greatly impacted soft computing. Fuzzy logic is a computational paradigm that entertains the uncertainties in data by using levels of truth rather than rigid 0s and 1s in binary. Next, neural networks which are computational models influenced by human brain functions. Finally, evolutionary computation is a term to describe groups of algorithm that mimic natural processes such as evolution and natural selection.

References

  1. Mitchell 1996, p. 2.
  2. Gerges, Firas; Zouein, Germain; Azar, Danielle (12 March 2018). "Genetic Algorithms with Local Optima Handling to Solve Sudoku Puzzles". Proceedings of the 2018 International Conference on Computing and Artificial Intelligence. ICCAI 2018. New York, NY, USA: Association for Computing Machinery. pp. 19–22. doi:10.1145/3194452.3194463. ISBN   978-1-4503-6419-5. S2CID   44152535.
  3. Burkhart, Michael C.; Ruiz, Gabriel (2023). "Neuroevolutionary representations for learning heterogeneous treatment effects". Journal of Computational Science. 71: 102054. doi: 10.1016/j.jocs.2023.102054 . S2CID   258752823.
  4. 1 2 Whitley 1994, p. 66.
  5. Luque-Rodriguez, Maria; Molina-Baena, Jose; Jimenez-Vilchez, Alfonso; Arauzo-Azofra, Antonio (2022). "Initialization of Feature Selection Search for Classification (sec. 3)". Journal of Artificial Intelligence Research. 75: 953–983. doi: 10.1613/jair.1.14015 .
  6. Eiben, A. E. et al (1994). "Genetic algorithms with multi-parent recombination". PPSN III: Proceedings of the International Conference on Evolutionary Computation. The Third Conference on Parallel Problem Solving from Nature: 7887. ISBN   3-540-58484-6.
  7. Ting, Chuan-Kang (2005). "On the Mean Convergence Time of Multi-parent Genetic Algorithms Without Selection". Advances in Artificial Life: 403412. ISBN   978-3-540-28848-0.
  8. Deb, Kalyanmoy; Spears, William M. (1997). "C6.2: Speciation methods". Handbook of Evolutionary Computation. Institute of Physics Publishing. S2CID   3547258.
  9. Shir, Ofer M. (2012). "Niching in Evolutionary Algorithms". In Rozenberg, Grzegorz; Bäck, Thomas; Kok, Joost N. (eds.). Handbook of Natural Computing. Springer Berlin Heidelberg. pp. 1035–1069. doi:10.1007/978-3-540-92910-9_32. ISBN   9783540929093.
  10. Goldberg 1989, p. 41.
  11. Harik, Georges R.; Lobo, Fernando G.; Sastry, Kumara (1 January 2006). "Linkage Learning via Probabilistic Modeling in the Extended Compact Genetic Algorithm (ECGA)". Scalable Optimization via Probabilistic Modeling. Studies in Computational Intelligence. Vol. 33. pp. 39–61. doi:10.1007/978-3-540-34954-9_3. ISBN   978-3-540-34953-2.
  12. Pelikan, Martin; Goldberg, David E.; Cantú-Paz, Erick (1 January 1999). BOA: The Bayesian Optimization Algorithm. Gecco'99. pp. 525–532. ISBN   9781558606111.{{cite book}}: |journal= ignored (help)
  13. Coffin, David; Smith, Robert E. (1 January 2008). "Linkage Learning in Estimation of Distribution Algorithms". Linkage in Evolutionary Computation. Studies in Computational Intelligence. Vol. 157. pp. 141–156. doi:10.1007/978-3-540-85068-7_7. ISBN   978-3-540-85067-0.
  14. Echegoyen, Carlos; Mendiburu, Alexander; Santana, Roberto; Lozano, Jose A. (8 November 2012). "On the Taxonomy of Optimization Problems Under Estimation of Distribution Algorithms". Evolutionary Computation. 21 (3): 471–495. doi:10.1162/EVCO_a_00095. ISSN   1063-6560. PMID   23136917. S2CID   26585053.
  15. Sadowski, Krzysztof L.; Bosman, Peter A.N.; Thierens, Dirk (1 January 2013). "On the usefulness of linkage processing for solving MAX-SAT". Proceedings of the 15th annual conference on Genetic and evolutionary computation. Gecco '13. pp. 853–860. doi:10.1145/2463372.2463474. hdl:1874/290291. ISBN   9781450319638. S2CID   9986768.
  16. Taherdangkoo, Mohammad; Paziresh, Mahsa; Yazdi, Mehran; Bagheri, Mohammad Hadi (19 November 2012). "An efficient algorithm for function optimization: modified stem cells algorithm". Central European Journal of Engineering. 3 (1): 36–50. doi: 10.2478/s13531-012-0047-8 .
  17. Wolpert, D.H., Macready, W.G., 1995. No Free Lunch Theorems for Optimisation. Santa Fe Institute, SFI-TR-05-010, Santa Fe.
  18. Goldberg, David E. (1991). "The theory of virtual alphabets". Parallel Problem Solving from Nature. Lecture Notes in Computer Science. Vol. 496. pp. 13–22. doi:10.1007/BFb0029726. ISBN   978-3-540-54148-6.{{cite book}}: |journal= ignored (help)
  19. Janikow, C. Z.; Michalewicz, Z. (1991). "An Experimental Comparison of Binary and Floating Point Representations in Genetic Algorithms" (PDF). Proceedings of the Fourth International Conference on Genetic Algorithms: 31–36. Archived (PDF) from the original on 9 October 2022. Retrieved 2 July 2013.
  20. Patrascu, M.; Stancu, A.F.; Pop, F. (2014). "HELGA: a heterogeneous encoding lifelike genetic algorithm for population evolution modeling and simulation". Soft Computing. 18 (12): 2565–2576. doi:10.1007/s00500-014-1401-y. S2CID   29821873.
  21. Baluja, Shumeet; Caruana, Rich (1995). Removing the genetics from the standard genetic algorithm (PDF). ICML. Archived (PDF) from the original on 9 October 2022.
  22. Stannat, W. (2004). "On the convergence of genetic algorithms – a variational approach". Probab. Theory Relat. Fields. 129: 113–132. doi: 10.1007/s00440-003-0330-y . S2CID   121086772.
  23. Sharapov, R.R.; Lapshin, A.V. (2006). "Convergence of genetic algorithms". Pattern Recognit. Image Anal. 16 (3): 392–397. doi:10.1134/S1054661806030084. S2CID   22890010.
  24. Srinivas, M.; Patnaik, L. (1994). "Adaptive probabilities of crossover and mutation in genetic algorithms" (PDF). IEEE Transactions on Systems, Man, and Cybernetics. 24 (4): 656–667. doi:10.1109/21.286385. Archived (PDF) from the original on 9 October 2022.
  25. Kwon, Y.D.; Kwon, S.B.; Jin, S.B.; Kim, J.Y. (2003). "Convergence enhanced genetic algorithm with successive zooming method for solving continuous optimization problems". Computers & Structures. 81 (17): 1715–1725. doi:10.1016/S0045-7949(03)00183-4.
  26. Zhang, J.; Chung, H.; Lo, W. L. (2007). "Clustering-Based Adaptive Crossover and Mutation Probabilities for Genetic Algorithms". IEEE Transactions on Evolutionary Computation. 11 (3): 326–335. doi:10.1109/TEVC.2006.880727. S2CID   2625150.
  27. Pavai, G.; Geetha, T.V. (2019). "New crossover operators using dominance and co-dominance principles for faster convergence of genetic algorithms". Soft Comput. 23 (11): 3661–3686. doi:10.1007/s00500-018-3016-1. S2CID   254028984.
  28. Li, J.C.F.; Zimmerle, D.; Young, P. (2022). "Flexible networked rural electrification using levelized interpolative genetic algorithm". Energy & AI. 10: 100186. doi: 10.1016/j.egyai.2022.100186 . S2CID   250972466.
  29. See for instance Evolution-in-a-nutshell Archived 15 April 2016 at the Wayback Machine or example in travelling salesman problem, in particular the use of an edge recombination operator.
  30. Goldberg, D. E.; Korb, B.; Deb, K. (1989). "Messy Genetic Algorithms : Motivation Analysis, and First Results". Complex Systems. 5 (3): 493–530.
  31. Gene expression: The missing link in evolutionary computation
  32. Harik, G. (1997). Learning linkage to efficiently solve problems of bounded difficulty using genetic algorithms (PhD). Dept. Computer Science, University of Michigan, Ann Arbour.
  33. Tomoiagă B, Chindriş M, Sumper A, Sudria-Andreu A, Villafafila-Robles R. Pareto Optimal Reconfiguration of Power Distribution Systems Using a Genetic Algorithm Based on NSGA-II. Energies. 2013; 6(3):1439-1455.
  34. Gross, Bill (2 February 2009). "A solar energy system that tracks the sun". TED. Retrieved 20 November 2013.
  35. Hornby, G. S.; Linden, D. S.; Lohn, J. D., Automated Antenna Design with Evolutionary Algorithms (PDF)
  36. "Flexible Muscle-Based Locomotion for Bipedal Creatures".
  37. Evans, B.; Walton, S.P. (December 2017). "Aerodynamic optimisation of a hypersonic reentry vehicle based on solution of the Boltzmann–BGK equation and evolutionary optimisation". Applied Mathematical Modelling. 52: 215–240. doi: 10.1016/j.apm.2017.07.024 . ISSN   0307-904X.
  38. Skiena, Steven (2010). The Algorithm Design Manual (2nd ed.). Springer Science+Business Media. ISBN   978-1-849-96720-4.
  39. Turing, Alan M. (October 1950). "Computing machinery and intelligence". Mind. LIX (238): 433–460. doi:10.1093/mind/LIX.236.433.
  40. Barricelli, Nils Aall (1954). "Esempi numerici di processi di evoluzione". Methodos: 45–68.
  41. Barricelli, Nils Aall (1957). "Symbiogenetic evolution processes realized by artificial methods". Methodos: 143–182.
  42. Fraser, Alex (1957). "Simulation of genetic systems by automatic digital computers. I. Introduction". Aust. J. Biol. Sci. 10 (4): 484–491. doi: 10.1071/BI9570484 .
  43. Fraser, Alex; Burnell, Donald (1970). Computer Models in Genetics. New York: McGraw-Hill. ISBN   978-0-07-021904-5.
  44. Crosby, Jack L. (1973). Computer Simulation in Genetics. London: John Wiley & Sons. ISBN   978-0-471-18880-3.
  45. 02.27.96 - UC Berkeley's Hans Bremermann, professor emeritus and pioneer in mathematical biology, has died at 69
  46. Fogel, David B., ed. (1998). Evolutionary Computation: The Fossil Record. New York: IEEE Press. ISBN   978-0-7803-3481-6.
  47. Barricelli, Nils Aall (1963). "Numerical testing of evolution theories. Part II. Preliminary tests of performance, symbiogenesis and terrestrial life". Acta Biotheoretica. 16 (3–4): 99–126. doi:10.1007/BF01556602. S2CID   86717105.
  48. Rechenberg, Ingo (1973). Evolutionsstrategie. Stuttgart: Holzmann-Froboog. ISBN   978-3-7728-0373-4.
  49. Schwefel, Hans-Paul (1974). Numerische Optimierung von Computer-Modellen (PhD thesis).
  50. Schwefel, Hans-Paul (1977). Numerische Optimierung von Computor-Modellen mittels der Evolutionsstrategie : mit einer vergleichenden Einführung in die Hill-Climbing- und Zufallsstrategie. Basel; Stuttgart: Birkhäuser. ISBN   978-3-7643-0876-6.
  51. Schwefel, Hans-Paul (1981). Numerical optimization of computer models (Translation of 1977 Numerische Optimierung von Computor-Modellen mittels der Evolutionsstrategie. Chichester; New York: Wiley. ISBN   978-0-471-09988-8.
  52. Aldawoodi, Namir (2008). An Approach to Designing an Unmanned Helicopter Autopilot Using Genetic Algorithms and Simulated Annealing. p. 99. ISBN   978-0549773498 via Google Books.
  53. Markoff, John (29 August 1990). "What's the Best Answer? It's Survival of the Fittest". New York Times. Retrieved 13 July 2016.
  54. Ruggiero, Murray A.. (1 August 2009) Fifteen years and counting Archived 30 January 2016 at the Wayback Machine . Futuresmag.com. Retrieved on 2013-08-07.
  55. Evolver: Sophisticated Optimization for Spreadsheets. Palisade. Retrieved on 2013-08-07.
  56. Li, Lin; Saldivar, Alfredo Alan Flores; Bai, Yun; Chen, Yi; Liu, Qunfeng; Li, Yun (2019). "Benchmarks for Evaluating Optimization Algorithms and Benchmarking MATLAB Derivative-Free Optimizers for Practitioners' Rapid Access". IEEE Access. 7: 79657–79670. Bibcode:2019IEEEA...779657L. doi: 10.1109/ACCESS.2019.2923092 . S2CID   195774435.
  57. Cohoon, J; et al. (2002). Evolutionary algorithms for the physical design of VLSI circuits (PDF). Springer, pp. 683-712, 2003. ISBN   978-3-540-43330-9. Archived (PDF) from the original on 9 October 2022.{{cite book}}: |journal= ignored (help)
  58. Pelikan, Martin; Goldberg, David E.; Cantú-Paz, Erick (1 January 1999). BOA: The Bayesian Optimization Algorithm. Gecco'99. pp. 525–532. ISBN   9781558606111.{{cite book}}: |journal= ignored (help)
  59. Pelikan, Martin (2005). Hierarchical Bayesian optimization algorithm : toward a new generation of evolutionary algorithms (1st ed.). Berlin [u.a.]: Springer. ISBN   978-3-540-23774-7.
  60. Thierens, Dirk (11 September 2010). "The Linkage Tree Genetic Algorithm". Parallel Problem Solving from Nature, PPSN XI. pp. 264–273. doi:10.1007/978-3-642-15844-5_27. ISBN   978-3-642-15843-8.
  61. Ferreira, C (2001). "Gene Expression Programming: A New Adaptive Algorithm for Solving Problems" (PDF). Complex Systems. 13 (2): 87–129. arXiv: cs/0102027 . Bibcode:2001cs........2027F. Archived (PDF) from the original on 9 October 2022.
  62. Falkenauer, Emanuel (1997). Genetic Algorithms and Grouping Problems. Chichester, England: John Wiley & Sons Ltd. ISBN   978-0-471-97150-4.
  63. Zlochin, Mark; Birattari, Mauro; Meuleau, Nicolas; Dorigo, Marco (1 October 2004). "Model-Based Search for Combinatorial Optimization: A Critical Survey". Annals of Operations Research. 131 (1–4): 373–395. CiteSeerX   10.1.1.3.427 . doi:10.1023/B:ANOR.0000039526.52305.af. ISSN   0254-5330. S2CID   63137.
  64. Rania Hassan, Babak Cohanim, Olivier de Weck, Gerhard Vente r (2005) A comparison of particle swarm optimization and the genetic algorithm
  65. Baudry, Benoit; Franck Fleurey; Jean-Marc Jézéquel; Yves Le Traon (March–April 2005). "Automatic Test Case Optimization: A Bacteriologic Algorithm" (PDF). IEEE Software. 22 (2): 76–82. doi:10.1109/MS.2005.30. S2CID   3559602. Archived (PDF) from the original on 9 October 2022. Retrieved 9 August 2009.
  66. Civicioglu, P. (2012). "Transforming Geocentric Cartesian Coordinates to Geodetic Coordinates by Using Differential Search Algorithm". Computers &Geosciences. 46: 229–247. Bibcode:2012CG.....46..229C. doi:10.1016/j.cageo.2011.12.011.
  67. Kjellström, G. (December 1991). "On the Efficiency of Gaussian Adaptation". Journal of Optimization Theory and Applications. 71 (3): 589–597. doi:10.1007/BF00941405. S2CID   116847975.

Bibliography

Resources

Tutorials