In computational science, particle swarm optimization (PSO) [1] is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. It solves a problem by having a population of candidate solutions, here dubbed particles, and moving these particles around in the search-space according to simple mathematical formulae over the particle's position and velocity. Each particle's movement is influenced by its local best known position, but is also guided toward the best known positions in the search-space, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions.
PSO is originally attributed to Kennedy, Eberhart and Shi [2] [3] and was first intended for simulating social behaviour, [4] as a stylized representation of the movement of organisms in a bird flock or fish school. The algorithm was simplified and it was observed to be performing optimization. The book by Kennedy and Eberhart [5] describes many philosophical aspects of PSO and swarm intelligence. An extensive survey of PSO applications is made by Poli. [6] [7] In 2017, a comprehensive review on theoretical and experimental works on PSO has been published by Bonyadi and Michalewicz. [1]
PSO is a metaheuristic as it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. Also, PSO does not use the gradient of the problem being optimized, which means PSO does not require that the optimization problem be differentiable as is required by classic optimization methods such as gradient descent and quasi-newton methods. However, metaheuristics such as PSO do not guarantee an optimal solution is ever found.
A basic variant of the PSO algorithm works by having a population (called a swarm) of candidate solutions (called particles). These particles are moved around in the search-space according to a few simple formulae. [8] The movements of the particles are guided by their own best-known position in the search-space as well as the entire swarm's best-known position. When improved positions are being discovered these will then come to guide the movements of the swarm. The process is repeated and by doing so it is hoped, but not guaranteed, that a satisfactory solution will eventually be discovered.
Formally, let f: ℝn → ℝ be the cost function which must be minimized. The function takes a candidate solution as an argument in the form of a vector of real numbers and produces a real number as output which indicates the objective function value of the given candidate solution. The gradient of f is not known. The goal is to find a solution a for which f(a) ≤ f(b) for all b in the search-space, which would mean a is the global minimum.
Let S be the number of particles in the swarm, each having a position xi ∈ ℝn in the search-space and a velocity vi ∈ ℝn. Let pi be the best known position of particle i and let g be the best known position of the entire swarm. A basic PSO algorithm to minimize the cost function is then: [9]
for each particle i = 1, ..., Sdo Initialize the particle's position with a uniformly distributed random vector: xi ~ U(blo, bup) Initialize the particle's best known position to its initial position: pi ← xiiff(pi) < f(g) then update the swarm's best known position: g ← pi Initialize the particle's velocity: vi ~ U(-|bup-blo|, |bup-blo|) while a termination criterion is not met do: for each particle i = 1, ..., Sdofor each dimension d = 1, ..., ndo Pick random numbers: rp, rg ~ U(0,1) Update the particle's velocity: vi,d ← w vi,d + φprp (pi,d-xi,d) + φgrg (gd-xi,d) Update the particle's position: xi ← xi + viiff(xi) < f(pi) then Update the particle's best known position: pi ← xiiff(pi) < f(g) then Update the swarm's best known position: g ← pi
The values blo and bup represent the lower and upper boundaries of the search-space respectively. The w parameter is the inertia weight. The parameters φp and φg are often called cognitive coefficient and social coefficient.
The termination criterion can be the number of iterations performed, or a solution where the adequate objective function value is found. [10] The parameters w, φp, and φg are selected by the practitioner and control the behaviour and efficacy of the PSO method (below).
The choice of PSO parameters can have a large impact on optimization performance. Selecting PSO parameters that yield good performance has therefore been the subject of much research. [11] [12] [13] [14] [15] [16] [17] [18] [19]
To prevent divergence ("explosion") the inertia weight must be smaller than 1. The two other parameters can be then derived thanks to the constriction approach, [16] or freely selected, but the analyses suggest convergence domains to constrain them. Typical values are in .
The PSO parameters can also be tuned by using another overlaying optimizer, a concept known as meta-optimization, [20] [21] [22] [23] or even fine-tuned during the optimization, e.g., by means of fuzzy logic. [24] [25]
Parameters have also been tuned for various optimization scenarios. [26] [27]
The topology of the swarm defines the subset of particles with which each particle can exchange information. [28] The basic version of the algorithm uses the global topology as the swarm communication structure. [10] This topology allows all particles to communicate with all the other particles, thus the whole swarm share the same best position g from a single particle. However, this approach might lead the swarm to be trapped into a local minimum, [29] thus different topologies have been used to control the flow of information among particles. For instance, in local topologies, particles only share information with a subset of particles. [10] This subset can be a geometrical one [30] – for example "the m nearest particles" – or, more often, a social one, i.e. a set of particles that is not depending on any distance. In such cases, the PSO variant is said to be local best (vs global best for the basic PSO).
A commonly used swarm topology is the ring, in which each particle has just two neighbours, but there are many others. [10] The topology is not necessarily static. In fact, since the topology is related to the diversity of communication of the particles, [31] some efforts have been done to create adaptive topologies (SPSO, [32] APSO, [33] stochastic star, [34] TRIBES, [35] Cyber Swarm, [36] and C-PSO [37] )
By using the ring topology, PSO can attain generation-level parallelism, significantly enhancing the evolutionary speed. [38]
There are several schools of thought as to why and how the PSO algorithm can perform optimization.
A common belief amongst researchers is that the swarm behaviour varies between exploratory behaviour, that is, searching a broader region of the search-space, and exploitative behaviour, that is, a locally oriented search so as to get closer to a (possibly local) optimum. This school of thought has been prevalent since the inception of PSO. [3] [4] [12] [16] This school of thought contends that the PSO algorithm and its parameters must be chosen so as to properly balance between exploration and exploitation to avoid premature convergence to a local optimum yet still ensure a good rate of convergence to the optimum. This belief is the precursor of many PSO variants, see below.
Another school of thought is that the behaviour of a PSO swarm is not well understood in terms of how it affects actual optimization performance, especially for higher-dimensional search-spaces and optimization problems that may be discontinuous, noisy, and time-varying. This school of thought merely tries to find PSO algorithms and parameters that cause good performance regardless of how the swarm behaviour can be interpreted in relation to e.g. exploration and exploitation. Such studies have led to the simplification of the PSO algorithm, see below.
In relation to PSO the word convergence typically refers to two different definitions:
Convergence of the sequence of solutions has been investigated for PSO. [15] [16] [17] These analyses have resulted in guidelines for selecting PSO parameters that are believed to cause convergence to a point and prevent divergence of the swarm's particles (particles do not move unboundedly and will converge to somewhere). However, the analyses were criticized by Pedersen [22] for being oversimplified as they assume the swarm has only one particle, that it does not use stochastic variables and that the points of attraction, that is, the particle's best known position p and the swarm's best known position g, remain constant throughout the optimization process. However, it was shown [39] that these simplifications do not affect the boundaries found by these studies for parameter where the swarm is convergent. Considerable effort has been made in recent years to weaken the modeling assumption utilized during the stability analysis of PSO, [40] with the most recent generalized result applying to numerous PSO variants and utilized what was shown to be the minimal necessary modeling assumptions. [41]
Convergence to a local optimum has been analyzed for PSO in [42] and. [43] It has been proven that PSO needs some modification to guarantee finding a local optimum.
This means that determining the convergence capabilities of different PSO algorithms and parameters still depends on empirical results. One attempt at addressing this issue is the development of an "orthogonal learning" strategy for an improved use of the information already existing in the relationship between p and g, so as to form a leading converging exemplar and to be effective with any PSO topology. The aims are to improve the performance of PSO overall, including faster global convergence, higher solution quality, and stronger robustness. [44] However, such studies do not provide theoretical evidence to actually prove their claims.
Without the need for a trade-off between convergence ('exploitation') and divergence ('exploration'), an adaptive mechanism can be introduced. Adaptive particle swarm optimization (APSO) [45] features better search efficiency than standard PSO. APSO can perform global search over the entire search space with a higher convergence speed. It enables automatic control of the inertia weight, acceleration coefficients, and other algorithmic parameters at the run time, thereby improving the search effectiveness and efficiency at the same time. Also, APSO can act on the globally best particle to jump out of the likely local optima. However, APSO will introduce new algorithm parameters, it does not introduce additional design or implementation complexity nonetheless.
Besides, through the utilization of a scale-adaptive fitness evaluation mechanism, PSO can efficiently address computationally expensive optimization problems. [46]
Numerous variants of even a basic PSO algorithm are possible. For example, there are different ways to initialize the particles and velocities (e.g. start with zero velocities instead), how to dampen the velocity, only update pi and g after the entire swarm has been updated, etc. Some of these choices and their possible performance impact have been discussed in the literature. [14]
A series of standard implementations have been created by leading researchers, "intended for use both as a baseline for performance testing of improvements to the technique, as well as to represent PSO to the wider optimization community. Having a well-known, strictly-defined standard algorithm provides a valuable point of comparison which can be used throughout the field of research to better test new advances." [10] The latest is Standard PSO 2011 (SPSO-2011). [47]
New and more sophisticated PSO variants are also continually being introduced in an attempt to improve optimization performance. There are certain trends in that research; one is to make a hybrid optimization method using PSO combined with other optimizers, [48] [49] [50] e.g., combined PSO with biogeography-based optimization, [51] and the incorporation of an effective learning method. [44]
Another research trend is to try to alleviate premature convergence (that is, optimization stagnation), e.g. by reversing or perturbing the movement of the PSO particles, [19] [52] [53] [54] another approach to deal with premature convergence is the use of multiple swarms [55] (multi-swarm optimization). The multi-swarm approach can also be used to implement multi-objective optimization. [56] Finally, there are developments in adapting the behavioural parameters of PSO during optimization. [45] [24]
Another school of thought is that PSO should be simplified as much as possible without impairing its performance; a general concept often referred to as Occam's razor. Simplifying PSO was originally suggested by Kennedy [4] and has been studied more extensively, [18] [21] [22] [57] where it appeared that optimization performance was improved, and the parameters were easier to tune and they performed more consistently across different optimization problems.
Another argument in favour of simplifying PSO is that metaheuristics can only have their efficacy demonstrated empirically by doing computational experiments on a finite number of optimization problems. This means a metaheuristic such as PSO cannot be proven correct and this increases the risk of making errors in its description and implementation. A good example of this [58] presented a promising variant of a genetic algorithm (another popular metaheuristic) but it was later found to be defective as it was strongly biased in its optimization search towards similar values for different dimensions in the search space, which happened to be the optimum of the benchmark problems considered. This bias was because of a programming error, and has now been fixed. [59]
Initialization of velocities may require extra inputs. The Bare Bones PSO variant [60] has been proposed in 2003 by James Kennedy, and does not need to use velocity at all.
In this variant of PSO one dispences with the velocity of the particles and instead updates the positions of the particles using the following simple rule,
where , are the position and the best position of the particle ; is the global best position; is the normal distribution with the mean and standard deviation ; and where signifies the norm of a vector.
Another simpler variant is the accelerated particle swarm optimization (APSO), [61] which also does not need to use velocity and can speed up the convergence in many applications. A simple demo code of APSO is available. [62]
In this variant of PSO one dispences with both the particle's velocity and the particle's best position. The particle position is updated according to the following rule,
where is a random uniformly distributed vector, is the typical length of the problem at hand, and and are the parameters of the method. As a refinement of the method one can decrease with each iteration, , where is the number of the iteration and is the decrease control parameter.
PSO has also been applied to multi-objective problems, [63] [64] [65] in which the objective function comparison takes Pareto dominance into account when moving the PSO particles and non-dominated solutions are stored so as to approximate the pareto front.
As the PSO equations given above work on real numbers, a commonly used method to solve discrete problems is to map the discrete search space to a continuous domain, to apply a classical PSO, and then to demap the result. Such a mapping can be very simple (for example by just using rounded values) or more sophisticated. [66]
However, it can be noted that the equations of movement make use of operators that perform four actions:
Usually a position and a velocity are represented by n real numbers, and these operators are simply -, *, +, and again +. But all these mathematical objects can be defined in a completely different way, in order to cope with binary problems (or more generally discrete ones), or even combinatorial ones. [67] [68] [69] [70] One approach is to redefine the operators based on sets. [71]
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems via biologically inspired operators such as selection, crossover, and mutation. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, and causal inference.
An evolutionary algorithm (EA) in computational intelligence is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. Evolution of the population then takes place after the repeated application of the above operators.
In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems that can be reduced to finding good paths through graphs. Artificial ants represent multi-agent methods inspired by the behavior of real ants. The pheromone-based communication of biological ants is often the predominant paradigm used. Combinations of artificial ants and local search algorithms have become a preferred method for numerous optimization tasks involving some sort of graph, e.g., vehicle routing and internet routing.
Swarm intelligence (SI) is the collective behavior of decentralized, self-organized systems, natural or artificial. The concept is employed in work on artificial intelligence. The expression was introduced by Gerardo Beni and Jing Wang in 1989, in the context of cellular robotic systems.
Metaheuristic in computer science and mathematical optimization is a higher-level procedure or heuristic designed to find, generate, tune, or select a heuristic that may provide a sufficiently good solution to an optimization problem or a machine learning problem, especially with incomplete or imperfect information or limited computation capacity. Metaheuristics sample a subset of solutions which is otherwise too large to be completely enumerated or otherwise explored. Metaheuristics may make relatively few assumptions about the optimization problem being solved and so may be usable for a variety of problems. Their use is always of interest when exact or other (approximate) methods are not available or are not expedient, either because the calculation time is too long or because, for example, the solution provided is too imprecise.
In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the optimized problem and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found.
A memetic algorithm (MA) in computer science and operations research, is an extension of the traditional genetic algorithm (GA) or more general evolutionary algorithm (EA). It may provide a sufficiently good solution to an optimization problem. It uses a suitable heuristic or local search technique to improve the quality of solutions generated by the EA and to reduce the likelihood of premature convergence.
James Kennedy is an American social psychologist, best known as an originator and researcher of particle swarm optimization. The first papers on the topic, by Kennedy and Russell C. Eberhart, were presented in 1995; since then tens of thousands of papers have been published on particle swarms. The Academic Press / Morgan Kaufmann book, Swarm Intelligence, by Kennedy and Eberhart with Yuhui Shi, was published in 2001.
In mathematical optimization, the firefly algorithm is a metaheuristic proposed by Xin-She Yang and inspired by the flashing behavior of fireflies.
Design Automation usually refers to electronic design automation, or Design Automation which is a Product Configurator. Extending Computer-Aided Design (CAD), automated design and Computer-Automated Design (CAutoD) are more concerned with a broader range of applications, such as automotive engineering, civil engineering, composite material design, control engineering, dynamic system identification and optimization, financial systems, industrial equipment, mechatronic systems, steel construction, structural optimisation, and the invention of novel systems.
In operations research, cuckoo search is an optimization algorithm developed by Xin-She Yang and Suash Deb in 2009. It has been shown to be a special case of the well-known -evolution strategy. It was inspired by the obligate brood parasitism of some cuckoo species by laying their eggs in the nests of host birds of other species. Some host birds can engage direct conflict with the intruding cuckoos. For example, if a host bird discovers the eggs are not their own, it will either throw these alien eggs away or simply abandon its nest and build a new nest elsewhere. Some cuckoo species such as the New World brood-parasitic Tapera have evolved in such a way that female parasitic cuckoos are often very specialized in the mimicry in colors and pattern of the eggs of a few chosen host species. Cuckoo search idealized such breeding behavior, and thus can be applied for various optimization problems.
Meta-optimization from numerical optimization is the use of one optimization method to tune another optimization method. Meta-optimization is reported to have been used as early as in the late 1970s by Mercer and Sampson for finding optimal parameter settings of a genetic algorithm.
In applied mathematics, multimodal optimization deals with optimization tasks that involve finding all or most of the multiple solutions of a problem, as opposed to a single best solution. Evolutionary multimodal optimization is a branch of evolutionary computation, which is closely related to machine learning. Wong provides a short survey, wherein the chapter of Shir and the book of Preuss cover the topic in more detail.
In computer science, imperialist competitive algorithms are a type of computational method used to solve optimization problems of different types. Like most of the methods in the area of evolutionary computation, ICA does not need the gradient of the function in its optimization process. From a specific point of view, ICA can be thought of as the social counterpart of genetic algorithms (GAs). ICA is the mathematical model and the computer simulation of human social evolution, while GAs are based on the biological evolution of species.
Multi-swarm optimization is a variant of particle swarm optimization (PSO) based on the use of multiple sub-swarms instead of one (standard) swarm. The general approach in multi-swarm optimization is that each sub-swarm focuses on a specific region while a specific diversification method decides where and when to launch the sub-swarms. The multi-swarm framework is especially fitted for the optimization on multi-modal problems, where multiple (local) optima exist.
Fish School Search (FSS), proposed by Bastos Filho and Lima Neto in 2008 is, in its basic version, an unimodal optimization algorithm inspired on the collective behavior of fish schools. The mechanisms of feeding and coordinated movement were used as inspiration to create the search operators. The core idea is to make the fishes “swim” toward the positive gradient in order to “eat” and “gain weight”. Collectively, the heavier fishes have more influence on the search process as a whole, what makes the barycenter of the fish school moves toward better places in the search space over the iterations.
The Fly Algorithm is a computational method within the field of evolutionary algorithms, designed for direct exploration of 3D spaces in applications such as computer stereo vision, robotics, and medical imaging. Unlike traditional image-based stereovision, which relies on matching features to construct 3D information, the Fly Algorithm operates by generating a 3D representation directly from random points, termed "flies." Each fly is a coordinate in 3D space, evaluated for its accuracy by comparing its projections in a scene. By iteratively refining the positions of flies based on fitness criteria, the algorithm can construct an optimized spatial representation. The Fly Algorithm has expanded into various fields, including applications in digital art, where it is used to generate complex visual patterns.
Dispersive flies optimisation (DFO) is a bare-bones swarm intelligence algorithm which is inspired by the swarming behaviour of flies hovering over food sources. DFO is a simple optimiser which works by iteratively trying to improve a candidate solution with regard to a numerical measure that is calculated by a fitness function. Each member of the population, a fly or an agent, holds a candidate solution whose suitability can be evaluated by their fitness value. Optimisation problems are often formulated as either minimisation or maximisation problems.
Maurice Clerc is a French mathematician.