Simulated annealing

Last updated

Simulated annealing can be used to solve combinatorial problems. Here it is applied to the travelling salesman problem to minimize the length of a route that connects all 125 points. Travelling salesman problem solved with simulated annealing.gif
Simulated annealing can be used to solve combinatorial problems. Here it is applied to the travelling salesman problem to minimize the length of a route that connects all 125 points.
Travelling salesman problem in 3D for 120 points solved with simulated annealing. 3D TSP solved with simulated annealing 2.5 MB.gif
Travelling salesman problem in 3D for 120 points solved with simulated annealing.

Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. For large numbers of local optima, SA can find the global optima. [1] It is often used when the search space is discrete (for example the traveling salesman problem, the boolean satisfiability problem, protein structure prediction, and job-shop scheduling). For problems where finding an approximate global optimum is more important than finding a precise local optimum in a fixed amount of time, simulated annealing may be preferable to exact algorithms such as gradient descent or branch and bound.

Contents

The name of the algorithm comes from annealing in metallurgy, a technique involving heating and controlled cooling of a material to alter its physical properties. Both are attributes of the material that depend on their thermodynamic free energy. Heating and cooling the material affects both the temperature and the thermodynamic free energy or Gibbs energy. Simulated annealing can be used for very hard computational optimization problems where exact algorithms fail; even though it usually achieves an approximate solution to the global minimum, it could be enough for many practical problems.

The problems solved by SA are currently formulated by an objective function of many variables, subject to several mathematical constraints. In practice, the constraint can be penalized as part of the objective function.

Similar techniques have been independently introduced on several occasions, including Pincus (1970), [2] Khachaturyan et al (1979, [3] 1981 [4] ), Kirkpatrick, Gelatt and Vecchi (1983), and Cerny (1985). [5] In 1983, this approach was used by Kirkpatrick, Gelatt Jr., Vecchi, [6] for a solution of the traveling salesman problem. They also proposed its current name, simulated annealing.

This notion of slow cooling implemented in the simulated annealing algorithm is interpreted as a slow decrease in the probability of accepting worse solutions as the solution space is explored. Accepting worse solutions allows for a more extensive search for the global optimal solution. In general, simulated annealing algorithms work as follows. The temperature progressively decreases from an initial positive value to zero. At each time step, the algorithm randomly selects a solution close to the current one, measures its quality, and moves to it according to the temperature-dependent probabilities of selecting better or worse solutions, which during the search respectively remain at 1 (or positive) and decrease toward zero.

The simulation can be performed either by a solution of kinetic equations for probability density functions, [7] [8] or by using a stochastic sampling method. [6] [9] The method is an adaptation of the Metropolis–Hastings algorithm, a Monte Carlo method to generate sample states of a thermodynamic system, published by N. Metropolis et al. in 1953. [10]

Overview

The state s of some physical systems, and the function E(s) to be minimized, is analogous to the internal energy of the system in that state. The goal is to bring the system, from an arbitrary initial state, to a state with the minimum possible energy.

Simulated annealing searching for a maximum. The objective here is to get to the highest point. In this example, it is not enough to use a simple hill climb algorithm, as there are many local maxima. By cooling the temperature slowly the global maximum is found. Hill Climbing with Simulated Annealing.gif
Simulated annealing searching for a maximum. The objective here is to get to the highest point. In this example, it is not enough to use a simple hill climb algorithm, as there are many local maxima. By cooling the temperature slowly the global maximum is found.

The basic iteration

At each step, the simulated annealing heuristic considers some neighboring state s* of the current state s, and probabilistically decides between moving the system to state s* or staying in-state s. These probabilities ultimately lead the system to move to states of lower energy. Typically this step is repeated until the system reaches a state that is good enough for the application, or until a given computation budget has been exhausted.

The neighbors of a state

Optimization of a solution involves evaluating the neighbors of a state of the problem, which are new states produced through conservatively altering a given state. For example, in the traveling salesman problem each state is typically defined as a permutation of the cities to be visited, and the neighbors of any state are the set of permutations produced by swapping any two of these cities. The well-defined way in which the states are altered to produce neighboring states is called a "move", and different moves give different sets of neighboring states. These moves usually result in minimal alterations of the last state, in an attempt to progressively improve the solution through iteratively improving its parts (such as the city connections in the traveling salesman problem).

Simple heuristics like hill climbing, which move by finding better neighbor after better neighbor and stop when they have reached a solution which has no neighbors that are better solutions, cannot guarantee to lead to any of the existing better solutions  their outcome may easily be just a local optimum, while the actual best solution would be a global optimum that could be different. Metaheuristics use the neighbors of a solution as a way to explore the solution space, and although they prefer better neighbors, they also accept worse neighbors in order to avoid getting stuck in local optima; they can find the global optimum if run for a long enough amount of time.

Acceptance probabilities

The probability of making the transition from the current state to a candidate new state is specified by an acceptance probability function, that depends on the energies and of the two states, and on a global time-varying parameter called the temperature. States with a smaller energy are better than those with a greater energy. The probability function must be positive even when is greater than . This feature prevents the method from becoming stuck at a local minimum that is worse than the global one.

When tends to zero, the probability must tend to zero if and to a positive value otherwise. For sufficiently small values of , the system will then increasingly favor moves that go "downhill" (i.e., to lower energy values), and avoid those that go "uphill." With the procedure reduces to the greedy algorithm, which makes only the downhill transitions.

In the original description of simulated annealing, the probability was equal to 1 when —i.e., the procedure always moved downhill when it found a way to do so, irrespective of the temperature. Many descriptions and implementations of simulated annealing still take this condition as part of the method's definition. However, this condition is not essential for the method to work.

The function is usually chosen so that the probability of accepting a move decreases when the difference increases—that is, small uphill moves are more likely than large ones. However, this requirement is not strictly necessary, provided that the above requirements are met.

Given these properties, the temperature plays a crucial role in controlling the evolution of the state of the system with regard to its sensitivity to the variations of system energies. To be precise, for a large , the evolution of is sensitive to coarser energy variations, while it is sensitive to finer energy variations when is small.

The annealing schedule

SimulatedAnnealingFast.jpg
Fast
SimulatedAnnealingSlow.jpg
Slow
Example illustrating the effect of cooling schedule on the performance of simulated annealing. The problem is to rearrange the pixels of an image so as to minimize a certain potential energy function, which causes similar colors to attract at short range and repel at a slightly larger distance. The elementary moves swap two adjacent pixels. These images were obtained with a fast cooling schedule (left) and a slow cooling schedule (right), producing results similar to amorphous and crystalline solids, respectively.

The name and inspiration of the algorithm demand an interesting feature related to the temperature variation to be embedded in the operational characteristics of the algorithm. This necessitates a gradual reduction of the temperature as the simulation proceeds. The algorithm starts initially with set to a high value (or infinity), and then it is decreased at each step following some annealing schedule—which may be specified by the user but must end with towards the end of the allotted time budget. In this way, the system is expected to wander initially towards a broad region of the search space containing good solutions, ignoring small features of the energy function; then drift towards low-energy regions that become narrower and narrower, and finally move downhill according to the steepest descent heuristic.

For any given finite problem, the probability that the simulated annealing algorithm terminates with a global optimal solution approaches 1 as the annealing schedule is extended. [11] This theoretical result, however, is not particularly helpful, since the time required to ensure a significant probability of success will usually exceed the time required for a complete search of the solution space. [12]

Pseudocode

The following pseudocode presents the simulated annealing heuristic as described above. It starts from a state s0 and continues until a maximum of kmax steps have been taken. In the process, the call neighbour(s) should generate a randomly chosen neighbour of a given state s; the call random(0, 1) should pick and return a value in the range [0, 1], uniformly at random. The annealing schedule is defined by the call temperature(r), which should yield the temperature to use, given the fraction r of the time budget that has been expended so far.

  • Let s = s0
  • For k = 0 through kmax (exclusive):
    • T ← temperature( 1 - (k+1)/kmax )
    • Pick a random neighbour, snew ← neighbour(s)
    • If P(E(s), E(snew), T) ≥ random(0, 1):
      • ssnew
  • Output: the final state s

Selecting the parameters

In order to apply the simulated annealing method to a specific problem, one must specify the following parameters: the state space, the energy (goal) function E(), the candidate generator procedure neighbor (), the acceptance probability function P(), and the annealing schedule temperature() AND initial temperature init_temp. These choices can have a significant impact on the method's effectiveness. Unfortunately, there are no choices of these parameters that will be good for all problems, and there is no general way to find the best choices for a given problem. The following sections give some general guidelines.

Sufficiently near neighbour

Simulated annealing may be modeled as a random walk on a search graph, whose vertices are all possible states, and whose edges are the candidate moves. An essential requirement for the neighbor () function is that it must provide a sufficiently short path on this graph from the initial state to any state which may be the global optimum  the diameter of the search graph must be small. In the traveling salesman example above, for instance, the search space for n = 20 cities has n! = 2,432,902,008,176,640,000 (2.4 quintillion) states; yet the number of neighbors of each vertex is edges (coming from n choose 20), and the diameter of the graph is .

Transition probabilities

To investigate the behavior of simulated annealing on a particular problem, it can be useful to consider the transition probabilities that result from the various design choices made in the implementation of the algorithm. For each edge of the search graph, the transition probability is defined as the probability that the simulated annealing algorithm will move to state when its current state is . This probability depends on the current temperature as specified by temperature(), on the order in which the candidate moves are generated by the neighbor () function, and on the acceptance probability function P(). (Note that the transition probability is not simply , because the candidates are tested serially.)

Acceptance probabilities

The specification of neighbour(), P(), and temperature() is partially redundant. In practice, it's common to use the same acceptance function P() for many problems and adjust the other two functions according to the specific problem.

In the formulation of the method by Kirkpatrick et al., the acceptance probability function was defined as 1 if , and otherwise. This formula was superficially justified by analogy with the transitions of a physical system; it corresponds to the Metropolis–Hastings algorithm, in the case where T=1 and the proposal distribution of Metropolis–Hastings is symmetric. However, this acceptance probability is often used for simulated annealing even when the neighbor () function, which is analogous to the proposal distribution in Metropolis–Hastings, is not symmetric, or not probabilistic at all. As a result, the transition probabilities of the simulated annealing algorithm do not correspond to the transitions of the analogous physical system, and the long-term distribution of states at a constant temperature need not bear any resemblance to the thermodynamic equilibrium distribution over states of that physical system, at any temperature. Nevertheless, most descriptions of simulated annealing assume the original acceptance function, which is probably hard-coded in many implementations of SA.

In 1990, Moscato and Fontanari, [13] and independently Dueck and Scheuer, [14] proposed that a deterministic update (i.e. one that is not based on the probabilistic acceptance rule) could speed-up the optimization process without impacting on the final quality. Moscato and Fontanari conclude from observing the analogous of the "specific heat" curve of the "threshold updating" annealing originating from their study that "the stochasticity of the Metropolis updating in the simulated annealing algorithm does not play a major role in the search of near-optimal minima". Instead, they proposed that "the smoothening of the cost function landscape at high temperature and the gradual definition of the minima during the cooling process are the fundamental ingredients for the success of simulated annealing." The method subsequently popularized under the denomination of "threshold accepting" due to Dueck and Scheuer's denomination. In 2001, Franz, Hoffmann and Salamon showed that the deterministic update strategy is indeed the optimal one within the large class of algorithms that simulate a random walk on the cost/energy landscape. [15]

Efficient candidate generation

When choosing the candidate generator neighbor (), one must consider that after a few iterations of the simulated annealing algorithm, the current state is expected to have much lower energy than a random state. Therefore, as a general rule, one should skew the generator towards candidate moves where the energy of the destination state is likely to be similar to that of the current state. This heuristic (which is the main principle of the Metropolis–Hastings algorithm) tends to exclude very good candidate moves as well as very bad ones; however, the former are usually much less common than the latter, so the heuristic is generally quite effective.

In the traveling salesman problem above, for example, swapping two consecutive cities in a low-energy tour is expected to have a modest effect on its energy (length); whereas swapping two arbitrary cities is far more likely to increase its length than to decrease it. Thus, the consecutive-swap neighbor generator is expected to perform better than the arbitrary-swap one, even though the latter could provide a somewhat shorter path to the optimum (with swaps, instead of ).

A more precise statement of the heuristic is that one should try the first candidate states for which is large. For the "standard" acceptance function above, it means that is on the order of or less. Thus, in the traveling salesman example above, one could use a neighbor () function that swaps two random cities, where the probability of choosing a city-pair vanishes as their distance increases beyond .

Barrier avoidance

When choosing the candidate generator neighbor () one must also try to reduce the number of "deep" local minima—states (or sets of connected states) that have much lower energy than all its neighboring states. Such "closed catchment basins" of the energy function may trap the simulated annealing algorithm with high probability (roughly proportional to the number of states in the basin) and for a very long time (roughly exponential on the energy difference between the surrounding states and the bottom of the basin).

As a rule, it is impossible to design a candidate generator that will satisfy this goal and also prioritize candidates with similar energy. On the other hand, one can often vastly improve the efficiency of simulated annealing by relatively simple changes to the generator. In the traveling salesman problem, for instance, it is not hard to exhibit two tours , , with nearly equal lengths, such that (1) is optimal, (2) every sequence of city-pair swaps that converts to goes through tours that are much longer than both, and (3) can be transformed into by flipping (reversing the order of) a set of consecutive cities. In this example, and lie in different "deep basins" if the generator performs only random pair-swaps; but they will be in the same basin if the generator performs random segment-flips.

Cooling schedule

The physical analogy that is used to justify simulated annealing assumes that the cooling rate is low enough for the probability distribution of the current state to be near thermodynamic equilibrium at all times. Unfortunately, the relaxation time—the time one must wait for the equilibrium to be restored after a change in temperature—strongly depends on the "topography" of the energy function and on the current temperature. In the simulated annealing algorithm, the relaxation time also depends on the candidate generator, in a very complicated way. Note that all these parameters are usually provided as black box functions to the simulated annealing algorithm. Therefore, the ideal cooling rate cannot be determined beforehand and should be empirically adjusted for each problem. Adaptive simulated annealing algorithms address this problem by connecting the cooling schedule to the search progress. Other adaptive approaches such as Thermodynamic Simulated Annealing, [16] automatically adjusts the temperature at each step based on the energy difference between the two states, according to the laws of thermodynamics.

Restarts

Sometimes it is better to move back to a solution that was significantly better rather than always moving from the current state. This process is called restarting of simulated annealing. To do this we set s and e to sbest and ebest and perhaps restart the annealing schedule. The decision to restart could be based on several criteria. Notable among these include restarting based on a fixed number of steps, based on whether the current energy is too high compared to the best energy obtained so far, restarting randomly, etc.

See also

Related Research Articles

<span class="mw-page-title-main">Genetic algorithm</span> Competitive algorithm for searching a problem space

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, causal inference, etc.

<span class="mw-page-title-main">Mathematical optimization</span> Study of mathematical algorithms for optimization problems

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

In computer science, local search is a heuristic method for solving computationally hard optimization problems. Local search can be used on problems that can be formulated as finding a solution maximizing a criterion among a number of candidate solutions. Local search algorithms move from solution to solution in the space of candidate solutions by applying local changes, until a solution deemed optimal is found or a time bound is elapsed.

Automatic label placement, sometimes called text placement or name placement, comprises the computer methods of placing labels automatically on a map or chart. This is related to the typographic design of such labels.

<span class="mw-page-title-main">Hill climbing</span> Optimization algorithm

In numerical analysis, hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem, then attempts to find a better solution by making an incremental change to the solution. If the change produces a better solution, another incremental change is made to the new solution, and so on until no further improvements can be found.

Tabu search (TS) is a metaheuristic search method employing local search methods used for mathematical optimization. It was created by Fred W. Glover in 1986 and formalized in 1989.

An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints are linear.

<span class="mw-page-title-main">Combinatorial optimization</span> Subfield of mathematical optimization

Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combinatorial optimization problems are the travelling salesman problem ("TSP"), the minimum spanning tree problem ("MST"), and the knapsack problem. In many such problems, such as the ones previously mentioned, exhaustive search is not tractable, and so specialized algorithms that quickly rule out large parts of the search space or approximation algorithms must be resorted to instead.

Branch and bound is a method for solving optimization problems by breaking them down into smaller sub-problems and using a bounding function to eliminate sub-problems that cannot contain the optimal solution. It is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.

Multi-disciplinary design optimization (MDO) is a field of engineering that uses optimization methods to solve design problems incorporating a number of disciplines. It is also known as multidisciplinary system design optimization (MSDO), and multidisciplinary design analysis and optimization (MDAO).

Global optimization is a branch of applied mathematics and numerical analysis that attempts to find the global minima or maxima of a function or a set of functions on a given set. It is usually described as a minimization problem because the maximization of the real-valued function is equivalent to the minimization of the function .

<span class="mw-page-title-main">Ant colony optimization algorithms</span> Optimization algorithm

In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. Artificial ants stand for multi-agent methods inspired by the behavior of real ants. The pheromone-based communication of biological ants is often the predominant paradigm used. Combinations of artificial ants and local search algorithms have become a method of choice for numerous optimization tasks involving some sort of graph, e.g., vehicle routing and internet routing.

In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, tune, or select a heuristic that may provide a sufficiently good solution to an optimization problem or a machine learning problem, especially with incomplete or imperfect information or limited computation capacity. Metaheuristics sample a subset of solutions which is otherwise too large to be completely enumerated or otherwise explored. Metaheuristics may make relatively few assumptions about the optimization problem being solved and so may be usable for a variety of problems.

<span class="mw-page-title-main">Differential evolution</span> Method of mathematical optimization

In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the optimized problem and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found.

Adaptive simulated annealing (ASA) is a variant of simulated annealing (SA) algorithm in which the algorithm parameters that control temperature schedule and random step selection are automatically adjusted according to algorithm progress. This makes the algorithm more efficient and less sensitive to user defined parameters than canonical SA. These are in the standard variant often selected on the basis of experience and experimentation, which represents a significant deficiency in practice.

Quantum annealing (QA) is an optimization process for finding the global minimum of a given objective function over a given set of candidate solutions, by a process using quantum fluctuations. Quantum annealing is used mainly for problems where the search space is discrete with many local minima; such as finding the ground state of a spin glass or the traveling salesman problem. The term "quantum annealing" was first proposed in 1988 by B. Apolloni, N. Cesa Bianchi and D. De Falco as a quantum-inspired classical algorithm. It was formulated in its present form by T. Kadowaki and H. Nishimori in 1998 though an imaginary-time variant without quantum coherence had been discussed by A. B. Finnila, M. A. Gomez, C. Sebenik and J. D. Doll in 1994.

The self-consistent mean field (SCMF) method is an adaptation of mean field theory used in protein structure prediction to determine the optimal amino acid side chain packing given a fixed protein backbone. It is faster but less accurate than dead-end elimination and is generally used in situations where the protein of interest is too large for the problem to be tractable by DEE.

The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective.

Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involves random objective functions or random constraints. Stochastic optimization methods also include methods with random iterates. Some stochastic optimization methods use random iterates to solve stochastic problems, combining both meanings of stochastic optimization. Stochastic optimization methods generalize deterministic methods for deterministic problems.

As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems, such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, and many other computer vision problems that can be formulated in terms of energy minimization. Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph. Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph, the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization.

References

  1. "What is Simulated Annealing?". www.cs.cmu.edu. Retrieved 2023-05-13.
  2. Pincus, Martin (Nov–Dec 1970). "A Monte-Carlo Method for the Approximate Solution of Certain Types of Constrained Optimization Problems". Journal of the Operations Research Society of America. 18 (6): 967–1235. doi:10.1287/opre.18.6.1225.
  3. Khachaturyan, A.: Semenovskaya, S.: Vainshtein B., Armen (1979). "Statistical-Thermodynamic Approach to Determination of Structure Amplitude Phases". Soviet Physics, Crystallography. 24 (5): 519–524.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  4. Khachaturyan, A.; Semenovskaya, S.; Vainshtein, B. (1981). "The Thermodynamic Approach to the Structure Analysis of Crystals". Acta Crystallographica. A37 (5): 742–754. Bibcode:1981AcCrA..37..742K. doi:10.1107/S0567739481001630.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  5. Laarhoven, P. J. M. van (Peter J. M.) (1987). Simulated annealing : theory and applications. Aarts, E. H. L. (Emile H. L.). Dordrecht: D. Reidel. ISBN   90-277-2513-6. OCLC   15548651.
  6. 1 2 Kirkpatrick, S.; Gelatt Jr, C. D.; Vecchi, M. P. (1983). "Optimization by Simulated Annealing". Science. 220 (4598): 671–680. Bibcode:1983Sci...220..671K. CiteSeerX   10.1.1.123.7607 . doi:10.1126/science.220.4598.671. JSTOR   1690046. PMID   17813860. S2CID   205939.
  7. Khachaturyan, A.; Semenovskaya, S.; Vainshtein, B. (1979). "Statistical-Thermodynamic Approach to Determination of Structure Amplitude Phases". Sov.Phys. Crystallography. 24 (5): 519–524.
  8. Khachaturyan, A.; Semenovskaya, S.; Vainshtein, B. (1981). "The Thermodynamic Approach to the Structure Analysis of Crystals". Acta Crystallographica. 37 (A37): 742–754. Bibcode:1981AcCrA..37..742K. doi:10.1107/S0567739481001630.
  9. Černý, V. (1985). "Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm". Journal of Optimization Theory and Applications. 45: 41–51. doi:10.1007/BF00940812. S2CID   122729427.
  10. Metropolis, Nicholas; Rosenbluth, Arianna W.; Rosenbluth, Marshall N.; Teller, Augusta H.; Teller, Edward (1953). "Equation of State Calculations by Fast Computing Machines". The Journal of Chemical Physics. 21 (6): 1087. Bibcode:1953JChPh..21.1087M. doi:10.1063/1.1699114. OSTI   4390578. S2CID   1046577.
  11. Granville, V.; Krivanek, M.; Rasson, J.-P. (1994). "Simulated annealing: A proof of convergence". IEEE Transactions on Pattern Analysis and Machine Intelligence. 16 (6): 652–656. doi:10.1109/34.295910.
  12. Nolte, Andreas; Schrader, Rainer (1997), "A Note on the Finite Time Behaviour of Simulated Annealing", Operations Research Proceedings 1996, vol. 1996, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 175–180, doi:10.1007/978-3-642-60744-8_32, ISBN   978-3-540-62630-5 , retrieved 2023-02-06
  13. Moscato, P.; Fontanari, J.F. (1990), "Stochastic versus deterministic update in simulated annealing", Physics Letters A, 146 (4): 204–208, Bibcode:1990PhLA..146..204M, doi:10.1016/0375-9601(90)90166-L
  14. Dueck, G.; Scheuer, T. (1990), "Threshold accepting: A general purpose optimization algorithm appearing superior to simulated annealing", Journal of Computational Physics, 90 (1): 161–175, Bibcode:1990JCoPh..90..161D, doi:10.1016/0021-9991(90)90201-B, ISSN   0021-9991
  15. Franz, A.; Hoffmann, K.H.; Salamon, P (2001), "Best optimal strategy for finding ground states", Physical Review Letters, 86 (3): 5219–5222, doi:10.1103/PhysRevLett.86.5219, PMID   11384462
  16. De Vicente, Juan; Lanchares, Juan; Hermida, Román (2003). "Placement by thermodynamic simulated annealing". Physics Letters A. 317 (5–6): 415–423. Bibcode:2003PhLA..317..415D. doi:10.1016/j.physleta.2003.08.070.
  17. Del Moral, Pierre; Doucet, Arnaud; Jasra, Ajay (2006). "Sequential Monte Carlo samplers". Journal of the Royal Statistical Society, Series B. 68 (3): 411–436. arXiv: cond-mat/0212648 . doi:10.1111/j.1467-9868.2006.00553.x. S2CID   12074789.
  18. Moscato, Pablo (June 1993). "An introduction to population approaches for optimization and hierarchical objective functions: A discussion on the role of tabu search". Annals of Operations Research. 41 (2): 85–121. doi:10.1007/BF02022564. S2CID   35382644.
  19. Moscato, P. (1989). "On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms". Caltech Concurrent Computation Program (report 826).
  20. Deb, Bandyopadhyay (June 2008). "A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA". IEEE Transactions on Evolutionary Computation. 12 (3): 269–283. doi:10.1109/TEVC.2007.900837. S2CID   12107321.

Further reading