Pattern search (optimization)

Last updated
Example of convergence of a direct search method on the Broyden function. At each iteration, the pattern either moves to the point which best minimizes its objective function, or shrinks in size if no point is better than the current point, until the desired accuracy has been achieved, or the algorithm reaches a predetermined number of iterations. Direct search BROYDEN.gif
Example of convergence of a direct search method on the Broyden function. At each iteration, the pattern either moves to the point which best minimizes its objective function, or shrinks in size if no point is better than the current point, until the desired accuracy has been achieved, or the algorithm reaches a predetermined number of iterations.

Pattern search (also known as direct search, derivative-free search, or black-box search) is a family of numerical optimization methods that does not require a gradient. As a result, it can be used on functions that are not continuous or differentiable. One such pattern search method is "convergence" (see below), which is based on the theory of positive bases. Optimization attempts to find the best match (the solution that has the lowest error value) in a multidimensional analysis space of possibilities.

Contents

History

The name "pattern search" was coined by Hooke and Jeeves. [1] An early and simple variant is attributed to Fermi and Metropolis when they worked at the Los Alamos National Laboratory. It is described by Davidon, [2] as follows:

They varied one theoretical parameter at a time by steps of the same magnitude, and when no such increase or decrease in any one parameter further improved the fit to the experimental data, they halved the step size and repeated the process until the steps were deemed sufficiently small.

Convergence

Convergence is a pattern search method proposed by Yu, who proved that it converges using the theory of positive bases. [3] Later, Torczon, Lagarias and co-authors [4] [5] used positive-basis techniques to prove the convergence of another pattern-search method on specific classes of functions. Outside of such classes, pattern search is a heuristic that can provide useful approximate solutions for some issues, but can fail on others. Outside of such classes, pattern search is not an iterative method that converges to a solution; indeed, pattern-search methods can converge to non-stationary points on some relatively tame problems. [6] [7]

See also

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

<span class="mw-page-title-main">Mathematical optimization</span> Study of mathematical algorithms for optimization problems

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

<span class="mw-page-title-main">Gradient descent</span> Optimization algorithm

In mathematics, gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent.

In mathematical optimization, Dantzig's simplex algorithm is a popular algorithm for linear programming.

In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, tune, or select a heuristic that may provide a sufficiently good solution to an optimization problem or a machine learning problem, especially with incomplete or imperfect information or limited computation capacity. Metaheuristics sample a subset of solutions which is otherwise too large to be completely enumerated or otherwise explored. Metaheuristics may make relatively few assumptions about the optimization problem being solved and so may be usable for a variety of problems.

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are a certain class of algorithms that solve linear and nonlinear convex optimization problems.

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the descent direction by preconditioning the gradient with curvature information. It does so by gradually improving an approximation to the Hessian matrix of the loss function, obtained only from gradient evaluations via a generalized secant method.

<span class="mw-page-title-main">Nelder–Mead method</span> Numerical optimization algorithm

The Nelder–Mead method is a numerical method used to find the minimum or maximum of an objective function in a multidimensional space. It is a direct search method and is often applied to nonlinear optimization problems for which derivatives may not be known. However, the Nelder–Mead technique is a heuristic search method that can converge to non-stationary points on problems that can be solved by alternative methods.

<span class="mw-page-title-main">Lloyd's algorithm</span>

In electrical engineering and computer science, Lloyd's algorithm, also known as Voronoi iteration or relaxation, is an algorithm named after Stuart P. Lloyd for finding evenly spaced sets of points in subsets of Euclidean spaces and partitions of these subsets into well-shaped and uniformly sized convex cells. Like the closely related k-means clustering algorithm, it repeatedly finds the centroid of each set in the partition and then re-partitions the input according to which of these centroids is closest. In this setting, the mean operation is an integral over a region of space, and the nearest centroid operation results in Voronoi diagrams.

Quasi-Newton methods are methods used to either find zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration. The "full" Newton's method requires the Jacobian in order to search for zeros, or the Hessian for finding extrema.

Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation and selection: in each generation (iteration) new individuals are generated by variation, usually in a stochastic way, of the current parental individuals. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value . Like this, over the generation sequence, individuals with better and better -values are generated.

In numerical optimization, the nonlinear conjugate gradient method generalizes the conjugate gradient method to nonlinear optimization. For a quadratic function

Random search (RS) is a family of numerical optimization methods that do not require the gradient of the problem to be optimized, and RS can hence be used on functions that are not continuous or differentiable. Such optimization methods are also known as direct-search, derivative-free, or black-box methods.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective; the difference is that the augmented Lagrangian method adds yet another term, designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with the method of Lagrange multipliers.

In computational engineering, Luus–Jaakola (LJ) denotes a heuristic for global optimization of a real-valued function. In engineering use, LJ is not an algorithm that terminates with an optimal solution; nor is it an iterative method that generates a sequence of points that converges to an optimal solution. However, when applied to a twice continuously differentiable function, the LJ heuristic is a proper iterative method, that generates a sequence that has a convergent subsequence; for this class of problems, Newton's method is recommended and enjoys a quadratic rate of convergence, while no convergence rate analysis has been given for the LJ heuristic. In practice, the LJ heuristic has been recommended for functions that need be neither convex nor differentiable nor locally Lipschitz: The LJ heuristic does not use a gradient or subgradient when one be available, which allows its application to non-differentiable and non-convex problems.

<span class="mw-page-title-main">Criss-cross algorithm</span> Method for mathematical optimization

In mathematical optimization, the criss-cross algorithm is any of a family of algorithms for linear programming. Variants of the criss-cross algorithm also solve more general problems with linear inequality constraints and nonlinear objective functions; there are criss-cross algorithms for linear-fractional programming problems, quadratic-programming problems, and linear complementarity problems.

Coordinate descent is an optimization algorithm that successively minimizes along coordinate directions to find the minimum of a function. At each iteration, the algorithm determines a coordinate or coordinate block via a coordinate selection rule, then exactly or inexactly minimizes over the corresponding coordinate hyperplane while fixing all other coordinates or coordinate blocks. A line search along the coordinate direction can be performed at the current iterate to determine the appropriate step size. Coordinate descent is applicable in both differentiable and derivative-free contexts.

Derivative-free optimization, is a discipline in mathematical optimization that does not use derivative information in the classical sense to find optimal solutions: Sometimes information about the derivative of the objective function f is unavailable, unreliable or impractical to obtain. For example, f might be non-smooth, or time-consuming to evaluate, or in some way noisy, so that methods that rely on derivatives or approximate them via finite differences are of little use. The problem to find optimal points in such situations is referred to as derivative-free optimization, algorithms that do not use derivatives or finite differences are called derivative-free algorithms.

Virginia Joanne Torczon is an American applied mathematician and computer scientist known for her research on nonlinear optimization methods including pattern search. She is dean of graduate studies and research, and chancellor professor of computer science, at the College of William & Mary.

References

  1. Hooke, R.; Jeeves, T.A. (1961). ""Direct search" solution of numerical and statistical problems". Journal of the ACM. 8 (2): 212–229. doi:10.1145/321062.321069. S2CID   10905054.
  2. Davidon, W.C. (1991). "Variable metric method for minimization". SIAM Journal on Optimization. 1 (1): 1–17. CiteSeerX   10.1.1.693.272 . doi:10.1137/0801001. S2CID   1819475.
  3. Torczon, V.J. (1997). "On the convergence of pattern search algorithms" (PDF). SIAM Journal on Optimization. 7 (1): 1–25. CiteSeerX   10.1.1.50.3173 . doi:10.1137/S1052623493250780.
  4. Dolan, E.D.; Lewis, R.M.; Torczon, V.J. (2003). "On the local convergence of pattern search" (PDF). SIAM Journal on Optimization. 14 (2): 567–583. CiteSeerX   10.1.1.78.2407 . doi:10.1137/S1052623400374495. hdl:2060/20000109966. S2CID   4226940.