Derivative-free optimization

Last updated March 02, 2024

Derivative-free optimization (sometimes referred to as blackbox optimization) is a discipline in mathematical optimization that does not use derivative information in the classical sense to find optimal solutions: Sometimes information about the derivative of the objective function f is unavailable, unreliable or impractical to obtain. For example, f might be non-smooth, or time-consuming to evaluate, or in some way noisy, so that methods that rely on derivatives or approximate them via finite differences are of little use. The problem to find optimal points in such situations is referred to as derivative-free optimization, algorithms that do not use derivatives or finite differences are called derivative-free algorithms.^[1]

Introduction

The problem to be solved is to numerically optimize an objective function $f\colon A\to \mathbb {R}$ for some set $A$ (usually $A\subset \mathbb {R} ^{n}$ ), i.e. find $x_{0}\in A$ such that without loss of generality $f(x_{0})\leq f(x)$ for all $x\in A$ .

When applicable, a common approach is to iteratively improve a parameter guess by local hill-climbing in the objective function landscape. Derivative-based algorithms use derivative information of $f$ to find a good search direction, since for example the gradient gives the direction of steepest ascent. Derivative-based optimization is efficient at finding local optima for continuous-domain smooth single-modal problems. However, they can have problems when e.g. $A$ is disconnected, or (mixed-)integer, or when $f$ is expensive to evaluate, or is non-smooth, or noisy, so that (numeric approximations of) derivatives do not provide useful information. A slightly different problem is when $f$ is multi-modal, in which case local derivative-based methods only give local optima, but might miss the global one.

In derivative-free optimization, various methods are employed to address these challenges using only function values of $f$ , but no derivatives. Some of these methods can be proved to discover optima, but some are rather metaheuristic since the problems are in general more difficult to solve compared to convex optimization. For these, the ambition is rather to efficiently find "good" parameter values which can be near-optimal given enough resources, but optimality guarantees can typically not be given. One should keep in mind that the challenges are diverse, so that one can usually not use one algorithm for all kinds of problems.

Algorithms

Notable derivative-free optimization algorithms include:

Bayesian optimization
Coordinate descent and adaptive coordinate descent
Differential evolution, including multi-objective variants
DONE
Evolution strategies, Natural evolution strategies (CMA-ES, xNES, SNES)
Genetic algorithms
MCS algorithm
Nelder-Mead method
Particle swarm optimization
Pattern search
Powell's methods based on interpolation, e.g., COBYLA (PRIMA)
Random search (including Luus–Jaakola)
Simulated annealing
Stochastic optimization
Subgradient method
various model-based algorithms like BOBYQA and ORBIT

Benchmarks

There exist benchmarks for blackbox optimization algorithms, see e.g. the bbob-biobj tests.^[2]

Related Research Articles

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for finding a local minimum of a differentiable multivariate function.

Branch and bound is a method for solving optimization problems by breaking them down into smaller sub-problems and using a bounding function to eliminate sub-problems that cannot contain the optimal solution. It is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.

Global optimization is a branch of applied mathematics and numerical analysis that attempts to find the global minima or maxima of a function or a set of functions on a given set. It is usually described as a minimization problem because the maximization of the real-valued function $is equivalent to the minimization of the function .$

In mathematics, nonlinear programming (NLP) is the process of solving an optimization problem where some of the constraints or the objective function are nonlinear. An optimization problem is one of calculation of the extrema of an objective function over a set of unknown real variables and conditional to the satisfaction of a system of equalities and inequalities, collectively termed constraints. It is the sub-field of mathematical optimization that deals with problems that are not linear.

Shape optimization is part of the field of optimal control theory. The typical problem is to find the shape which is optimal in that it minimizes a certain cost functional while satisfying given constraints. In many cases, the functional being solved depends on the solution of a given partial differential equation defined on the variable domain.

In optimization, line search is a basic iterative approach to find a local minimum $of an objective function . It first finds a descent direction along which the objective function will be reduced, and then computes a step size that determines how far should move along that direction. The descent direction can be computed by various methods, such as gradient descent or quasi-Newton method. The step size can be determined either exactly or inexactly.$

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are algorithms for solving linear and non-linear convex optimization problems. IPMs combine two advantages of previously-known algorithms:

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

<span class="mw-page-title-main">Nelder–Mead method</span> Numerical optimization algorithm

The Nelder–Mead method is a numerical method used to find the minimum or maximum of an objective function in a multidimensional space. It is a direct search method and is often applied to nonlinear optimization problems for which derivatives may not be known. However, the Nelder–Mead technique is a heuristic search method that can converge to non-stationary points on problems that can be solved by alternative methods.

In multi-objective optimization, the Pareto front is the set of all Pareto efficient solutions. The concept is widely used in engineering. It allows the designer to restrict attention to the set of efficient choices, and to make tradeoffs within this set, rather than considering the full range of every parameter.

<span class="mw-page-title-main">Differential evolution</span> Method of mathematical optimization

In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the optimized problem and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found.

The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective.

Multi-objective optimization or Pareto optimization is an area of multiple-criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. Multi-objective is a type of vector optimization that has been applied in many fields of science, including engineering, economics and logistics where optimal decisions need to be taken in the presence of trade-offs between two or more conflicting objectives. Minimizing cost while maximizing comfort while buying a car, and maximizing performance whilst minimizing fuel consumption and emission of pollutants of a vehicle are examples of multi-objective optimization problems involving two and three objectives, respectively. In practical problems, there can be more than three objectives.

WORHP, also referred to as eNLP by ESA, is a mathematical software library for numerically solving large scale continuous nonlinear optimization problems.

Simulation-based optimization integrates optimization techniques into simulation modeling and analysis. Because of the complexity of the simulation, the objective function may become difficult and expensive to evaluate. Usually, the underlying simulation model is stochastic, so that the objective function must be estimated using statistical estimation techniques.

In mathematical optimization, oracle complexity is a standard theoretical framework to study the computational requirements for solving classes of optimization problems. It is suitable for analyzing iterative algorithms which proceed by computing local information about the objective function at various points. The framework has been used to provide tight worst-case guarantees on the number of required iterations, for several important classes of optimization problems.

Probabilistic numerics is an active field of study at the intersection of applied mathematics, statistics, and machine learning centering on the concept of uncertainty in computation. In probabilistic numerics, tasks in numerical analysis such as finding numerical solutions for integration, linear algebra, optimization and simulation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

(Stochastic) variance reduction is an algorithmic approach to minimizing functions that can be decomposed into finite sums. By exploiting the finite sum structure, variance reduction techniques are able to achieve convergence rates that are impossible to achieve with methods that treat the objective as an infinite sum, as in the classical Stochastic approximation setting.

References

↑ Conn, A. R.; Scheinberg, K.; Vicente, L. N. (2009). Introduction to Derivative-Free Optimization. MPS-SIAM Book Series on Optimization. Philadelphia: SIAM. Retrieved 2014-01-18.
↑ Using Well-Understood Single-Objective Functions in Multiobjective Black-Box Optimization Test Suites, https://arxiv.org/abs/1604.00359, 2016

External links

Audet, Charles; Kokkolaras, Michael (2016). "Blackbox and derivative-free optimization: theory, algorithms and applications". Optimization and Engineering. 17: 1–2. doi: 10.1007/s11081-016-9307-4 .

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[CSV-1] Conn, A. R.; Scheinberg, K.; Vicente, L. N. (2009). Introduction to Derivative-Free Optimization. MPS-SIAM Book Series on Optimization. Philadelphia: SIAM. Retrieved 2014-01-18.

[2] Using Well-Understood Single-Objective Functions in Multiobjective Black-Box Optimization Test Suites, https://arxiv.org/abs/1604.00359, 2016

[1]

[2]