Robust optimization is a field of mathematical optimization theory that deals with optimization problems in which a certain measure of robustness is sought against uncertainty that can be represented as deterministic variability in the value of the parameters of the problem itself and/or its solution. It is related to, but often distinguished from, probabilistic optimization methods such as chance-constrained optimization. [1] [2]
The origins of robust optimization date back to the establishment of modern decision theory in the 1950s and the use of worst case analysis and Wald's maximin model as a tool for the treatment of severe uncertainty. It became a discipline of its own in the 1970s with parallel developments in several scientific and technological fields. Over the years, it has been applied in statistics, but also in operations research, [3] electrical engineering, [4] [5] [6] control theory, [7] finance, [8] portfolio management [9] logistics, [10] manufacturing engineering, [11] chemical engineering, [12] medicine, [13] and computer science. In engineering problems, these formulations often take the name of "Robust Design Optimization", RDO or "Reliability Based Design Optimization", RBDO.
Consider the following linear programming problem
where is a given subset of .
What makes this a 'robust optimization' problem is the clause in the constraints. Its implication is that for a pair to be admissible, the constraint must be satisfied by the worst pertaining to , namely the pair that maximizes the value of for the given value of .
If the parameter space is finite (consisting of finitely many elements), then this robust optimization problem itself is a linear programming problem: for each there is a linear constraint .
If is not a finite set, then this problem is a linear semi-infinite programming problem, namely a linear programming problem with finitely many (2) decision variables and infinitely many constraints.
There are a number of classification criteria for robust optimization problems/models. In particular, one can distinguish between problems dealing with local and global models of robustness; and between probabilistic and non-probabilistic models of robustness. Modern robust optimization deals primarily with non-probabilistic models of robustness that are worst case oriented and as such usually deploy Wald's maximin models.
There are cases where robustness is sought against small perturbations in a nominal value of a parameter. A very popular model of local robustness is the radius of stability model:
where denotes the nominal value of the parameter, denotes a ball of radius centered at and denotes the set of values of that satisfy given stability/performance conditions associated with decision .
In words, the robustness (radius of stability) of decision is the radius of the largest ball centered at all of whose elements satisfy the stability requirements imposed on . The picture is this:
where the rectangle represents the set of all the values associated with decision .
Consider the simple abstract robust optimization problem
where denotes the set of all possible values of under consideration.
This is a global robust optimization problem in the sense that the robustness constraint represents all the possible values of .
The difficulty is that such a "global" constraint can be too demanding in that there is no that satisfies this constraint. But even if such an exists, the constraint can be too "conservative" in that it yields a solution that generates a very small payoff that is not representative of the performance of other decisions in . For instance, there could be an that only slightly violates the robustness constraint but yields a very large payoff . In such cases it might be necessary to relax a bit the robustness constraint and/or modify the statement of the problem.
Consider the case where the objective is to satisfy a constraint . where denotes the decision variable and is a parameter whose set of possible values in . If there is no such that , then the following intuitive measure of robustness suggests itself:
where denotes an appropriate measure of the "size" of set . For example, if is a finite set, then could be defined as the cardinality of set .
In words, the robustness of decision is the size of the largest subset of for which the constraint is satisfied for each in this set. An optimal decision is then a decision whose robustness is the largest.
This yields the following robust optimization problem:
This intuitive notion of global robustness is not used often in practice because the robust optimization problems that it induces are usually (not always) very difficult to solve.
Consider the robust optimization problem
where is a real-valued function on , and assume that there is no feasible solution to this problem because the robustness constraint is too demanding.
To overcome this difficulty, let be a relatively small subset of representing "normal" values of and consider the following robust optimization problem:
Since is much smaller than , its optimal solution may not perform well on a large portion of and therefore may not be robust against the variability of over .
One way to fix this difficulty is to relax the constraint for values of outside the set in a controlled manner so that larger violations are allowed as the distance of from increases. For instance, consider the relaxed robustness constraint
where is a control parameter and denotes the distance of from . Thus, for the relaxed robustness constraint reduces back to the original robustness constraint. This yields the following (relaxed) robust optimization problem:
The function is defined in such a manner that
and
and therefore the optimal solution to the relaxed problem satisfies the original constraint for all values of in . It also satisfies the relaxed constraint
outside .
The dominating paradigm in this area of robust optimization is Wald's maximin model, namely
where the represents the decision maker, the represents Nature, namely uncertainty, represents the decision space and denotes the set of possible values of associated with decision . This is the classic format of the generic model, and is often referred to as minimax or maximin optimization problem. The non-probabilistic (deterministic) model has been and is being extensively used for robust optimization especially in the field of signal processing. [14] [15] [16]
The equivalent mathematical programming (MP) of the classic format above is
Constraints can be incorporated explicitly in these models. The generic constrained classic format is
The equivalent constrained MP format is defined as:
These models quantify the uncertainty in the "true" value of the parameter of interest by probability distribution functions. They have been traditionally classified as stochastic programming and stochastic optimization models. Recently, probabilistically robust optimization has gained popularity by the introduction of rigorous theories such as scenario optimization able to quantify the robustness level of solutions obtained by randomization. These methods are also relevant to data-driven optimization methods.
The solution method to many robust program involves creating a deterministic equivalent, called the robust counterpart. The practical difficulty of a robust program depends on if its robust counterpart is computationally tractable. [17] [18]
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.
Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.
An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints are linear.
In the field of mathematical optimization, stochastic programming is a framework for modeling optimization problems that involve uncertainty. A stochastic program is an optimization problem in which some or all problem parameters are uncertain, but follow known probability distributions. This framework contrasts with deterministic optimization, in which all problem parameters are assumed to be known exactly. The goal of stochastic programming is to find a decision which both optimizes some criteria chosen by the decision maker, and appropriately accounts for the uncertainty of the problem parameters. Because many real-world decisions involve uncertainty, stochastic programming has found applications in a broad range of areas ranging from finance to transportation to energy optimization.
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem that results from those initial choices. This breaks a dynamic optimization problem into a sequence of simpler subproblems, as Bellman's “principle of optimality" prescribes. The equation applies to algebraic structures with a total ordering; for algebraic structures with a partial ordering, the generic Bellman's equation can be used.
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.
In mathematics, the stability radius of an object at a given nominal point is the radius of the largest ball, centered at the nominal point, all of whose elements satisfy pre-determined stability conditions. The picture of this intuitive notion is this:
In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then the dual is a maximization problem. Any feasible solution to the primal (minimization) problem is at least as large as any feasible solution to the dual (maximization) problem. Therefore, the solution to the primal is an upper bound to the solution of the dual, and the solution of the dual is a lower bound to the solution of the primal. This fact is called weak duality.
In decision theory, a scoring rule provides evaluation metrics for probabilistic predictions or forecasts. While "regular" loss functions assign a goodness-of-fit score to a predicted value and an observed value, scoring rules assign such a score to a predicted probability distribution and an observed value. On the other hand, a scoring function provides a summary measure for the evaluation of point predictions, i.e. one predicts a property or functional , like the expectation or the median.
In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions over convex sets. The ellipsoid method generates a sequence of ellipsoids whose volume uniformly decreases at every step, thus enclosing a minimizer of a convex function.
A second-order cone program (SOCP) is a convex optimization problem of the form
Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.
In geometry, the Chebyshev center of a bounded set having non-empty interior is the center of the minimal-radius ball enclosing the entire set , or alternatively the center of largest inscribed ball of .
The structured support-vector machine is a machine learning algorithm that generalizes the Support-Vector Machine (SVM) classifier. Whereas the SVM classifier supports binary classification, multiclass classification and regression, the structured SVM allows training of a classifier for general structured output labels.
In decision theory and game theory, Wald's maximin model is a non-probabilistic decision-making model according to which decisions are ranked on the basis of their worst-case outcomes – the optimal decision is one with the least bad worst outcome. It is one of the most important models in robust decision making in general and robust optimization in particular.
A sum-of-squares optimization program is an optimization problem with a linear cost function and a particular type of constraint on the decision variables. These constraints are of the form that when the decision variables are used as coefficients in certain polynomials, those polynomials should have the polynomial SOS property. When fixing the maximum degree of the polynomials involved, sum-of-squares optimization is also known as the Lasserre hierarchy of relaxations in semidefinite programming.
In mathematics, low-rank approximation refers to the process of approximating a given matrix by a matrix of lower rank. More precisely, it is a minimization problem, in which the cost function measures the fit between a given matrix and an approximating matrix, subject to a constraint that the approximating matrix has reduced rank. The problem is used for mathematical modeling and data compression. The rank constraint is related to a constraint on the complexity of a model that fits the data. In applications, often there are other constraints on the approximating matrix apart from the rank constraint, e.g., non-negativity and Hankel structure.
Variable neighborhood search (VNS), proposed by Mladenović & Hansen in 1997, is a metaheuristic method for solving a set of combinatorial optimization and global optimization problems. It explores distant neighborhoods of the current incumbent solution, and moves from there to a new one if and only if an improvement was made. The local search method is applied repeatedly to get from solutions in the neighborhood to local optima. VNS was designed for approximating solutions of discrete and continuous optimization problems and according to these, it is aimed for solving linear program problems, integer program problems, mixed integer program problems, nonlinear program problems, etc.
In financial mathematics and stochastic optimization, the concept of risk measure is used to quantify the risk involved in a random outcome or risk position. Many risk measures have hitherto been proposed, each having certain characteristics. The entropic value at risk (EVaR) is a coherent risk measure introduced by Ahmadi-Javid, which is an upper bound for the value at risk (VaR) and the conditional value at risk (CVaR), obtained from the Chernoff inequality. The EVaR can also be represented by using the concept of relative entropy. Because of its connection with the VaR and the relative entropy, this risk measure is called "entropic value at risk". The EVaR was developed to tackle some computational inefficiencies of the CVaR. Getting inspiration from the dual representation of the EVaR, Ahmadi-Javid developed a wide class of coherent risk measures, called g-entropic risk measures. Both the CVaR and the EVaR are members of this class.
In mathematical optimization, oracle complexity is a standard theoretical framework to study the computational requirements for solving classes of optimization problems. It is suitable for analyzing iterative algorithms which proceed by computing local information about the objective function at various points. The framework has been used to provide tight worst-case guarantees on the number of required iterations, for several important classes of optimization problems.