Nonlinear programming

Last updated August 16, 2024

In mathematics, nonlinear programming (NLP) is the process of solving an optimization problem where some of the constraints are not linear equalities or the objective function is not a linear function. An optimization problem is one of calculation of the extrema (maxima, minima or stationary points) of an objective function over a set of unknown real variables and conditional to the satisfaction of a system of equalities and inequalities, collectively termed constraints. It is the sub-field of mathematical optimization that deals with problems that are not linear.

Definition and discussion

Let n, m, and p be positive integers. Let X be a subset of Rⁿ (usually a box-constrained one), let f, g_i, and h_j be real-valued functions on X for each i in {1, ..., m} and each j in {1, ..., p}, with at least one of f, g_i, and h_j being nonlinear.

A nonlinear programming problem is an optimization problem of the form

{\begin{aligned}{\text{minimize }}&f(x)\\{\text{subject to }}&g_{i}(x)\leq 0{\text{ for each }}i\in \{1,\dotsc ,m\}\\&h_{j}(x)=0{\text{ for each }}j\in \{1,\dotsc ,p\}\\&x\in X.\end{aligned}}

Depending on the constraint set, there are several possibilities:

feasible problem is one for which there exists at least one set of values for the choice variables satisfying all the constraints.
an infeasible problem is one for which no set of values for the choice variables satisfies all the constraints. That is, the constraints are mutually contradictory, and no solution exists; the feasible set is the empty set.
unbounded problem is a feasible problem for which the objective function can be made to be better than any given finite value. Thus there is no optimal solution, because there is always a feasible solution that gives a better objective function value than does any given proposed solution.

Most realistic applications feature feasible problems, with infeasible or unbounded problems seen as a failure of an underlying model. In some cases, infeasible problems are handled by minimizing a sum of feasibility violations.

Some special cases of nonlinear programming have specialized solution methods:

If the objective function is concave (maximization problem), or convex (minimization problem) and the constraint set is convex, then the program is called convex and general methods from convex optimization can be used in most cases.
If the objective function is quadratic and the constraints are linear, quadratic programming techniques are used.
If the objective function is a ratio of a concave and a convex function (in the maximization case) and the constraints are convex, then the problem can be transformed to a convex optimization problem using fractional programming techniques.

Applicability

A typical non-convex problem is that of optimizing transportation costs by selection from a set of transportation methods, one or more of which exhibit economies of scale, with various connectivities and capacity constraints. An example would be petroleum product transport given a selection or combination of pipeline, rail tanker, road tanker, river barge, or coastal tankship. Owing to economic batch size the cost functions may have discontinuities in addition to smooth changes.

In experimental science, some simple data analysis (such as fitting a spectrum with a sum of peaks of known location and shape but unknown magnitude) can be done with linear methods, but in general these problems are also nonlinear. Typically, one has a theoretical model of the system under study with variable parameters in it and a model the experiment or experiments, which may also have unknown parameters. One tries to find a best fit numerically. In this case one often wants a measure of the precision of the result, as well as the best fit itself.

Methods for solving a general nonlinear program

Analytic methods

Under differentiability and constraint qualifications, the Karush–Kuhn–Tucker (KKT) conditions provide necessary conditions for a solution to be optimal. If some of the functions are non-differentiable, subdifferential versions of Karush–Kuhn–Tucker (KKT) conditions are available.^[1]

Under convexity, the KKT conditions are sufficient for a global optimum. Without convexity, these conditions are sufficient only for a local optimum. In some cases, the number of local optima is small, and one can find all of them analytically and find the one for which the objective value is smallest.^[2]

Numeric methods

In most realistic cases, it is very hard to solve the KKT conditions analytically, and so the problems are solved using numerical methods. These methods are iterative: they start with an initial point, and then proceed to points that are supposed to be closer to the optimal point, using some update rule. There are three kinds of update rules:^[2]^: 5.1.2

Zero-order routines - use only the values of the objective function and constraint functions at the current point;
First-order routines - use also the values of the gradients of these functions;
Second-order routines - use also the values of the Hessians of these functions.

Third-order routines (and higher) are theoretically possible, but not used in practice, due to the higher computational load and little theoretical benefit.

Branch and bound

Another method involves the use of branch and bound techniques, where the program is divided into subclasses to be solved with convex (minimization problem) or linear approximations that form a lower bound on the overall cost within the subdivision. With subsequent divisions, at some point an actual solution will be obtained whose cost is equal to the best lower bound obtained for any of the approximate solutions. This solution is optimal, although possibly not unique. The algorithm may also be stopped early, with the assurance that the best possible solution is within a tolerance from the best point found; such points are called ε-optimal. Terminating to ε-optimal points is typically necessary to ensure finite termination. This is especially useful for large, difficult problems and problems with uncertain costs or values where the uncertainty can be estimated with an appropriate reliability estimation.

Implementations

There exist numerous nonlinear programming solvers, including open source:

ALGLIB (C++, C#, Java, Python API) implements several first-order and derivative-free nonlinear programming solvers
NLopt (C/C++ implementation, with numerous interfaces including Julia, Python, R, MATLAB/Octave), includes various nonlinear programming solvers
SciPy (de facto standard for scientific Python) has scipy.optimize solver, which includes several nonlinear programming algorithms (zero-order, first order and second order ones).
IPOPT (C++ implementation, with numerous interfaces including C, Fortran, Java, AMPL, R, Python, etc.) is an interior point method solver (zero-order, and optionally first order and second order derivatives).

Numerical Examples

2-dimensional example

A simple problem (shown in the diagram) can be defined by the constraints ${\begin{aligned}x_{1}&\geq 0\\x_{2}&\geq 0\\x_{1}^{2}+x_{2}^{2}&\geq 1\\x_{1}^{2}+x_{2}^{2}&\leq 2\end{aligned}}$ with an objective function to be maximized $f(\mathbf {x} )=x_{1}+x_{2}$ where $x = (x 1, x 2)$ .

3-dimensional example

Another simple problem (see diagram) can be defined by the constraints ${\begin{aligned}x_{1}^{2}-x_{2}^{2}+x_{3}^{2}&\leq 2\\x_{1}^{2}+x_{2}^{2}+x_{3}^{2}&\leq 10\end{aligned}}$ with an objective function to be maximized $f(\mathbf {x} )=x_{1}x_{2}+x_{2}x_{3}$ where $x = (x 1, x 2, x 3)$ .

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the Moon with minimum fuel expenditure. Or the dynamical system could be a nation's economy, with the objective to minimize unemployment; the controls in this case could be fiscal and monetary policy. A dynamical system may also be introduced to embed operations research problems within the framework of optimal control theory.

Multi-disciplinary design optimization (MDO) is a field of engineering that uses optimization methods to solve design problems incorporating a number of disciplines. It is also known as multidisciplinary system design optimization (MSDO), and multidisciplinary design analysis and optimization (MDAO).

In mathematical optimization, the active-set method is an algorithm used to identify the active constraints in a set of inequality constraints. The active constraints are then expressed as equality constraints, thereby transforming an inequality-constrained problem into a simpler equality-constrained subproblem.

In mathematical optimization, the cutting-plane method is any of a variety of optimization methods that iteratively refine a feasible set or objective function by means of linear inequalities, termed cuts. Such procedures are commonly used to find integer solutions to mixed integer linear programming (MILP) problems, as well as to solve general, not necessarily differentiable convex optimization problems. The use of cutting planes to solve MILP was introduced by Ralph E. Gomory.

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are algorithms for solving linear and non-linear convex optimization problems. IPMs combine two advantages of previously-known algorithms:

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Also known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was originally proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, and moves towards a minimizer of this linear function.

In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then the dual is a maximization problem. Any feasible solution to the primal (minimization) problem is at least as large as any feasible solution to the dual (maximization) problem. Therefore, the solution to the primal is an upper bound to the solution of the dual, and the solution of the dual is a lower bound to the solution of the primal. This fact is called weak duality.

In mathematical optimization, constrained optimization is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized. Constraints can be either hard constraints, which set conditions for the variables that are required to be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.

Semidefinite programming (SDP) is a subfield of mathematical programming concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

Penalty methods are a certain class of algorithms for solving constrained optimization problems.

In mathematical optimization and computer science, a feasible region,feasible set, or solution space is the set of all possible points of an optimization problem that satisfy the problem's constraints, potentially including inequalities, equalities, and integer constraints. This is the initial set of candidate solutions to the problem, before the set of candidates has been narrowed down.

In mathematical optimization, a quadratically constrained quadratic program (QCQP) is an optimization problem in which both the objective function and the constraints are quadratic functions. It has the form

In operations research, the Big M method is a method of solving linear programming problems using the simplex algorithm. The Big M method extends the simplex algorithm to problems that contain "greater-than" constraints. It does so by associating the constraints with large negative constants which would not be part of any optimal solution, if it exists.

Lexicographic max-min optimization is a kind of multi-objective optimization. In general, multi-objective optimization deals with optimization problems with two or more objective functions to be optimized simultaneously. Lexmaxmin optimization presumes that the decision-maker would like the smallest objective value to be as high as possible; subject to this, the second-smallest objective should be as high as possible; and so on. In other words, the decision-maker ranks the possible solutions according to a leximin order of their objective function values.

References

↑ Ruszczyński, Andrzej (2006). Nonlinear Optimization. Princeton, NJ: Princeton University Press. pp. xii+454. ISBN 978-0691119151. MR 2199043.
1 2 Nemirovsky and Ben-Tal (2023). "Optimization III: Convex Optimization" (PDF).

External links

Mathematical Programming Glossary

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Ruszczyński, Andrzej (2006). Nonlinear Optimization. Princeton, NJ: Princeton University Press. pp. xii+454. ISBN 978-0691119151. MR 2199043.

[:0-2] 1 2 Nemirovsky and Ben-Tal (2023). "Optimization III: Convex Optimization" (PDF).

[1]

[2]