Lagrangian relaxation

Last updated November 27, 2023

In the field of mathematical optimization, Lagrangian relaxation is a relaxation method which approximates a difficult problem of constrained optimization by a simpler problem. A solution to the relaxed problem is an approximate solution to the original problem, and provides useful information.

The method penalizes violations of inequality constraints using a Lagrange multiplier, which imposes a cost on violations. These added costs are used instead of the strict inequality constraints in the optimization. In practice, this relaxed problem can often be solved more easily than the original problem.

The problem of maximizing the Lagrangian function of the dual variables (the Lagrangian multipliers) is the Lagrangian dual problem.

Mathematical description

Suppose we are given a linear programming problem, with $x\in \mathbb {R} ^{n}$ and $A\in \mathbb {R} ^{m,n}$ , of the following form:

max		$c^{T}x$
s.t.
		$Ax\leq b$

If we split the constraints in $A$ such that $A_{1}\in \mathbb {R} ^{m_{1},n}$ , $A_{2}\in \mathbb {R} ^{m_{2},n}$ and $m_{1}+m_{2}=m$ we may write the system:

max		$c^{T}x$
s.t.
(1)		$A_{1}x\leq b_{1}$
(2)		$A_{2}x\leq b_{2}$

We may introduce the constraint (2) into the objective:

max		$c^{T}x+\lambda ^{T}(b_{2}-A_{2}x)$
s.t.
(1)		$A_{1}x\leq b_{1}$

If we let $\lambda =(\lambda _{1},\ldots ,\lambda _{m_{2}})$ be nonnegative weights, we get penalized if we violate the constraint (2), and we are also rewarded if we satisfy the constraint strictly. The above system is called the Lagrangian relaxation of our original problem.

The LR solution as a bound

Of particular use is the property that for any fixed set of ${\tilde {\lambda }}\succeq 0$ values, the optimal result to the Lagrangian relaxation problem will be no smaller than the optimal result to the original problem. To see this, let ${\hat {x}}$ be the optimal solution to the original problem, and let ${\bar {x}}$ be the optimal solution to the Lagrangian relaxation. We can then see that

c^{T}{\hat {x}}\leq c^{T}{\hat {x}}+{\tilde {\lambda }}^{T}(b_{2}-A_{2}{\hat {x}})\leq c^{T}{\bar {x}}+{\tilde {\lambda }}^{T}(b_{2}-A_{2}{\bar {x}})

The first inequality is true because ${\hat {x}}$ is feasible in the original problem and the second inequality is true because ${\bar {x}}$ is the optimal solution to the Lagrangian relaxation.

Iterating towards a solution of the original problem

The above inequality tells us that if we minimize the maximum value we obtain from the relaxed problem, we obtain a tighter limit on the objective value of our original problem. Thus we can address the original problem by instead exploring the partially dualized problem

min

P(\lambda )

s.t.

\lambda \geq 0

where we define $P(\lambda )$ as

max		$c^{T}x+\lambda ^{T}(b_{2}-A_{2}x)$
s.t.
(1)		$A_{1}x\leq b_{1}$

A Lagrangian relaxation algorithm thus proceeds to explore the range of feasible $\lambda$ values while seeking to minimize the result returned by the inner $P$ problem. Each value returned by $P$ is a candidate upper bound to the problem, the smallest of which is kept as the best upper bound. If we additionally employ a heuristic, probably seeded by the ${\bar {x}}$ values returned by $P$ , to find feasible solutions to the original problem, then we can iterate until the best upper bound and the cost of the best feasible solution converge to a desired tolerance.

Related methods

The augmented Lagrangian method is quite similar in spirit to the Lagrangian relaxation method, but adds an extra term, and updates the dual parameters $\lambda$ in a more principled manner. It was introduced in the 1970s and has been used extensively.

The penalty method does not use dual variables but rather removes the constraints and instead penalizes deviations from the constraint. The method is conceptually simple but usually augmented Lagrangian methods are preferred in practice since the penalty method suffers from ill-conditioning issues.

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming.

Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equation constraints. It is named after the mathematician Joseph-Louis Lagrange.

An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints are linear.

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are algorithms for solving convex optimization problems, both linear and non-linear. They combine two advantages of previously-known algorithms:

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

In multi-objective optimization, the Pareto front is the set of all Pareto efficient solutions. The concept is widely used in engineering. It allows the designer to restrict attention to the set of efficient choices, and to make tradeoffs within this set, rather than considering the full range of every parameter.

In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then the dual is a maximization problem. Any feasible solution to the primal (minimization) problem is at least as large as any feasible solution to the dual (maximization) problem. Therefore, the solution to the primal is an upper bound to the solution of the dual, and the solution of the dual is a lower bound to the solution of the primal. This fact is called weak duality.

In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions. When specialized to solving feasible linear optimization problems with rational data, the ellipsoid method is an algorithm which finds an optimal solution in a number of steps that is polynomial in the input size.

Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

In mathematical optimization and related fields, relaxation is a modeling strategy. A relaxation is an approximation of a difficult problem by a nearby problem that is easier to solve. A solution of the relaxed problem provides information about the original problem.

Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization which may be considered a quasi-Newton method. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable.

Subgradient methods are iterative methods for solving convex minimization problems. Originally developed by Naum Z. Shor and others in the 1960s and 1970s, subgradient methods are convergent when applied even to a non-differentiable objective function. When the objective function is differentiable, sub-gradient methods for unconstrained problems use the same search direction as the method of steepest descent.

In mathematical optimization, linear-fractional programming (LFP) is a generalization of linear programming (LP). Whereas the objective function in a linear program is a linear function, the objective function in a linear-fractional program is a ratio of two linear functions. A linear program can be regarded as a special case of a linear-fractional program in which the denominator is the constant function 1.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective, but the augmented Lagrangian method adds yet another term designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with, the method of Lagrange multipliers.

In optimization problems in applied mathematics, the duality gap is the difference between the primal and dual solutions. If $is the optimal dual value and is the optimal primal value then the duality gap is equal to . This value is always greater than or equal to 0. The duality gap is zero if and only if strong duality holds. Otherwise the gap is strictly positive and weak duality holds.$

Sparse dictionary learning is a representation learning method which aims at finding a sparse representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than the one of the signals being observed. The above two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal but also provide an improvement in sparsity and flexibility of the representation.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

References

Books

Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin (1993). Network Flows: Theory, Algorithms and Applications. Prentice Hall. ISBN 0-13-617549-X.{{cite book}}: CS1 maint: multiple names: authors list (link)
Bertsekas, Dimitri P. (1999). Nonlinear Programming: 2nd Edition. Athena Scientific. ISBN 1-886529-00-0.
Bonnans, J. Frédéric; Gilbert, J. Charles; Lemaréchal, Claude; Sagastizábal, Claudia A. (2006). Numerical optimization: Theoretical and practical aspects. Universitext (Second revised ed. of translation of 1997 French ed.). Berlin: Springer-Verlag. pp. xiv+490. doi:10.1007/978-3-540-35447-5. ISBN 3-540-35445-X. MR 2265882.
Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1993). Convex analysis and minimization algorithms, Volume I: Fundamentals. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Vol. 305. Berlin: Springer-Verlag. pp. xviii+417. ISBN 3-540-56850-6. MR 1261420.
Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1993). "14 Duality for Practitioners". Convex analysis and minimization algorithms, Volume II: Advanced theory and bundle methods. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Vol. 306. Berlin: Springer-Verlag. pp. xviii+346. ISBN 3-540-56852-2.
Lasdon, Leon S. (2002). Optimization theory for large systems (reprint of the 1970 Macmillan ed.). Mineola, New York: Dover Publications, Inc. pp. xiii+523. MR 1888251.
Lemaréchal, Claude (2001). "Lagrangian relaxation". In Michael Jünger and Denis Naddef (ed.). Computational combinatorial optimization: Papers from the Spring School held in Schloß Dagstuhl, May 15–19, 2000. Lecture Notes in Computer Science. Vol. 2241. Berlin: Springer-Verlag. pp. 112–156. doi:10.1007/3-540-45586-8_4. ISBN 3-540-42877-1. MR 1900016. S2CID 9048698.
Minoux, M. (1986). Mathematical programming: Theory and algorithms. Egon Balas (foreword) (Translated by Steven Vajda from the (1983 Paris: Dunod) French ed.). Chichester: A Wiley-Interscience Publication. John Wiley & Sons, Ltd. pp. xxviii+489. ISBN 0-471-90170-9. MR 0868279. (2008 Second ed., in French: Programmation mathématique: Théorie et algorithmes. Editions Tec & Doc, Paris, 2008. xxx+711 pp. ).

Articles

Everett, Hugh III (1963). "Generalized Lagrange multiplier method for solving problems of optimum allocation of resources". Operations Research. 11 (3): 399–417. doi:10.1287/opre.11.3.399. JSTOR 168028. MR 0152360.
Kiwiel, Krzysztof C.; Larsson, Torbjörn; Lindberg, P. O. (August 2007). "Lagrangian relaxation via ballstep subgradient methods". Mathematics of Operations Research . 32 (3): 669–686. doi:10.1287/moor.1070.0261. MR 2348241.
Bragin, Mikhail A.; Luh, Peter B.; Yan, Joseph H.; Yu, Nanpeng and Stern, Gary A. (2015). "Convergence of the Surrogate Lagrangian Relaxation Method," Journal of Optimization Theory and Applications.164 (1): 173-201, doi : 10.1007/s10957-014-0561-3
Bragin, Mikhail A.; Tucker, Emily, L. (2022) "Surrogate "Level-Based" Lagrangian Relaxation for mixed-integer linear programming," Scientific Reports. 12: 22417, doi: 10.1038/s41598-022-26264-1
Neal Young, Lagrangian Relaxation Example, in AlgNotes Blog.
Neal Young, 2012, Toy Examples for Plotkin-Shmoys-Tardos and Arora-Kale solvers in CSTheory Stack Exchange.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.