Lagrange multipliers on Banach spaces

Last updated October 02, 2024

In the field of calculus of variations in mathematics, the method of Lagrange multipliers on Banach spaces can be used to solve certain infinite-dimensional constrained optimization problems. The method is a generalization of the classical method of Lagrange multipliers as used to find extrema of a function of finitely many variables.

The Lagrange multiplier theorem for Banach spaces

Let X and Y be real Banach spaces. Let U be an open subset of X and let f : U → R be a continuously differentiable function. Let g : U → Y be another continuously differentiable function, the constraint: the objective is to find the extremal points (maxima or minima) of f subject to the constraint that g is zero.

Suppose that u₀ is a constrained extremum of f, i.e. an extremum of f on

g^{-1}(0)=\{x\in U\mid g(x)=0\in Y\}\subseteq U.

Suppose also that the Fréchet derivative Dg(u₀) : X → Y of g at u₀ is a surjective linear map. Then there exists a Lagrange multiplierλ : Y → R in Y^∗, the dual space to Y, such that

\mathrm {D} f(u_{0})=\lambda \circ \mathrm {D} g(u_{0}).\quad {\mbox{(L)}}

Since Df(u₀) is an element of the dual space X^∗, equation (L) can also be written as

\mathrm {D} f(u_{0})=\left(\mathrm {D} g(u_{0})\right)^{*}(\lambda ),

where (Dg(u₀))^∗(λ) is the pullback of λ by Dg(u₀), i.e. the action of the adjoint map (Dg(u₀))^∗ on λ, as defined by

\left(\mathrm {D} g(u_{0})\right)^{*}(\lambda )=\lambda \circ \mathrm {D} g(u_{0}).

Connection to the finite-dimensional case

In the case that X and Y are both finite-dimensional (i.e. linearly isomorphic to R^m and Rⁿ for some natural numbers m and n) then writing out equation (L) in matrix form shows that λ is the usual Lagrange multiplier vector; in the case n = 1, λ is the usual Lagrange multiplier, a real number.

Application

In many optimization problems, one seeks to minimize a functional defined on an infinite-dimensional space such as a Banach space.

Consider, for example, the Sobolev space ${\textstyle X=H_{0}^{1}([-1,+1];\mathbb {R} )}$ and the functional ${\textstyle f:X\rightarrow \mathbb {R} }$ given by

f(u)=\int _{-1}^{+1}u'(x)^{2}\,\mathrm {d} x.

Without any constraint, the minimum value of f would be 0, attained by u₀(x) = 0 for all x between −1 and +1. One could also consider the constrained optimization problem, to minimize f among all those u ∈ X such that the mean value of u is +1. In terms of the above theorem, the constraint g would be given by

g(u)={\frac {1}{2}}\int _{-1}^{+1}u(x)\,\mathrm {d} x-1.

However this problem can be solved as in the finite dimensional case since the Lagrange multiplier $\lambda$ is only a scalar.

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equation constraints. It is named after the mathematician Joseph-Louis Lagrange.

The calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.

In mathematics, the Hessian matrix, Hessian or Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants". The Hessian is sometimes denoted by H or, ambiguously, by ∇².

In mathematics and its applications, a Sturm–Liouville problem is a second-order linear ordinary differential equation of the form $for given functions, and, together with some boundary conditions at extreme values of . The goals of a given Sturm-Liouville problem are:$

In functional analysis, a branch of mathematics, a compact operator is a linear operator $, where are normed vector spaces, with the property that maps bounded subsets of to relatively compact subsets of . Such an operator is necessarily a bounded operator, and so continuous. Some authors require that are Banach, but the definition can be extended to more general spaces.$

Pontryagin's maximum principle is used in optimal control theory to find the best possible control for taking a dynamical system from one state to another, especially in the presence of constraints for the state or input controls. It states that it is necessary for any optimal control along with the optimal state trajectory to solve the so-called Hamiltonian system, which is a two-point boundary value problem, plus a maximum condition of the control Hamiltonian. These necessary conditions become sufficient under certain convexity conditions on the objective and constraint functions.

In physics, a first-class constraint is a dynamical quantity in a constrained Hamiltonian system whose Poisson bracket with all the other constraints vanishes on the constraint surface in phase space. To calculate the first-class constraint, one assumes that there are no second-class constraints, or that they have been calculated previously, and their Dirac brackets generated.

The Nambu–Goto action is the simplest invariant action in bosonic string theory, and is also used in other theories that investigate string-like objects. It is the starting point of the analysis of zero-thickness string behaviour, using the principles of Lagrangian mechanics. Just as the action for a free point particle is proportional to its proper time – i.e., the "length" of its world-line – a relativistic string's action is proportional to the area of the sheet which the string traces as it travels through spacetime.

In mathematical optimization theory, the linear complementarity problem (LCP) arises frequently in computational mechanics and encompasses the well-known quadratic programming as a special case. It was proposed by Cottle and Dantzig in 1968.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then the dual is a maximization problem. Any feasible solution to the primal (minimization) problem is at least as large as any feasible solution to the dual (maximization) problem. Therefore, the solution to the primal is an upper bound to the solution of the dual, and the solution of the dual is a lower bound to the solution of the primal. This fact is called weak duality.

In mathematical optimization, constrained optimization is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized. Constraints can be either hard constraints, which set conditions for the variables that are required to be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.

Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization which may be considered a quasi-Newton method. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable, but not necessarily convex.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective, but the augmented Lagrangian method adds yet another term designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with, the method of Lagrange multipliers.

Sequential linear-quadratic programming (SLQP) is an iterative method for nonlinear optimization problems where objective function and constraints are twice continuously differentiable. Similarly to sequential quadratic programming (SQP), SLQP proceeds by solving a sequence of optimization subproblems. The difference between the two approaches is that:

In numerical partial differential equations, the Ladyzhenskaya–Babuška–Brezzi (LBB) condition is a sufficient condition for a saddle point problem to have a unique solution that depends continuously on the input data. Saddle point problems arise in the discretization of Stokes flow and in the mixed finite element discretization of Poisson's equation. For positive-definite problems, like the unmixed formulation of the Poisson equation, most discretization schemes will converge to the true solution in the limit as the mesh is refined. For saddle point problems, however, many discretizations are unstable, giving rise to artifacts such as spurious oscillations. The LBB condition gives criteria for when a discretization of a saddle point problem is stable.

References

Luenberger, David G. (1969). "Local Theory of Constrained Optimization". Optimization by Vector Space Methods. New York: John Wiley & Sons. pp. 239–270. ISBN 0-471-55359-X.
Zeidler, Eberhard (1995). Applied functional analysis: Variational Methods and Optimization. Applied Mathematical Sciences 109. Vol. 109. New York, NY: Springer-Verlag. doi:10.1007/978-1-4612-0821-1. ISBN 978-1-4612-0821-1. (See Section 4.14, pp.270–271.)

This article incorporates material from Lagrange multipliers on Banach spaces on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.