Lagrange multipliers on Banach spaces

Last updated

In the field of calculus of variations in mathematics, the method of Lagrange multipliers on Banach spaces can be used to solve certain infinite-dimensional constrained optimization problems. The method is a generalization of the classical method of Lagrange multipliers as used to find extrema of a function of finitely many variables.

Contents

The Lagrange multiplier theorem for Banach spaces

Let X and Y be real Banach spaces. Let U be an open subset of X and let f : UR be a continuously differentiable function. Let g : UY be another continuously differentiable function, the constraint: the objective is to find the extremal points (maxima or minima) of f subject to the constraint that g is zero.

Suppose that u0 is a constrained extremum of f, i.e. an extremum of f on

Suppose also that the Fréchet derivative Dg(u0) : XY of g at u0 is a surjective linear map. Then there exists a Lagrange multiplierλ : YR in Y, the dual space to Y, such that

Since Df(u0) is an element of the dual space X, equation (L) can also be written as

where (Dg(u0))(λ) is the pullback of λ by Dg(u0), i.e. the action of the adjoint map (Dg(u0)) on λ, as defined by

Connection to the finite-dimensional case

In the case that X and Y are both finite-dimensional (i.e. linearly isomorphic to Rm and Rn for some natural numbers m and n) then writing out equation (L) in matrix form shows that λ is the usual Lagrange multiplier vector; in the case n = 1, λ is the usual Lagrange multiplier, a real number.

Application

In many optimization problems, one seeks to minimize a functional defined on an infinite-dimensional space such as a Banach space.

Consider, for example, the Sobolev space and the functional given by

Without any constraint, the minimum value of f would be 0, attained by u0(x) = 0 for all x between 1 and +1. One could also consider the constrained optimization problem, to minimize f among all those uX such that the mean value of u is +1. In terms of the above theorem, the constraint g would be given by

However this problem can be solved as in the finite dimensional case since the Lagrange multiplier is only a scalar.

See also

Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints. It is named after the mathematician Joseph-Louis Lagrange. The basic idea is to convert a constrained problem into a form such that the derivative test of an unconstrained problem can still be applied. The relationship between the gradient of the function and gradients of the constraints rather naturally leads to a reformulation of the original problem, known as the Lagrangian function.

The calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.

In functional analysis, a branch of mathematics, a compact operator is a linear operator , where are normed vector spaces, with the property that maps bounded subsets of to relatively compact subsets of . Such an operator is necessarily a bounded operator, and so continuous. Some authors require that are Banach, but the definition can be extended to more general spaces.

Pontryagin's maximum principle is used in optimal control theory to find the best possible control for taking a dynamical system from one state to another, especially in the presence of constraints for the state or input controls. It states that it is necessary for any optimal control along with the optimal state trajectory to solve the so-called Hamiltonian system, which is a two-point boundary value problem, plus a maximum condition of the control Hamiltonian. These necessary conditions become sufficient under certain convexity conditions on the objective and constraint functions.

A first class constraint is a dynamical quantity in a constrained Hamiltonian system whose Poisson bracket with all the other constraints vanishes on the constraint surface in phase space. To calculate the first class constraint, one assumes that there are no second class constraints, or that they have been calculated previously, and their Dirac brackets generated.

In mathematical optimization theory, the linear complementarity problem (LCP) arises frequently in computational mechanics and encompasses the well-known quadratic programming as a special case. It was proposed by Cottle and Dantzig in 1968.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then the dual is a maximization problem. Any feasible solution to the primal (minimization) problem is at least as large as any feasible solution to the dual (maximization) problem. Therefore, the solution to the primal is an upper bound to the solution of the dual, and the solution of the dual is a lower bound to the solution of the primal. This fact is called weak duality.

In mathematical optimization, constrained optimization is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized. Constraints can be either hard constraints, which set conditions for the variables that are required to be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.

Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

<span class="mw-page-title-main">Hilbert space</span> Generalization of Euclidean space allowing infinite dimensions

In mathematics, Hilbert spaces allow generalizing the methods of linear algebra and calculus from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. A Hilbert space is a vector space equipped with an inner product which defines a distance function for which it is a complete metric space. Hilbert spaces arise naturally and frequently in mathematics and physics, typically as function spaces.

The adjoint state method is a numerical method for efficiently computing the gradient of a function or operator in a numerical optimization problem. It has applications in geophysics, seismic imaging, photonics and more recently in neural networks.

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective; the difference is that the augmented Lagrangian method adds yet another term, designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with the method of Lagrange multipliers.

In mathematics, the Lyapunov–Schmidt reduction or Lyapunov–Schmidt construction is used to study solutions to nonlinear equations in the case when the implicit function theorem does not work. It permits the reduction of infinite-dimensional equations in Banach spaces to finite-dimensional equations. It is named after Aleksandr Lyapunov and Erhard Schmidt.

Sequential linear-quadratic programming (SLQP) is an iterative method for nonlinear optimization problems where objective function and constraints are twice continuously differentiable. Similarly to sequential quadratic programming (SQP), SLQP proceeds by solving a sequence of optimization subproblems. The difference between the two approaches is that:

In numerical partial differential equations, the Ladyzhenskaya–Babuška–Brezzi (LBB) condition is a sufficient condition for a saddle point problem to have a unique solution that depends continuously on the input data. Saddle point problems arise in the discretization of Stokes flow and in the mixed finite element discretization of Poisson's equation. For positive-definite problems, like the unmixed formulation of the Poisson equation, most discretization schemes will converge to the true solution in the limit as the mesh is refined. For saddle point problems, however, many discretizations are unstable, giving rise to artifacts such as spurious oscillations. The LBB condition gives criteria for when a discretization of a saddle point problem is stable.

References

This article incorporates material from Lagrange multipliers on Banach spaces on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.