Part of a series of articles about |
Calculus |
---|
The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. [a] Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.
A simple example of such a problem is to find the curve of shortest length connecting two points. If there are no constraints, the solution is a straight line between the points. However, if the curve is constrained to lie on a surface in space, then the solution is less obvious, and possibly many solutions may exist. Such solutions are known as geodesics . A related problem is posed by Fermat's principle: light follows the path of shortest optical length connecting two points, which depends upon the material of the medium. One corresponding concept in mechanics is the principle of least/stationary action.
Many important problems involve functions of several variables. Solutions of boundary value problems for the Laplace equation satisfy the Dirichlet's principle. Plateau's problem requires finding a surface of minimal area that spans a given contour in space: a solution can often be found by dipping a frame in soapy water. Although such experiments are relatively easy to perform, their mathematical formulation is far from simple: there may be more than one locally minimizing surface, and they may have non-trivial topology.
The calculus of variations may be said to begin with Newton's minimal resistance problem in 1687, followed by the brachistochrone curve problem raised by Johann Bernoulli (1696). [2] It immediately occupied the attention of Jacob Bernoulli and the Marquis de l'Hôpital, but Leonhard Euler first elaborated the subject, beginning in 1733. Joseph-Louis Lagrange was influenced by Euler's work to contribute significantly to the theory. After Euler saw the 1755 work of the 19-year-old Lagrange, Euler dropped his own partly geometric approach in favor of Lagrange's purely analytic approach and renamed the subject the calculus of variations in his 1756 lecture Elementa Calculi Variationum. [3] [4] [b]
Adrien-Marie Legendre (1786) laid down a method, not entirely satisfactory, for the discrimination of maxima and minima. Isaac Newton and Gottfried Leibniz also gave some early attention to the subject. [5] To this discrimination Vincenzo Brunacci (1810), Carl Friedrich Gauss (1829), Siméon Poisson (1831), Mikhail Ostrogradsky (1834), and Carl Jacobi (1837) have been among the contributors. An important general work is that of Pierre Frédéric Sarrus (1842) which was condensed and improved by Augustin-Louis Cauchy (1844). Other valuable treatises and memoirs have been written by Strauch [ which? ] (1849), John Hewitt Jellett (1850), Otto Hesse (1857), Alfred Clebsch (1858), and Lewis Buffett Carll (1885), but perhaps the most important work of the century is that of Karl Weierstrass. His celebrated course on the theory is epoch-making, and it may be asserted that he was the first to place it on a firm and unquestionable foundation. The 20th and the 23rd Hilbert problem published in 1900 encouraged further development. [5]
In the 20th century David Hilbert, Oskar Bolza, Gilbert Ames Bliss, Emmy Noether, Leonida Tonelli, Henri Lebesgue and Jacques Hadamard among others made significant contributions. [5] Marston Morse applied calculus of variations in what is now called Morse theory. [6] Lev Pontryagin, Ralph Rockafellar and F. H. Clarke developed new mathematical tools for the calculus of variations in optimal control theory. [6] The dynamic programming of Richard Bellman is an alternative to the calculus of variations. [7] [8] [9] [c]
The calculus of variations is concerned with the maxima or minima (collectively called extrema) of functionals. A functional maps functions to scalars, so functionals have been described as "functions of functions." Functionals have extrema with respect to the elements of a given function space defined over a given domain. A functional is said to have an extremum at the function if has the same sign for all in an arbitrarily small neighborhood of [d] The function is called an extremal function or extremal. [e] The extremum is called a local maximum if everywhere in an arbitrarily small neighborhood of and a local minimum if there. For a function space of continuous functions, extrema of corresponding functionals are called strong extrema or weak extrema, depending on whether the first derivatives of the continuous functions are respectively all continuous or not. [11]
Both strong and weak extrema of functionals are for a space of continuous functions but strong extrema have the additional requirement that the first derivatives of the functions in the space be continuous. Thus a strong extremum is also a weak extremum, but the converse may not hold. Finding strong extrema is more difficult than finding weak extrema. [12] An example of a necessary condition that is used for finding weak extrema is the Euler–Lagrange equation. [13] [f]
Finding the extrema of functionals is similar to finding the maxima and minima of functions. The maxima and minima of a function may be located by finding the points where its derivative vanishes (i.e., is equal to zero). The extrema of functionals may be obtained by finding functions for which the functional derivative is equal to zero. This leads to solving the associated Euler–Lagrange equation. [g]
Consider the functional where
If the functional attains a local minimum at and is an arbitrary function that has at least one derivative and vanishes at the endpoints and then for any number close to 0,
The term is called the variation of the function and is denoted by [1] [h]
Substituting for in the functional the result is a function of
Since the functional has a minimum for the function has a minimum at and thus, [i]
Taking the total derivative of where and are considered as functions of rather than yields and because and
Therefore, where when and we have used integration by parts on the second term. The second term on the second line vanishes because at and by definition. Also, as previously mentioned the left side of the equation is zero so that
According to the fundamental lemma of calculus of variations, the part of the integrand in parentheses is zero, i.e. which is called the Euler–Lagrange equation. The left hand side of this equation is called the functional derivative of and is denoted or
In general this gives a second-order ordinary differential equation which can be solved to obtain the extremal function The Euler–Lagrange equation is a necessary, but not sufficient, condition for an extremum A sufficient condition for a minimum is given in the section Variations and sufficient condition for a minimum.
In order to illustrate this process, consider the problem of finding the extremal function which is the shortest curve that connects two points and The arc length of the curve is given by with Note that assuming y is a function of x loses generality; ideally both should be a function of some other parameter. This approach is good solely for instructive purposes.
The Euler–Lagrange equation will now be used to find the extremal function that minimizes the functional with
Since does not appear explicitly in the first term in the Euler–Lagrange equation vanishes for all and thus, Substituting for and taking the derivative,
Thus for some constant Then where Solving, we get which implies that is a constant and therefore that the shortest curve that connects two points and is and we have thus found the extremal function that minimizes the functional so that is a minimum. The equation for a straight line is In other words, the shortest distance between two points is a straight line. [j]
In physics problems it may be the case that meaning the integrand is a function of and but does not appear separately. In that case, the Euler–Lagrange equation can be simplified to the Beltrami identity [16] where is a constant. The left hand side is the Legendre transformation of with respect to
The intuition behind this result is that, if the variable is actually time, then the statement implies that the Lagrangian is time-independent. By Noether's theorem, there is an associated conserved quantity. In this case, this quantity is the Hamiltonian, the Legendre transform of the Lagrangian, which (often) coincides with the energy of the system. This is (minus) the constant in Beltrami's identity.
If depends on higher-derivatives of that is, if then must satisfy the Euler–Poisson equation, [17]
The discussion thus far has assumed that extremal functions possess two continuous derivatives, although the existence of the integral requires only first derivatives of trial functions. The condition that the first variation vanishes at an extremal may be regarded as a weak form of the Euler–Lagrange equation. The theorem of Du Bois-Reymond asserts that this weak form implies the strong form. If has continuous first and second derivatives with respect to all of its arguments, and if then has two continuous derivatives, and it satisfies the Euler–Lagrange equation.
Hilbert was the first to give good conditions for the Euler–Lagrange equations to give a stationary solution. Within a convex area and a positive thrice differentiable Lagrangian the solutions are composed of a countable collection of sections that either go along the boundary or satisfy the Euler–Lagrange equations in the interior.
However Lavrentiev in 1926 showed that there are circumstances where there is no optimum solution but one can be approached arbitrarily closely by increasing numbers of sections. The Lavrentiev Phenomenon identifies a difference in the infimum of a minimization problem across different classes of admissible functions. For instance the following problem, presented by Manià in 1934: [18]
Clearly, minimizes the functional, but we find any function gives a value bounded away from the infimum.
Examples (in one-dimension) are traditionally manifested across and but Ball and Mizel [19] procured the first functional that displayed Lavrentiev's Phenomenon across and for There are several results that gives criteria under which the phenomenon does not occur - for instance 'standard growth', a Lagrangian with no dependence on the second variable, or an approximating sequence satisfying Cesari's Condition (D) - but results are often particular, and applicable to a small class of functionals.
Connected with the Lavrentiev Phenomenon is the repulsion property: any functional displaying Lavrentiev's Phenomenon will display the weak repulsion property. [20]
For example, if denotes the displacement of a membrane above the domain in the plane, then its potential energy is proportional to its surface area: Plateau's problem consists of finding a function that minimizes the surface area while assuming prescribed values on the boundary of ; the solutions are called minimal surfaces. The Euler–Lagrange equation for this problem is nonlinear: See Courant (1950) for details.
It is often sufficient to consider only small displacements of the membrane, whose energy difference from no displacement is approximated by The functional is to be minimized among all trial functions that assume prescribed values on the boundary of If is the minimizing function and is an arbitrary smooth function that vanishes on the boundary of then the first variation of must vanish: Provided that u has two derivatives, we may apply the divergence theorem to obtain where is the boundary of is arclength along and is the normal derivative of on Since vanishes on and the first variation vanishes, the result is for all smooth functions that vanish on the boundary of The proof for the case of one dimensional integrals may be adapted to this case to show that in
The difficulty with this reasoning is the assumption that the minimizing function must have two derivatives. Riemann argued that the existence of a smooth minimizing function was assured by the connection with the physical problem: membranes do indeed assume configurations with minimal potential energy. Riemann named this idea the Dirichlet principle in honor of his teacher Peter Gustav Lejeune Dirichlet. However Weierstrass gave an example of a variational problem with no solution: minimize among all functions that satisfy and can be made arbitrarily small by choosing piecewise linear functions that make a transition between −1 and 1 in a small neighborhood of the origin. However, there is no function that makes [k] Eventually it was shown that Dirichlet's principle is valid, but it requires a sophisticated application of the regularity theory for elliptic partial differential equations; see Jost and Li–Jost (1998).
A more general expression for the potential energy of a membrane is This corresponds to an external force density in an external force on the boundary and elastic forces with modulus acting on The function that minimizes the potential energy with no restriction on its boundary values will be denoted by Provided that and are continuous, regularity theory implies that the minimizing function will have two derivatives. In taking the first variation, no boundary condition need be imposed on the increment The first variation of is given by If we apply the divergence theorem, the result is If we first set on the boundary integral vanishes, and we conclude as before that in Then if we allow to assume arbitrary boundary values, this implies that must satisfy the boundary condition on This boundary condition is a consequence of the minimizing property of : it is not imposed beforehand. Such conditions are called natural boundary conditions.
The preceding reasoning is not valid if vanishes identically on In such a case, we could allow a trial function where is a constant. For such a trial function, By appropriate choice of can assume any value unless the quantity inside the brackets vanishes. Therefore, the variational problem is meaningless unless This condition implies that net external forces on the system are in equilibrium. If these forces are in equilibrium, then the variational problem has a solution, but it is not unique, since an arbitrary constant may be added. Further details and examples are in Courant and Hilbert (1953).
Both one-dimensional and multi-dimensional eigenvalue problems can be formulated as variational problems.
The Sturm–Liouville eigenvalue problem involves a general quadratic form where is restricted to functions that satisfy the boundary conditions Let be a normalization integral The functions and are required to be everywhere positive and bounded away from zero. The primary variational problem is to minimize the ratio among all satisfying the endpoint conditions, which is equivalent to minimizing under the constraint that is constant. It is shown below that the Euler–Lagrange equation for the minimizing is where is the quotient It can be shown (see Gelfand and Fomin 1963) that the minimizing has two derivatives and satisfies the Euler–Lagrange equation. The associated will be denoted by ; it is the lowest eigenvalue for this equation and boundary conditions. The associated minimizing function will be denoted by This variational characterization of eigenvalues leads to the Rayleigh–Ritz method: choose an approximating as a linear combination of basis functions (for example trigonometric functions) and carry out a finite-dimensional minimization among such linear combinations. This method is often surprisingly accurate.
The next smallest eigenvalue and eigenfunction can be obtained by minimizing under the additional constraint This procedure can be extended to obtain the complete sequence of eigenvalues and eigenfunctions for the problem.
The variational problem also applies to more general boundary conditions. Instead of requiring that vanish at the endpoints, we may not impose any condition at the endpoints, and set where and are arbitrary. If we set , the first variation for the ratio is where λ is given by the ratio as previously. After integration by parts, If we first require that vanish at the endpoints, the first variation will vanish for all such only if If satisfies this condition, then the first variation will vanish for arbitrary only if These latter conditions are the natural boundary conditions for this problem, since they are not imposed on trial functions for the minimization, but are instead a consequence of the minimization.
Eigenvalue problems in higher dimensions are defined in analogy with the one-dimensional case. For example, given a domain with boundary in three dimensions we may define and Let be the function that minimizes the quotient with no condition prescribed on the boundary The Euler–Lagrange equation satisfied by is where The minimizing must also satisfy the natural boundary condition on the boundary This result depends upon the regularity theory for elliptic partial differential equations; see Jost and Li–Jost (1998) for details. Many extensions, including completeness results, asymptotic properties of the eigenvalues and results concerning the nodes of the eigenfunctions are in Courant and Hilbert (1953).
Fermat's principle states that light takes a path that (locally) minimizes the optical length between its endpoints. If the -coordinate is chosen as the parameter along the path, and along the path, then the optical length is given by where the refractive index depends upon the material. If we try then the first variation of (the derivative of with respect to ε) is
After integration by parts of the first term within brackets, we obtain the Euler–Lagrange equation
The light rays may be determined by integrating this equation. This formalism is used in the context of Lagrangian optics and Hamiltonian optics.
There is a discontinuity of the refractive index when light enters or leaves a lens. Let where and are constants. Then the Euler–Lagrange equation holds as before in the region where or and in fact the path is a straight line there, since the refractive index is constant. At the must be continuous, but may be discontinuous. After integration by parts in the separate regions and using the Euler–Lagrange equations, the first variation takes the form
The factor multiplying is the sine of angle of the incident ray with the axis, and the factor multiplying is the sine of angle of the refracted ray with the axis. Snell's law for refraction requires that these terms be equal. As this calculation demonstrates, Snell's law is equivalent to vanishing of the first variation of the optical path length.
It is expedient to use vector notation: let let be a parameter, let be the parametric representation of a curve and let be its tangent vector. The optical length of the curve is given by
Note that this integral is invariant with respect to changes in the parametric representation of The Euler–Lagrange equations for a minimizing curve have the symmetric form where
It follows from the definition that satisfies
Therefore, the integral may also be written as
This form suggests that if we can find a function whose gradient is given by then the integral is given by the difference of at the endpoints of the interval of integration. Thus the problem of studying the curves that make the integral stationary can be related to the study of the level surfaces of In order to find such a function, we turn to the wave equation, which governs the propagation of light. This formalism is used in the context of Lagrangian optics and Hamiltonian optics.
The wave equation for an inhomogeneous medium is where is the velocity, which generally depends upon Wave fronts for light are characteristic surfaces for this partial differential equation: they satisfy
We may look for solutions in the form
In that case, satisfies where According to the theory of first-order partial differential equations, if then satisfies along a system of curves (the light rays) that are given by
These equations for solution of a first-order partial differential equation are identical to the Euler–Lagrange equations if we make the identification
We conclude that the function is the value of the minimizing integral as a function of the upper end point. That is, when a family of minimizing curves is constructed, the values of the optical length satisfy the characteristic equation corresponding the wave equation. Hence, solving the associated partial differential equation of first order is equivalent to finding families of solutions of the variational problem. This is the essential content of the Hamilton–Jacobi theory, which applies to more general variational problems.
In classical mechanics, the action, is defined as the time integral of the Lagrangian, The Lagrangian is the difference of energies, where is the kinetic energy of a mechanical system and its potential energy. Hamilton's principle (or the action principle) states that the motion of a conservative holonomic (integrable constraints) mechanical system is such that the action integral is stationary with respect to variations in the path The Euler–Lagrange equations for this system are known as Lagrange's equations: and they are equivalent to Newton's equations of motion (for such systems).
The conjugate momenta are defined by For example, if then Hamiltonian mechanics results if the conjugate momenta are introduced in place of by a Legendre transformation of the Lagrangian into the Hamiltonian defined by The Hamiltonian is the total energy of the system: Analogy with Fermat's principle suggests that solutions of Lagrange's equations (the particle trajectories) may be described in terms of level surfaces of some function of This function is a solution of the Hamilton–Jacobi equation:
Further applications of the calculus of variations include the following:
Calculus of variations is concerned with variations of functionals, which are small changes in the functional's value due to small changes in the function that is its argument. The first variation [l] is defined as the linear part of the change in the functional, and the second variation [m] is defined as the quadratic part. [22]
For example, if is a functional with the function as its argument, and there is a small change in its argument from to where is a function in the same function space as then the corresponding change in the functional is [n]
The functional is said to be differentiable if where is a linear functional, [o] is the norm of [p] and as The linear functional is the first variation of and is denoted by, [26]
The functional is said to be twice differentiable if where is a linear functional (the first variation), is a quadratic functional, [q] and as The quadratic functional is the second variation of and is denoted by, [28]
The second variation is said to be strongly positive if for all and for some constant . [29]
Using the above definitions, especially the definitions of first variation, second variation, and strongly positive, the following sufficient condition for a minimum of a functional can be stated.
In mathematics and physics, Laplace's equation is a second-order partial differential equation named after Pierre-Simon Laplace, who first studied its properties. This is often written as or where is the Laplace operator, is the divergence operator, is the gradient operator, and is a twice-differentiable real-valued function. The Laplace operator therefore maps a scalar function to another scalar function.
In mathematical analysis, the Dirac delta function, also known as the unit impulse, is a generalized function on the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one. Thus it can be represented heuristically as
The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances. They were named after French engineer and physicist Claude-Louis Navier and the Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842–1850 (Stokes).
Noether's theorem states that every continuous symmetry of the action of a physical system with conservative forces has a corresponding conservation law. This is the first of two theorems published by mathematician Emmy Noether in 1918. The action of a physical system is the integral over time of a Lagrangian function, from which the system's behavior can be determined by the principle of least action. This theorem only applies to continuous and smooth symmetries of physical space.
In mathematics, the Laplace operator or Laplacian is a differential operator given by the divergence of the gradient of a scalar function on Euclidean space. It is usually denoted by the symbols , (where is the nabla operator), or . In a Cartesian coordinate system, the Laplacian is given by the sum of second partial derivatives of the function with respect to each independent variable. In other coordinate systems, such as cylindrical and spherical coordinates, the Laplacian also has a useful form. Informally, the Laplacian Δf (p) of a function f at a point p measures by how much the average value of f over small spheres or balls centered at p deviates from f (p).
Poisson's equation is an elliptic partial differential equation of broad utility in theoretical physics. For example, the solution to Poisson's equation is the potential field caused by a given electric charge or mass density distribution; with the potential field known, one can then calculate the corresponding electrostatic or gravitational (force) field. It is a generalization of Laplace's equation, which is also frequently seen in physics. The equation is named after French mathematician and physicist Siméon Denis Poisson who published it in 1823.
In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem. In one dimension, it is equivalent to the fundamental theorem of calculus. In three dimensions, it is equivalent to the divergence theorem.
In the calculus of variations and classical mechanics, the Euler–Lagrange equations are a system of second-order ordinary differential equations whose solutions are stationary points of the given action functional. The equations were discovered in the 1750s by Swiss mathematician Leonhard Euler and Italian mathematician Joseph-Louis Lagrange.
In the calculus of variations, a field of mathematical analysis, the functional derivative relates a change in a functional to a change in a function on which the functional depends.
The path integral formulation is a description in quantum mechanics that generalizes the stationary action principle of classical mechanics. It replaces the classical notion of a single, unique classical trajectory for a system with a sum, or functional integral, over an infinity of quantum-mechanically possible trajectories to compute a quantum amplitude.
Geometrical optics, or ray optics, is a model of optics that describes light propagation in terms of rays. The ray in geometrical optics is an abstraction useful for approximating the paths along which light propagates under certain circumstances.
In physics, Hamilton's principle is William Rowan Hamilton's formulation of the principle of stationary action. It states that the dynamics of a physical system are determined by a variational problem for a functional based on a single function, the Lagrangian, which may contain all physical information concerning the system and the forces acting on it. The variational problem is equivalent to and allows for the derivation of the differential equations of motion of the physical system. Although formulated originally for classical mechanics, Hamilton's principle also applies to classical fields such as the electromagnetic and gravitational fields, and plays an important role in quantum mechanics, quantum field theory and criticality theories.
In mathematics, Weyl's lemma, named after Hermann Weyl, states that every weak solution of Laplace's equation is a smooth solution. This contrasts with the wave equation, for example, which has weak solutions that are not smooth solutions. Weyl's lemma is a special case of elliptic or hypoelliptic regularity.
There are various mathematical descriptions of the electromagnetic field that are used in the study of electromagnetism, one of the four fundamental interactions of nature. In this article, several approaches are discussed, although the equations are in terms of electric and magnetic fields, potentials, and charges with currents, generally speaking.
In differential calculus, there is no single uniform notation for differentiation. Instead, various notations for the derivative of a function or variable have been proposed by various mathematicians. The usefulness of each notation varies with the context, and it is sometimes advantageous to use more than one notation in a given context. The most common notations for differentiation are listed below.
In mathematical physics, the Hunter–Saxton equation
In continuum mechanics, plate theories are mathematical descriptions of the mechanics of flat plates that draw on the theory of beams. Plates are defined as plane structural elements with a small thickness compared to the planar dimensions. The typical thickness to width ratio of a plate structure is less than 0.1. A plate theory takes advantage of this disparity in length scale to reduce the full three-dimensional solid mechanics problem to a two-dimensional problem. The aim of plate theory is to calculate the deformation and stresses in a plate subjected to loads.
The Reissner–Mindlin theory of plates is an extension of Kirchhoff–Love plate theory that takes into account shear deformations through-the-thickness of a plate. The theory was proposed in 1951 by Raymond Mindlin. A similar, but not identical, theory in static setting, had been proposed earlier by Eric Reissner in 1945. Both theories are intended for thick plates in which the normal to the mid-surface remains straight but not necessarily perpendicular to the mid-surface. The Reissner-Mindlin theory is used to calculate the deformations and stresses in a plate whose thickness is of the order of one tenth the planar dimensions while the Kirchhoff–Love theory is applicable to thinner plates.
Lagrangian field theory is a formalism in classical field theory. It is the field-theoretic analogue of Lagrangian mechanics. Lagrangian mechanics is used to analyze the motion of a system of discrete particles each with a finite number of degrees of freedom. Lagrangian field theory applies to continua and fields, which have an infinite number of degrees of freedom.
Computational anatomy (CA) is the study of shape and form in medical imaging. The study of deformable shapes in CA rely on high-dimensional diffeomorphism groups which generate orbits of the form . In CA, this orbit is in general considered a smooth Riemannian manifold since at every point of the manifold there is an inner product inducing the norm on the tangent space that varies smoothly from point to point in the manifold of shapes . This is generated by viewing the group of diffeomorphisms as a Riemannian manifold with , associated to the tangent space at . This induces the norm and metric on the orbit under the action from the group of diffeomorphisms.