Functional derivative

Last updated

In the calculus of variations, a field of mathematical analysis, the functional derivative (or variational derivative) [1] relates a change in a functional (a functional in this sense is a function that acts on functions) to a change in a function on which the functional depends.

Contents

In the calculus of variations, functionals are usually expressed in terms of an integral of functions, their arguments, and their derivatives. In an integrand L of a functional, if a function f is varied by adding to it another function δf that is arbitrarily small, and the resulting integrand is expanded in powers of δf, the coefficient of δf in the first order term is called the functional derivative.

For example, consider the functional where f(x) df/dx. If f is varied by adding to it a function δf, and the resulting integrand L(x, f +δf, f+δf) is expanded in powers of δf, then the change in the value of J to first order in δf can be expressed as follows: [1] [Note 1] where the variation in the derivative, δf was rewritten as the derivative of the variation (δf) , and integration by parts was used in these derivatives.

Definition

In this section, the functional differential (or variation or first variation) [Note 2] is defined. Then the functional derivative is defined in terms of the functional differential.

Functional differential

Suppose is a Banach space and is a functional defined on . The differential of at a point is the linear functional on defined [2] by the condition that, for all , where is a real number that depends on in such a way that as . This means that is the Fréchet derivative of at .

However, this notion of functional differential is so strong it may not exist, [3] and in those cases a weaker notion, like the Gateaux derivative is preferred. In many practical cases, the functional differential is defined [4] as the directional derivative Note that this notion of the functional differential can even be defined without a norm.

Functional derivative

In many applications, the domain of the functional is a space of differentiable functions defined on some space and is of the form for some function that may depend on , the value and the derivative . If this is the case and, moreover, can be written as the integral of times another function (denoted δF/δρ) then this function δF/δρ is called the functional derivative of F at ρ. [5] [6] If is restricted to only certain functions (for example, if there are some boundary conditions imposed) then is restricted to functions such that continues to satisfy these conditions.

Heuristically, is the change in , so we 'formally' have , and then this is similar in form to the total differential of a function , where are independent variables. Comparing the last two equations, the functional derivative has a role similar to that of the partial derivative , where the variable of integration is like a continuous version of the summation index . [7] One thinks of δF/δρ as the gradient of F at the point ρ, so the value δF/δρ(x) measures how much the functional F will change if the function ρ is changed at the point x. Hence the formula is regarded as the directional derivative at point in the direction of . This is analogous to vector calculus, where the inner product of a vector with the gradient gives the directional derivative in the direction of .

Properties

Like the derivative of a function, the functional derivative satisfies the following properties, where F[ρ] and G[ρ] are functionals: [Note 3]

Determining functional derivatives

A formula to determine functional derivatives for a common class of functionals can be written as the integral of a function and its derivatives. This is a generalization of the Euler–Lagrange equation: indeed, the functional derivative was introduced in physics within the derivation of the Lagrange equation of the second kind from the principle of least action in Lagrangian mechanics (18th century). The first three examples below are taken from density functional theory (20th century), the fourth from statistical mechanics (19th century).

Formula

Given a functional and a function that vanishes on the boundary of the region of integration, from a previous section Definition,

The second line is obtained using the total derivative, where ∂f /∂∇ρ is a derivative of a scalar with respect to a vector. [Note 4]

The third line was obtained by use of a product rule for divergence. The fourth line was obtained using the divergence theorem and the condition that on the boundary of the region of integration. Since is also an arbitrary function, applying the fundamental lemma of calculus of variations to the last line, the functional derivative is

where ρ = ρ(r) and f = f (r, ρ, ρ). This formula is for the case of the functional form given by F[ρ] at the beginning of this section. For other functional forms, the definition of the functional derivative can be used as the starting point for its determination. (See the example Coulomb potential energy functional.)

The above equation for the functional derivative can be generalized to the case that includes higher dimensions and higher order derivatives. The functional would be,

where the vector rRn, and (i) is a tensor whose ni components are partial derivative operators of order i, [Note 5]

An analogous application of the definition of the functional derivative yields

In the last two equations, the ni components of the tensor are partial derivatives of f with respect to partial derivatives of ρ, where , and the tensor scalar product is, [Note 6]

Examples

Thomas–Fermi kinetic energy functional

The Thomas–Fermi model of 1927 used a kinetic energy functional for a noninteracting uniform electron gas in a first attempt of density-functional theory of electronic structure: Since the integrand of TTF[ρ] does not involve derivatives of ρ(r), the functional derivative of TTF[ρ] is, [12]

Coulomb potential energy functional

For the electron-nucleus potential, Thomas and Fermi employed the Coulomb potential energy functional

Applying the definition of functional derivative, So,

For the classical part of the electron-electron interaction, Thomas and Fermi employed the Coulomb potential energy functional From the definition of the functional derivative, The first and second terms on the right hand side of the last equation are equal, since r and r in the second term can be interchanged without changing the value of the integral. Therefore, and the functional derivative of the electron-electron Coulomb potential energy functional J[ρ] is, [13]

The second functional derivative is

Weizsäcker kinetic energy functional

In 1935 von Weizsäcker proposed to add a gradient correction to the Thomas-Fermi kinetic energy functional to make it better suit a molecular electron cloud: where Using a previously derived formula for the functional derivative, and the result is, [14]

Entropy

The entropy of a discrete random variable is a functional of the probability mass function.

Thus, Thus,

Exponential

Let

Using the delta function as a test function,

Thus,

This is particularly useful in calculating the correlation functions from the partition function in quantum field theory.

Functional derivative of a function

A function can be written in the form of an integral like a functional. For example, Since the integrand does not depend on derivatives of ρ, the functional derivative of ρ(r) is,

Functional derivative of iterated function

The functional derivative of the iterated function is given by: and

In general:

Putting in N = 0 gives:

Using the delta function as a test function

In physics, it is common to use the Dirac delta function in place of a generic test function , for yielding the functional derivative at the point (this is a point of the whole functional derivative as a partial derivative is a component of the gradient): [15]

This works in cases when formally can be expanded as a series (or at least up to first order) in . The formula is however not mathematically rigorous, since is usually not even defined.

The definition given in a previous section is based on a relationship that holds for all test functions , so one might think that it should hold also when is chosen to be a specific function such as the delta function. However, the latter is not a valid test function (it is not even a proper function).

In the definition, the functional derivative describes how the functional changes as a result of a small change in the entire function . The particular form of the change in is not specified, but it should stretch over the whole interval on which is defined. Employing the particular form of the perturbation given by the delta function has the meaning that is varied only in the point . Except for this point, there is no variation in .

Notes

  1. According to Giaquinta & Hildebrandt (1996) , p. 18, this notation is customary in physical literature.
  2. Called first variation in ( Giaquinta & Hildebrandt 1996 , p. 3), variation or first variation in ( Courant & Hilbert 1953 , p. 186), variation or differential in ( Gelfand & Fomin 2000 , p. 11, § 3.2) and differential in ( Parr & Yang 1989 , p. 246).
  3. Here the notation is introduced.
  4. For a three-dimensional Cartesian coordinate system, where and , , are unit vectors along the x, y, z axes.
  5. For example, for the case of three dimensions (n = 3) and second order derivatives (i = 2), the tensor (2) has components, where and can be .
  6. For example, for the case n = 3 and i = 2, the tensor scalar product is, where .

Footnotes

  1. 1 2 Giaquinta & Hildebrandt (1996) , p. 18
  2. Gelfand & Fomin (2000) , p. 11.
  3. Giaquinta & Hildebrandt (1996) , p. 10.
  4. Giaquinta & Hildebrandt (1996) , p. 10.
  5. Parr & Yang (1989) , p. 246, Eq. A.2.
  6. Greiner & Reinhardt (1996) , p. 36,37.
  7. Parr & Yang (1989) , p. 246.
  8. Parr & Yang (1989) , p. 247, Eq. A.3.
  9. Parr & Yang (1989) , p. 247, Eq. A.4.
  10. Greiner & Reinhardt (1996) , p. 38, Eq. 6.
  11. Greiner & Reinhardt (1996) , p. 38, Eq. 7.
  12. Parr & Yang (1989) , p. 247, Eq. A.6.
  13. Parr & Yang (1989) , p. 248, Eq. A.11.
  14. Parr & Yang (1989) , p. 247, Eq. A.9.
  15. Greiner & Reinhardt (1996) , p. 37

Related Research Articles

<span class="mw-page-title-main">Dirac delta function</span> Generalized function whose value is zero everywhere except at zero

In mathematical analysis, the Dirac delta function, also known as the unit impulse, is a generalized function on the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one. Since there is no function having this property, modelling the delta "function" rigorously involves the use of limits or, as is common in mathematics, measure theory and the theory of distributions.

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances. They were named after French engineer and physicist Claude-Louis Navier and the Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842–1850 (Stokes).

In mathematics, the Laplace operator or Laplacian is a differential operator given by the divergence of the gradient of a scalar function on Euclidean space. It is usually denoted by the symbols , (where is the nabla operator), or . In a Cartesian coordinate system, the Laplacian is given by the sum of second partial derivatives of the function with respect to each independent variable. In other coordinate systems, such as cylindrical and spherical coordinates, the Laplacian also has a useful form. Informally, the Laplacian Δf (p) of a function f at a point p measures by how much the average value of f over small spheres or balls centered at p deviates from f (p).

In continuum mechanics, the infinitesimal strain theory is a mathematical approach to the description of the deformation of a solid body in which the displacements of the material particles are assumed to be much smaller than any relevant dimension of the body; so that its geometry and the constitutive properties of the material at each point of space can be assumed to be unchanged by the deformation.

<span class="mw-page-title-main">Poisson's equation</span> Expression frequently encountered in mathematical physics, generalization of Laplaces equation

Poisson's equation is an elliptic partial differential equation of broad utility in theoretical physics. For example, the solution to Poisson's equation is the potential field caused by a given electric charge or mass density distribution; with the potential field known, one can then calculate the corresponding electrostatic or gravitational (force) field. It is a generalization of Laplace's equation, which is also frequently seen in physics. The equation is named after French mathematician and physicist Siméon Denis Poisson.

Linear elasticity is a mathematical model as to how solid objects deform and become internally stressed by prescribed loading conditions. It is a simplification of the more general nonlinear theory of elasticity and a branch of continuum mechanics.

A directional derivative is a concept in multivariable calculus that measures the rate at which a function changes in a particular direction at a given point.

This is a list of some vector calculus formulae for working with common curvilinear coordinate systems.

In physics and mathematics, the Helmholtz decomposition theorem or the fundamental theorem of vector calculus states that certain differentiable vector fields can be resolved into the sum of an irrotational (curl-free) vector field and a solenoidal (divergence-free) vector field. In physics, often only the decomposition of sufficiently smooth, rapidly decaying vector fields in three dimensions is discussed. It is named after Hermann von Helmholtz.

<span class="mw-page-title-main">Charge density</span> Electric charge per unit length, area or volume

In electromagnetism, charge density is the amount of electric charge per unit length, surface area, or volume. Volume charge density is the quantity of charge per unit volume, measured in the SI system in coulombs per cubic meter (C⋅m−3), at any point in a volume. Surface charge density (σ) is the quantity of charge per unit area, measured in coulombs per square meter (C⋅m−2), at any point on a surface charge distribution on a two dimensional surface. Linear charge density (λ) is the quantity of charge per unit length, measured in coulombs per meter (C⋅m−1), at any point on a line charge distribution. Charge density can be either positive or negative, since electric charge can be either positive or negative.

<span class="mw-page-title-main">Covariant formulation of classical electromagnetism</span> Ways of writing certain laws of physics

The covariant formulation of classical electromagnetism refers to ways of writing the laws of classical electromagnetism in a form that is manifestly invariant under Lorentz transformations, in the formalism of special relativity using rectilinear inertial coordinate systems. These expressions both make it simple to prove that the laws of classical electromagnetism take the same form in any inertial coordinate system, and also provide a way to translate the fields and forces from one frame to another. However, this is not as general as Maxwell's equations in curved spacetime or non-rectilinear coordinate systems.

<span class="mw-page-title-main">Inhomogeneous electromagnetic wave equation</span> Equation in physics

In electromagnetism and applications, an inhomogeneous electromagnetic wave equation, or nonhomogeneous electromagnetic wave equation, is one of a set of wave equations describing the propagation of electromagnetic waves generated by nonzero source charges and currents. The source terms in the wave equations make the partial differential equations inhomogeneous, if the source terms are zero the equations reduce to the homogeneous electromagnetic wave equations. The equations follow from Maxwell's equations.

<span class="mw-page-title-main">Navier–Stokes existence and smoothness</span> Millennium Prize Problem

The Navier–Stokes existence and smoothness problem concerns the mathematical properties of solutions to the Navier–Stokes equations, a system of partial differential equations that describe the motion of a fluid in space. Solutions to the Navier–Stokes equations are used in many practical applications. However, theoretical understanding of the solutions to these equations is incomplete. In particular, solutions of the Navier–Stokes equations often include turbulence, which remains one of the greatest unsolved problems in physics, despite its immense importance in science and engineering.

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members often asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.

<span class="mw-page-title-main">Mathematical descriptions of the electromagnetic field</span> Formulations of electromagnetism

There are various mathematical descriptions of the electromagnetic field that are used in the study of electromagnetism, one of the four fundamental interactions of nature. In this article, several approaches are discussed, although the equations are in terms of electric and magnetic fields, potentials, and charges with currents, generally speaking.

The Cauchy momentum equation is a vector partial differential equation put forth by Cauchy that describes the non-relativistic momentum transport in any continuum.

The derivatives of scalars, vectors, and second-order tensors with respect to second-order tensors are of considerable use in continuum mechanics. These derivatives are used in the theories of nonlinear elasticity and plasticity, particularly in the design of algorithms for numerical simulations.

In continuum mechanics, a compatible deformation tensor field in a body is that unique tensor field that is obtained when the body is subjected to a continuous, single-valued, displacement field. Compatibility is the study of the conditions under which such a displacement field can be guaranteed. Compatibility conditions are particular cases of integrability conditions and were first derived for linear elasticity by Barré de Saint-Venant in 1864 and proved rigorously by Beltrami in 1886.

Multipole radiation is a theoretical framework for the description of electromagnetic or gravitational radiation from time-dependent distributions of distant sources. These tools are applied to physical phenomena which occur at a variety of length scales - from gravitational waves due to galaxy collisions to gamma radiation resulting from nuclear decay. Multipole radiation is analyzed using similar multipole expansion techniques that describe fields from static sources, however there are important differences in the details of the analysis because multipole radiation fields behave quite differently from static fields. This article is primarily concerned with electromagnetic multipole radiation, although the treatment of gravitational waves is similar.

Lagrangian field theory is a formalism in classical field theory. It is the field-theoretic analogue of Lagrangian mechanics. Lagrangian mechanics is used to analyze the motion of a system of discrete particles each with a finite number of degrees of freedom. Lagrangian field theory applies to continua and fields, which have an infinite number of degrees of freedom.

References