Symmetry of second derivatives

Last updated

In mathematics, the symmetry of second derivatives (also called the equality of mixed partials) refers to the possibility of interchanging the order of taking partial derivatives of a function

Contents

of n variables without changing the result under certain conditions (see below). The symmetry is the assertion that the second-order partial derivatives satisfy the identity

so that they form an n×n symmetric matrix, known as the function's Hessian matrix. Sufficient conditions for the above symmetry to hold are established by a result known as Schwarz's theorem, Clairaut's theorem, or Young's theorem. [1] [2]

In the context of partial differential equations it is called the Schwarz integrability condition.

Formal expressions of symmetry

In symbols, the symmetry may be expressed as:

Another notation is:

In terms of composition of the differential operator Di which takes the partial derivative with respect to xi:

.

From this relation it follows that the ring of differential operators with constant coefficients, generated by the Di, is commutative; but this is only true as operators over a domain of sufficiently differentiable functions. It is easy to check the symmetry as applied to monomials, so that one can take polynomials in the xi as a domain. In fact smooth functions are another valid domain.

History

The result on the equality of mixed partial derivatives under certain conditions has a long history. The list of unsuccessful proposed proofs started with Euler's, published in 1740, [3] although already in 1721 Bernoulli had implicitly assumed the result with no formal justification. [4] Clairaut also published a proposed proof in 1740, with no other attempts until the end of the 18th century. Starting then, for a period of 70 years, a number of incomplete proofs were proposed. The proof of Lagrange (1797) was improved by Cauchy (1823), but assumed the existence and continuity of the partial derivatives and . [5] Other attempts were made by P. Blanchet (1841), Duhamel (1856), Sturm (1857), Schlömilch (1862), and Bertrand (1864). Finally in 1867 Lindelöf systematically analyzed all the earlier flawed proofs and was able to exhibit a specific counterexample where mixed derivatives failed to be equal. [6] [7]

Six years after that, Schwarz succeeded in giving the first rigorous proof. [8] Dini later contributed by finding more general conditions than those of Schwarz. Eventually a clean and more general version was found by Jordan in 1883 that is still the proof found in most textbooks. Minor variants of earlier proofs were published by Laurent (1885), Peano (1889 and 1893), J. Edwards (1892), P. Haag (1893), J. K. Whittemore (1898), Vivanti (1899) and Pierpont (1905). Further progress was made in 1907-1909 when E. W. Hobson and W. H. Young found proofs with weaker conditions than those of Schwarz and Dini. In 1918, Carathéodory gave a different proof based on the Lebesgue integral. [7]

Schwarz's theorem

In mathematical analysis, Schwarz's theorem (or Clairaut's theorem on equality of mixed partials) [9] named after Alexis Clairaut and Hermann Schwarz, states that for a function defined on a set , if is a point such that some neighborhood of is contained in and has continuous second partial derivatives on that neighborhood of , then for all i and j in

The partial derivatives of this function commute at that point.

One easy way to establish this theorem (in the case where , , and , which readily entails the result in general) is by applying Green's theorem to the gradient of

An elementary proof for functions on open subsets of the plane is as follows (by a simple reduction, the general case for the theorem of Schwarz easily reduces to the planar case). [10] Let be a differentiable function on an open rectangle containing a point and suppose that is continuous with continuous and over Define

These functions are defined for , where and is contained in

By the mean value theorem, for fixed h and k non-zero, can be found in the open interval with

Since , the first equality below can be divided by :

Letting tend to zero in the last equality, the continuity assumptions on and now imply that

This account is a straightforward classical method found in many text books, for example in Burkill, Apostol and Rudin. [10] [11] [12]

Although the derivation above is elementary, the approach can also be viewed from a more conceptual perspective so that the result becomes more apparent. [13] [14] [15] [16] [17] Indeed the difference operators commute and tend to as tends to 0, with a similar statement for second order operators. [lower-alpha 1] Here, for a vector in the plane and a directional vector or , the difference operator is defined by

By the fundamental theorem of calculus for functions on an open interval with

Hence

.

This is a generalized version of the mean value theorem. Recall that the elementary discussion on maxima or minima for real-valued functions implies that if is continuous on and differentiable on , then there is a point in such that

For vector-valued functions with a finite-dimensional normed space, there is no analogue of the equality above, indeed it fails. But since , the inequality above is a useful substitute. Moreover, using the pairing of the dual of with its dual norm, yields the following inequality:

.

These versions of the mean valued theorem are discussed in Rudin, Hörmander and elsewhere. [19] [20]

For a function on an open set in the plane, define and . Furthermore for set

.

Then for in the open set, the generalized mean value theorem can be applied twice:

Thus tends to as tends to 0. The same argument shows that tends to . Hence, since the difference operators commute, so do the partial differential operators and , as claimed. [21] [22] [23] [24] [25]

Remark. By two applications of the classical mean value theorem,

for some and in . Thus the first elementary proof can be reinterpreted using difference operators. Conversely, instead of using the generalized mean value theorem in the second proof, the classical mean valued theorem could be used.


Proof of Clairaut's theorem using iterated integrals

The properties of repeated Riemann integrals of a continuous function F on a compact rectangle [a,b] × [c,d] are easily established. [26] The uniform continuity of F implies immediately that the functions and are continuous. [27] It follows that

;

moreover it is immediate that the iterated integral is positive if F is positive. [28] The equality above is a simple case of Fubini's theorem, involving no measure theory. Titchmarsh (1939) proves it in a straightforward way using Riemann approximating sums corresponding to subdivisions of a rectangle into smaller rectangles.

To prove Clairaut's theorem, assume f is a differentiable function on an open set U, for which the mixed second partial derivatives fyx and fxy exist and are continuous. Using the fundamental theorem of calculus twice,

Similarly

The two iterated integrals are therefore equal. On the other hand, since fxy(x,y) is continuous, the second iterated integral can be performed by first integrating over x and then afterwards over y. But then the iterated integral of fyxfxy on [a,b] × [c,d] must vanish. However, if the iterated integral of a continuous function function F vanishes for all rectangles, then F must be identically zero; for otherwise F or F would be strictly positive at some point and therefore by continuity on a rectangle, which is not possible. Hence fyxfxy must vanish identically, so that fyx = fxy everywhere. [29] [30] [31] [32] [33]

Sufficiency of twice-differentiability

A weaker condition than the continuity of second partial derivatives (which is implied by the latter) which suffices to ensure symmetry is that all partial derivatives are themselves differentiable. [34] Another strengthening of the theorem, in which existence of the permuted mixed partial is asserted, was provided by Peano in a short 1890 note on Mathesis:

If is defined on an open set ; and exist everywhere on ; is continuous at , and if exists in a neighborhood of , then exists at and . [35]

Distribution theory formulation

The theory of distributions (generalized functions) eliminates analytic problems with the symmetry. The derivative of an integrable function can always be defined as a distribution, and symmetry of mixed partial derivatives always holds as an equality of distributions. The use of formal integration by parts to define differentiation of distributions puts the symmetry question back onto the test functions, which are smooth and certainly satisfy this symmetry. In more detail (where f is a distribution, written as an operator on test functions, and φ is a test function),

Another approach, which defines the Fourier transform of a function, is to note that on such transforms partial derivatives become multiplication operators that commute much more obviously. [lower-alpha 1]

Requirement of continuity

The symmetry may be broken if the function fails to have differentiable partial derivatives, which is possible if Clairaut's theorem is not satisfied (the second partial derivatives are not continuous).

The function f(x, y), as shown in equation (1), does not have symmetric second derivatives at its origin. Graph001.png
The function f(x,y), as shown in equation ( 1 ), does not have symmetric second derivatives at its origin.

An example of non-symmetry is the function (due to Peano) [36] [37]

 

 

 

 

(1)

This can be visualized by the polar form ; it is everywhere continuous, but its derivatives at (0,0) cannot be computed algebraically. Rather, the limit of difference quotients shows that , so the graph has a horizontal tangent plane at (0,0), and the partial derivatives exist and are everywhere continuous. However, the second partial derivatives are not continuous at (0,0), and the symmetry fails. In fact, along the x-axis the y-derivative is , and so:

In contrast, along the y-axis the x-derivative , and so . That is, at (0,0), although the mixed partial derivatives do exist, and at every other point the symmetry does hold.

The above function, written in a cylindrical coordinate system, can be expressed as

showing that the function oscillates four times when traveling once around an arbitrarily small loop containing the origin. Intuitively, therefore, the local behavior of the function at (0,0) cannot be described as a quadratic form, and the Hessian matrix thus fails to be symmetric.

In general, the interchange of limiting operations need not commute. Given two variables near (0,0) and two limiting processes on

corresponding to making h → 0 first, and to making k → 0 first. It can matter, looking at the first-order terms, which is applied first. This leads to the construction of pathological examples in which second derivatives are non-symmetric. This kind of example belongs to the theory of real analysis where the pointwise value of functions matters. When viewed as a distribution the second partial derivative's values can be changed at an arbitrary set of points as long as this has Lebesgue measure 0. Since in the example the Hessian is symmetric everywhere except (0,0), there is no contradiction with the fact that the Hessian, viewed as a Schwartz distribution, is symmetric.

In Lie theory

Consider the first-order differential operators Di to be infinitesimal operators on Euclidean space. That is, Di in a sense generates the one-parameter group of translations parallel to the xi-axis. These groups commute with each other, and therefore the infinitesimal generators do also; the Lie bracket

[Di, Dj] = 0

is this property's reflection. In other words, the Lie derivative of one coordinate with respect to another is zero.

Application to differential forms

The Clairaut-Schwarz theorem is the key fact needed to prove that for every (or at least twice differentiable) differential form , the second exterior derivative vanishes: . This implies that every differentiable exact form (i.e., a form such that for some form ) is closed (i.e., ), since . [38]

In the middle of the 18th century, the theory of differential forms was first studied in the simplest case of 1-forms in the plane, i.e. , where and are functions in the plane. The study of 1-forms and the differentials of functions began with Clairaut's papers in 1739 and 1740. At that stage his investigations were interpreted as ways of solving ordinary differential equations. Formally Clairaut showed that a 1-form on an open rectangle is closed, i.e. , if and only has the form for some function in the disk. The solution for can be written by Cauchy's integral formula

while if , the closed property is the identity . (In modern language this is one version of the Poincaré lemma.) [39]

Notes

  1. 1 2 These can also be rephrased in terms of the action of operators on Schwartz functions on the plane. Under Fourier transform, the difference and differential operators are just multiplication operators. [18]
  1. "Young's Theorem" (PDF). University of California Berkeley. Archived from the original (PDF) on 2006-05-18. Retrieved 2015-01-02.
  2. Allen 1964, pp.  300–305.
  3. Euler 1740.
  4. Sandifer 2007, pp.  142–147, footnote: Comm. Acad. Sci. Imp. Petropol. 7 (1734/1735) 1740, 174-189, 180-183; Opera Omnia, 1.22, 34-56..
  5. Minguzzi 2015.
  6. Lindelöf 1867.
  7. 1 2 Higgins 1940.
  8. Schwarz 1873.
  9. James 1966, p. [ page needed ].
  10. 1 2 Burkill 1962 , pp. 154–155
  11. Apostol 1965.
  12. Rudin 1976.
  13. Hörmander 2015, pp. 7, 11. This condensed account is possibly the shortest.
  14. Dieudonné 1960, pp. 179–180.
  15. Godement 1998b, pp. 287–289.
  16. Lang 1969, pp. 108–111.
  17. Cartan 1971, pp. 64–67.
  18. Hörmander 2015, Chapter VII.
  19. Hörmander 2015, p. 6.
  20. Rudin 1976, p. [ page needed ].
  21. Hörmander 2015, p. 11.
  22. Dieudonné 1960.
  23. Godement 1998a.
  24. Lang 1969.
  25. Cartan 1971.
  26. Titchmarsh 1939, p. [ page needed ].
  27. Titchmarsh 1939, pp. 23–25.
  28. Titchmarsh 1939, pp. 49–50.
  29. Spivak 1965, p. 61.
  30. McGrath 2014.
  31. Aksoy & Martelli 2002.
  32. Axler 2020, pp. 142–143.
  33. Marshall, Donald E., Theorems of Fubini and Clairaut (PDF), University of Washington
  34. Hubbard & Hubbard 2015, pp. 732–733.
  35. Rudin 1976, pp. 235–236.
  36. Hobson 1921, pp. 403–404.
  37. Apostol 1974, pp. 358–359.
  38. Tu 2010.
  39. Katz 1981.

Related Research Articles

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances, named after French engineer and physicist Claude-Louis Navier and Anglo-Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842-1850 (Stokes).

Kinematics is a subfield of physics, developed in classical mechanics, that describes the motion of points, bodies (objects), and systems of bodies without considering the forces that cause them to move. Kinematics, as a field of study, is often referred to as the "geometry of motion" and is occasionally seen as a branch of mathematics. A kinematics problem begins by describing the geometry of the system and declaring the initial conditions of any known values of position, velocity and/or acceleration of points within the system. Then, using arguments from geometry, the position, velocity and acceleration of any unknown parts of the system can be determined. The study of how forces act on bodies falls within kinetics, not kinematics. For further details, see analytical dynamics.

<span class="mw-page-title-main">Laplace operator</span> Differential operator

In mathematics, the Laplace operator or Laplacian is a differential operator given by the divergence of the gradient of a scalar function on Euclidean space. It is usually denoted by the symbols , (where is the nabla operator), or . In a Cartesian coordinate system, the Laplacian is given by the sum of second partial derivatives of the function with respect to each independent variable. In other coordinate systems, such as cylindrical and spherical coordinates, the Laplacian also has a useful form. Informally, the Laplacian Δf (p) of a function f at a point p measures by how much the average value of f over small spheres or balls centered at p deviates from f (p).

<span class="mw-page-title-main">Spherical harmonics</span> Special mathematical functions defined on the surface of a sphere

In mathematics and physical science, spherical harmonics are special functions defined on the surface of a sphere. They are often employed in solving partial differential equations in many scientific fields.

In mathematics, especially vector calculus and differential topology, a closed form is a differential form α whose exterior derivative is zero, and an exact form is a differential form, α, that is the exterior derivative of another differential form β. Thus, an exact form is in the image of d, and a closed form is in the kernel of d.

In mathematics, a Sobolev space is a vector space of functions equipped with a norm that is a combination of Lp-norms of the function together with its derivatives up to a given order. The derivatives are understood in a suitable weak sense to make the space complete, i.e. a Banach space. Intuitively, a Sobolev space is a space of functions possessing sufficiently many derivatives for some application domain, such as partial differential equations, and equipped with a norm that measures both the size and regularity of a function.

In physics, the Hamilton–Jacobi equation, named after William Rowan Hamilton and Carl Gustav Jacob Jacobi, is an alternative formulation of classical mechanics, equivalent to other formulations such as Newton's laws of motion, Lagrangian mechanics and Hamiltonian mechanics. The Hamilton–Jacobi equation is particularly useful in identifying conserved quantities for mechanical systems, which may be possible even when the mechanical problem itself cannot be solved completely.

In mathematics, a Killing vector field, named after Wilhelm Killing, is a vector field on a Riemannian manifold that preserves the metric. Killing fields are the infinitesimal generators of isometries; that is, flows generated by Killing fields are continuous isometries of the manifold. More simply, the flow generates a symmetry, in the sense that moving each point of an object the same distance in the direction of the Killing vector will not distort distances on the object.

In mathematics and physics, the Christoffel symbols are an array of numbers describing a metric connection. The metric connection is a specialization of the affine connection to surfaces or other manifolds endowed with a metric, allowing distances to be measured on that surface. In differential geometry, an affine connection can be defined without reference to a metric, and many additional concepts follow: parallel transport, covariant derivatives, geodesics, etc. also do not require the concept of a metric. However, when a metric is available, these concepts can be directly tied to the "shape" of the manifold itself; that shape is determined by how the tangent space is attached to the cotangent space by the metric tensor. Abstractly, one would say that the manifold has an associated (orthonormal) frame bundle, with each "frame" being a possible choice of a coordinate frame. An invariant metric implies that the structure group of the frame bundle is the orthogonal group O(p, q). As a result, such a manifold is necessarily a (pseudo-)Riemannian manifold. The Christoffel symbols provide a concrete representation of the connection of (pseudo-)Riemannian geometry in terms of coordinates on the manifold. Additional concepts, such as parallel transport, geodesics, etc. can then be expressed in terms of Christoffel symbols.

In mathematics, a change of variables is a basic technique used to simplify problems in which the original variables are replaced with functions of other variables. The intent is that when expressed in new variables, the problem may become simpler, or equivalent to a better understood problem.

In calculus, the Leibniz integral rule for differentiation under the integral sign states that for an integral of the form

In mathematics, subharmonic and superharmonic functions are important classes of functions used extensively in partial differential equations, complex analysis and potential theory.

In mathematics, vector spherical harmonics (VSH) are an extension of the scalar spherical harmonics for use with vector fields. The components of the VSH are complex-valued functions expressed in the spherical coordinate basis vectors.

<span class="mw-page-title-main">Mild-slope equation</span> Physics phenomenon and formula

In fluid dynamics, the mild-slope equation describes the combined effects of diffraction and refraction for water waves propagating over bathymetry and due to lateral boundaries—like breakwaters and coastlines. It is an approximate model, deriving its name from being originally developed for wave propagation over mild slopes of the sea floor. The mild-slope equation is often used in coastal engineering to compute the wave-field changes near harbours and coasts.

In mathematics, Sobolev spaces for planar domains are one of the principal techniques used in the theory of partial differential equations for solving the Dirichlet and Neumann boundary value problems for the Laplacian in a bounded domain in the plane with smooth boundary. The methods use the theory of bounded operators on Hilbert space. They can be used to deduce regularity properties of solutions and to solve the corresponding eigenvalue problems.

In mathematics, the Neumann–Poincaré operator or Poincaré–Neumann operator, named after Carl Neumann and Henri Poincaré, is a non-self-adjoint compact operator introduced by Poincaré to solve boundary value problems for the Laplacian on bounded domains in Euclidean space. Within the language of potential theory it reduces the partial differential equation to an integral equation on the boundary to which the theory of Fredholm operators can be applied. The theory is particularly simple in two dimensions—the case treated in detail in this article—where it is related to complex function theory, the conjugate Beurling transform or complex Hilbert transform and the Fredholm eigenvalues of bounded planar domains.

In mathematics, singular integral operators on closed curves arise in problems in analysis, in particular complex analysis and harmonic analysis. The two main singular integral operators, the Hilbert transform and the Cauchy transform, can be defined for any smooth Jordan curve in the complex plane and are related by a simple algebraic formula. In the special case of Fourier series for the unit circle, the operators become the classical Cauchy transform, the orthogonal projection onto Hardy space, and the Hilbert transform a real orthogonal linear complex structure. In general the Cauchy transform is a non-self-adjoint idempotent and the Hilbert transform a non-orthogonal complex structure. The range of the Cauchy transform is the Hardy space of the bounded region enclosed by the Jordan curve. The theory for the original curve can be deduced from that of the unit circle, where, because of rotational symmetry, both operators are classical singular integral operators of convolution type. The Hilbert transform satisfies the jump relations of Plemelj and Sokhotski, which express the original function as the difference between the boundary values of holomorphic functions on the region and its complement. Singular integral operators have been studied on various classes of functions, including Hőlder spaces, Lp spaces and Sobolev spaces. In the case of L2 spaces—the case treated in detail below—other operators associated with the closed curve, such as the Szegő projection onto Hardy space and the Neumann–Poincaré operator, can be expressed in terms of the Cauchy transform and its adjoint.

<span class="mw-page-title-main">Symmetry in quantum mechanics</span> Properties underlying modern physics

Symmetries in quantum mechanics describe features of spacetime and particles which are unchanged under some transformation, in the context of quantum mechanics, relativistic quantum mechanics and quantum field theory, and with applications in the mathematical formulation of the standard model and condensed matter physics. In general, symmetry in physics, invariance, and conservation laws, are fundamentally important constraints for formulating physical theories and models. In practice, they are powerful methods for solving problems and predicting what can happen. While conservation laws do not always give the answer to the problem directly, they form the correct constraints and the first steps to solving a multitude of problems.

<span class="mw-page-title-main">Averaged Lagrangian</span>

In continuum mechanics, Whitham's averaged Lagrangian method – or in short Whitham's method – is used to study the Lagrangian dynamics of slowly-varying wave trains in an inhomogeneous (moving) medium. The method is applicable to both linear and non-linear systems. As a direct consequence of the averaging used in the method, wave action is a conserved property of the wave motion. In contrast, the wave energy is not necessarily conserved, due to the exchange of energy with the mean motion. However the total energy, the sum of the energies in the wave motion and the mean motion, will be conserved for a time-invariant Lagrangian. Further, the averaged Lagrangian has a strong relation to the dispersion relation of the system.

In mathematics, calculus on Euclidean space is a generalization of calculus of functions in one or several variables to calculus of functions on Euclidean space as well as a finite-dimensional real vector space. This calculus is also known as advanced calculus, especially in the United States. It is similar to multivariable calculus but is somehow more sophisticated in that it uses linear algebra more extensively and covers some concepts from differential geometry such as differential forms and Stokes' formula in terms of differential forms. This extensive use of linear algebra also allows a natural generalization of multivariable calculus to calculus on Banach spaces or topological vector spaces.

References

Further reading