Covariant derivative

Last updated

In mathematics, the covariant derivative is a way of specifying a derivative along tangent vectors of a manifold. Alternatively, the covariant derivative is a way of introducing and working with a connection on a manifold by means of a differential operator, to be contrasted with the approach given by a principal connection on the frame bundle – see affine connection. In the special case of a manifold isometrically embedded into a higher-dimensional Euclidean space, the covariant derivative can be viewed as the orthogonal projection of the Euclidean directional derivative onto the manifold's tangent space. In this case the Euclidean derivative is broken into two parts, the extrinsic normal component (dependent on the embedding) and the intrinsic covariant derivative component.

Contents

The name is motivated by the importance of changes of coordinate in physics: the covariant derivative transforms covariantly under a general coordinate transformation, that is, linearly via the Jacobian matrix of the transformation. [1]

This article presents an introduction to the covariant derivative of a vector field with respect to a vector field, both in a coordinate-free language and using a local coordinate system and the traditional index notation. The covariant derivative of a tensor field is presented as an extension of the same concept. The covariant derivative generalizes straightforwardly to a notion of differentiation associated to a connection on a vector bundle, also known as a Koszul connection.

History

Historically, at the turn of the 20th century, the covariant derivative was introduced by Gregorio Ricci-Curbastro and Tullio Levi-Civita in the theory of Riemannian and pseudo-Riemannian geometry. [2] Ricci and Levi-Civita (following ideas of Elwin Bruno Christoffel) observed that the Christoffel symbols used to define the curvature could also provide a notion of differentiation which generalized the classical directional derivative of vector fields on a manifold. [3] [4] This new derivative – the Levi-Civita connection – was covariant in the sense that it satisfied Riemann's requirement that objects in geometry should be independent of their description in a particular coordinate system.

It was soon noted by other mathematicians, prominent among these being Hermann Weyl, Jan Arnoldus Schouten, and Élie Cartan, [5] that a covariant derivative could be defined abstractly without the presence of a metric. The crucial feature was not a particular dependence on the metric, but that the Christoffel symbols satisfied a certain precise second-order transformation law. This transformation law could serve as a starting point for defining the derivative in a covariant manner. Thus the theory of covariant differentiation forked off from the strictly Riemannian context to include a wider range of possible geometries.

In the 1940s, practitioners of differential geometry began introducing other notions of covariant differentiation in general vector bundles which were, in contrast to the classical bundles of interest to geometers, not part of the tensor analysis of the manifold. By and large, these generalized covariant derivatives had to be specified ad hoc by some version of the connection concept. In 1950, Jean-Louis Koszul unified these new ideas of covariant differentiation in a vector bundle by means of what is known today as a Koszul connection or a connection on a vector bundle. [6] Using ideas from Lie algebra cohomology, Koszul successfully converted many of the analytic features of covariant differentiation into algebraic ones. In particular, Koszul connections eliminated the need for awkward manipulations of Christoffel symbols (and other analogous non-tensorial objects) in differential geometry. Thus they quickly supplanted the classical notion of covariant derivative in many post-1950 treatments of the subject.

Motivation

Kovariantnaia proizvodnaia c.jpg

The covariant derivative is a generalization of the directional derivative from vector calculus. As with the directional derivative, the covariant derivative is a rule, , which takes as its inputs: (1) a vector, u, defined at a point P, and (2) a vector field v defined in a neighborhood of P. [7] The output is the vector , also at the point P. The primary difference from the usual directional derivative is that must, in a certain precise sense, be independent of the manner in which it is expressed in a coordinate system.

A vector may be described as a list of numbers in terms of a basis, but as a geometrical object the vector retains its identity regardless of how it is described. For a geometric vector written in components with respect to one basis, when the basis is changed the components transform according to a change of basis formula, with the coordinates undergoing a covariant transformation. The covariant derivative is required to transform, under a change in coordinates, by a covariant transformation in the same way as a basis does (hence the name).

In the case of Euclidean space, one usually defines the directional derivative of a vector field in terms of the difference between two vectors at two nearby points. In such a system one translates one of the vectors to the origin of the other, keeping it parallel, then taking their difference within the same vector space. With a Cartesian (fixed orthonormal) coordinate system "keeping it parallel" amounts to keeping the components constant. This ordinary directional derivative on Euclidean space is the first example of a covariant derivative.

Next, one must take into account changes of the coordinate system. For example, if the Euclidean plane is described by polar coordinates, "keeping it parallel" does not amount to keeping the polar components constant under translation, since the coordinate grid itself "rotates". Thus, the same covariant derivative written in polar coordinates contains extra terms that describe how the coordinate grid itself rotates, or how in more general coordinates the grid expands, contracts, twists, interweaves, etc.

Consider the example of a particle moving along a curve γ(t) in the Euclidean plane. In polar coordinates, γ may be written in terms of its radial and angular coordinates by γ(t) = (r(t), θ(t)). A vector at a particular time t [8] (for instance, a constant acceleration of the particle) is expressed in terms of , where and are unit tangent vectors for the polar coordinates, serving as a basis to decompose a vector in terms of radial and tangential components. At a slightly later time, the new basis in polar coordinates appears slightly rotated with respect to the first set. The covariant derivative of the basis vectors (the Christoffel symbols) serve to express this change.

In a curved space, such as the surface of the Earth (regarded as a sphere), the translation of tangent vectors between different points is not well defined, and its analog, parallel transport, depends on the path along which the vector is translated. A vector on a globe on the equator at point Q is directed to the north. Suppose we transport the vector (keeping it parallel) first along the equator to the point P, then drag it along a meridian to the N pole, and finally transport it along another meridian back to Q. Then we notice that the parallel-transported vector along a closed circuit does not return as the same vector; instead, it has another orientation. This would not happen in Euclidean space and is caused by the curvature of the surface of the globe. The same effect occurs if we drag the vector along an infinitesimally small closed surface subsequently along two directions and then back. This infinitesimal change of the vector is a measure of the curvature, and can be defined in terms of the covariant derivative.

Remarks

  • The definition of the covariant derivative does not use the metric in space. However, for each metric there is a unique torsion-free covariant derivative called the Levi-Civita connection such that the covariant derivative of the metric is zero.
  • The properties of a derivative imply that depends on the values of u on an arbitrarily small neighborhood of a point p in the same way as e.g. the derivative of a scalar function f along a curve at a given point p depends on the values of f in an arbitrarily small neighborhood of p.
  • The information on the neighborhood of a point p in the covariant derivative can be used to define parallel transport of a vector. Also the curvature, torsion, and geodesics may be defined only in terms of the covariant derivative or other related variation on the idea of a linear connection.

Informal definition using an embedding into Euclidean space

Suppose an open subset of a -dimensional Riemannian manifold is embedded into Euclidean space via a twice continuously-differentiable (C2) mapping such that the tangent space at is spanned by the vectors

and the scalar product on is compatible with the metric on M:

(Since the manifold metric is always assumed to be regular, the compatibility condition implies linear independence of the partial derivative tangent vectors.)

For a tangent vector field, , one has

The last term is not tangential to M, but can be expressed as a linear combination of the tangent space base vectors using the Christoffel symbols as linear factors plus a vector orthogonal to the tangent space:

In the case of the Levi-Civita connection, the covariant derivative , also written , is defined as the orthogonal projection of the usual derivative onto tangent space:


To obtain the relation between Christoffel symbols for the Levi-Civita connection and the metric, first we must note that, since in previous equation is orthogonal to tangent space:

Second, the partial derivative of a component of the metric is:

implies for a basis , using the symmetry of the scalar product and swapping the order of partial differentiation:

adding first row to second and subtracting third one:

and yields the Christoffel symbols for the Levi-Civita connection in terms of the metric:

Which if is nondegenerate can be written as:

For a very simple example that captures the essence of the description above, draw a circle on a flat sheet of paper. Travel around the circle at a constant speed. The derivative of your velocity, your acceleration vector, always points radially inward. Roll this sheet of paper into a cylinder. Now the (Euclidean) derivative of your velocity has a component that sometimes points inward toward the axis of the cylinder depending on whether you're near a solstice or an equinox. (At the point of the circle when you are moving parallel to the axis, there is no inward acceleration. Conversely, at a point (1/4 of a circle later) when the velocity is along the cylinder's bend, the inward acceleration is maximum.) This is the (Euclidean) normal component. The covariant derivative component is the component parallel to the cylinder's surface, and is the same as that before you rolled the sheet into a cylinder.

Formal definition

A covariant derivative is a (Koszul) connection on the tangent bundle and other tensor bundles: it differentiates vector fields in a way analogous to the usual differential on functions. The definition extends to a differentiation on the dual of vector fields (i.e. covector fields) and to arbitrary tensor fields, in a unique way that ensures compatibility with the tensor product and trace operations (tensor contraction).

Functions

Given a point of the manifold , a real function on the manifold and a tangent vector , the covariant derivative of f at p along v is the scalar at p, denoted , that represents the principal part of the change in the value of f when the argument of f is changed by the infinitesimal displacement vector v. (This is the differential of f evaluated against the vector v.) Formally, there is a differentiable curve such that and , and the covariant derivative of f at p is defined by

When is a vector field on , the covariant derivative is the function that associates with each point p in the common domain of f and v the scalar .

For a scalar function f and vector field v, the covariant derivative coincides with the Lie derivative , and with the exterior derivative .

Vector fields

Given a point of the manifold , a vector field defined in a neighborhood of p and a tangent vector , the covariant derivative of u at p along v is the tangent vector at p, denoted , such that the following properties hold (for any tangent vectors v, x and y at p, vector fields u and w defined in a neighborhood of p, scalar values g and h at p, and scalar function f defined in a neighborhood of p):

  1. is linear in so
  2. is additive in so:
  3. obeys the product rule; i.e., where is defined above,

Note that depends not only on the value of u at p but also on values of u in an infinitesimal neighborhood of p because of the last property, the product rule.

If u and v are both vector fields defined over a common domain, then denotes the vector field whose value at each point p of the domain is the tangent vector .

Covector fields

Given a field of covectors (or one-form) defined in a neighborhood of p, its covariant derivative is defined in a way to make the resulting operation compatible with tensor contraction and the product rule. That is, is defined as the unique one-form at p such that the following identity is satisfied for all vector fields u in a neighborhood of p

The covariant derivative of a covector field along a vector field v is again a covector field.

Tensor fields

Once the covariant derivative is defined for fields of vectors and covectors it can be defined for arbitrary tensor fields by imposing the following identities for every pair of tensor fields and in a neighborhood of the point p:

and for and of the same valence

The covariant derivative of a tensor field along a vector field v is again a tensor field of the same type.

Explicitly, let T be a tensor field of type (p, q). Consider T to be a differentiable multilinear map of smooth sections α1, α2, ..., αq of the cotangent bundle TM and of sections X1, X2, ..., Xp of the tangent bundle TM, written T(α1, α2, ..., X1, X2, ...) into R. The covariant derivative of T along Y is given by the formula

Coordinate description

Given coordinate functions

any tangent vector can be described by its components in the basis

The covariant derivative of a basis vector along a basis vector is again a vector and so can be expressed as a linear combination . To specify the covariant derivative it is enough to specify the covariant derivative of each basis vector field along .

the coefficients are the components of the connection with respect to a system of local coordinates. In the theory of Riemannian and pseudo-Riemannian manifolds, the components of the Levi-Civita connection with respect to a system of local coordinates are called Christoffel symbols.

Then using the rules in the definition, we find that for general vector fields and we get

so

The first term in this formula is responsible for "twisting" the coordinate system with respect to the covariant derivative and the second for changes of components of the vector field u. In particular

In words: the covariant derivative is the usual derivative along the coordinates with correction terms which tell how the coordinates change.

For covectors similarly we have

where .

The covariant derivative of a type (r, s) tensor field along is given by the expression:

Or, in words: take the partial derivative of the tensor and add: for every upper index , and for every lower index .

If instead of a tensor, one is trying to differentiate a tensor density (of weight +1), then one also adds a term

If it is a tensor density of weight W, then multiply that term by W. For example, is a scalar density (of weight +1), so we get:

where semicolon ";" indicates covariant differentiation and comma "," indicates partial differentiation. Incidentally, this particular expression is equal to zero, because the covariant derivative of a function solely of the metric is always zero.

Notation

In textbooks on physics, the covariant derivative is sometimes simply stated in terms of its components in this equation.

Often a notation is used in which the covariant derivative is given with a semicolon, while a normal partial derivative is indicated by a comma. In this notation we write the same as:

In case two or more indexes appear after the semicolon, all of them must be understood as covariant derivatives:

In some older texts (notably Adler, Bazin & Schiffer, Introduction to General Relativity), the covariant derivative is denoted by a double pipe and the partial derivative by single pipe:

Covariant derivative by field type

For a scalar field , covariant differentiation is simply partial differentiation:

For a contravariant vector field , we have:

For a covariant vector field , we have:

For a type (2,0) tensor field , we have:

For a type (0,2) tensor field , we have:

For a type (1,1) tensor field , we have:

The notation above is meant in the sense

Properties

In general, covariant derivatives do not commute. By example, the covariant derivatives of vector field . The Riemann tensor is defined such that:

or, equivalently,

The covariant derivative of a (2,0)-tensor field fulfills:

The latter can be shown by taking (without loss of generality) that .

Derivative along a curve

Since the covariant derivative of a tensor field at a point depends only on the value of the vector field at one can define the covariant derivative along a smooth curve in a manifold:

Note that the tensor field only needs to be defined on the curve for this definition to make sense.

In particular, is a vector field along the curve itself. If vanishes then the curve is called a geodesic of the covariant derivative. If the covariant derivative is the Levi-Civita connection of a positive-definite metric then the geodesics for the connection are precisely the geodesics of the metric that are parametrized by arc length.

The derivative along a curve is also used to define the parallel transport along the curve.

Sometimes the covariant derivative along a curve is called absolute or intrinsic derivative.

Relation to Lie derivative

A covariant derivative introduces an extra geometric structure on a manifold that allows vectors in neighboring tangent spaces to be compared: there is no canonical way to compare vectors from different tangent spaces because there is no canonical coordinate system.

There is however another generalization of directional derivatives which is canonical: the Lie derivative, which evaluates the change of one vector field along the flow of another vector field. Thus, one must know both vector fields in an open neighborhood, not merely at a single point. The covariant derivative on the other hand introduces its own change for vectors in a given direction, and it only depends on the vector direction at a single point, rather than a vector field in an open neighborhood of a point. In other words, the covariant derivative is linear (over C(M)) in the direction argument, while the Lie derivative is linear in neither argument.

Note that the antisymmetrized covariant derivative uv − ∇vu, and the Lie derivative Luv differ by the torsion of the connection, so that if a connection is torsion free, then its antisymmetrization is the Lie derivative.

See also

Notes

  1. Einstein, Albert (1922). "The General Theory of Relativity". The Meaning of Relativity.
  2. Ricci, G.; Levi-Civita, T. (1901). "Méthodes de calcul différential absolu et leurs applications". Mathematische Annalen. 54 (1–2): 125–201. doi:10.1007/bf01454201. S2CID   120009332.
  3. Riemann, G. F. B. (1866). "Über die Hypothesen, welche der Geometrie zu Grunde liegen". Gesammelte Mathematische Werke.; reprint, ed. Weber, H. (1953), New York: Dover.
  4. Christoffel, E. B. (1869). "Über die Transformation der homogenen Differentialausdrücke zweiten Grades". Journal für die reine und angewandte Mathematik . 70: 46–70.
  5. cf. with Cartan, É (1923). "Sur les variétés à connexion affine et la theorie de la relativité généralisée". Annales, École Normale. 40: 325–412. doi: 10.24033/asens.751 .
  6. Koszul, J. L. (1950). "Homologie et cohomologie des algebres de Lie". Bulletin de la Société Mathématique. 78: 65–127. doi: 10.24033/bsmf.1410 .
  7. The covariant derivative is also denoted variously by vu, Dvu, or other notations.
  8. In many applications, it may be better not to think of t as corresponding to time, at least for applications in general relativity. It is simply regarded as an abstract parameter varying smoothly and monotonically along the path.

Related Research Articles

In vector calculus and differential geometry the generalized Stokes theorem, also called the Stokes–Cartan theorem, is a statement about the integration of differential forms on manifolds, which both simplifies and generalizes several theorems from vector calculus. In particular, the fundamental theorem of calculus is the special case where the manifold is a line segment, Green’s theorem and Stokes' theorem are the cases of a surface in or and the divergence theorem is the case of a volume in Hence, the theorem is sometimes referred to as the Fundamental Theorem of Multivariate Calculus.

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-12 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way.

<span class="mw-page-title-main">Four-vector</span> 4-dimensional vector in relativity

In special relativity, a four-vector is an object with four components, which transform in a specific way under Lorentz transformations. Specifically, a four-vector is an element of a four-dimensional vector space considered as a representation space of the standard representation of the Lorentz group, the representation. It differs from a Euclidean vector in how its magnitude is determined. The transformations that preserve this magnitude are the Lorentz transformations, which include spatial rotations and boosts.

In quantum field theory, the Dirac spinor is the spinor that describes all known fundamental particles that are fermions, with the possible exception of neutrinos. It appears in the plane-wave solution to the Dirac equation, and is a certain combination of two Weyl spinors, specifically, a bispinor that transforms "spinorially" under the action of the Lorentz group.

In differential geometry, the Lie derivative, named after Sophus Lie by Władysław Ślebodziński, evaluates the change of a tensor field, along the flow defined by another vector field. This change is coordinate invariant and therefore the Lie derivative is defined on any differentiable manifold.

In mathematics and physics, the Christoffel symbols are an array of numbers describing a metric connection. The metric connection is a specialization of the affine connection to surfaces or other manifolds endowed with a metric, allowing distances to be measured on that surface. In differential geometry, an affine connection can be defined without reference to a metric, and many additional concepts follow: parallel transport, covariant derivatives, geodesics, etc. also do not require the concept of a metric. However, when a metric is available, these concepts can be directly tied to the "shape" of the manifold itself; that shape is determined by how the tangent space is attached to the cotangent space by the metric tensor. Abstractly, one would say that the manifold has an associated (orthonormal) frame bundle, with each "frame" being a possible choice of a coordinate frame. An invariant metric implies that the structure group of the frame bundle is the orthogonal group O(p, q). As a result, such a manifold is necessarily a (pseudo-)Riemannian manifold. The Christoffel symbols provide a concrete representation of the connection of (pseudo-)Riemannian geometry in terms of coordinates on the manifold. Additional concepts, such as parallel transport, geodesics, etc. can then be expressed in terms of Christoffel symbols.

In differential geometry, the four-gradient is the four-vector analogue of the gradient from vector calculus.

When studying and formulating Albert Einstein's theory of general relativity, various mathematical structures and techniques are utilized. The main tools used in this geometrical theory of gravitation are tensor fields defined on a Lorentzian manifold representing spacetime. This article is a general description of the mathematics of general relativity.

<span class="mw-page-title-main">Electromagnetic tensor</span> Mathematical object that describes the electromagnetic field in spacetime

In electromagnetism, the electromagnetic tensor or electromagnetic field tensor is a mathematical object that describes the electromagnetic field in spacetime. The field tensor was first used after the four-dimensional tensor formulation of special relativity was introduced by Hermann Minkowski. The tensor allows related physical laws to be written very concisely, and allows for the quantization of the electromagnetic field by Lagrangian formulation described below.

In physics, the gauge covariant derivative is a means of expressing how fields vary from place to place, in a way that respects how the coordinate systems used to describe a physical phenomenon can themselves change from place to place. The gauge covariant derivative is used in many areas of physics, including quantum field theory and fluid dynamics and in a very special way general relativity.

The following are important identities involving derivatives and integrals in vector calculus.

The theoretical and experimental justification for the Schrödinger equation motivates the discovery of the Schrödinger equation, the equation that describes the dynamics of nonrelativistic particles. The motivation uses photons, which are relativistic particles with dynamics described by Maxwell's equations, as an analogue for all types of particles.

<span class="mw-page-title-main">Mathematical descriptions of the electromagnetic field</span> Formulations of electromagnetism

There are various mathematical descriptions of the electromagnetic field that are used in the study of electromagnetism, one of the four fundamental interactions of nature. In this article, several approaches are discussed, although the equations are in terms of electric and magnetic fields, potentials, and charges with currents, generally speaking.

In mathematical physics, spacetime algebra (STA) is the application of Clifford algebra Cl1,3(R), or equivalently the geometric algebra G(M4) to physics. Spacetime algebra provides a "unified, coordinate-free formulation for all of relativistic physics, including the Dirac equation, Maxwell equation and General Relativity" and "reduces the mathematical divide between classical, quantum and relativistic physics."

The derivation of the Navier–Stokes equations as well as their application and formulation for different families of fluids, is an important exercise in fluid dynamics with applications in mechanical engineering, physics, chemistry, heat transfer, and electrical engineering. A proof explaining the properties and bounds of the equations, such as Navier–Stokes existence and smoothness, is one of the important unsolved problems in mathematics.

In fluid dynamics, the Oseen equations describe the flow of a viscous and incompressible fluid at small Reynolds numbers, as formulated by Carl Wilhelm Oseen in 1910. Oseen flow is an improved description of these flows, as compared to Stokes flow, with the (partial) inclusion of convective acceleration.

<span class="mw-page-title-main">Stokes' theorem</span> Theorem in vector calculus

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on . Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence: The line integral of a vector field over a loop is equal to the surface integral of its curl over the enclosed surface. It is illustrated in the figure, where the direction of positive circulation of the bounding contour ∂Σ, and the direction n of positive flux through the surface Σ, are related by a right-hand-rule. For the right hand the fingers circulate along ∂Σ and the thumb is directed along n.

Curvilinear coordinates can be formulated in tensor calculus, with important applications in physics and engineering, particularly for describing transportation of physical quantities and deformation of matter in fluid mechanics and continuum mechanics.

<span class="mw-page-title-main">Loop representation in gauge theories and quantum gravity</span> Description of gauge theories using loop operators

Attempts have been made to describe gauge theories in terms of extended objects such as Wilson loops and holonomies. The loop representation is a quantum hamiltonian representation of gauge theories in terms of loops. The aim of the loop representation in the context of Yang–Mills theories is to avoid the redundancy introduced by Gauss gauge symmetries allowing to work directly in the space of physical states. The idea is well known in the context of lattice Yang–Mills theory. Attempts to explore the continuous loop representation was made by Gambini and Trias for canonical Yang–Mills theory, however there were difficulties as they represented singular objects. As we shall see the loop formalism goes far beyond a simple gauge invariant description, in fact it is the natural geometrical framework to treat gauge theories and quantum gravity in terms of their fundamental physical excitations.

In mathematical physics, the Gordon decomposition of the Dirac current is a splitting of the charge or particle-number current into a part that arises from the motion of the center of mass of the particles and a part that arises from gradients of the spin density. It makes explicit use of the Dirac equation and so it applies only to "on-shell" solutions of the Dirac equation.

References