Einstein notation

Last updated

In mathematics, especially the usage of linear algebra in mathematical physics and differential geometry, Einstein notation (also known as the Einstein summation convention or Einstein summation notation) is a notational convention that implies summation over a set of indexed terms in a formula, thus achieving brevity. As part of mathematics it is a notational subset of Ricci calculus; however, it is often used in physics applications that do not distinguish between tangent and cotangent spaces. It was introduced to physics by Albert Einstein in 1916. [1]

Contents

Introduction

Statement of convention

According to this convention, when an index variable appears twice in a single term and is not otherwise defined (see Free and bound variables), it implies summation of that term over all the values of the index. So where the indices can range over the set {1, 2, 3},

is simplified by the convention to:

The upper indices are not exponents but are indices of coordinates, coefficients or basis vectors. That is, in this context x2 should be understood as the second component of x rather than the square of x (this can occasionally lead to ambiguity). The upper index position in xi is because, typically, an index occurs once in an upper (superscript) and once in a lower (subscript) position in a term (see § Application below). Typically, (x1x2x3) would be equivalent to the traditional (xyz).

In general relativity, a common convention is that

In general, indices can range over any indexing set, including an infinite set. This should not be confused with a typographically similar convention used to distinguish between tensor index notation and the closely related but distinct basis-independent abstract index notation.

An index that is summed over is a summation index, in this case "i". It is also called a dummy index since any symbol can replace "i" without changing the meaning of the expression (provided that it does not collide with other index symbols in the same term).

An index that is not summed over is a free index and should appear only once per term. If such an index does appear, it usually also appears in every other term in an equation. An example of a free index is the "i" in the equation , which is equivalent to the equation .

Application

Einstein notation can be applied in slightly different ways. Typically, each index occurs once in an upper (superscript) and once in a lower (subscript) position in a term; however, the convention can be applied more generally to any repeated indices within a term. [2] When dealing with covariant and contravariant vectors, where the position of an index also indicates the type of vector, the first case usually applies; a covariant vector can only be contracted with a contravariant vector, corresponding to summation of the products of coefficients. On the other hand, when there is a fixed coordinate basis (or when not considering coordinate vectors), one may choose to use only subscripts; see § Superscripts and subscripts versus only subscripts below.

Vector representations

Superscripts and subscripts versus only subscripts

In terms of covariance and contravariance of vectors,

They transform contravariantly or covariantly, respectively, with respect to change of basis.

In recognition of this fact, the following notation uses the same symbol both for a vector or covector and its components, as in:

where v is the vector and vi are its components (not the ith covector v), w is the covector and wi are its components. The basis vector elements are each column vectors, and the covector basis elements are each row covectors. (See also § Abstract description; duality, below and the examples)

In the presence of a non-degenerate form (an isomorphism VV, for instance a Riemannian metric or Minkowski metric), one can raise and lower indices.

A basis gives such a form (via the dual basis), hence when working on Rn with a Euclidean metric and a fixed orthonormal basis, one has the option to work with only subscripts.

However, if one changes coordinates, the way that coefficients change depends on the variance of the object, and one cannot ignore the distinction; see Covariance and contravariance of vectors.

Mnemonics

In the above example, vectors are represented as n ×1 matrices (column vectors), while covectors are represented as 1× n matrices (row covectors).

When using the column vector convention:

Abstract description

The virtue of Einstein notation is that it represents the invariant quantities with a simple notation.

In physics, a scalar is invariant under transformations of basis. In particular, a Lorentz scalar is invariant under a Lorentz transformation. The individual terms in the sum are not. When the basis is changed, the components of a vector change by a linear transformation described by a matrix. This led Einstein to propose the convention that repeated indices imply the summation is to be done.

As for covectors, they change by the inverse matrix. This is designed to guarantee that the linear function associated with the covector, the sum above, is the same no matter what the basis is.

The value of the Einstein convention is that it applies to other vector spaces built from V using the tensor product and duality. For example, VV, the tensor product of V with itself, has a basis consisting of tensors of the form eij = eiej. Any tensor T in VV can be written as:

V*, the dual of V, has a basis e1, e2, ..., en which obeys the rule

where δ is the Kronecker delta. As

the row/column coordinates on a matrix correspond to the upper/lower indices on the tensor product.

Common operations in this notation

In Einstein notation, the usual element reference for the -th row and -th column of matrix becomes . We can then write the following operations in Einstein notation as follows.

Inner product

Using an orthogonal basis, the inner product (vector dot product) is the sum of corresponding components multiplied together:

This can also be calculated by multiplying the covector on the vector.

Vector cross product

Again using an orthogonal basis (in 3 dimensions), the cross product intrinsically involves summations over permutations of components:

where

εijk is the Levi-Civita symbol, and δil is the generalized Kronecker delta. Based on this definition of ε, there is no difference between εijk and εijk but the position of indices.

Matrix-vector multiplication

The product of a matrix Aij with a column vector vj is:

equivalent to

This is a special case of matrix multiplication.

Matrix multiplication

The matrix product of two matrices Aij and Bjk is:

equivalent to

Trace

For a square matrix Aij, the trace is the sum of the diagonal elements, hence the sum over a common index Aii.

Outer product

The outer product of the column vector ui by the row vector vj yields an m×n matrix A:

Since i and j represent two different indices, there is no summation and the indices are not eliminated by the multiplication.

Raising and lowering indices

Given a tensor, one can raise an index or lower an index by contracting the tensor with the metric tensor, gμν. For example, taking the tensor Tαβ, one can lower an index:

Or one can raise an index:

See also

Notes

  1. This applies only for numerical indices. The situation is the opposite for abstract indices. Then, vectors themselves carry upper abstract indices and covectors carry lower abstract indices, as per the example in the introduction of this article. Elements of a basis of vectors may carry a lower numerical index and an upper abstract index.

Related Research Articles

<span class="mw-page-title-main">Divergence</span> Vector operator in vector calculus

In vector calculus, divergence is a vector operator that operates on a vector field, producing a scalar field giving the quantity of the vector field's source at each point. More technically, the divergence represents the volume density of the outward flux of a vector field from an infinitesimal volume around a given point.

<span class="mw-page-title-main">Gradient</span> Multivariate derivative (mathematics)

In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field whose value at a point gives the direction and the rate of fastest increase. The gradient transforms like a vector under change of basis of the space of variables of . If the gradient of a function is non-zero at a point , the direction of the gradient is the direction in which the function increases most quickly from , and the magnitude of the gradient is the rate of increase in that direction, the greatest absolute directional derivative. Further, a point where the gradient is the zero vector is known as a stationary point. The gradient thus plays a fundamental role in optimization theory, where it is used to maximize a function by gradient ascent. In coordinate-free terms, the gradient of a function may be defined by:

<span class="mw-page-title-main">Lorentz transformation</span> Family of linear transformations

In physics, the Lorentz transformations are a six-parameter family of linear transformations from a coordinate frame in spacetime to another frame that moves at a constant velocity relative to the former. The respective inverse transformation is then parameterized by the negative of this velocity. The transformations are named after the Dutch physicist Hendrik Lorentz.

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In mathematics, a product is the result of multiplication, or an expression that identifies objects to be multiplied, called factors. For example, 21 is the product of 3 and 7, and is the product of and . When one factor is an integer, the product is called a multiple.

<span class="mw-page-title-main">Tensor</span> Algebraic object with geometric applications

In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects related to a vector space. Tensors may map between different objects such as vectors, scalars, and even other tensors. There are many types of tensors, including scalars and vectors, dual vectors, multilinear maps between vector spaces, and even some operations such as the dot product. Tensors are defined independent of any basis, although they are often referred to by their components in a basis related to a particular coordinate system; those components form an array, which can be thought of as a high-dimensional matrix.

In linear algebra, the outer product of two coordinate vectors is the matrix whose entries are all products of an element in the first vector with an element in the second vector. If the two coordinate vectors have dimensions n and m, then their outer product is an n × m matrix. More generally, given two tensors, their outer product is a tensor. The outer product of tensors is also referred to as their tensor product, and can be used to define the tensor algebra.

<span class="mw-page-title-main">Cross product</span> Mathematical operation on vectors in 3D space

In mathematics, the cross product or vector product is a binary operation on two vectors in a three-dimensional oriented Euclidean vector space, and is denoted by the symbol . Given two linearly independent vectors a and b, the cross product, a × b, is a vector that is perpendicular to both a and b, and thus normal to the plane containing them. It has many applications in mathematics, physics, engineering, and computer programming. It should not be confused with the dot product.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric tensor on M consists of a metric tensor at each point p of M that varies smoothly with p.

In multilinear algebra, a tensor contraction is an operation on a tensor that arises from the canonical pairing of a vector space and its dual. In components, it is expressed as a sum of products of scalar components of the tensor(s) caused by applying the summation convention to a pair of dummy indices that are bound to each other in an expression. The contraction of a single mixed tensor occurs when a pair of literal indices of the tensor are set equal to each other and summed over. In Einstein notation this summation is built into the notation. The result is another tensor with order reduced by 2.

<span class="mw-page-title-main">Covariance and contravariance of vectors</span> Manner in which a geometric object varies with a change of basis

In physics, especially in multilinear algebra and tensor analysis, covariance and contravariance describe how the quantitative description of certain geometric or physical entities changes with a change of basis. In modern mathematical notation, the role is sometimes swapped.

<span class="mw-page-title-main">Exterior algebra</span> Algebra of a vector space

In mathematics, the exterior product or wedge product of vectors is an algebraic construction used in geometry to study areas, volumes, and their higher-dimensional analogs. The exterior product of two vectors u and v, denoted by uv, is called a bivector and lives in a space called the exterior square, a vector space that is distinct from the original space of vectors. The magnitude of uv can be interpreted as the area of the parallelogram with sides u and v, which in three dimensions can also be computed using the cross product of the two vectors. Like the cross product, the exterior product is anticommutative, meaning that uv = −(vu) for all vectors u and v, but, unlike the cross product, the exterior product is associative. One way to visualize a bivector is as a family of parallelograms all lying in the same plane, having the same area and orientation, which is a choice of rotational direction within the plane (clockwise or counterclockwise from some view).

This is a glossary of tensor theory. For expositions of tensor theory from different points of view, see:

<span class="mw-page-title-main">Curvilinear coordinates</span> Coordinate system whose directions vary in space

In geometry, curvilinear coordinates are a coordinate system for Euclidean space in which the coordinate lines may be curved. These coordinates may be derived from a set of Cartesian coordinates by using a transformation that is locally invertible at each point. This means that one can convert a point given in a Cartesian coordinate system to its curvilinear coordinates and back. The name curvilinear coordinates, coined by the French mathematician Lamé, derives from the fact that the coordinate surfaces of the curvilinear systems are curved.

In mathematics and physics, the Christoffel symbols are an array of numbers describing a metric connection. The metric connection is a specialization of the affine connection to surfaces or other manifolds endowed with a metric, allowing distances to be measured on that surface. In differential geometry, an affine connection can be defined without reference to a metric, and many additional concepts follow: parallel transport, covariant derivatives, geodesics, etc. also do not require the concept of a metric. However, when a metric is available, these concepts can be directly tied to the "shape" of the manifold itself; that shape is determined by how the tangent space is attached to the cotangent space by the metric tensor. Abstractly, one would say that the manifold has an associated (orthonormal) frame bundle, with each "frame" being a possible choice of a coordinate frame. An invariant metric implies that the structure group of the frame bundle is the orthogonal group O(p, q). As a result, such a manifold is necessarily a (pseudo-)Riemannian manifold. The Christoffel symbols provide a concrete representation of the connection of (pseudo-)Riemannian geometry in terms of coordinates on the manifold. Additional concepts, such as parallel transport, geodesics, etc. can then be expressed in terms of Christoffel symbols.

In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices. It collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. This greatly simplifies operations such as finding the maximum or minimum of a multivariate function and solving systems of differential equations. The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.

<span class="mw-page-title-main">Cartesian tensor</span>

In geometry and linear algebra, a Cartesian tensor uses an orthonormal basis to represent a tensor in a Euclidean space in the form of components. Converting a tensor's components from one such basis to another is done through an orthogonal transformation.

In mathematics—more specifically, in differential geometry—the musical isomorphism is an isomorphism between the tangent bundle and the cotangent bundle of a pseudo-Riemannian manifold induced by its metric tensor. There are similar isomorphisms on symplectic manifolds. The term musical refers to the use of the symbols (flat) and (sharp).

The tetrad formalism is an approach to general relativity that generalizes the choice of basis for the tangent bundle from a coordinate basis to the less restrictive choice of a local basis, i.e. a locally defined set of four linearly independent vector fields called a tetrad or vierbein. It is a special case of the more general idea of a vielbein formalism, which is set in (pseudo-)Riemannian geometry. This article as currently written makes frequent mention of general relativity; however, almost everything it says is equally applicable to (pseudo-)Riemannian manifolds in general, and even to spin manifolds. Most statements hold simply by substituting arbitrary for . In German, "vier" translates to "four", and "viel" to "many".

In mathematics and mathematical physics, raising and lowering indices are operations on tensors which change their type. Raising and lowering indices are a form of index manipulation in tensor expressions.

References

  1. Einstein, Albert (1916). "The Foundation of the General Theory of Relativity". Annalen der Physik. 354 (7): 769. Bibcode:1916AnP...354..769E. doi:10.1002/andp.19163540702. Archived from the original (PDF) on 2006-08-29. Retrieved 2006-09-03.
  2. "Einstein Summation". Wolfram Mathworld. Retrieved 13 April 2011.

Bibliography