Change of basis

Last updated October 16, 2024

A linear combination of one basis of vectors (purple) obtains new vectors (red). If they are linearly independent, these form a new basis. The linear combinations relating the first basis to the other extend to a linear transformation, called the change of basis.

A vector represented by two different bases (purple and red arrows).

In mathematics, an ordered basis of a vector space of finite dimension $n$ allows representing uniquely any element of the vector space by a coordinate vector, which is a sequence of $n$ scalars called coordinates. If two different bases are considered, the coordinate vector that represents a vector $v$ on one basis is, in general, different from the coordinate vector that represents $v$ on the other basis. A change of basis consists of converting every assertion expressed in terms of coordinates relative to one basis into an assertion expressed in terms of coordinates relative to the other basis.^[1]^[2]^[3]

where "old" and "new" refer respectively to the initially defined basis and the other basis, $\mathbf {x} _{\mathrm {old} }$ and $\mathbf {x} _{\mathrm {new} }$ are the column vectors of the coordinates of the same vector on the two bases. $A$ is the change-of-basis matrix (also called transition matrix), which is the matrix whose columns are the coordinates of the new basis vectors on the old basis.

A change of basis is sometimes called a change of coordinates, although it excludes many coordinate transformations. For applications in physics and specially in mechanics, a change of basis often involves the transformation of an orthonormal basis, understood as a rotation in physical space, thus excluding translations. This article deals mainly with finite-dimensional vector spaces. However, many of the principles are also valid for infinite-dimensional vector spaces.

Change of basis formula

Let $B_{\mathrm {old} }=(v_{1},\ldots ,v_{n})$ be a basis of a finite-dimensional vector space $V$ over a field $F$ .^{[lower-alpha 1]}

For $j = 1, ..., n$ , one can define a vector $w j$ by its coordinates $a_{i,j}$ over $B_{\mathrm {old} }\colon$

w_{j}=\sum _{i=1}^{n}a_{i,j}v_{i}.

Let

A=\left(a_{i,j}\right)_{i,j}

be the matrix whose $j$ th column is formed by the coordinates of $w j$ . (Here and in what follows, the index $i$ refers always to the rows of $A$ and the $v_{i},$ while the index $j$ refers always to the columns of $A$ and the $w_{j};$ such a convention is useful for avoiding errors in explicit computations.)

Setting $B_{\mathrm {new} }=(w_{1},\ldots ,w_{n}),$ one has that $B_{\mathrm {new} }$ is a basis of $V$ if and only if the matrix $A$ is invertible, or equivalently if it has a nonzero determinant. In this case, $A$ is said to be the change-of-basis matrix from the basis $B_{\mathrm {old} }$ to the basis $B_{\mathrm {new} }.$

Given a vector $z\in V,$ let $(x_{1},\ldots ,x_{n})$ be the coordinates of $z$ over $B_{\mathrm {old} },$ and $(y_{1},\ldots ,y_{n})$ its coordinates over $B_{\mathrm {new} };$ that is

z=\sum _{i=1}^{n}x_{i}v_{i}=\sum _{j=1}^{n}y_{j}w_{j}.

(One could take the same summation index for the two sums, but choosing systematically the indexes $i$ for the old basis and $j$ for the new one makes clearer the formulas that follows, and helps avoiding errors in proofs and explicit computations.)

The change-of-basis formula expresses the coordinates over the old basis in terms of the coordinates over the new basis. With above notation, it is

x_{i}=\sum _{j=1}^{n}a_{i,j}y_{j}\qquad {\text{for }}i=1,\ldots ,n.

In terms of matrices, the change of basis formula is

\mathbf {x} =A\,\mathbf {y} ,

where $\mathbf {x}$ and $\mathbf {y}$ are the column vectors of the coordinates of $z$ over $B_{\mathrm {old} }$ and $B_{\mathrm {new} },$ respectively.

Proof: Using the above definition of the change-of basis matrix, one has

{\begin{aligned}z&=\sum _{j=1}^{n}y_{j}w_{j}\\&=\sum _{j=1}^{n}\left(y_{j}\sum _{i=1}^{n}a_{i,j}v_{i}\right)\\&=\sum _{i=1}^{n}\left(\sum _{j=1}^{n}a_{i,j}y_{j}\right)v_{i}.\end{aligned}}

As $z=\textstyle \sum _{i=1}^{n}x_{i}v_{i},$ the change-of-basis formula results from the uniqueness of the decomposition of a vector over a basis.

Example

Consider the Euclidean vector space $\mathbb {R} ^{2}$ and a basis consisting of the vectors $v_{1}=(1,0)$ and $v_{2}=(0,1).$ If one rotates them by an angle of $t$ , one has a new basis formed by $w_{1}=(\cos t,\sin t)$ and $w_{2}=(-\sin t,\cos t).$

So, the change-of-basis matrix is ${\begin{bmatrix}\cos t&-\sin t\\\sin t&\cos t\end{bmatrix}}.$

The change-of-basis formula asserts that, if $y_{1},y_{2}$ are the new coordinates of a vector $(x_{1},x_{2}),$ then one has

{\begin{bmatrix}x_{1}\\x_{2}\end{bmatrix}}={\begin{bmatrix}\cos t&-\sin t\\\sin t&\cos t\end{bmatrix}}\,{\begin{bmatrix}y_{1}\\y_{2}\end{bmatrix}}.

That is,

x_{1}=y_{1}\cos t-y_{2}\sin t\qquad {\text{and}}\qquad x_{2}=y_{1}\sin t+y_{2}\cos t.

This may be verified by writing

{\begin{aligned}x_{1}v_{1}+x_{2}v_{2}&=(y_{1}\cos t-y_{2}\sin t)v_{1}+(y_{1}\sin t+y_{2}\cos t)v_{2}\\&=y_{1}(\cos(t)v_{1}+\sin(t)v_{2})+y_{2}(-\sin(t)v_{1}+\cos(t)v_{2})\\&=y_{1}w_{1}+y_{2}w_{2}.\end{aligned}}

In terms of linear maps

Normally, a matrix represents a linear map, and the product of a matrix and a column vector represents the function application of the corresponding linear map to the vector whose coordinates form the column vector. The change-of-basis formula is a specific case of this general principle, although this is not immediately clear from its definition and proof.

When one says that a matrix represents a linear map, one refers implicitly to bases of implied vector spaces, and to the fact that the choice of a basis induces an isomorphism between a vector space and $F n$ , where $F$ is the field of scalars. When only one basis is considered for each vector space, it is worth to leave this isomorphism implicit, and to work up to an isomorphism. As several bases of the same vector space are considered here, a more accurate wording is required.

Let $F$ be a field, the set $F^{n}$ of the $n$ -tuples is a $F$ -vector space whose addition and scalar multiplication are defined component-wise. Its standard basis is the basis that has as its $i$ th element the tuple with all components equal to $0$ except the $i$ th that is $1$ .

A basis $B=(v_{1},\ldots ,v_{n})$ of a $F$ -vector space $V$ defines a linear isomorphism $\phi \colon F^{n}\to V$ by

\phi (x_{1},\ldots ,x_{n})=\sum _{i=1}^{n}x_{i}v_{i}.

Conversely, such a linear isomorphism defines a basis, which is the image by $\phi$ of the standard basis of $F^{n}.$

Let $B_{\mathrm {old} }=(v_{1},\ldots ,v_{n})$ be the "old basis" of a change of basis, and $\phi _{\mathrm {old} }$ the associated isomorphism. Given a change-of basis matrix $A$ , one could consider it the matrix of an endomorphism $\psi _{A}$ of $F^{n}.$ Finally, define

\phi _{\mathrm {new} }=\phi _{\mathrm {old} }\circ \psi _{A}

(where $\circ$ denotes function composition), and

B_{\mathrm {new} }=\phi _{\mathrm {new} }(\phi _{\mathrm {old} }^{-1}(B_{\mathrm {old} })).

A straightforward verification shows that this definition of $B_{\mathrm {new} }$ is the same as that of the preceding section.

Now, by composing the equation $\phi _{\mathrm {new} }=\phi _{\mathrm {old} }\circ \psi _{A}$ with $\phi _{\mathrm {old} }^{-1}$ on the left and $\phi _{\mathrm {new} }^{-1}$ on the right, one gets

\phi _{\mathrm {old} }^{-1}=\psi _{A}\circ \phi _{\mathrm {new} }^{-1}.

It follows that, for $v\in V,$ one has

\phi _{\mathrm {old} }^{-1}(v)=\psi _{A}(\phi _{\mathrm {new} }^{-1}(v)),

which is the change-of-basis formula expressed in terms of linear maps instead of coordinates.

Function defined on a vector space

A function that has a vector space as its domain is commonly specified as a multivariate function whose variables are the coordinates on some basis of the vector on which the function is applied.

When the basis is changed, the expression of the function is changed. This change can be computed by substituting the "old" coordinates for their expressions in terms of the "new" coordinates. More precisely, if $f (x)$ is the expression of the function in terms of the old coordinates, and if $x = A y$ is the change-of-base formula, then $f (A y)$ is the expression of the same function in terms of the new coordinates.

The fact that the change-of-basis formula expresses the old coordinates in terms of the new one may seem unnatural, but appears as useful, as no matrix inversion is needed here.

As the change-of-basis formula involves only linear functions, many function properties are kept by a change of basis. This allows defining these properties as properties of functions of a variable vector that are not related to any specific basis. So, a function whose domain is a vector space or a subset of it is

if the multivariate function that represents it on some basis—and thus on every basis—has the same property.

This is specially useful in the theory of manifolds, as this allows extending the concepts of continuous, differentiable, smooth and analytic functions to functions that are defined on a manifold.

Linear maps

Consider a linear map $T : W \to V$ from a vector space $W$ of dimension $n$ to a vector space $V$ of dimension $m$ . It is represented on "old" bases of $V$ and $W$ by a $m \times n$ matrix $M$ . A change of bases is defined by an $m \times m$ change-of-basis matrix $P$ for $V$ , and an $n \times n$ change-of-basis matrix $Q$ for $W$ .

On the "new" bases, the matrix of $T$ is

P^{-1}MQ.

This is a straightforward consequence of the change-of-basis formula.

Endomorphisms

Endomorphisms are linear maps from a vector space $V$ to itself. For a change of basis, the formula of the preceding section applies, with the same change-of-basis matrix on both sides of the formula. That is, if $M$ is the square matrix of an endomorphism of $V$ over an "old" basis, and $P$ is a change-of-basis matrix, then the matrix of the endomorphism on the "new" basis is

P^{-1}MP.

As every invertible matrix can be used as a change-of-basis matrix, this implies that two matrices are similar if and only if they represent the same endomorphism on two different bases.

Bilinear forms

A bilinear form on a vector space V over a field $F$ is a function $V \times V \to F$ which is linear in both arguments. That is, $B : V \times V \to F$ is bilinear if the maps $v\mapsto B(v,w)$ and $v\mapsto B(w,v)$ are linear for every fixed $w\in V.$

The matrix $B$ of a bilinear form $B$ on a basis $(v_{1},\ldots ,v_{n})$ (the "old" basis in what follows) is the matrix whose entry of the $i$ th row and $j$ th column is $B(v_{i},v_{j})$ . It follows that if $v$ and $w$ are the column vectors of the coordinates of two vectors $v$ and $w$ , one has

B(v,w)=\mathbf {v} ^{\mathsf {T}}\mathbf {B} \mathbf {w} ,

where $\mathbf {v} ^{\mathsf {T}}$ denotes the transpose of the matrix $v$ .

If $P$ is a change of basis matrix, then a straightforward computation shows that the matrix of the bilinear form on the new basis is

P^{\mathsf {T}}\mathbf {B} P.

A symmetric bilinear form is a bilinear form $B$ such that $B(v,w)=B(w,v)$ for every $v$ and $w$ in $V$ . It follows that the matrix of $B$ on any basis is symmetric. This implies that the property of being a symmetric matrix must be kept by the above change-of-base formula. One can also check this by noting that the transpose of a matrix product is the product of the transposes computed in the reverse order. In particular,

(P^{\mathsf {T}}\mathbf {B} P)^{\mathsf {T}}=P^{\mathsf {T}}\mathbf {B} ^{\mathsf {T}}P,

and the two members of this equation equal $P^{\mathsf {T}}\mathbf {B} P$ if the matrix $B$ is symmetric.

If the characteristic of the ground field $F$ is not two, then for every symmetric bilinear form there is a basis for which the matrix is diagonal. Moreover, the resulting nonzero entries on the diagonal are defined up to the multiplication by a square. So, if the ground field is the field $\mathbb {R}$ of the real numbers, these nonzero entries can be chosen to be either $1$ or $-1$ . Sylvester's law of inertia is a theorem that asserts that the numbers of $1$ and of $-1$ depends only on the bilinear form, and not of the change of basis.

Symmetric bilinear forms over the reals are often encountered in geometry and physics, typically in the study of quadrics and of the inertia of a rigid body. In these cases, orthonormal bases are specially useful; this means that one generally prefer to restrict changes of basis to those that have an orthogonal change-of-base matrix, that is, a matrix such that $P^{\mathsf {T}}=P^{-1}.$ Such matrices have the fundamental property that the change-of-base formula is the same for a symmetric bilinear form and the endomorphism that is represented by the same symmetric matrix. The Spectral theorem asserts that, given such a symmetric matrix, there is an orthogonal change of basis such that the resulting matrix (of both the bilinear form and the endomorphism) is a diagonal matrix with the eigenvalues of the initial matrix on the diagonal. It follows that, over the reals, if the matrix of an endomorphism is symmetric, then it is diagonalizable.

Notes

↑ Although a basis is generally defined as a set of vectors (for example, as a spanning set that is linearly independent), the tuple notation is convenient here, since the indexing by the first positive integers makes the basis an ordered basis.

Related Research Articles

In mathematics, any vector space has a corresponding dual vector space consisting of all linear forms on together with the vector space structure of pointwise addition and scalar multiplication by constants.

In mathematics, a set $B$ of vectors in a vector space $V$ is called a basis if every element of $V$ may be written in a unique way as a finite linear combination of elements of $B$ . The coefficients of this linear combination are referred to as components or coordinates of the vector with respect to $B$ . The elements of a basis are called basis vectors.

In mathematics, a spherical coordinate system is a coordinate system for three-dimensional space where the position of a given point in space is specified by three real numbers: the radial distance $r$ along the radial line connecting the point to the fixed point of origin; the polar angle $θ$ between the radial line and a given polar axis; and the azimuthal angle $φ$ as the angle of rotation of the radial line around the polar axis. (See graphic re the "physics convention".) Once the radius is fixed, the three coordinates (r, θ, φ), known as a 3-tuple, provide a coordinate system on a sphere, typically called the spherical polar coordinates. The plane passing through the origin and perpendicular to the polar axis (where the polar angle is a right angle) is called the reference plane (sometimes fundamental plane).

Kinematics is a subfield of physics and mathematics, developed in classical mechanics, that describes the motion of points, bodies (objects), and systems of bodies without considering the forces that cause them to move. Kinematics, as a field of study, is often referred to as the "geometry of motion" and is occasionally seen as a branch of both applied and pure mathematics since it can be studied without considering the mass of a body or the forces acting upon it. A kinematics problem begins by describing the geometry of the system and declaring the initial conditions of any known values of position, velocity and/or acceleration of points within the system. Then, using arguments from geometry, the position, velocity and acceleration of any unknown parts of the system can be determined. The study of how forces act on bodies falls within kinetics, not kinematics. For further details, see analytical dynamics.

<span class="mw-page-title-main">Angular velocity</span> Direction and rate of rotation

In physics, angular velocity, also known as angular frequency vector, is a pseudovector representation of how the angular position or orientation of an object changes with time, i.e. how quickly an object rotates around an axis of rotation and how fast the axis itself changes direction.

In mathematics, a unit vector in a normed vector space is a vector of length 1. A unit vector is often denoted by a lowercase letter with a circumflex, or "hat", as in $.$

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature. They are named after Carl Gustav Jacob Jacobi.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold $M$ that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point $p$ of $M$ is a bilinear form defined on the tangent space at $p$ , and a metric field on $M$ consists of a metric tensor at each point $p$ of $M$ that varies smoothly with $p$ .

An operator is a function over a space of physical states onto another space of states. The simplest example of the utility of operators is the study of symmetry. Because of this, they are useful tools in classical mechanics. Operators are even more important in quantum mechanics, where they form an intrinsic part of the formulation of the theory.

In linear algebra, linear transformations can be represented by matrices. If $is a linear transformation mapping to and is a column vector with entries, then for some matrix, called the transformation matrix of . Note that has rows and columns, whereas the transformation is from to . There are alternative expressions of transformation matrices involving row vectors that are preferred by some authors.$

In geometry, Euler's rotation theorem states that, in three-dimensional space, any displacement of a rigid body such that a point on the rigid body remains fixed, is equivalent to a single rotation about some axis that runs through the fixed point. It also means that the composition of two rotations is also a rotation. Therefore the set of rotations has a group structure, known as a rotation group.

In geometry, a position or position vector, also known as location vector or radius vector, is a Euclidean vector that represents a point P in space. Its length represents the distance in relation to an arbitrary reference origin O, and its direction represents the angular orientation with respect to given reference axes. Usually denoted x, r, or s, it corresponds to the straight line segment from O to P. In other words, it is the displacement or translation that maps the origin to P:

In mathematics and physics, the Christoffel symbols are an array of numbers describing a metric connection. The metric connection is a specialization of the affine connection to surfaces or other manifolds endowed with a metric, allowing distances to be measured on that surface. In differential geometry, an affine connection can be defined without reference to a metric, and many additional concepts follow: parallel transport, covariant derivatives, geodesics, etc. also do not require the concept of a metric. However, when a metric is available, these concepts can be directly tied to the "shape" of the manifold itself; that shape is determined by how the tangent space is attached to the cotangent space by the metric tensor. Abstractly, one would say that the manifold has an associated (orthonormal) frame bundle, with each "frame" being a possible choice of a coordinate frame. An invariant metric implies that the structure group of the frame bundle is the orthogonal group $O(p, q)$ . As a result, such a manifold is necessarily a (pseudo-)Riemannian manifold. The Christoffel symbols provide a concrete representation of the connection of (pseudo-)Riemannian geometry in terms of coordinates on the manifold. Additional concepts, such as parallel transport, geodesics, etc. can then be expressed in terms of Christoffel symbols.

In mathematics, a volume element provides a means for integrating a function with respect to volume in various coordinate systems such as spherical coordinates and cylindrical coordinates. Thus a volume element is an expression of the form $where the are the coordinates, so that the volume of any set can be computed by For example, in spherical coordinates, and so .$

The diabatic representation as a mathematical tool for theoretical calculations of atomic collisions and of molecular interactions.

A parametric surface is a surface in the Euclidean space $which is defined by a parametric equation with two parameters :\mathbb {R} ^{2}\to \mathbb {R} ^{3}} . Parametric representation is a very general way to specify a surface, as well as implicit representation. Surfaces that occur in two of the main theorems of vector calculus, Stokes' theorem and the divergence theorem, are frequently given in a parametric form. The curvature and arc length of curves on the surface, surface area, differential geometric invariants such as the first and second fundamental forms, Gaussian, mean, and principal curvatures can all be computed from a given parametrization.$

In geometry, various formalisms exist to express a rotation in three dimensions as a mathematical transformation. In physics, this concept is applied to classical mechanics where rotational kinematics is the science of quantitative description of a purely rotational motion. The orientation of an object at a given instant is described with the same tools, as it is defined as an imaginary rotation from a reference placement in space, rather than an actually observed rotation from a previous placement in space.

<span class="mw-page-title-main">Classical group</span>

In mathematics, the classical groups are defined as the special linear groups over the reals $, the complex numbers and the quaternions together with special automorphism groups of symmetric or skew-symmetric bilinear forms and Hermitian or skew-Hermitian sesquilinear forms defined on real, complex and quaternionic finite-dimensional vector spaces. Of these, the complex classical Lie groups are four infinite families of Lie groups that together with the exceptional groups exhaust the classification of simple Lie groups. The compact classical groups are compact real forms of the complex classical groups. The finite analogues of the classical groups are the classical groups of Lie type . The term "classical group" was coined by Hermann Weyl, it being the title of his 1939 monograph The Classical Groups .$

References

↑ Anton (1987 , pp. 221–237)
↑ Beauregard & Fraleigh (1973 , pp. 240–243)
↑ Nering (1970 , pp. 50–52)

Bibliography

Anton, Howard (1987), Elementary Linear Algebra (5th ed.), New York: Wiley, ISBN 0-471-84819-0
Beauregard, Raymond A.; Fraleigh, John B. (1973), A First Course In Linear Algebra: with Optional Introduction to Groups, Rings, and Fields , Boston: Houghton Mifflin Company, ISBN 0-395-14017-X
Nering, Evar D. (1970), Linear Algebra and Matrix Theory (2nd ed.), New York: Wiley, LCCN 76091646

External links

MIT Linear Algebra Lecture on Change of Basis, from MIT OpenCourseWare
Khan Academy Lecture on Change of Basis, from Khan Academy

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[4] Although a basis is generally defined as a set of vectors (for example, as a spanning set that is linearly independent), the tuple notation is convenient here, since the indexing by the first positive integers makes the basis an ordered basis.

[1] Anton (1987 , pp. 221–237)

[2] Beauregard & Fraleigh (1973 , pp. 240–243)

[3] Nering (1970 , pp. 50–52)

[1]

[2]

[3]

[lower-alpha 1]

v t e Linear algebra
Outline Glossary
Basic concepts	Scalar Vector Vector space Scalar multiplication Vector projection Linear span Linear map Linear projection Linear independence Linear combination Multilinear map Basis Change of basis Row and column vectors Row and column spaces Kernel Eigenvalues and eigenvectors Transpose Linear equations
Matrices	Block Decomposition Invertible Minor Multiplication Rank Transformation Cramer's rule Gaussian elimination Productive matrix
Bilinear	Orthogonality Dot product Hadamard product Inner product space Outer product Kronecker product Gram–Schmidt process
Multilinear algebra	Determinant Cross product Triple product Seven-dimensional cross product Geometric algebra Exterior algebra Bivector Multivector Tensor Outermorphism
Vector space constructions	Dual Direct sum Function space Quotient Subspace Tensor product
Numerical	Floating-point Numerical stability Basic Linear Algebra Subprograms Sparse matrix Comparison of linear algebra libraries
Category