Idempotent matrix

Last updated March 26, 2024

In linear algebra, an idempotent matrix is a matrix which, when multiplied by itself, yields itself.^[1]^[2] That is, the matrix $A$ is idempotent if and only if $A^{2}=A$ . For this product $A^{2}$ to be defined, $A$ must necessarily be a square matrix. Viewed this way, idempotent matrices are idempotent elements of matrix rings.

Example

Examples of $2\times 2$ idempotent matrices are:

{\begin{bmatrix}1&0\\0&1\end{bmatrix}}\qquad {\begin{bmatrix}3&-6\\1&-2\end{bmatrix}}

Examples of $3\times 3$ idempotent matrices are:

{\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}\qquad {\begin{bmatrix}2&-2&-4\\-1&3&4\\1&-2&-3\end{bmatrix}}

Real 2 × 2 case

If a matrix ${\begin{pmatrix}a&b\\c&d\end{pmatrix}}$ is idempotent, then

$a=a^{2}+bc,$
$b=ab+bd,$ implying $b(1-a-d)=0$ so $b=0$ or $d=1-a,$
$c=ca+cd,$ implying $c(1-a-d)=0$ so $c=0$ or $d=1-a,$
$d=bc+d^{2}.$

Thus, a necessary condition for a $2\times 2$ matrix to be idempotent is that either it is diagonal or its trace equals 1. For idempotent diagonal matrices, $a$ and $d$ must be either 1 or 0.

If $b=c$ , the matrix ${\begin{pmatrix}a&b\\b&1-a\end{pmatrix}}$ will be idempotent provided $a^{2}+b^{2}=a,$ so a satisfies the quadratic equation

a^{2}-a+b^{2}=0,

or

\left(a-{\frac {1}{2}}\right)^{2}+b^{2}={\frac {1}{4}}

which is a circle with center (1/2, 0) and radius 1/2. In terms of an angle θ,

A={\frac {1}{2}}{\begin{pmatrix}1-\cos \theta &\sin \theta \\\sin \theta &1+\cos \theta \end{pmatrix}}

is idempotent.

However, $b=c$ is not a necessary condition: any matrix

{\begin{pmatrix}a&b\\c&1-a\end{pmatrix}}

with

a^{2}+bc=a

is idempotent.

Properties

Singularity and regularity

The only non-singular idempotent matrix is the identity matrix; that is, if a non-identity matrix is idempotent, its number of independent rows (and columns) is less than its number of rows (and columns).

This can be seen from writing $A^{2}=A$ , assuming that $A$ has full rank (is non-singular), and pre-multiplying by $A^{-1}$ to obtain $A=IA=A^{-1}A^{2}=A^{-1}A=I$ .

When an idempotent matrix is subtracted from the identity matrix, the result is also idempotent. This holds since

(I-A)(I-A)=I-A-A+A^{2}=I-A-A+A=I-A.

If a matrix $A$ is idempotent then for all positive integers n, $A^{n}=A$ . This can be shown using proof by induction. Clearly we have the result for $n=1$ , as $A^{1}=A$ . Suppose that $A^{k-1}=A$ . Then, $A^{k}=A^{k-1}A=AA=A$ , since $A$ is idempotent. Hence by the principle of induction, the result follows.

Eigenvalues

An idempotent matrix is always diagonalizable.^[3] Its eigenvalues are either 0 or 1: if $\mathbf {x}$ is a non-zero eigenvector of some idempotent matrix $A$ and $\lambda$ its associated eigenvalue, then ${\textstyle \lambda \mathbf {x} =A\mathbf {x} =A^{2}\mathbf {x} =A\lambda \mathbf {x} =\lambda A\mathbf {x} =\lambda ^{2}\mathbf {x} ,}$ which implies $\lambda \in \{0,1\}.$ This further implies that the determinant of an idempotent matrix is always 0 or 1. As stated above, if the determinant is equal to one, the matrix is invertible and is therefore the identity matrix.

Trace

The trace of an idempotent matrix — the sum of the elements on its main diagonal — equals the rank of the matrix and thus is always an integer. This provides an easy way of computing the rank, or alternatively an easy way of determining the trace of a matrix whose elements are not specifically known (which is helpful in statistics, for example, in establishing the degree of bias in using a sample variance as an estimate of a population variance).

Relationships between idempotent matrices

In regression analysis, the matrix $M=I-X(X'X)^{-1}X'$ is known to produce the residuals $e$ from the regression of the vector of dependent variables $y$ on the matrix of covariates $X$ . (See the section on Applications.) Now, let $X_{1}$ be a matrix formed from a subset of the columns of $X$ , and let $M_{1}=I-X_{1}(X_{1}'X_{1})^{-1}X_{1}'$ . It is easy to show that both $M$ and $M_{1}$ are idempotent, but a somewhat surprising fact is that $MM_{1}=M$ . This is because $MX_{1}=0$ , or in other words, the residuals from the regression of the columns of $X_{1}$ on $X$ are 0 since $X_{1}$ can be perfectly interpolated as it is a subset of $X$ (by direct substitution it is also straightforward to show that $MX=0$ ). This leads to two other important results: one is that $(M_{1}-M)$ is symmetric and idempotent, and the other is that $(M_{1}-M)M=0$ , i.e., $(M_{1}-M)$ is orthogonal to $M$ . These results play a key role, for example, in the derivation of the F test.

Any similar matrices of an idempotent matrix are also idempotent. Idempotency is conserved under a change of basis. This can be shown through multiplication of the transformed matrix $SAS^{-1}$ with $A$ being idempotent: $(SAS^{-1})^{2}=(SAS^{-1})(SAS^{-1})=SA(S^{-1}S)AS^{-1}=SA^{2}S^{-1}=SAS^{-1}$ .

Applications

Idempotent matrices arise frequently in regression analysis and econometrics. For example, in ordinary least squares, the regression problem is to choose a vector $β$ of coefficient estimates so as to minimize the sum of squared residuals (mispredictions) e_i: in matrix form,

Minimize

(y-X\beta )^{\textsf {T}}(y-X\beta )

where $y$ is a vector of dependent variable observations, and $X$ is a matrix each of whose columns is a column of observations on one of the independent variables. The resulting estimator is

{\hat {\beta }}=\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}y

where superscript T indicates a transpose, and the vector of residuals is^[2]

{\hat {e}}=y-X{\hat {\beta }}=y-X\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}y=\left[I-X\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}\right]y=My.

Here both $M$ and $X\left(X^{\textsf {T}}X\right)^{-1}X^{\textsf {T}}$ (the latter being known as the hat matrix) are idempotent and symmetric matrices, a fact which allows simplification when the sum of squared residuals is computed:

{\hat {e}}^{\textsf {T}}{\hat {e}}=(My)^{\textsf {T}}(My)=y^{\textsf {T}}M^{\textsf {T}}My=y^{\textsf {T}}MMy=y^{\textsf {T}}My.

The idempotency of $M$ plays a role in other calculations as well, such as in determining the variance of the estimator ${\hat {\beta }}$ .

An idempotent linear operator $P$ is a projection operator on the range space $R(P)$ along its null space $N(P)$ . $P$ is an orthogonal projection operator if and only if it is idempotent and symmetric.

Related Research Articles

In mathematics, and more specifically in linear algebra, a linear map is a mapping $between two vector spaces that preserves the operations of vector addition and scalar multiplication. The same names and the same definition are also used for the more general case of modules over a ring; see Module homomorphism.$

In mathematical physics and mathematics, the Pauli matrices are a set of three $2 \times 2$ complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices $A$ and $B$ is denoted as $AB$ .

In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally,

Ray transfer matrix analysis is a mathematical form for performing ray tracing calculations in sufficiently simple problems which can be solved considering only paraxial rays. Each optical element is described by a 2×2 ray transfer matrix which operates on a vector describing an incoming light ray to calculate the outgoing ray. Multiplication of the successive matrices thus yields a concise ray transfer matrix describing the entire optical system. The same mathematics is also used in accelerator physics to track particles through the magnet installations of a particle accelerator, see electron optics.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal matrix is $, while an example of a 3\times3 diagonal matrix is . An identity matrix of any size, or any multiple of it is a diagonal matrix called scalar matrix, for example, . In geometry, a diagonal matrix may be used as a scaling matrix, since matrix multiplication with it results in changing scale (size) and possibly also shape; only a scalar matrix results in uniform change in scale.$

In linear algebra, an $n$ -by- $n$ square matrix $A$ is called invertible if there exists an $n$ -by- $n$ square matrix $B$ such that

In special relativity, a four-vector is an object with four components, which transform in a specific way under Lorentz transformations. Specifically, a four-vector is an element of a four-dimensional vector space considered as a representation space of the standard representation of the Lorentz group, the representation. It differs from a Euclidean vector in how its magnitude is determined. The transformations that preserve this magnitude are the Lorentz transformations, which include spatial rotations and boosts.

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

An infinitesimal rotation matrix or differential rotation matrix is a matrix representing an infinitely small rotation.

In linear algebra and functional analysis, a projection is a linear transformation $from a vector space to itself such that . That is, whenever is applied twice to any vector, it gives the same result as if it were applied once. It leaves its image unchanged. This definition of "projection" formalizes and generalizes the idea of graphical projection. One can also consider the effect of a projection on a geometrical object by examining the effect of the projection on points in the object.$

In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices. It collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. This greatly simplifies operations such as finding the maximum or minimum of a multivariate function and solving systems of differential equations. The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.

In mathematics, a moment matrix is a special symmetric square matrix whose rows and columns are indexed by monomials. The entries of the matrix depend on the product of the indexing monomials only

In statistics, Bayesian multivariate linear regression is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random variable. A more general treatment of this approach can be found in the article MMSE estimator.

In statistics, the projection matrix $, sometimes also called the influence matrix or hat matrix, maps the vector of response values to the vector of fitted values. It describes the influence each response value has on each fitted value. The diagonal elements of the projection matrix are the leverages, which describe the influence each response value has on the fitted value for that same observation.$

In mathematics, the Khatri–Rao product or block Kronecker product of two partitioned matrices $and is defined as$

References

↑ Chiang, Alpha C. (1984). Fundamental Methods of Mathematical Economics (3rd ed.). New York: McGraw–Hill. p. 80. ISBN 0070108137.
1 2 Greene, William H. (2003). Econometric Analysis (5th ed.). Upper Saddle River, NJ: Prentice–Hall. pp. 808–809. ISBN 0130661899.
↑ Horn, Roger A.; Johnson, Charles R. (1990). Matrix analysis. Cambridge University Press. p. p. 148. ISBN 0521386322.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Chiang, Alpha C. (1984). Fundamental Methods of Mathematical Economics (3rd ed.). New York: McGraw–Hill. p. 80. ISBN 0070108137.

[Greene-2] 1 2 Greene, William H. (2003). Econometric Analysis (5th ed.). Upper Saddle River, NJ: Prentice–Hall. pp. 808–809. ISBN 0130661899.

[3] Horn, Roger A.; Johnson, Charles R. (1990). Matrix analysis. Cambridge University Press. p. p. 148. ISBN 0521386322.

[1]

[2]

[3]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematicsportal List of matrices Category:Matrices