Generalized inverse

Last updated September 26, 2024

In mathematics, and in particular, algebra, a generalized inverse (or, g-inverse) of an element x is an element y that has some properties of an inverse element but not necessarily all of them. The purpose of constructing a generalized inverse of a matrix is to obtain a matrix that can serve as an inverse in some sense for a wider class of matrices than invertible matrices. Generalized inverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup. This article describes generalized inverses of a matrix $A$ .

A matrix $A^{\mathrm {g} }\in \mathbb {R} ^{n\times m}$ is a generalized inverse of a matrix $A\in \mathbb {R} ^{m\times n}$ if $AA^{\mathrm {g} }A=A.$ ^[1]^[2]^[3] A generalized inverse exists for an arbitrary matrix, and when a matrix has a regular inverse, this inverse is its unique generalized inverse.^[1]

Motivation

Consider the linear system

Ax=y

where $A$ is an $m\times n$ matrix and $y\in {\mathcal {C}}(A),$ the column space of $A$ . If $m=n$ and $A$ is nonsingular then $x=A^{-1}y$ will be the solution of the system. Note that, if $A$ is nonsingular, then

AA^{-1}A=A.

Now suppose $A$ is rectangular ( $m\neq n$ ), or square and singular. Then we need a right candidate $G$ of order $n\times m$ such that for all $y\in {\mathcal {C}}(A),$

AGy=y.

^[4]

That is, $x=Gy$ is a solution of the linear system $Ax=y$ . Equivalently, we need a matrix $G$ of order $n\times m$ such that

AGA=A.

Hence we can define the generalized inverse as follows: Given an $m\times n$ matrix $A$ , an $n\times m$ matrix $G$ is said to be a generalized inverse of $A$ if $AGA=A.$ ‍^[1]^[2]^[3] The matrix $A^{-1}$ has been termed a regular inverse of $A$ by some authors.^[5]

Types

Important types of generalized inverse include:

One-sided inverse (right inverse or left inverse)
- Right inverse: If the matrix $A$ has dimensions $m\times n$ and ${\textrm {rank}}(A)=m$ , then there exists an $n\times m$ matrix $A_{\mathrm {R} }^{-1}$ called the right inverse of $A$ such that $AA_{\mathrm {R} }^{-1}=I_{m}$ , where $I_{m}$ is the $m\times m$ identity matrix.
- Left inverse: If the matrix $A$ has dimensions $m\times n$ and ${\textrm {rank}}(A)=n$ , then there exists an $n\times m$ matrix $A_{\mathrm {L} }^{-1}$ called the left inverse of $A$ such that $A_{\mathrm {L} }^{-1}A=I_{n}$ , where $I_{n}$ is the $n\times n$ identity matrix.^[6]
Bott–Duffin inverse
Drazin inverse
Moore–Penrose inverse

Some generalized inverses are defined and classified based on the Penrose conditions:

$AA^{\mathrm {g} }A=A$
$A^{\mathrm {g} }AA^{\mathrm {g} }=A^{\mathrm {g} }$
$(AA^{\mathrm {g} })^{*}=AA^{\mathrm {g} }$
$(A^{\mathrm {g} }A)^{*}=A^{\mathrm {g} }A,$

where ${}^{*}$ denotes conjugate transpose. If $A^{\mathrm {g} }$ satisfies the first condition, then it is a generalized inverse of $A$ . If it satisfies the first two conditions, then it is a reflexive generalized inverse of $A$ . If it satisfies all four conditions, then it is the pseudoinverse of $A$ , which is denoted by $A^{+}$ and also known as the Moore–Penrose inverse, after the pioneering works by E. H. Moore and Roger Penrose.^[2]^[7]^[8]^[9]^[10]^[11] It is convenient to define an $I$ -inverse of $A$ as an inverse that satisfies the subset $I\subset \{1,2,3,4\}$ of the Penrose conditions listed above. Relations, such as $A^{(1,4)}AA^{(1,3)}=A^{+}$ , can be established between these different classes of $I$ -inverses.^[1]

When $A$ is non-singular, any generalized inverse $A^{\mathrm {g} }=A^{-1}$ and is therefore unique. For a singular $A$ , some generalised inverses, such as the Drazin inverse and the Moore–Penrose inverse, are unique, while others are not necessarily uniquely defined.

Examples

Reflexive generalized inverse

Let

A={\begin{bmatrix}1&2&3\\4&5&6\\7&8&9\end{bmatrix}},\quad G={\begin{bmatrix}-{\frac {5}{3}}&{\frac {2}{3}}&0\\[4pt]{\frac {4}{3}}&-{\frac {1}{3}}&0\\[4pt]0&0&0\end{bmatrix}}.

Since $\det(A)=0$ , $A$ is singular and has no regular inverse. However, $A$ and $G$ satisfy Penrose conditions (1) and (2), but not (3) or (4). Hence, $G$ is a reflexive generalized inverse of $A$ .

One-sided inverse

Let

A={\begin{bmatrix}1&2&3\\4&5&6\end{bmatrix}},\quad A_{\mathrm {R} }^{-1}={\begin{bmatrix}-{\frac {17}{18}}&{\frac {8}{18}}\\[4pt]-{\frac {2}{18}}&{\frac {2}{18}}\\[4pt]{\frac {13}{18}}&-{\frac {4}{18}}\end{bmatrix}}.

Since $A$ is not square, $A$ has no regular inverse. However, $A_{\mathrm {R} }^{-1}$ is a right inverse of $A$ . The matrix $A$ has no left inverse.

Inverse of other semigroups (or rings)

The element b is a generalized inverse of an element a if and only if $a\cdot b\cdot a=a$ , in any semigroup (or ring, since the multiplication function in any ring is a semigroup).

The generalized inverses of the element 3 in the ring $\mathbb {Z} /12\mathbb {Z}$ are 3, 7, and 11, since in the ring $\mathbb {Z} /12\mathbb {Z}$ :

3\cdot 3\cdot 3=3

3\cdot 7\cdot 3=3

3\cdot 11\cdot 3=3

The generalized inverses of the element 4 in the ring $\mathbb {Z} /12\mathbb {Z}$ are 1, 4, 7, and 10, since in the ring $\mathbb {Z} /12\mathbb {Z}$ :

4\cdot 1\cdot 4=4

4\cdot 4\cdot 4=4

4\cdot 7\cdot 4=4

4\cdot 10\cdot 4=4

If an element a in a semigroup (or ring) has an inverse, the inverse must be the only generalized inverse of this element, like the elements 1, 5, 7, and 11 in the ring $\mathbb {Z} /12\mathbb {Z}$ .

In the ring $\mathbb {Z} /12\mathbb {Z}$ , any element is a generalized inverse of 0, however, 2 has no generalized inverse, since there is no b in $\mathbb {Z} /12\mathbb {Z}$ such that $2\cdot b\cdot 2=2$ .

Construction

The following characterizations are easy to verify:

A right inverse of a non-square matrix $A$ is given by $A_{\mathrm {R} }^{-1}=A^{\intercal }\left(AA^{\intercal }\right)^{-1}$ , provided $A$ has full row rank.^[6]
A left inverse of a non-square matrix $A$ is given by $A_{\mathrm {L} }^{-1}=\left(A^{\intercal }A\right)^{-1}A^{\intercal }$ , provided $A$ has full column rank.^[6]
If $A=BC$ is a rank factorization, then $G=C_{\mathrm {R} }^{-1}B_{\mathrm {L} }^{-1}$ is a g-inverse of $A$ , where $C_{\mathrm {R} }^{-1}$ is a right inverse of $C$ and $B_{\mathrm {L} }^{-1}$ is left inverse of $B$ .
If $A=P{\begin{bmatrix}I_{r}&0\\0&0\end{bmatrix}}Q$ for any non-singular matrices $P$ and $Q$ , then $G=Q^{-1}{\begin{bmatrix}I_{r}&U\\W&V\end{bmatrix}}P^{-1}$ is a generalized inverse of $A$ for arbitrary $U,V$ and $W$ .
Let $A$ be of rank $r$ . Without loss of generality, let $A={\begin{bmatrix}B&C\\D&E\end{bmatrix}},$ where $B_{r\times r}$ is the non-singular submatrix of $A$ . Then, $G={\begin{bmatrix}B^{-1}&0\\0&0\end{bmatrix}}$ is a generalized inverse of $A$ if and only if $E=DB^{-1}C$ .

Uses

Any generalized inverse can be used to determine whether a system of linear equations has any solutions, and if so to give all of them. If any solutions exist for the n × m linear system

Ax=b

,

with vector $x$ of unknowns and vector $b$ of constants, all solutions are given by

x=A^{\mathrm {g} }b+\left[I-A^{\mathrm {g} }A\right]w

,

parametric on the arbitrary vector $w$ , where $A^{\mathrm {g} }$ is any generalized inverse of $A$ . Solutions exist if and only if $A^{\mathrm {g} }b$ is a solution, that is, if and only if $AA^{\mathrm {g} }b=b$ . If A has full column rank, the bracketed expression in this equation is the zero matrix and so the solution is unique.^[12]

Generalized inverses of matrices

The generalized inverses of matrices can be characterized as follows. Let $A\in \mathbb {R} ^{m\times n}$ , and

$A=U{\begin{bmatrix}\Sigma _{1}&0\\0&0\end{bmatrix}}V^{\operatorname {T} }$

be its singular-value decomposition. Then for any generalized inverse $A^{g}$ , there exist^[1] matrices $X$ , $Y$ , and $Z$ such that

$A^{g}=V{\begin{bmatrix}\Sigma _{1}^{-1}&X\\Y&Z\end{bmatrix}}U^{\operatorname {T} }.$

Conversely, any choice of $X$ , $Y$ , and $Z$ for matrix of this form is a generalized inverse of $A$ .^[1] The $\{1,2\}$ -inverses are exactly those for which $Z=Y\Sigma _{1}X$ , the $\{1,3\}$ -inverses are exactly those for which $X=0$ , and the $\{1,4\}$ -inverses are exactly those for which $Y=0$ . In particular, the pseudoinverse is given by $X=Y=Z=0$ :

$A^{+}=V{\begin{bmatrix}\Sigma _{1}^{-1}&0\\0&0\end{bmatrix}}U^{\operatorname {T} }.$

Transformation consistency properties

In practical applications it is necessary to identify the class of matrix transformations that must be preserved by a generalized inverse. For example, the Moore–Penrose inverse, $A^{+},$ satisfies the following definition of consistency with respect to transformations involving unitary matrices U and V:

(UAV)^{+}=V^{*}A^{+}U^{*}

.

The Drazin inverse, $A^{\mathrm {D} }$ satisfies the following definition of consistency with respect to similarity transformations involving a nonsingular matrix S:

\left(SAS^{-1}\right)^{\mathrm {D} }=SA^{\mathrm {D} }S^{-1}

.

The unit-consistent (UC) inverse,^[13] $A^{\mathrm {U} },$ satisfies the following definition of consistency with respect to transformations involving nonsingular diagonal matrices D and E:

(DAE)^{\mathrm {U} }=E^{-1}A^{\mathrm {U} }D^{-1}

.

The fact that the Moore–Penrose inverse provides consistency with respect to rotations (which are orthonormal transformations) explains its widespread use in physics and other applications in which Euclidean distances must be preserved. The UC inverse, by contrast, is applicable when system behavior is expected to be invariant with respect to the choice of units on different state variables, e.g., miles versus kilometers.

Citations

1 2 3 4 5 6 Ben-Israel & Greville 2003 , pp. 2, 7
1 2 3 Nakamura 1991 , pp. 41–42
1 2 Rao & Mitra 1971 , pp. vii, 20
↑ Rao & Mitra 1971 , p. 24
↑ Rao & Mitra 1971 , pp. 19–20
1 2 3 Rao & Mitra 1971 , p. 19
↑ Rao & Mitra 1971 , pp. 20, 28, 50–51
↑ Ben-Israel & Greville 2003 , p. 7
↑ Campbell & Meyer 1991 , p. 10
↑ James 1978 , p. 114
↑ Nakamura 1991 , p. 42
↑ James 1978 , pp. 109–110
↑ Uhlmann 2018

Sources

Textbook

Ben-Israel, Adi; Greville, Thomas Nall Eden (2003). Generalized Inverses: Theory and Applications (2nd ed.). New York, NY: Springer. doi:10.1007/b97366. ISBN 978-0-387-00293-4.
Campbell, Stephen L.; Meyer, Carl D. (1991). Generalized Inverses of Linear Transformations . Dover. ISBN 978-0-486-66693-8.
Horn, Roger Alan; Johnson, Charles Royal (1985). Matrix Analysis. Cambridge University Press. ISBN 978-0-521-38632-6.
Nakamura, Yoshihiko (1991). Advanced Robotics: Redundancy and Optimization. Addison-Wesley. ISBN 978-0201151985.
Rao, C. Radhakrishna; Mitra, Sujit Kumar (1971). Generalized Inverse of Matrices and its Applications . New York: John Wiley & Sons. pp. 240. ISBN 978-0-471-70821-6.

Publication

James, M. (June 1978). "The generalised inverse". The Mathematical Gazette . 62 (420): 109–114. doi:10.2307/3617665. JSTOR 3617665.
Uhlmann, Jeffrey K. (2018). "A Generalized Matrix Inverse that is Consistent with Respect to Diagonal Transformations" (PDF). SIAM Journal on Matrix Analysis and Applications . 239 (2): 781–800. doi:10.1137/17M113890X.
Zheng, Bing; Bapat, Ravindra (2004). "Generalized inverse A(2)T,S and a rank equation". Applied Mathematics and Computation. 155 (2): 407–415. doi:10.1016/S0096-3003(03)00786-0.

Related Research Articles

In mathematical physics and mathematics, the Pauli matrices are a set of three $2 \times 2$ complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In mathematics, a symmetric matrix $with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the row vector transpose of More generally, a Hermitian matrix is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of$

In mathematics, the concept of an inverse element generalises the concepts of opposite and reciprocal of numbers.

In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

In linear algebra, an $n$ -by- $n$ square matrix $A$ is called invertible if there exists an $n$ -by- $n$ square matrix $B$ such that $where I n denotes the n -by- n identity matrix and the multiplication used is ordinary matrix multiplication. If this is the case, then the matrix B is uniquely determined by A, and is called the (multiplicative) inverse of A, denoted by A -1 . Matrix inversion is the process of finding the matrix which when multiplied by the original matrix gives the identity matrix.$

In mathematics, the conjugate transpose, also known as the Hermitian transpose, of an $complex matrix is an matrix obtained by transposing and applying complex conjugation to each entry. There are several notations, such as or,, or .$

In linear algebra, a square matrix $is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix and a diagonal matrix such that . This is equivalent to . This property exists for any linear map: for a finite-dimensional vector space, a linear map is called diagonalizable if there exists an ordered basis of consisting of eigenvectors of . These definitions are equivalent: if has a matrix representation as above, then the column vectors of form a basis consisting of eigenvectors of, and the diagonal entries of are the corresponding eigenvalues of; with respect to this eigenvector basis, is represented by .$

In mathematics, and in particular linear algebra, the Moore–Penrose inverse⁠ $⁠$ of a matrix ⁠ $⁠$ , often called the pseudoinverse, is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. The terms pseudoinverse and generalized inverse are sometimes used as synonyms for the Moore–Penrose inverse of a matrix, but sometimes applied to other elements of algebraic structures which share some but not all properties expected for an inverse element.

The Schur complement of a block matrix, encountered in linear algebra and the theory of matrices, is defined as follows.

In multivariable calculus, the implicit function theorem is a tool that allows relations to be converted to functions of several real variables. It does so by representing the relation as the graph of a function. There may not be a single function whose graph can represent the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

In linear algebra, linear transformations can be represented by matrices. If $is a linear transformation mapping to and is a column vector with entries, then for some matrix, called the transformation matrix of . Note that has rows and columns, whereas the transformation is from to . There are alternative expressions of transformation matrices involving row vectors that are preferred by some authors.$

<span class="mw-page-title-main">Pushforward (differential)</span> Linear approximation of smooth maps on tangent spaces

In differential geometry, pushforward is a linear approximation of smooth maps on tangent spaces. Suppose that $is a smooth map between smooth manifolds; then the differential of at a point, denoted, is, in some sense, the best linear approximation of near . It can be viewed as a generalization of the total derivative of ordinary calculus. Explicitly, the differential is a linear map from the tangent space of at to the tangent space of at, . Hence it can be used to push tangent vectors on forward to tangent vectors on . The differential of a map is also called, by various authors, the derivative or total derivative of .$

In mathematics, a logarithm of a matrix is another matrix such that the matrix exponential of the latter matrix equals the original matrix. It is thus a generalization of the scalar logarithm and in some sense an inverse function of the matrix exponential. Not all matrices have a logarithm and those matrices that do have a logarithm may have more than one logarithm. The study of logarithms of matrices leads to Lie theory since when a matrix has a logarithm then it is in an element of a Lie group and the logarithm is the corresponding element of the vector space of the Lie algebra.

In mathematics and physics, in particular quantum information, the term generalized Pauli matrices refers to families of matrices which generalize the properties of the Pauli matrices. Here, a few classes of such matrices are summarized.

In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on $. Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence:$

Generalized pencil-of-function method (GPOF), also known as matrix pencil method, is a signal processing technique for estimating a signal or extracting information with complex exponentials. Being similar to Prony and original pencil-of-function methods, it is generally preferred to those for its robustness and computational efficiency.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 3 4 5 6 Ben-Israel & Greville 2003 , pp. 2, 7

[:1-2] 1 2 3 Nakamura 1991 , pp. 41–42

[:2-3] 1 2 Rao & Mitra 1971 , pp. vii, 20

[4] Rao & Mitra 1971 , p. 24

[5] Rao & Mitra 1971 , pp. 19–20

[:4-6] 1 2 3 Rao & Mitra 1971 , p. 19

[7] Rao & Mitra 1971 , pp. 20, 28, 50–51

[:3-8] Ben-Israel & Greville 2003 , p. 7

[9] Campbell & Meyer 1991 , p. 10

[10] James 1978 , p. 114

[11] Nakamura 1991 , p. 42

[12] James 1978 , pp. 109–110

[13] Uhlmann 2018

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]