Schur decomposition

Last updated

In the mathematical discipline of linear algebra, the Schur decomposition or Schur triangulation, named after Issai Schur, is a matrix decomposition. It allows one to write an arbitrary complex square matrix as unitarily equivalent to an upper triangular matrix whose diagonal elements are the eigenvalues of the original matrix.

Contents

Statement

The Schur decomposition reads as follows: if A is an n × n square matrix with complex entries, then A can be expressed as [1] [2] [3]

for some unitary matrix Q (so that the inverse Q−1 is also the conjugate transpose Q* of Q), and some upper triangular matrix U. This is called a Schur form of A. Since U is similar to A, it has the same spectrum, and since it is triangular, its eigenvalues are the diagonal entries of U.

The Schur decomposition implies that there exists a nested sequence of A-invariant subspaces {0} = V0V1 ⊂ ⋯ ⊂ Vn = Cn, and that there exists an ordered orthonormal basis (for the standard Hermitian form of Cn) such that the first i basis vectors span Vi for each i occurring in the nested sequence. Phrased somewhat differently, the first part says that a linear operator J on a complex finite-dimensional vector space stabilizes a complete flag (V1, ..., Vn).

Proof

A constructive proof for the Schur decomposition is as follows: every operator A on a complex finite-dimensional vector space has an eigenvalue λ, corresponding to some eigenspace Vλ. Let Vλ be its orthogonal complement. It is clear that, with respect to this orthogonal decomposition, A has matrix representation (one can pick here any orthonormal bases Z1 and Z2 spanning Vλ and Vλ respectively)

where Iλ is the identity operator on Vλ. The above matrix would be upper-triangular except for the A22 block. But exactly the same procedure can be applied to the sub-matrix A22, viewed as an operator on Vλ, and its submatrices. Continue this way until the resulting matrix is upper triangular. Since each conjugation increases the dimension of the upper-triangular block by at least one, this process takes at most n steps. Thus the space Cn will be exhausted and the procedure has yielded the desired result. [4]

The above argument can be slightly restated as follows: let λ be an eigenvalue of A, corresponding to some eigenspace Vλ. A induces an operator T on the quotient space Cn/Vλ. This operator is precisely the A22 submatrix from above. As before, T would have an eigenspace, say WμCn modulo Vλ. Notice the preimage of Wμ under the quotient map is an invariant subspace of A that contains Vλ. Continue this way until the resulting quotient space has dimension 0. Then the successive preimages of the eigenspaces found at each step form a flag that A stabilizes.

Notes

Although every square matrix has a Schur decomposition, in general this decomposition is not unique. For example, the eigenspace Vλ can have dimension > 1, in which case any orthonormal basis for Vλ would lead to the desired result.

Write the triangular matrix U as U = D + N, where D is diagonal and N is strictly upper triangular (and thus a nilpotent matrix). The diagonal matrix D contains the eigenvalues of A in arbitrary order (hence its Frobenius norm, squared, is the sum of the squared moduli of the eigenvalues of A, while the Frobenius norm of A, squared, is the sum of the squared singular values of A). The nilpotent part N is generally not unique either, but its Frobenius norm is uniquely determined by A (just because the Frobenius norm of A is equal to the Frobenius norm of U = D + N). [5]

It is clear that if A is a normal matrix, then U from its Schur decomposition must be a diagonal matrix and the column vectors of Q are the eigenvectors of A. Therefore, the Schur decomposition extends the spectral decomposition. In particular, if A is positive definite, the Schur decomposition of A, its spectral decomposition, and its singular value decomposition coincide.

A commuting family {Ai} of matrices can be simultaneously triangularized, i.e. there exists a unitary matrix Q such that, for every Ai in the given family, Q Ai Q* is upper triangular. This can be readily deduced from the above proof. Take element A from {Ai} and again consider an eigenspace VA. Then VA is invariant under all matrices in {Ai}. Therefore, all matrices in {Ai} must share one common eigenvector in VA. Induction then proves the claim. As a corollary, we have that every commuting family of normal matrices can be simultaneously diagonalized.

In the infinite dimensional setting, not every bounded operator on a Banach space has an invariant subspace. However, the upper-triangularization of an arbitrary square matrix does generalize to compact operators. Every compact operator on a complex Banach space has a nest of closed invariant subspaces.

Computation

The Schur decomposition of a given matrix is numerically computed by the QR algorithm or its variants. In other words, the roots of the characteristic polynomial corresponding to the matrix are not necessarily computed ahead in order to obtain its Schur decomposition. Conversely, the QR algorithm can be used to compute the roots of any given characteristic polynomial by finding the Schur decomposition of its companion matrix. Similarly, the QR algorithm is used to compute the eigenvalues of any given matrix, which are the diagonal entries of the upper triangular matrix of the Schur decomposition. Although the QR algorithm is formally an infinite sequence of operations, convergence to machine precision is practically achieved in operations. [6] See the Nonsymmetric Eigenproblems section in LAPACK Users' Guide. [7]

Applications

Lie theory applications include:

Generalized Schur decomposition

Given square matrices A and B, the generalized Schur decomposition factorizes both matrices as and , where Q and Z are unitary, and S and T are upper triangular. The generalized Schur decomposition is also sometimes called the QZ decomposition. [2] :375 [8]

The generalized eigenvalues that solve the generalized eigenvalue problem (where x is an unknown nonzero vector) can be calculated as the ratio of the diagonal elements of S to those of T. That is, using subscripts to denote matrix elements, the ith generalized eigenvalue satisfies .

Related Research Articles

In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors.

In mathematics, particularly linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix can be diagonalized. This is extremely useful because computations involving a diagonalizable matrix can often be reduced to much simpler computations involving the corresponding diagonal matrix. The concept of diagonalization is relatively straightforward for operators on finite-dimensional vector spaces but requires some modification for operators on infinite-dimensional spaces. In general, the spectral theorem identifies a class of linear operators that can be modeled by multiplication operators, which are as simple as one can hope to find. In more abstract language, the spectral theorem is a statement about commutative C*-algebras. See also spectral theory for a historical perspective.

<span class="mw-page-title-main">Singular value decomposition</span> Matrix decomposition

In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix into a rotation, followed by a rescaling followed by another rotation. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any matrix. It is related to the polar decomposition.

In mathematics, a complex square matrix A is normal if it commutes with its conjugate transpose A*:

In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal matrix is , while an example of a 3×3 diagonal matrix is. An identity matrix of any size, or any multiple of it is a diagonal matrix called scalar matrix, for example, . In geometry, a diagonal matrix may be used as a scaling matrix, since matrix multiplication with it results in changing scale (size) and possibly also shape; only a scalar matrix results in uniform change in scale.

In linear algebra, a square matrix  is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix  and a diagonal matrix such that . This is equivalent to . This property exists for any linear map: for a finite-dimensional vector space , a linear map  is called diagonalizable if there exists an ordered basis of  consisting of eigenvectors of . These definitions are equivalent: if  has a matrix representation as above, then the column vectors of  form a basis consisting of eigenvectors of , and the diagonal entries of  are the corresponding eigenvalues of ; with respect to this eigenvector basis,  is represented by .

In the mathematical discipline of linear algebra, a matrix decomposition or matrix factorization is a factorization of a matrix into a product of matrices. There are many different matrix decompositions; each finds use among a particular class of problems.

<span class="mw-page-title-main">Jordan normal form</span> Form of a matrix indicating its eigenvalues and their algebraic multiplicities

In linear algebra, a Jordan normal form, also known as a Jordan canonical form (JCF), is an upper triangular matrix of a particular form called a Jordan matrix representing a linear operator on a finite-dimensional vector space with respect to some basis. Such a matrix has each non-zero off-diagonal entry equal to 1, immediately above the main diagonal, and with identical diagonal entries to the left and below them.

This is an outline of topics related to linear algebra, the branch of mathematics concerning linear equations and linear maps and their representations in vector spaces and through matrices.

In mathematics, a triangular matrix is a special kind of square matrix. A square matrix is called lower triangular if all the entries above the main diagonal are zero. Similarly, a square matrix is called upper triangular if all the entries below the main diagonal are zero.

In numerical analysis, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.

In mathematics, operator theory is the study of linear operators on function spaces, beginning with differential operators and integral operators. The operators may be presented abstractly by their characteristics, such as bounded linear operators or closed operators, and consideration may be given to nonlinear operators. The study, which depends heavily on the topology of function spaces, is a branch of functional analysis.

In linear algebra, the Frobenius normal form or rational canonical form of a square matrix A with entries in a field F is a canonical form for matrices obtained by conjugation by invertible matrices over F. The form reflects a minimal decomposition of the vector space into subspaces that are cyclic for A. Since only one normal form can be reached from a given matrix, a matrix B is similar to A if and only if it has the same rational canonical form as A. Since this form can be found without any operations that might change when extending the field F, notably without factoring polynomials, this shows that whether two matrices are similar does not change upon field extensions. The form is named after German mathematician Ferdinand Georg Frobenius.

In matrix theory, the Perron–Frobenius theorem, proved by Oskar Perron and Georg Frobenius, asserts that a real square matrix with positive entries has a unique eigenvalue of largest magnitude and that eigenvalue is real. The corresponding eigenvector can be chosen to have strictly positive components, and also asserts a similar statement for certain classes of nonnegative matrices. This theorem has important applications to probability theory ; to the theory of dynamical systems ; to economics ; to demography ; to social networks ; to Internet search engines (PageRank); and even to ranking of American football teams. The first to discuss the ordering of players within tournaments using Perron–Frobenius eigenvectors is Edmund Landau.

In linear algebra, it is often important to know which vectors have their directions unchanged by a given linear transformation. An eigenvector or characteristic vector is such a vector. Thus an eigenvector of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .

In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix B is said to be a square root of A if the matrix product BB is equal to A.

Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In linear algebra, two matrices and are said to commute if , or equivalently if their commutator is zero. A set of matrices is said to commute if they commute pairwise, meaning that every pair of matrices in the set commute with each other.

<span class="mw-page-title-main">Weyr canonical form</span> A matrix canonical form

In mathematics, in linear algebra, a Weyr canonical form is a square matrix which induces "nice" properties with matrices it commutes with. It also has a particularly simple structure and the conditions for possessing a Weyr form are fairly weak, making it a suitable tool for studying classes of commuting matrices. A square matrix is said to be in the Weyr canonical form if the matrix has the structure defining the Weyr canonical form. The Weyr form was discovered by the Czech mathematician Eduard Weyr in 1885. The Weyr form did not become popular among mathematicians and it was overshadowed by the closely related, but distinct, canonical form known by the name Jordan canonical form. The Weyr form has been rediscovered several times since Weyr’s original discovery in 1885. This form has been variously called as modified Jordan form,reordered Jordan form,second Jordan form, and H-form. The current terminology is credited to Shapiro who introduced it in a paper published in the American Mathematical Monthly in 1999.

References

  1. Horn, R.A. & Johnson, C.R. (1985). Matrix Analysis. Cambridge University Press. ISBN   0-521-38632-2. (Section 2.3 and further at p. 79)
  2. 1 2 Golub, G.H. & Van Loan, C.F. (1996). Matrix Computations (3rd ed.). Johns Hopkins University Press. ISBN   0-8018-5414-8.(Section 7.7 at p. 313)
  3. Schott, James R. (2016). Matrix Analysis for Statistics (3rd ed.). New York: John Wiley & Sons. pp. 175–178. ISBN   978-1-119-09247-6.
  4. Wagner, David. "Proof of Schur's Theorem" (PDF). Notes on Linear Algebra.
  5. Higham, Nick. "What Is a Schur Decomposition?".
  6. Trefethen, Lloyd N.; Bau, David (1997). Numerical linear algebra. Philadelphia: Society for Industrial and Applied Mathematics. pp. 193–194. ISBN   0-89871-361-7. OCLC   36084666.{{cite book}}: CS1 maint: date and year (link)
  7. Anderson, E; Bai, Z; Bischof, C; Blackford, S; Demmel, J; Dongarra, J; Du Croz, J; Greenbaum, A; Hammarling, S; McKenny, A; Sorensen, D (1995). LAPACK Users guide. Philadelphia, PA: Society for Industrial and Applied Mathematics. ISBN   0-89871-447-8.
  8. Daniel Kressner: "Numerical Methods for General and Structured Eigenvalue Problems", Chap-2, Springer, LNCSE-46 (2005).