Matrix sign function

Last updated

In mathematics, the matrix sign function is a matrix function on square matrices analogous to the complex sign function. [1]

Contents

It was introduced by J.D. Roberts in 1971 as a tool for model reduction and for solving Lyapunov and Algebraic Riccati equation in a technical report of Cambridge University, which was later published in a journal in 1980. [2] [3]

Definition

The matrix sign function is a generalization of the complex signum function

to the matrix valued analogue . Although the sign function is not analytic, the matrix function is well defined for all matrices that have no eigenvalue on the imaginary axis, see for example the Jordan-form-based definition (where the derivatives are all zero).

Properties

Theorem: Let , then . [1]

Theorem: Let , then is diagonalizable and has eigenvalues that are . [1]

Theorem: Let , then is a projector onto the invariant subspace associated with the eigenvalues in the right-half plane, and analogously for and the left-half plane. [1]

Theorem: Let , and be a Jordan decomposition such that corresponds to eigenvalues with positive real part and to eigenvalue with negative real part. Then , where and are identity matrices of sizes corresponding to and , respectively. [1]

Computational methods

The function can be computed with generic methods for matrix functions, but there are also specialized methods.

Newton iteration

The Newton iteration can be derived by observing that , which in terms of matrices can be written as , where we use the matrix square root. If we apply the Babylonian method to compute the square root of the matrix , that is, the iteration , and define the new iterate , we arrive at the iteration

,

where typically . Convergence is global, and locally it is quadratic. [1] [2]

The Newton iteration uses the explicit inverse of the iterates .

Newton–Schulz iteration

To avoid the need of an explicit inverse used in the Newton iteration, the inverse can be approximated with one step of the Newton iteration for the inverse, , derived by Schulz(de) in 1933. [4] Substituting this approximation into the previous method, the new method becomes

.

Convergence is (still) quadratic, but only local (guaranteed for ). [1]

Applications

Solutions of Sylvester equations

Theorem: [2] [3] Let and assume that and are stable, then the unique solution to the Sylvester equation, , is given by such that

Proof sketch: The result follows from the similarity transform

since

due to the stability of and .

The theorem is, naturally, also applicable to the Lyapunov equation. However, due to the structure the Newton iteration simplifies to only involving inverses of and .

Solutions of algebraic Riccati equations

There is a similar result applicable to the algebraic Riccati equation, . [1] [2] Define as

Under the assumption that are Hermitian and there exists a unique stabilizing solution, in the sense that is stable, that solution is given by the over-determined, but consistent, linear system

Proof sketch: The similarity transform

and the stability of implies that

for some matrix .

Computations of matrix square-root

The Denman–Beavers iteration for the square root of a matrix can be derived from the Newton iteration for the matrix sign function by noticing that is a degenerate algebraic Riccati equation [3] and by definition a solution is the square root of .

Related Research Articles

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices which are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In mathematics, a symmetric matrix with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the transpose of . More generally, a Hermitian matrix is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of

Ray transfer matrix analysis is a mathematical form for performing ray tracing calculations in sufficiently simple problems which can be solved considering only paraxial rays. Each optical element is described by a 2×2 ray transfer matrix which operates on a vector describing an incoming light ray to calculate the outgoing ray. Multiplication of the successive matrices thus yields a concise ray transfer matrix describing the entire optical system. The same mathematics is also used in accelerator physics to track particles through the magnet installations of a particle accelerator, see electron optics.

<span class="mw-page-title-main">Orthogonal group</span> Type of group in mathematics

In mathematics, the orthogonal group in dimension , denoted , is the group of distance-preserving transformations of a Euclidean space of dimension that preserve a fixed point, where the group operation is given by composing transformations. The orthogonal group is sometimes called the general orthogonal group, by analogy with the general linear group. Equivalently, it is the group of orthogonal matrices, where the group operation is given by matrix multiplication. The orthogonal group is an algebraic group and a Lie group. It is compact.

<span class="mw-page-title-main">Special unitary group</span> Group of unitary matrices with determinant of 1

In mathematics, the special unitary group of degree n, denoted SU(n), is the Lie group of n × n unitary matrices with determinant 1.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In linear algebra, a square matrix  is called diagonalizable or non-defective if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix  and a diagonal matrix such that , or equivalently . For a finite-dimensional vector space , a linear map  is called diagonalizable if there exists an ordered basis of  consisting of eigenvectors of . These definitions are equivalent: if  has a matrix representation as above, then the column vectors of  form a basis consisting of eigenvectors of , and the diagonal entries of  are the corresponding eigenvalues of ; with respect to this eigenvector basis,  is represented by .Diagonalization is the process of finding the above  and .

<span class="mw-page-title-main">Jordan normal form</span> Form of a matrix indicating its eigenvalues and their algebraic multiplicities

In linear algebra, a Jordan normal form, also known as a Jordan canonical form (JCF), is an upper triangular matrix of a particular form called a Jordan matrix representing a linear operator on a finite-dimensional vector space with respect to some basis. Such a matrix has each non-zero off-diagonal entry equal to 1, immediately above the main diagonal, and with identical diagonal entries to the left and below them.

In mathematics, and in particular linear algebra, the Moore–Penrose inverse of a matrix is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.

<span class="mw-page-title-main">Sign function</span> Mathematical function returning -1, 0 or 1

In mathematics, the sign function or signum function is a function that returns the sign of a real number. In mathematical notation the sign function is often represented as .

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

In linear algebra, a circulant matrix is a square matrix in which all row vectors are composed of the same elements and each row vector is rotated one element to the right relative to the preceding row vector. It is a particular kind of Toeplitz matrix.

In mathematics, a logarithm of a matrix is another matrix such that the matrix exponential of the latter matrix equals the original matrix. It is thus a generalization of the scalar logarithm and in some sense an inverse function of the matrix exponential. Not all matrices have a logarithm and those matrices that do have a logarithm may have more than one logarithm. The study of logarithms of matrices leads to Lie theory since when a matrix has a logarithm then it is in an element of a Lie group and the logarithm is the corresponding element of the vector space of the Lie algebra.

In the mathematical discipline of matrix theory, a Jordan matrix, named after Camille Jordan, is a block diagonal matrix over a ring R, where each block along the diagonal, called a Jordan block, has the following form:

In mathematics, in the field of control theory, a Sylvester equation is a matrix equation of the form:

In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In mathematics, the quadratic eigenvalue problem (QEP), is to find scalar eigenvalues , left eigenvectors and right eigenvectors such that

A matrix difference equation is a difference equation in which the value of a vector of variables at one point in time is related to its own value at one or more previous points in time, using matrices. The order of the equation is the maximum time gap between any two indicated values of the variable vector. For example,

References

  1. 1 2 3 4 5 6 7 8 Higham, Nicholas J. (2008). Functions of matrices : theory and computation. Society for Industrial and Applied Mathematics. Philadelphia, Pa.: Society for Industrial and Applied Mathematics (SIAM, 3600 Market Street, Floor 6, Philadelphia, PA 19104). ISBN   978-0-89871-777-8. OCLC   693957820.
  2. 1 2 3 4 Roberts, J. D. "Linear model reduction and solution of the algebraic Riccati equation by use of the sign function". International Journal of Control. 32 (4): 677–687. doi:10.1080/00207178008922881. ISSN   0020-7179.
  3. 1 2 3 Denman, Eugene D.; Beavers, Alex N. (1976). "The matrix sign function and computations in systems". Applied Mathematics and Computation. 2 (1): 63–94. doi:10.1016/0096-3003(76)90020-5. ISSN   0096-3003.
  4. Schulz, Günther (1933). "Iterative Berechung der reziproken Matrix". ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik. 13 (1): 57–59. doi:10.1002/zamm.19330130111. ISSN   1521-4001.