Sylvester equation

Last updated

In mathematics, in the field of control theory, a Sylvester equation is a matrix equation of the form: [1]

Contents

It is named after English mathematician James Joseph Sylvester. Then given matrices A, B, and C, the problem is to find the possible matrices X that obey this equation. All matrices are assumed to have coefficients in the complex numbers. For the equation to make sense, the matrices must have appropriate sizes, for example they could all be square matrices of the same size. But more generally, A and B must be square matrices of sizes n and m respectively, and then X and C both have n rows and m columns.

A Sylvester equation has a unique solution for X exactly when there are no common eigenvalues of A and B. More generally, the equation AX + XB = C has been considered as an equation of bounded operators on a (possibly infinite-dimensional) Banach space. In this case, the condition for the uniqueness of a solution X is almost the same: There exists a unique solution X exactly when the spectra of A and −B are disjoint. [2]

Existence and uniqueness of the solutions

Using the Kronecker product notation and the vectorization operator , we can rewrite Sylvester's equation in the form

where is of dimension , is of dimension , of dimension and is the identity matrix. In this form, the equation can be seen as a linear system of dimension . [3]

Theorem. Given matrices and , the Sylvester equation has a unique solution for any if and only if and do not share any eigenvalue.

Proof. The equation is a linear system with unknowns and the same number of equations. Hence it is uniquely solvable for any given if and only if the homogeneous equation admits only the trivial solution .

(i) Assume that and do not share any eigenvalue. Let be a solution to the abovementioned homogeneous equation. Then , which can be lifted to for each by mathematical induction. Consequently, for any polynomial . In particular, let be the characteristic polynomial of . Then due to the Cayley-Hamilton theorem; meanwhile, the spectral mapping theorem tells us where denotes the spectrum of a matrix. Since and do not share any eigenvalue, does not contain zero, and hence is nonsingular. Thus as desired. This proves the "if" part of the theorem.

(ii) Now assume that and share an eigenvalue . Let be a corresponding right eigenvector for , be a corresponding left eigenvector for , and . Then , and Hence is a nontrivial solution to the aforesaid homogeneous equation, justifying the "only if" part of the theorem. Q.E.D.

As an alternative to the spectral mapping theorem, the nonsingularity of in part (i) of the proof can also be demonstrated by the Bézout's identity for coprime polynomials. Let be the characteristic polynomial of . Since and do not share any eigenvalue, and are coprime. Hence there exist polynomials and such that . By the Cayley–Hamilton theorem, . Thus , implying that is nonsingular.

The theorem remains true for real matrices with the caveat that one considers their complex eigenvalues. The proof for the "if" part is still applicable; for the "only if" part, note that both and satisfy the homogenous equation , and they cannot be zero simultaneously.

Roth's removal rule

Given two square complex matrices A and B, of size n and m, and a matrix C of size n by m, then one can ask when the following two square matrices of size n + m are similar to each other: and . The answer is that these two matrices are similar exactly when there exists a matrix X such that AX  XB = C. In other words, X is a solution to a Sylvester equation. This is known as Roth's removal rule. [4]

One easily checks one direction: If AX  XB = C then

Roth's removal rule does not generalize to infinite-dimensional bounded operators on a Banach space. [5] Nevertheless, Roth's removal rule generalizes to the systems of Sylvester equations. [6]

Numerical solutions

A classical algorithm for the numerical solution of the Sylvester equation is the Bartels–Stewart algorithm, which consists of transforming and into Schur form by a QR algorithm, and then solving the resulting triangular system via back-substitution. This algorithm, whose computational cost is arithmetical operations,[ citation needed ] is used, among others, by LAPACK and the lyap function in GNU Octave. [7] See also the sylvester function in that language. [8] [9] In some specific image processing applications, the derived Sylvester equation has a closed form solution. [10]

See also

Notes

  1. This equation is also commonly written in the equivalent form of AX  XB = C.
  2. Bhatia and Rosenthal, 1997
  3. However, rewriting the equation in this form is not advised for the numerical solution since this version is costly to solve and can be ill-conditioned.
  4. Gerrish, F; Ward, A.G.B (Nov 1998). "Sylvester's matrix equation and Roth's removal rule". The Mathematical Gazette. 82 (495): 423–430. doi:10.2307/3619888. JSTOR   3619888. S2CID   126229881.
  5. Bhatia and Rosenthal, p.3
  6. Dmytryshyn, Andrii; Kågström, Bo (2015). "Coupled Sylvester-type Matrix Equations and Block Diagonalization". SIAM Journal on Matrix Analysis and Applications . 36 (2): 580–593. CiteSeerX   10.1.1.710.6894 . doi:10.1137/151005907.
  7. "Function Reference: Lyap".
  8. "Functions of a Matrix (GNU Octave (version 4.4.1))".
  9. The syl command is deprecated since GNU Octave Version 4.0
  10. Wei, Q.; Dobigeon, N.; Tourneret, J.-Y. (2015). "Fast Fusion of Multi-Band Images Based on Solving a Sylvester Equation". IEEE . 24 (11): 4109–4121. arXiv: 1502.03121 . Bibcode:2015ITIP...24.4109W. doi:10.1109/TIP.2015.2458572. PMID   26208345. S2CID   665111.

Related Research Articles

<span class="mw-page-title-main">Diophantine equation</span> Polynomial equation whose integer solutions are sought

In mathematics, a Diophantine equation is an equation, typically a polynomial equation in two or more unknowns with integer coefficients, for which only integer solutions are of interest. A linear Diophantine equation equates to a constant the sum of two or more monomials, each of degree one. An exponential Diophantine equation is one in which unknowns can appear in exponents.

In mathematics, a symmetric matrix with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the transpose of . More generally, a Hermitian matrix is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of

In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:

In linear algebra, a square matrix  is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix  and a diagonal matrix such that . This is equivalent to . This property exists for any linear map: for a finite-dimensional vector space , a linear map  is called diagonalizable if there exists an ordered basis of  consisting of eigenvectors of . These definitions are equivalent: if  has a matrix representation as above, then the column vectors of  form a basis consisting of eigenvectors of , and the diagonal entries of  are the corresponding eigenvalues of ; with respect to this eigenvector basis,  is represented by .

In mathematics, a quadratic form is a polynomial with terms all of degree two. For example,

In mathematics, and in particular linear algebra, the Moore–Penrose inverse of a matrix is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.

In numerical analysis, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

The Lyapunov equation, named after the Russian mathematician Aleksandr Lyapunov, is a matrix equation used in the stability analysis of linear dynamical systems.

In linear algebra, it is often important to know which vectors have their directions unchanged by a given linear transformation. An eigenvector or characteristic vector is such a vector. Thus an eigenvector of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .

In linear algebra, a coefficient matrix is a matrix consisting of the coefficients of the variables in a set of linear equations. The matrix is used in solving systems of linear equations.

In mathematics, a system of equations is considered overdetermined if there are more equations than unknowns. An overdetermined system is almost always inconsistent when constructed with random coefficients. However, an overdetermined system will have solutions in some cases, for example if some equation occurs several times in the system, or if some equations are linear combinations of the others.

In mathematics, Sylvester’s criterion is a necessary and sufficient criterion to determine whether a Hermitian matrix is positive-definite.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In mathematics, the quadratic eigenvalue problem (QEP), is to find scalar eigenvalues , left eigenvectors and right eigenvectors such that

<span class="mw-page-title-main">Matrix (mathematics)</span> Array of numbers

In mathematics, a matrix is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object.

In linear algebra, two matrices and are said to commute if , or equivalently if their commutator is zero. A set of matrices is said to commute if they commute pairwise, meaning that every pair of matrices in the set commute with each other.

A matrix difference equation is a difference equation in which the value of a vector of variables at one point in time is related to its own value at one or more previous points in time, using matrices. The order of the equation is the maximum time gap between any two indicated values of the variable vector. For example,

In numerical linear algebra, the Bartels–Stewart algorithm is used to numerically solve the Sylvester matrix equation . Developed by R.H. Bartels and G.W. Stewart in 1971, it was the first numerically stable method that could be systematically applied to solve such equations. The algorithm works by using the real Schur decompositions of and to transform into a triangular system that can then be solved using forward or backward substitution. In 1979, G. Golub, C. Van Loan and S. Nash introduced an improved version of the algorithm, known as the Hessenberg–Schur algorithm. It remains a standard approach for solving Sylvester equations when is of small to moderate size.

In mathematics, the matrix sign function is a matrix function on square matrices analogous to the complex sign function.

References