Bartels–Stewart algorithm

Last updated October 05, 2024 • 3 min readFrom Wikipedia, The Free Encyclopedia

In numerical linear algebra, the Bartels–Stewart algorithm is used to numerically solve the Sylvester matrix equation $AX-XB=C$ . Developed by R.H. Bartels and G.W. Stewart in 1971,^[1] it was the first numerically stable method that could be systematically applied to solve such equations. The algorithm works by using the real Schur decompositions of $A$ and $B$ to transform $AX-XB=C$ into a triangular system that can then be solved using forward or backward substitution. In 1979, G. Golub, C. Van Loan and S. Nash introduced an improved version of the algorithm,^[2] known as the Hessenberg–Schur algorithm. It remains a standard approach for solving Sylvester equations when $X$ is of small to moderate size.

The algorithm

Let $X,C\in \mathbb {R} ^{m\times n}$ , and assume that the eigenvalues of $A$ are distinct from the eigenvalues of $B$ . Then, the matrix equation $AX-XB=C$ has a unique solution. The Bartels–Stewart algorithm computes $X$ by applying the following steps:^[2]

1.Compute the real Schur decompositions

R=U^{T}AU,

S=V^{T}B^{T}V.

The matrices $R$ and $S$ are block-upper triangular matrices, with diagonal blocks of size $1\times 1$ or $2\times 2$ .

2. Set $F=U^{T}CV.$

3. Solve the simplified system $RY-YS^{T}=F$ , where $Y=U^{T}XV$ . This can be done using forward substitution on the blocks. Specifically, if $s_{k-1,k}=0$ , then

(R-s_{kk}I)y_{k}=f_{k}+\sum _{j=k+1}^{n}s_{kj}y_{j},

where $y_{k}$ is the $k$ th column of $Y$ . When $s_{k-1,k}\neq 0$ , columns $[y_{k-1}\mid y_{k}]$ should be concatenated and solved for simultaneously.

4. Set $X=UYV^{T}.$

Computational cost

Using the QR algorithm, the real Schur decompositions in step 1 require approximately $10(m^{3}+n^{3})$ flops, so that the overall computational cost is $10(m^{3}+n^{3})+2.5(mn^{2}+nm^{2})$ .^[2]

Simplifications and special cases

In the special case where $B=-A^{T}$ and $C$ is symmetric, the solution $X$ will also be symmetric. This symmetry can be exploited so that $Y$ is found more efficiently in step 3 of the algorithm.^[1]

The Hessenberg–Schur algorithm

The Hessenberg–Schur algorithm^[2] replaces the decomposition $R=U^{T}AU$ in step 1 with the decomposition $H=Q^{T}AQ$ , where $H$ is an upper-Hessenberg matrix. This leads to a system of the form $HY-YS^{T}=F$ that can be solved using forward substitution. The advantage of this approach is that $H=Q^{T}AQ$ can be found using Householder reflections at a cost of $(5/3)m^{3}$ flops, compared to the $10m^{3}$ flops required to compute the real Schur decomposition of $A$ .

Software and implementation

The subroutines required for the Hessenberg-Schur variant of the Bartels–Stewart algorithm are implemented in the SLICOT library. These are used in the MATLAB control system toolbox.

Alternative approaches

For large systems, the ${\mathcal {O}}(m^{3}+n^{3})$ cost of the Bartels–Stewart algorithm can be prohibitive. When $A$ and $B$ are sparse or structured, so that linear solves and matrix vector multiplies involving them are efficient, iterative algorithms can potentially perform better. These include projection-based methods, which use Krylov subspace iterations, methods based on the alternating direction implicit (ADI) iteration, and hybridizations that involve both projection and ADI.^[3] Iterative methods can also be used to directly construct low rank approximations to $X$ when solving $AX-XB=C$ .

Related Research Articles

<span class="mw-page-title-main">Diophantine equation</span> Polynomial equation whose integer solutions are sought

In mathematics, a Diophantine equation is an equation, typically a polynomial equation in two or more unknowns with integer coefficients, for which only integer solutions are of interest. A linear Diophantine equation equates to a constant the sum of two or more monomials, each of degree one. An exponential Diophantine equation is one in which unknowns can appear in exponents.

In mathematics, a system of linear equations is a collection of two or more linear equations involving the same variables. For example,

In linear algebra, the Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for efficient numerical solutions, e.g., Monte Carlo simulations. It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.

In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:

In the mathematical discipline of linear algebra, a matrix decomposition or matrix factorization is a factorization of a matrix into a product of matrices. There are many different matrix decompositions; each finds use among a particular class of problems.

In the mathematical discipline of linear algebra, the Schur decomposition or Schur triangulation, named after Issai Schur, is a matrix decomposition. It allows one to write an arbitrary complex square matrix as unitarily similar to an upper triangular matrix whose diagonal elements are the eigenvalues of the original matrix.

The Schur complement of a block matrix, encountered in linear algebra and the theory of matrices, is defined as follows.

In numerical linear algebra, the QR algorithm or QR iteration is an eigenvalue algorithm: that is, a procedure to calculate the eigenvalues and eigenvectors of a matrix. The QR algorithm was developed in the late 1950s by John G. F. Francis and by Vera N. Kublanovskaya, working independently. The basic idea is to perform a QR decomposition, writing the matrix as a product of an orthogonal matrix and an upper triangular matrix, multiply the factors in the reverse order, and iterate.

In applied statistics, total least squares is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression and also of orthogonal regression, and can be applied to both linear and non-linear models.

The Lyapunov equation, named after the Russian mathematician Aleksandr Lyapunov, is a matrix equation used in the stability analysis of linear dynamical systems.

In mathematics, the resultant of two polynomials is a polynomial expression of their coefficients that is equal to zero if and only if the polynomials have a common root, or, equivalently, a common factor. In some older texts, the resultant is also called the eliminant.

In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix $B$ is said to be a square root of $A$ if the matrix product $BB$ is equal to $A$ .

In numerical analysis, Stone's method, also known as the strongly implicit procedure or SIP, is an algorithm for solving a sparse linear system of equations. The method uses an incomplete LU decomposition, which approximates the exact LU decomposition, to get an iterative solution of the problem. The method is named after Harold S. Stone, who proposed it in 1968.

In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

Semidefinite programming (SDP) is a subfield of mathematical programming concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It is also sometimes referred to as LR decomposition.

Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible.

In mathematics, in the field of control theory, a Sylvester equation is a matrix equation of the form:

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In numerical linear algebra, the alternating-direction implicit (ADI) method is an iterative method used to solve Sylvester matrix equations. It is a popular method for solving the large matrix equations that arise in systems theory and control, and can be formulated to construct solutions in a memory-efficient, factored form. It is also used to numerically solve parabolic and elliptic partial differential equations, and is a classic method used for modeling heat conduction and solving the diffusion equation in two or more dimensions. It is an example of an operator splitting method.

References

1 2 Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation AX + XB = C [F4]". Communications of the ACM. 15 (9): 820–826. doi: 10.1145/361573.361582 . ISSN 0001-0782.
1 2 3 4 Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl: 1813/7472 . ISSN 0018-9286.
↑ Simoncini, V. (2016). "Computational Methods for Linear Matrix Equations". SIAM Review. 58 (3): 377–441. doi:10.1137/130912839. hdl: 11585/586011 . ISSN 0036-1445. S2CID 17271167.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation AX + XB = C [F4]". Communications of the ACM. 15 (9): 820–826. doi: 10.1145/361573.361582 . ISSN 0001-0782.

[:1-2] 1 2 3 4 Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl: 1813/7472 . ISSN 0018-9286.

[3] Simoncini, V. (2016). "Computational Methods for Linear Matrix Equations". SIAM Review. 58 (3): 377–441. doi:10.1137/130912839. hdl: 11585/586011 . ISSN 0036-1445. S2CID 17271167.

[1]

[2]

[3]

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	ATLAS MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software