Matrix splitting

Last updated November 16, 2023

In the mathematical discipline of numerical linear algebra, a matrix splitting is an expression which represents a given matrix as a sum or difference of matrices. Many iterative methods (for example, for systems of differential equations) depend upon the direct solution of matrix equations involving matrices more general than tridiagonal matrices. These matrix equations can often be solved directly and efficiently when written as a matrix splitting. The technique was devised by Richard S. Varga in 1960.^[1]

Regular splittings

We seek to solve the matrix equation

\mathbf {A} \mathbf {x} =\mathbf {k} ,

(1)

where A is a given n × n non-singular matrix, and k is a given column vector with n components. We split the matrix A into

\mathbf {A} =\mathbf {B} -\mathbf {C} ,

(2)

where B and C are n × n matrices. If, for an arbitrary n × n matrix M, M has nonnegative entries, we write M≥0. If M has only positive entries, we write M>0. Similarly, if the matrix M₁−M₂ has nonnegative entries, we write M₁≥M₂.

Definition: A = B−C is a regular splitting of A if B⁻¹≥0 and C≥0.

We assume that matrix equations of the form

\mathbf {B} \mathbf {x} =\mathbf {g} ,

(3)

where g is a given column vector, can be solved directly for the vector x. If ( 2 ) represents a regular splitting of A, then the iterative method

\mathbf {B} \mathbf {x} ^{(m+1)}=\mathbf {C} \mathbf {x} ^{(m)}+\mathbf {k} ,\quad m=0,1,2,\ldots ,

(4)

where x⁽⁰⁾ is an arbitrary vector, can be carried out. Equivalently, we write ( 4 ) in the form

\mathbf {x} ^{(m+1)}=\mathbf {B} ^{-1}\mathbf {C} \mathbf {x} ^{(m)}+\mathbf {B} ^{-1}\mathbf {k} ,\quad m=0,1,2,\ldots

(5)

The matrix D = B⁻¹C has nonnegative entries if ( 2 ) represents a regular splitting of A.^[2]

It can be shown that if A⁻¹>0, then $\rho (\mathbf {D} )$ < 1, where $\rho (\mathbf {D} )$ represents the spectral radius of D, and thus D is a convergent matrix. As a consequence, the iterative method ( 5 ) is necessarily convergent.^[3]^[4]

If, in addition, the splitting ( 2 ) is chosen so that the matrix B is a diagonal matrix (with the diagonal entries all non-zero, since B must be invertible), then B can be inverted in linear time (see Time complexity).

Matrix iterative methods

Many iterative methods can be described as a matrix splitting. If the diagonal entries of the matrix A are all nonzero, and we express the matrix A as the matrix sum

\mathbf {A} =\mathbf {D} -\mathbf {U} -\mathbf {L} ,

(6)

where D is the diagonal part of A, and U and L are respectively strictly upper and lower triangular n × n matrices, then we have the following.

The Jacobi method can be represented in matrix form as a splitting

\mathbf {x} ^{(m+1)}=\mathbf {D} ^{-1}(\mathbf {U} +\mathbf {L} )\mathbf {x} ^{(m)}+\mathbf {D} ^{-1}\mathbf {k} .

^[5]^[6]

(7)

The Gauss–Seidel method can be represented in matrix form as a splitting

\mathbf {x} ^{(m+1)}=(\mathbf {D} -\mathbf {L} )^{-1}\mathbf {U} \mathbf {x} ^{(m)}+(\mathbf {D} -\mathbf {L} )^{-1}\mathbf {k} .

^[7]^[8]

(8)

The method of successive over-relaxation can be represented in matrix form as a splitting

\mathbf {x} ^{(m+1)}=(\mathbf {D} -\omega \mathbf {L} )^{-1}[(1-\omega )\mathbf {D} +\omega \mathbf {U} ]\mathbf {x} ^{(m)}+\omega (\mathbf {D} -\omega \mathbf {L} )^{-1}\mathbf {k} .

^[9]^[10]

(9)

Example

Regular splitting

In equation ( 1 ), let

\mathbf {A} ={\begin{pmatrix}6&-2&-3\\-1&4&-2\\-3&-1&5\end{pmatrix}},\quad \mathbf {k} ={\begin{pmatrix}5\\-12\\10\end{pmatrix}}.

(10)

Let us apply the splitting ( 7 ) which is used in the Jacobi method: we split A in such a way that B consists of all of the diagonal elements of A, and C consists of all of the off-diagonal elements of A, negated. (Of course this is not the only useful way to split a matrix into two matrices.) We have

{\begin{aligned}&\mathbf {B} ={\begin{pmatrix}6&0&0\\0&4&0\\0&0&5\end{pmatrix}},\quad \mathbf {C} ={\begin{pmatrix}0&2&3\\1&0&2\\3&1&0\end{pmatrix}},\end{aligned}}

(11)

{\begin{aligned}&\mathbf {A^{-1}} ={\frac {1}{47}}{\begin{pmatrix}18&13&16\\11&21&15\\13&12&22\end{pmatrix}},\quad \mathbf {B^{-1}} ={\begin{pmatrix}{\frac {1}{6}}&0&0\\[4pt]0&{\frac {1}{4}}&0\\[4pt]0&0&{\frac {1}{5}}\end{pmatrix}},\end{aligned}}

{\begin{aligned}\mathbf {D} =\mathbf {B^{-1}C} ={\begin{pmatrix}0&{\frac {1}{3}}&{\frac {1}{2}}\\[4pt]{\frac {1}{4}}&0&{\frac {1}{2}}\\[4pt]{\frac {3}{5}}&{\frac {1}{5}}&0\end{pmatrix}},\quad \mathbf {B^{-1}k} ={\begin{pmatrix}{\frac {5}{6}}\\[4pt]-3\\[4pt]2\end{pmatrix}}.\end{aligned}}

Since B⁻¹≥0 and C≥0, the splitting ( 11 ) is a regular splitting. Since A⁻¹>0, the spectral radius $\rho (\mathbf {D} )$ < 1. (The approximate eigenvalues of D are $\lambda _{i}\approx -0.4599820,-0.3397859,0.7997679.$ ) Hence, the matrix D is convergent and the method ( 5 ) necessarily converges for the problem ( 10 ). Note that the diagonal elements of A are all greater than zero, the off-diagonal elements of A are all less than zero and A is strictly diagonally dominant.^[11]

The method ( 5 ) applied to the problem ( 10 ) then takes the form

\mathbf {x} ^{(m+1)}={\begin{pmatrix}0&{\frac {1}{3}}&{\frac {1}{2}}\\[4pt]{\frac {1}{4}}&0&{\frac {1}{2}}\\[4pt]{\frac {3}{5}}&{\frac {1}{5}}&0\end{pmatrix}}\mathbf {x} ^{(m)}+{\begin{pmatrix}{\frac {5}{6}}\\[4pt]-3\\[4pt]2\end{pmatrix}},\quad m=0,1,2,\ldots

(12)

The exact solution to equation ( 12 ) is

\mathbf {x} ={\begin{pmatrix}2\\-1\\3\end{pmatrix}}.

(13)

The first few iterates for equation ( 12 ) are listed in the table below, beginning with $x (0) = (0.0, 0.0, 0.0) T$ . From the table one can see that the method is evidently converging to the solution ( 13 ), albeit rather slowly.

$x_{1}^{(m)}$	$x_{2}^{(m)}$	$x_{3}^{(m)}$
0.0	0.0	0.0
0.83333	-3.0000	2.0000
0.83333	-1.7917	1.9000
1.1861	-1.8417	2.1417
1.2903	-1.6326	2.3433
1.4608	-1.5058	2.4477
1.5553	-1.4110	2.5753
1.6507	-1.3235	2.6510
1.7177	-1.2618	2.7257
1.7756	-1.2077	2.7783
1.8199	-1.1670	2.8238

Jacobi method

As stated above, the Jacobi method ( 7 ) is the same as the specific regular splitting ( 11 ) demonstrated above.

Gauss–Seidel method

Since the diagonal entries of the matrix A in problem ( 10 ) are all nonzero, we can express the matrix A as the splitting ( 6 ), where

\mathbf {D} ={\begin{pmatrix}6&0&0\\0&4&0\\0&0&5\end{pmatrix}},\quad \mathbf {U} ={\begin{pmatrix}0&2&3\\0&0&2\\0&0&0\end{pmatrix}},\quad \mathbf {L} ={\begin{pmatrix}0&0&0\\1&0&0\\3&1&0\end{pmatrix}}.

(14)

We then have

{\begin{aligned}&\mathbf {(D-L)^{-1}} ={\frac {1}{120}}{\begin{pmatrix}20&0&0\\5&30&0\\13&6&24\end{pmatrix}},\end{aligned}}

{\begin{aligned}&\mathbf {(D-L)^{-1}U} ={\frac {1}{120}}{\begin{pmatrix}0&40&60\\0&10&75\\0&26&51\end{pmatrix}},\quad \mathbf {(D-L)^{-1}k} ={\frac {1}{120}}{\begin{pmatrix}100\\-335\\233\end{pmatrix}}.\end{aligned}}

The Gauss–Seidel method ( 8 ) applied to the problem ( 10 ) takes the form

\mathbf {x} ^{(m+1)}={\frac {1}{120}}{\begin{pmatrix}0&40&60\\0&10&75\\0&26&51\end{pmatrix}}\mathbf {x} ^{(m)}+{\frac {1}{120}}{\begin{pmatrix}100\\-335\\233\end{pmatrix}},\quad m=0,1,2,\ldots

(15)

The first few iterates for equation ( 15 ) are listed in the table below, beginning with $x (0) = (0.0, 0.0, 0.0) T$ . From the table one can see that the method is evidently converging to the solution ( 13 ), somewhat faster than the Jacobi method described above.

$x_{1}^{(m)}$	$x_{2}^{(m)}$	$x_{3}^{(m)}$
0.0	0.0	0.0
0.8333	-2.7917	1.9417
0.8736	-1.8107	2.1620
1.3108	-1.5913	2.4682
1.5370	-1.3817	2.6459
1.6957	-1.2531	2.7668
1.7990	-1.1668	2.8461
1.8675	-1.1101	2.8985
1.9126	-1.0726	2.9330
1.9423	-1.0479	2.9558
1.9619	-1.0316	2.9708

Successive over-relaxation method

Let ω = 1.1. Using the splitting ( 14 ) of the matrix A in problem ( 10 ) for the successive over-relaxation method, we have

{\begin{aligned}&\mathbf {(D-\omega L)^{-1}} ={\frac {1}{12}}{\begin{pmatrix}2&0&0\\0.55&3&0\\1.441&0.66&2.4\end{pmatrix}},\end{aligned}}

{\begin{aligned}&\mathbf {(D-\omega L)^{-1}[(1-\omega )D+\omega U]} ={\frac {1}{12}}{\begin{pmatrix}-1.2&4.4&6.6\\-0.33&0.01&8.415\\-0.8646&2.9062&5.0073\end{pmatrix}},\end{aligned}}

{\begin{aligned}&\mathbf {\omega (D-\omega L)^{-1}k} ={\frac {1}{12}}{\begin{pmatrix}11\\-36.575\\25.6135\end{pmatrix}}.\end{aligned}}

The successive over-relaxation method ( 9 ) applied to the problem ( 10 ) takes the form

\mathbf {x} ^{(m+1)}={\frac {1}{12}}{\begin{pmatrix}-1.2&4.4&6.6\\-0.33&0.01&8.415\\-0.8646&2.9062&5.0073\end{pmatrix}}\mathbf {x} ^{(m)}+{\frac {1}{12}}{\begin{pmatrix}11\\-36.575\\25.6135\end{pmatrix}},\quad m=0,1,2,\ldots

(16)

The first few iterates for equation ( 16 ) are listed in the table below, beginning with $x (0) = (0.0, 0.0, 0.0) T$ . From the table one can see that the method is evidently converging to the solution ( 13 ), slightly faster than the Gauss–Seidel method described above.

$x_{1}^{(m)}$	$x_{2}^{(m)}$	$x_{3}^{(m)}$
0.0	0.0	0.0
0.9167	-3.0479	2.1345
0.8814	-1.5788	2.2209
1.4711	-1.5161	2.6153
1.6521	-1.2557	2.7526
1.8050	-1.1641	2.8599
1.8823	-1.0930	2.9158
1.9314	-1.0559	2.9508
1.9593	-1.0327	2.9709
1.9761	-1.0185	2.9829
1.9862	-1.0113	2.9901

Notes

Related Research Articles

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the n-th approximation is derived from the previous ones.

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-1⁄2 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way.

In linear algebra, the Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for efficient numerical solutions, e.g., Monte Carlo simulations. It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

<span class="mw-page-title-main">Symplectic group</span> Mathematical group

In mathematics, the name symplectic group can refer to two different, but closely related, collections of mathematical groups, denoted $Sp(2 n, F)$ and $Sp(n)$ for positive integer n and field F (usually C or R). The latter is called the compact symplectic group and is also denoted by $. Many authors prefer slightly different notations, usually differing by factors of 2 . The notation used here is consistent with the size of the most common matrices which represent the groups. In Cartan's classification of the simple Lie algebras, the Lie algebra of the complex group Sp(2 n, C) is denoted C n, and Sp(n) is the compact real form of Sp(2 n, C) . Note that when we refer to the (compact) symplectic group it is implied that we are talking about the collection of (compact) symplectic groups, indexed by their dimension n .$

In linear algebra, an $n$ -by- $n$ square matrix $A$ is called invertible, if there exists an $n$ -by- $n$ square matrix $B$ such that

In special relativity, a four-vector is an object with four components, which transform in a specific way under Lorentz transformations. Specifically, a four-vector is an element of a four-dimensional vector space considered as a representation space of the standard representation of the Lorentz group, the representation. It differs from a Euclidean vector in how its magnitude is determined. The transformations that preserve this magnitude are the Lorentz transformations, which include spatial rotations and boosts.

In mathematics, and in particular linear algebra, the Moore–Penrose inverse $of a matrix is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore-Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.$

An infinitesimal rotation matrix or differential rotation matrix is a matrix representing an infinitely small rotation.

In physics, the Rabi cycle is the cyclic behaviour of a two-level quantum system in the presence of an oscillatory driving field. A great variety of physical processes belonging to the areas of quantum computing, condensed matter, atomic and molecular physics, and nuclear and particle physics can be conveniently studied in terms of two-level quantum mechanical systems, and exhibit Rabi flopping when coupled to an optical driving field. The effect is important in quantum optics, magnetic resonance and quantum computing, and is named after Isidor Isaac Rabi.

In quantum mechanics, a two-state system is a quantum system that can exist in any quantum superposition of two independent quantum states. The Hilbert space describing such a system is two-dimensional. Therefore, a complete basis spanning the space will consist of two independent states. Any two-state system can also be seen as a qubit.

In geometry and linear algebra, a Cartesian tensor uses an orthonormal basis to represent a tensor in a Euclidean space in the form of components. Converting a tensor's components from one such basis to another is done through an orthogonal transformation.

In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix $B$ is said to be a square root of $A$ if the matrix product $BB$ is equal to $A$ .

In numerical linear algebra, the Jacobi method is an iterative algorithm for determining the solutions of a strictly diagonally dominant system of linear equations. Each diagonal element is solved for, and an approximate value is plugged in. The process is then iterated until it converges. This algorithm is a stripped-down version of the Jacobi transformation method of matrix diagonalization. The method is named after Carl Gustav Jacob Jacobi.

In numerical linear algebra, the method of successive over-relaxation (SOR) is a variant of the Gauss–Seidel method for solving a linear system of equations, resulting in faster convergence. A similar method can be used for any slowly converging iterative process.

In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model when there is a certain degree of correlation between the residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935.

In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It's also referred to as LR decomposition.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In linear algebra, a convergent matrix is a matrix that converges to the zero matrix under matrix exponentiation.

In electromagnetism, a branch of fundamental physics, the matrix representations of the Maxwell's equations are a formulation of Maxwell's equations using matrices, complex numbers, and vector calculus. These representations are for a homogeneous medium, an approximation in an inhomogeneous medium. A matrix representation for an inhomogeneous medium was presented using a pair of matrix equations. A single equation using 4 × 4 matrices is necessary and sufficient for any homogeneous medium. For an inhomogeneous medium it necessarily requires 8 × 8 matrices.

References

Burden, Richard L.; Faires, J. Douglas (1993), Numerical Analysis (5th ed.), Boston: Prindle, Weber and Schmidt, ISBN 0-534-93219-3 .
Varga, Richard S. (1960). "Factorization and Normalized Iterative Methods". In Langer, Rudolph E. (ed.). Boundary Problems in Differential Equations. Madison: University of Wisconsin Press. pp. 121–142. LCCN 60-60003.
Varga, Richard S. (1962), Matrix Iterative Analysis, New Jersey: Prentice-Hall, LCCN 62-21277 .

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Varga (1960)

[2] Varga (1960 , pp. 121–122)

[3] Varga (1960 , pp. 122–123)

[4] Varga (1962 , p. 89)

[5] Burden & Faires (1993, p. 408)

[6] Varga (1962, p. 88)

[7] Burden & Faires (1993, p. 411)

[8] Varga (1962, p. 88)

[9] Burden & Faires (1993, p. 416)

[10] Varga (1962, p. 88)

[11] Burden & Faires (1993 , p. 371)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software