Incomplete Cholesky factorization

Last updated April 20, 2024

In numerical analysis, an incomplete Cholesky factorization of a symmetric positive definite matrix is a sparse approximation of the Cholesky factorization. An incomplete Cholesky factorization is often used as a preconditioner for algorithms like the conjugate gradient method.

One popular way to find such a matrix K is to use the algorithm for finding the exact Cholesky decomposition in which K has the same sparsity pattern as A (any entry of K is set to zero if the corresponding entry in A is also zero). This gives an incomplete Cholesky factorization which is as sparse as the matrix A.

Motivation

Consider the following matrix as an example:

$\mathbf {A} ={\begin{bmatrix}5&-2&0&-2&-2\\-2&5&-2&0&0\\0&-2&5&-2&0\\-2&0&-2&5&-2\\-2&0&0&-2&5\\\end{bmatrix}}$

If we apply the full regular Cholesky decomposition, it yields:

$\mathbf {L} ={\begin{bmatrix}2.24&0&0&0&0\\-0.89&2.05&0&0&0\\0&-0.98&2.02&0&0\\-0.89&-0.39&-1.18&1.63&0\\-0.89&-0.39&-0.19&-1.95&0.45\\\end{bmatrix}}$

And, by definition:

$\mathbf {A} =\mathbf {L} \mathbf {L'}$

However, by applying Cholesky decomposition, we observe that some zero elements in the original matrix end up being non-zero elements in the decomposed matrix, like elements (4,2), (5,2) and (5,3) in this example. These elements are known as "fill-ins".

This is not an issue per se, but it is very problematic when working with sparse matrices, since the fill-ins generation is mostly unpredictable and reduces the matrix sparsity, impacting the efficiency of sparse matrix algorithms.

Therefore, given the importance of the Cholesky decomposition in matrix calculations, it is extremely relevant to repurpose the regular method, so as to eliminate the fill-ins generation. The incomplete Cholesky factorization does exactly that, by generating a matrix L similar to the one generated by the regular method, but conserving the zero elements in the original matrix.

Naturally:

$\mathbf {A} \neq \mathbf {L_{ichol}} \mathbf {L_{ichol}'}$

Multiplying matrix L generated by incomplete Cholesky factorization by its transpose won't yield the original matrix, but a similar one.

Algorithm

For $i$ from $1$ to $N$ :

L_{ii}=\left({a_{ii}-\sum \limits _{k=1}^{i-1}{L_{ik}^{2}}}\right)^{1 \over 2}

For $j$ from $i+1$ to $N$ :

L_{ji}={1 \over {L_{ii}}}\left({a_{ji}-\sum \limits _{k=1}^{i-1}{L_{ik}L_{jk}}}\right)

Implementation

Implementation of the incomplete Cholesky factorization in the GNU Octave language. The factorization is stored as a lower triangular matrix, with the elements in the upper triangle set to zero.

functiona=ichol(a)n=size(a,1);fork=1:na(k,k)=sqrt(a(k,k));fori=(k+1):nif(a(i,k)!=0)a(i,k)=a(i,k)/a(k,k);endifendforforj=(k+1):nfori=j:nif(a(i,j)!=0)a(i,j)=a(i,j)-a(i,k)*a(j,k);endifendforendforendforfori=1:nforj=i+1:na(i,j)=0;endforendforendfunction

Sparse example

Consider again the matrix displayed in the beginning of this article. Since it is symmetric and the method only uses the lower triangular elements, we can represent it by:

$\mathbf {A_{tri}} ={\begin{bmatrix}5&0&0&0&0\\-2&5&0&0&0\\0&-2&5&0&0\\-2&0&-2&5&0\\-2&0&0&-2&5\end{bmatrix}}$

More specifically, in its sparse form as a coordinate list, sweeping rows first:

Value 5 -2 -2 -2 5 -2 5 -2 5 -2 5 Row  1 2 4 5 2 3 3 4 4 5 5 Col  1 1 1 1 2 2 3 3 4 4 5

Then, we take the square root of (1,1) and divide the other (i,1) elements by the result:

Value 2.24 -0.89 -0.89 -0.89 | 5 -2 5 -2 5 -2 5 Row  1   2    4  5     | 2 3  3 4  4 5  5 Col  1   1    1    1     | 2 2  3 3  4 4  5

After that, for all the other elements with column greater than 1, calculate (i,j)=(i,j)-(i,1)*(j,1) if (i,1) and (j,1) exist. For example: (5,4) = (5,4)-(5,1)*(4,1) = -2 -(-0.89*-0.89) = -2.8.

Value 2.24 -0.89 -0.89 -0.89 | 4.2 -2 5 -2 4.2 -2.8 4.2 Row  1   2    4  5     | 2   3  3 4  4   5    5 Col  1   1    1    1     | 2   2  3 3  4   4    5                                   ↑           ↑   ↑    ↑

The elements (2,2), (4,4), (5,4) and (5,5) (marked with an arrow) have been recalculated, since they obey this rule. On the other hand, the elements (3,2), (3,3) and (4,3) won't be recalculated since the element (3,1) doesn't exist, even though the elements (2,1) and (4,1) exist.

Now, repeat the process, but for (i,2). Take the square root of (2,2) and divide the other (i,2) elements by the result:

Value 2.24 -0.89 -0.89 -0.89 | 2.05 -0.98 | 5 -2 4.2 -2.8 4.2 Row  1   2    4  5     | 2    3     | 3  4 4   5    5 Col  1   1    1    1     | 2    2     | 3  3 4   4    5

Again, for elements with column greater than 2, calculate (i,j)=(i,j)-(i,2)*(j,2) if (i,2) and (j,2) exist:

Value 2.24 -0.89 -0.89 -0.89 | 2.05 -0.98 | 4.05 -2 4.2 -2.8 4.2 Row  1   2    4  5     | 2    3     | 3    4  4   5    5 Col  1   1    1    1     | 2    2     | 3    3  4   4    5                                                ↑

Repeat for (i,3). Take the square root of (3,3) and divide the other (i,3):

Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 | 2.01 -0.99 | 4.2 -2.8 4.2 Row  1   2    4  5     2    3     | 3    4     | 4   5    5 Col  1   1    1    1     2    2     | 3    3     | 4   4    5

For elements with column greater than 3, calculate (i,j)=(i,j)-(i,3)*(j,3) if (i,3) and (j,3) exist:

Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 | 2.01 -0.99 | 3.21 -2.8 4.2 Row  1   2    4  5     2    3     | 3    4     | 4    5    5 Col  1   1    1    1     2    2     | 3    3     | 4    4    5                                                           ↑

Repeat for (i,4). Take the square root of (4,4) and divide the other (i,4):

Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 2.01 -0.99 | 1.79 -1.56 | 4.2 Row  1   2    4  5     2    3     3    4     | 4    5   | 5 Col  1   1    1    1     2    2     3    3     | 4    4   | 5

For elements with column greater than 4, calculate (i,j)=(i,j)-(i,4)*(j,4) if (i,4) and (j,4) exist:

Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 2.01 -0.99 | 1.79 -1.56 | 1.76 Row  1   2    4  5     2    3     3    4     | 4    5   | 5 Col  1   1    1    1     2    2     3    3     | 4    4   | 5                                                                      ↑

Finally take the square root of (5,5):

Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 2.01 -0.99 1.79 -1.56 | 1.33 Row  1   2    4  5     2    3     3    4     4    5     | 5 Col  1   1    1    1     2    2     3    3     4    4     | 5

Expanding the matrix to its full form:

$\mathbf {L} ={\begin{bmatrix}2.24&0&0&0&0\\-0.89&2.05&0&0&0\\0&-0.98&2.01&0&0\\-0.89&0&-0.99&1.79&0\\-0.89&0&0&-1.56&1.33\end{bmatrix}}$

Note that in this case no fill-ins were generated compared to the original matrix. The elements (4,2), (5,2) and (5,3) are still zero.

However, if we perform the multiplication of L to its transpose:

$\mathbf {LL'} ={\begin{bmatrix}5&-2&0&-2&-2\\-2&5&-2&0.8&0.8\\0&-2&5&-2&0\\-2&0.8&-2&5&-2\\-2&0.8&0&-2&5\end{bmatrix}}$

We get a matrix slightly different from the original one, since the decomposition didn't take into account all the elements, in order to eliminate the fill-ins.

Sparse implementation

The sparse version for the incomplete Cholesky factorization (the same procedure presented above) implemented in MATLAB can be seen below. Naturally, MATLAB has its own means for dealing with sparse matrixes, but the code below was made explicit for pedagogic purposes. This algorithm is efficient, since it treats the matrix as a sequential 1D array, automatically avoiding the zero elements.

functionA=Sp_ichol(A)n=size(A,1);ncols=A(n).col;c_end=0;forcol=1:ncolsis_next_col=0;c_start=c_end+1;fori=c_start:nifA(i).col==col% in the current column (col):ifA(i).col==A(i).rowA(i).val=sqrt(A(i).val);% take the square root of the current column's diagonal elementdiv=A(i).val;elseA(i).val=A(i).val/div;% divide the other current column's elements by the square root of the diagonal elementendendifA(i).col>col% in the next columns (col+1 ... ncols):ifis_next_col==0c_end=i-1;is_next_col=1;endv1=0;v2=0;forj=c_start:c_endifA(j).col==colifA(j).row==A(i).row% search for current column's (col) elements A(j) whose row index is the same as current element's A(i) row indexv1=A(j).val;endifA(j).row==A(i).col% search for current column's (col) elements A(j) whose row index is the same as current element's A(i) column indexv2=A(j).val;endifv1~=0&&v2~=0% if these elements exist in the current column (col), recalculate the current element A(i):A(i).val=A(i).val-v1*v2;break;endendendendendendend

Related Research Articles

In linear algebra, the rank of a matrix $A$ is the dimension of the vector space generated by its columns. This corresponds to the maximal number of linearly independent columns of $A$ . This, in turn, is identical to the dimension of the vector space spanned by its rows. Rank is thus a measure of the "nondegenerateness" of the system of linear equations and linear transformation encoded by $A$ . There are multiple equivalent definitions of rank. A matrix's rank is one of its most fundamental characteristics.

In mathematics, a symmetric matrix $with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the transpose of . More generally, a Hermitian matrix is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of$

In linear algebra, the outer product of two coordinate vectors is the matrix whose entries are all products of an element in the first vector with an element in the second vector. If the two coordinate vectors have dimensions n and m, then their outer product is an n × m matrix. More generally, given two tensors, their outer product is a tensor. The outer product of tensors is also referred to as their tensor product, and can be used to define the tensor algebra.

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices $A$ and $B$ is denoted as $AB$ .

In linear algebra, the Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for efficient numerical solutions, e.g., Monte Carlo simulations. It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.

<span class="mw-page-title-main">Transpose</span> Matrix operation which flips a matrix over its diagonal

In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix $A$ by producing another matrix, often denoted by $A T$ .

In linear algebra, a minor of a matrix A is the determinant of some smaller square matrix, cut down from A by removing one or more of its rows and columns. Minors obtained by removing just one row and one column from square matrices are required for calculating matrix cofactors, which in turn are useful for computing both the determinant and inverse of square matrices. The requirement that the square matrix be smaller than the original matrix is often omitted in the definition.

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse but a common criterion is that the number of non-zero elements is roughly equal to the number of rows or columns. By contrast, if most of the elements are non-zero, the matrix is considered dense. The number of zero-valued elements divided by the total number of elements is sometimes referred to as the sparsity of the matrix.

<span class="mw-page-title-main">Block matrix</span> Matrix defined using smaller matrices called blocks

In mathematics, a block matrix or a partitioned matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices.

In linear algebra, a Hessenberg matrix is a special kind of square matrix, one that is "almost" triangular. To be exact, an upper Hessenberg matrix has zero entries below the first subdiagonal, and a lower Hessenberg matrix has zero entries above the first superdiagonal. They are named after Karl Hessenberg.

In linear algebra, linear transformations can be represented by matrices. If $is a linear transformation mapping to and is a column vector with entries, then$

In numerical analysis, the minimum degree algorithm is an algorithm used to permute the rows and columns of a symmetric sparse matrix before applying the Cholesky decomposition, to reduce the number of non-zeros in the Cholesky factor. This results in reduced storage requirements and means that the Cholesky factor can be applied with fewer arithmetic operations.

In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the part of the domain which the map maps to the zero vector. That is, given a linear map $L : V \to W$ between two vector spaces $V$ and $W$ , the kernel of $L$ is the vector space of all elements $v$ of $V$ such that $L (v) = 0$ , where $0$ denotes the zero vector in $W$ , or more symbolically:

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-semidefinite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.

In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It is also sometimes referred to as LR decomposition.

In mathematics, especially in linear algebra and matrix theory, the commutation matrix is used for transforming the vectorized form of a matrix into the vectorized form of its transpose. Specifically, the commutation matrix K^(m,n) is the nm × mn matrix which, for any m × n matrix A, transforms vec(A) into vec(A^T):

<span class="mw-page-title-main">Matrix (mathematics)</span> Array of numbers

In mathematics, a matrix is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object.

In mathematics, given a field $, nonnegative integers, and a matrix, a rank decomposition or rank factorization of A is a factorization of A of the form A = CF, where and, where is the rank of .$

Polynomial matrices are widely studied in the fields of systems theory and control theory and have seen other uses relating to stable polynomials. In stability theory, Spectral Factorization has been used to find determinantal matrix representations for bivariate stable polynomials and real zero polynomials. A key tool used to study these is a matrix factorization known as either the Polynomial Matrix Spectral Factorization or the Matrix Fejer–Riesz Theorem.

References

Incomplete Cholesky factorization at CFD Online wiki
Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd ed.), Johns Hopkins, ISBN 978-0-8018-5414-9 . See Section 10.3.2.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	ATLAS MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software