Tridiagonal matrix algorithm

Last updated

In numerical linear algebra, the tridiagonal matrix algorithm, also known as the Thomas algorithm (named after Llewellyn Thomas), is a simplified form of Gaussian elimination that can be used to solve tridiagonal systems of equations. A tridiagonal system for n unknowns may be written as

Contents

where and .

For such systems, the solution can be obtained in operations instead of required by Gaussian elimination. A first sweep eliminates the 's, and then an (abbreviated) backward substitution produces the solution. Examples of such matrices commonly arise from the discretization of 1D Poisson equation and natural cubic spline interpolation.

Thomas' algorithm is not stable in general, but is so in several special cases, such as when the matrix is diagonally dominant (either by rows or columns) or symmetric positive definite; [1] [2] for a more precise characterization of stability of Thomas' algorithm, see Higham Theorem 9.12. [3] If stability is required in the general case, Gaussian elimination with partial pivoting (GEPP) is recommended instead. [2]

Method

The forward sweep consists of the computation of new coefficients as follows, denoting the new coefficients with primes:

and

The solution is then obtained by back substitution:

The method above does not modify the original coefficient vectors, but must also keep track of the new coefficients. If the coefficient vectors may be modified, then an algorithm with less bookkeeping is:

For do

followed by the back substitution

The implementation as a C function, which uses scratch space to avoid modifying its inputs for a-c, allowing them to be reused:

voidthomas(constintX,doublex[restrictX],constdoublea[restrictX],constdoubleb[restrictX],constdoublec[restrictX],doublescratch[restrictX]){/*     solves Ax = d, where A is a tridiagonal matrix consisting of vectors a, b, c     X = number of equations     x[] = initially contains the input v, and returns x. indexed from [0, ..., X - 1]     a[] = subdiagonal, indexed from [1, ..., X - 1]     b[] = main diagonal, indexed from [0, ..., X - 1]     c[] = superdiagonal, indexed from [0, ..., X - 2]     scratch[] = scratch space of length X, provided by caller, allowing a, b, c to be const     not performed in this example: manual expensive common subexpression elimination     */scratch[0]=c[0]/b[0];x[0]=x[0]/b[0];/* loop from 1 to X - 1 inclusive */for(intix=1;ix<X;ix++){if(ix<X-1){scratch[ix]=c[ix]/(b[ix]-a[ix]*scratch[ix-1]);}x[ix]=(x[ix]-a[ix]*x[ix-1])/(b[ix]-a[ix]*scratch[ix-1]);}/* loop from X - 2 to 0 inclusive */for(intix=X-2;ix>=0;ix--)x[ix]-=scratch[ix]*x[ix+1];}

Derivation

The derivation of the tridiagonal matrix algorithm is a special case of Gaussian elimination.

Suppose that the unknowns are , and that the equations to be solved are:

Consider modifying the second () equation with the first equation as follows:

which would give:

Note that has been eliminated from the second equation. Using a similar tactic with the modified second equation on the third equation yields:

This time was eliminated. If this procedure is repeated until the row; the (modified) equation will involve only one unknown, . This may be solved for and then used to solve the equation, and so on until all of the unknowns are solved for.

Clearly, the coefficients on the modified equations get more and more complicated if stated explicitly. By examining the procedure, the modified coefficients (notated with tildes) may instead be defined recursively:

To further hasten the solution process, may be divided out (if there's no division by zero risk), the newer modified coefficients, each notated with a prime, will be:

This gives the following system with the same unknowns and coefficients defined in terms of the original ones above:

The last equation involves only one unknown. Solving it in turn reduces the next last equation to one unknown, so that this backward substitution can be used to find all of the unknowns:

Variants

In some situations, particularly those involving periodic boundary conditions, a slightly perturbed form of the tridiagonal system may need to be solved:

In this case, we can make use of the Sherman–Morrison formula to avoid the additional operations of Gaussian elimination and still use the Thomas algorithm. The method requires solving a modified non-cyclic version of the system for both the input and a sparse corrective vector, and then combining the solutions. This can be done efficiently if both solutions are computed at once, as the forward portion of the pure tridiagonal matrix algorithm can be shared.

If we indicate by:

Then the system to be solved is:

In this case the coefficients and are, generally speaking, non-zero, so their presence does not allow to apply the Thomas algorithm directly. We can therefore consider and as following:

Where is a parameter to be chosen. The matrix can be reconstructed as . The solution is then obtained in the following way: [4] first we solve two tridiagonal systems of equations applying the Thomas algorithm:

Then we reconstruct the solution using the Shermann-Morrison formula:


The implementation as a C function, which uses scratch space to avoid modifying its inputs for a-c, allowing them to be reused:

voidcyclic_thomas(constintX,doublex[restrictX],constdoublea[restrictX],constdoubleb[restrictX],constdoublec[restrictX],doublecmod[restrictX],doubleu[restrictX]){/*     solves Ax = v, where A is a cyclic tridiagonal matrix consisting of vectors a, b, c     X = number of equations     x[] = initially contains the input v, and returns x. indexed from [0, ..., X - 1]     a[] = subdiagonal, regularly indexed from [1, ..., X - 1], a[0] is lower left corner     b[] = main diagonal, indexed from [0, ..., X - 1]     c[] = superdiagonal, regularly indexed from [0, ..., X - 2], c[X - 1] is upper right     cmod[], u[] = scratch vectors each of length X     *//* lower left and upper right corners of the cyclic tridiagonal system respectively */constdoublealpha=a[0];constdoublebeta=c[X-1];/* arbitrary, but chosen such that division by zero is avoided */constdoublegamma=-b[0];cmod[0]=alpha/(b[0]-gamma);u[0]=gamma/(b[0]-gamma);x[0]/=(b[0]-gamma);/* loop from 1 to X - 2 inclusive */for(intix=1;ix+1<X;ix++){constdoublem=1.0/(b[ix]-a[ix]*cmod[ix-1]);cmod[ix]=c[ix]*m;u[ix]=(0.0f-a[ix]*u[ix-1])*m;x[ix]=(x[ix]-a[ix]*x[ix-1])*m;}/* handle X - 1 */constdoublem=1.0/(b[X-1]-alpha*beta/gamma-beta*cmod[X-2]);u[X-1]=(alpha-a[X-1]*u[X-2])*m;x[X-1]=(x[X-1]-a[X-1]*x[X-2])*m;/* loop from X - 2 to 0 inclusive */for(intix=X-2;ix>=0;ix--){u[ix]-=cmod[ix]*u[ix+1];x[ix]-=cmod[ix]*x[ix+1];}constdoublefact=(x[0]+x[X-1]*beta/gamma)/(1.0+u[0]+u[X-1]*beta/gamma);/* loop from 0 to X - 1 inclusive */for(intix=0;ix<X;ix++)x[ix]-=fact*u[ix];}

There is also another way to solve the slightly perturbed form of the tridiagonal system considered above. [5] Let us consider two auxiliary linear systems of dimension :

For convenience, we additionally define and . We can now find the solutions and applying Thomas algorithm to the two auxiliary tridiagonal system.

The solution can be then represented in the form:

Indeed, multiplying each equation of the second auxiliary system by , adding with the corresponding equation of the first auxiliary system and using the representation , we immediately see that equations number through of the original system are satisfied; it only remains to satisfy equation number . To do so, consider formula for and and substitute and into the first equation of the original system. This yields one scalar equation for :

As such, we find:

The implementation as a C function, which uses scratch space to avoid modifying its inputs for a-c, allowing them to be reused:

voidcyclic_thomas(constintX,doublex[restrictX],constdoublea[restrictX],constdoubleb[restrictX],constdoublec[restrictX],doublecmod[restrictX],doublev[restrictX]){/* first solve a system of length X - 1 for two right hand sides, ignoring ix == 0 */cmod[1]=c[1]/b[1];v[1]=-a[1]/b[1];x[1]=x[1]/b[1];/* loop from 2 to X - 1 inclusive */for(intix=2;ix<X-1;ix++){constdoublem=1.0/(b[ix]-a[ix]*cmod[ix-1]);cmod[ix]=c[ix]*m;v[ix]=(0.0f-a[ix]*v[ix-1])*m;x[ix]=(x[ix]-a[ix]*x[ix-1])*m;}/* handle X - 1 */constdoublem=1.0/(b[X-1]-a[X-1]*cmod[X-2]);cmod[X-1]=c[X-1]*m;v[X-1]=(-c[0]-a[X-1]*v[X-2])*m;x[X-1]=(x[X-1]-a[X-1]*x[X-2])*m;/* loop from X - 2 to 1 inclusive */for(intix=X-2;ix>=1;ix--){v[ix]-=cmod[ix]*v[ix+1];x[ix]-=cmod[ix]*x[ix+1];}x[0]=(x[0]-a[0]*x[X-1]-c[0]*x[1])/(b[0]+a[0]*v[X-1]+c[0]*v[1]);/* loop from 1 to X - 1 inclusive */for(intix=1;ix<X;ix++)x[ix]+=x[0]*v[ix];}

In both cases the auxiliary systems to be solved are genuinely tri-diagonal, so the overall computational complexity of solving system remains linear with the respect to the dimension of the system , that is arithmetic operations.

In other situations, the system of equations may be block tridiagonal (see block matrix), with smaller submatrices arranged as the individual elements in the above matrix system (e.g., the 2D Poisson problem). Simplified forms of Gaussian elimination have been developed for these situations. [6]

The textbook Numerical Mathematics by Alfio Quarteroni, Sacco and Saleri, lists a modified version of the algorithm which avoids some of the divisions (using instead multiplications), which is beneficial on some computer architectures.

Parallel tridiagonal solvers have been published for many vector and parallel architectures, including GPUs [7] [8]

For an extensive treatment of parallel tridiagonal and block tridiagonal solvers see [9]

Related Research Articles

In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:

<span class="mw-page-title-main">Gradient descent</span> Optimization algorithm

Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for finding a local minimum of a differentiable multivariate function.

In mathematics, a triangular matrix is a special kind of square matrix. A square matrix is called lower triangular if all the entries above the main diagonal are zero. Similarly, a square matrix is called upper triangular if all the entries below the main diagonal are zero.

<span class="mw-page-title-main">Block matrix</span> Matrix defined using smaller matrices called blocks

In mathematics, a block matrix or a partitioned matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices.

In quantum mechanics, the Gorini–Kossakowski–Sudarshan–Lindblad equation, master equation in Lindblad form, quantum Liouvillian, or Lindbladian is one of the general forms of Markovian master equations describing open quantum systems. It generalizes the Schrödinger equation to open quantum systems; that is, systems in contacts with their surroundings. The resulting dynamics is no longer unitary, but still satisfies the property of being trace-preserving and completely positive for any initial condition.

In numerical analysis, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.

In linear algebra, a tridiagonal matrix is a band matrix that has nonzero elements only on the main diagonal, the subdiagonal/lower diagonal, and the supradiagonal/upper diagonal. For example, the following matrix is tridiagonal:

The Newmark-beta method is a method of numerical integration used to solve certain differential equations. It is widely used in numerical evaluation of the dynamic response of structures and solids such as in finite element analysis to model dynamic systems. The method is named after Nathan M. Newmark, former Professor of Civil Engineering at the University of Illinois at Urbana–Champaign, who developed it in 1959 for use in structural dynamics. The semi-discretized structural equation is a second order ordinary differential equation system,

<span class="mw-page-title-main">Conjugate gradient method</span> Mathematical optimization algorithm

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.

In numerical analysis, the Crank–Nicolson method is a finite difference method used for numerically solving the heat equation and similar partial differential equations. It is a second-order method in time. It is implicit in time, can be written as an implicit Runge–Kutta method, and it is numerically stable. The method was developed by John Crank and Phyllis Nicolson in the mid 20th century.

<span class="mw-page-title-main">Osculating circle</span> Circle of immediate corresponding curvature of a curve at a point

An osculating circle is a circle that best approximates the curvature of a curve at a specific point. It is tangent to the curve at that point and has the same curvature as the curve at that point. The osculating circle provides a way to understand the local behavior of a curve and is commonly used in differential geometry and calculus.

In numerical analysis, Bairstow's method is an efficient algorithm for finding the roots of a real polynomial of arbitrary degree. The algorithm first appeared in the appendix of the 1920 book Applied Aerodynamics by Leonard Bairstow. The algorithm finds the roots in complex conjugate pairs using only real arithmetic.

In numerical linear algebra, the Gauss–Seidel method, also known as the Liebmann method or the method of successive displacement, is an iterative method used to solve a system of linear equations. It is named after the German mathematicians Carl Friedrich Gauss and Philipp Ludwig von Seidel, and is similar to the Jacobi method. Though it can be applied to any matrix with non-zero elements on the diagonals, convergence is only guaranteed if the matrix is either strictly diagonally dominant, or symmetric and positive definite. It was only mentioned in a private letter from Gauss to his student Gerling in 1823. A publication was not delivered before 1874 by Seidel.

In numerical linear algebra, the Jacobi method is an iterative algorithm for determining the solutions of a strictly diagonally dominant system of linear equations. Each diagonal element is solved for, and an approximate value is plugged in. The process is then iterated until it converges. This algorithm is a stripped-down version of the Jacobi transformation method of matrix diagonalization. The method is named after Carl Gustav Jacob Jacobi.

In numerical linear algebra, the method of successive over-relaxation (SOR) is a variant of the Gauss–Seidel method for solving a linear system of equations, resulting in faster convergence. A similar method can be used for any slowly converging iterative process.

In mathematics, the discrete Poisson equation is the finite difference analog of the Poisson equation. In it, the discrete Laplace operator takes the place of the Laplace operator. The discrete Poisson equation is frequently used in numerical analysis as a stand-in for the continuous Poisson equation, although it is also studied in its own right as a topic in discrete mathematics.

In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It is also sometimes referred to as LR decomposition.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

The SPIKE algorithm is a hybrid parallel solver for banded linear systems developed by Eric Polizzi and Ahmed Sameh^

Generalized filtering is a generic Bayesian filtering scheme for nonlinear state-space models. It is based on a variational principle of least action, formulated in generalized coordinates of motion. Note that "generalized coordinates of motion" are related to—but distinct from—generalized coordinates as used in (multibody) dynamical systems analysis. Generalized filtering furnishes posterior densities over hidden states generating observed data using a generalized gradient descent on variational free energy, under the Laplace assumption. Unlike classical filtering, generalized filtering eschews Markovian assumptions about random fluctuations. Furthermore, it operates online, assimilating data to approximate the posterior density over unknown quantities, without the need for a backward pass. Special cases include variational filtering, dynamic expectation maximization and generalized predictive coding.

References

  1. Pradip Niyogi (2006). Introduction to Computational Fluid Dynamics. Pearson Education India. p. 76. ISBN   978-81-7758-764-7.
  2. 1 2 Biswa Nath Datta (2010). Numerical Linear Algebra and Applications, Second Edition. SIAM. p. 162. ISBN   978-0-89871-765-5.
  3. Nicholas J. Higham (2002). Accuracy and Stability of Numerical Algorithms: Second Edition. SIAM. p. 175. ISBN   978-0-89871-802-7.
  4. Batista, Milan; Ibrahim Karawia, Abdel Rahman A. (2009). "The use of the Sherman–Morrison–Woodbury formula to solve cyclic block tri-diagonal and cyclic block penta-diagonal linear systems of equations". Applied Mathematics and Computation. 210 (2): 558–563. doi:10.1016/j.amc.2009.01.003. ISSN   0096-3003.
  5. Ryaben’kii, Victor S.; Tsynkov, Semyon V. (2006-11-02), "Introduction", A Theoretical Introduction to Numerical Analysis, Chapman and Hall/CRC, pp. 1–19, doi:10.1201/9781420011166-1, ISBN   978-0-429-14339-7 , retrieved 2022-05-25
  6. Quarteroni, Alfio; Sacco, Riccardo; Saleri, Fausto (2007). "Section 3.8". Numerical Mathematics. Springer, New York. ISBN   978-3-540-34658-6.
  7. Chang, L.-W.; Hwu, W,-M. (2014). "A guide for implementing tridiagonal solvers on GPUs". In V. Kidratenko (ed.). Numerical Computations with GPUs. Springer. ISBN   978-3-319-06548-9.{{cite conference}}: CS1 maint: multiple names: authors list (link)
  8. Venetis, I.E.; Kouris, A.; Sobczyk, A.; Gallopoulos, E.; Sameh, A. (2015). "A direct tridiagonal solver based on Givens rotations for GPU architectures". Parallel Computing. 49: 101–116. doi:10.1016/j.parco.2015.03.008.
  9. Gallopoulos, E.; Philippe, B.; Sameh, A.H. (2016). "Chapter 5". Parallelism in Matrix Computations. Springer. ISBN   978-94-017-7188-7.