Biconjugate gradient stabilized method

Last updated

In numerical linear algebra, the biconjugate gradient stabilized method, often abbreviated as BiCGSTAB, is an iterative method developed by H. A. van der Vorst for the numerical solution of nonsymmetric linear systems. It is a variant of the biconjugate gradient method (BiCG) and has faster and smoother convergence than the original BiCG as well as other variants such as the conjugate gradient squared method (CGS). It is a Krylov subspace method. Unlike the original BiCG method, it doesn't require multiplication by the transpose of the system matrix.

Contents

Algorithmic steps

Unpreconditioned BiCGSTAB

In the following sections, (x,y) = xTy denotes the dot product of vectors. To solve a linear system Ax = b, BiCGSTAB starts with an initial guess x0 and proceeds as follows:

  1. r0 = bAx0
  2. Choose an arbitrary vector 0 such that (0, r0) ≠ 0, e.g., 0 = r0
  3. ρ0 = (0, r0)
  4. p0 = r0
  5. For i = 1, 2, 3, …
    1. v = Api−1
    2. α = ρi−1/(0, v)
    3. h = xi−1 + αpi−1
    4. s = ri−1αv
    5. If h is accurate enough, i.e., if s is small enough, then set xi = h and quit
    6. t = As
    7. ω = (t, s)/(t, t)
    8. xi = h + ωs
    9. ri = sωt
    10. If xi is accurate enough, i.e., if ri is small enough, then quit
    11. ρi = (0, ri)
    12. β = (ρi/ρi−1)(α/ω)
    13. pi = ri + β(pi−1ωv)

In some cases, choosing the vector 0 randomly improves numerical stability. [1]

Preconditioned BiCGSTAB

Preconditioners are usually used to accelerate convergence of iterative methods. To solve a linear system Ax = b with a preconditioner K = K1K2A, preconditioned BiCGSTAB starts with an initial guess x0 and proceeds as follows:

  1. r0 = bAx0
  2. Choose an arbitrary vector 0 such that (0, r0) ≠ 0, e.g., 0 = r0
  3. ρ0 = (0, r0)
  4. p0 = r0
  5. For i = 1, 2, 3, …
    1. y = K −1
      2
       
      K −1
      1
       
      pi−1
    2. v = Ay
    3. α = ρi−1/(0, v)
    4. h = xi−1 + αy
    5. s = ri−1αv
    6. If h is accurate enough then xi = h and quit
    7. z = K −1
      2
       
      K −1
      1
       
      s
    8. t = Az
    9. ω = (K −1
      1
       
      t, K −1
      1
       
      s)/(K −1
      1
       
      t, K −1
      1
       
      t)
    10. xi = h + ωz
    11. ri = sωt
    12. If xi is accurate enough then quit
    13. ρi = (0, ri)
    14. β = (ρi/ρi−1)(α/ω)
    15. pi = ri + β(pi−1ωv)

This formulation is equivalent to applying unpreconditioned BiCGSTAB to the explicitly preconditioned system

Ãx̃ =

with à = K −1
1
 
AK −1
2
 
, = K2x and = K −1
1
 
b
. In other words, both left- and right-preconditioning are possible with this formulation.

Derivation

BiCG in polynomial form

In BiCG, the search directions pi and i and the residuals ri and i are updated using the following recurrence relations:

pi = ri−1 + βipi−1,
i = i−1 + βii−1,
ri = ri−1αiApi,
i = i−1αiATi.

The constants αi and βi are chosen to be

αi = ρi/(i, Api),
βi = ρi/ρi−1

where ρi = (i−1, ri−1) so that the residuals and the search directions satisfy biorthogonality and biconjugacy, respectively, i.e., for ij,

(i, rj) = 0,
(i, Apj) = 0.

It is straightforward to show that

ri = Pi(A)r0,
i = Pi(AT)0,
pi+1 = Ti(A)r0,
i+1 = Ti(AT)0

where Pi(A) and Ti(A) are ith-degree polynomials in A. These polynomials satisfy the following recurrence relations:

Pi(A) = Pi−1(A) − αiATi−1(A),
Ti(A) = Pi(A) + βi+1Ti−1(A).

Derivation of BiCGSTAB from BiCG

It is unnecessary to explicitly keep track of the residuals and search directions of BiCG. In other words, the BiCG iterations can be performed implicitly. In BiCGSTAB, one wishes to have recurrence relations for

i = Qi(A)Pi(A)r0

where Qi(A) = (Iω1A)(Iω2A)⋯(IωiA) with suitable constants ωj instead of ri = Pi(A)r0 in the hope that Qi(A) will enable faster and smoother convergence in i than ri.

It follows from the recurrence relations for Pi(A) and Ti(A) and the definition of Qi(A) that

Qi(A)Pi(A)r0 = (IωiA)(Qi−1(A)Pi−1(A)r0αiAQi−1(A)Ti−1(A)r0),

which entails the necessity of a recurrence relation for Qi(A)Ti(A)r0. This can also be derived from the BiCG relations:

Qi(A)Ti(A)r0 = Qi(A)Pi(A)r0 + βi+1(IωiA)Qi−1(A)Pi−1(A)r0.

Similarly to defining i, BiCGSTAB defines

i+1 = Qi(A)Ti(A)r0.

Written in vector form, the recurrence relations for i and i are

i = i−1 + βi(Iωi−1A)i−1,
i = (IωiA)(i−1αiAi).

To derive a recurrence relation for xi, define

si = i−1αiAi.

The recurrence relation for i can then be written as

i = i−1αiAiωiAsi,

which corresponds to

xi = xi−1 + αii + ωisi.

Determination of BiCGSTAB constants

Now it remains to determine the BiCG constants αi and βi and choose a suitable ωi.

In BiCG, βi = ρi/ρi−1 with

ρi = (i−1, ri−1) = (Pi−1(AT)0, Pi−1(A)r0).

Since BiCGSTAB does not explicitly keep track of i or ri, ρi is not immediately computable from this formula. However, it can be related to the scalar

ρ̃i = (Qi−1(AT)0, Pi−1(A)r0) = (0, Qi−1(A)Pi−1(A)r0) = (0, ri−1).

Due to biorthogonality, ri−1 = Pi−1(A)r0 is orthogonal to Ui−2(AT)0 where Ui−2(AT) is any polynomial of degree i − 2 in AT. Hence, only the highest-order terms of Pi−1(AT) and Qi−1(AT) matter in the dot products (Pi−1(AT)0, Pi−1(A)r0) and (Qi−1(AT)0, Pi−1(A)r0). The leading coefficients of Pi−1(AT) and Qi−1(AT) are (−1)i−1α1α2αi−1 and (−1)i−1ω1ω2ωi−1, respectively. It follows that

ρi = (α1/ω1)(α2/ω2)⋯(αi−1/ωi−1)ρ̃i,

and thus

βi = ρi/ρi−1 = (ρ̃i/ρ̃i−1)(αi−1/ωi−1).

A simple formula for αi can be similarly derived. In BiCG,

αi = ρi/(i, Api) = (Pi−1(AT)0, Pi−1(A)r0)/(Ti−1(AT)0, ATi−1(A)r0).

Similarly to the case above, only the highest-order terms of Pi−1(AT) and Ti−1(AT) matter in the dot products thanks to biorthogonality and biconjugacy. It happens that Pi−1(AT) and Ti−1(AT) have the same leading coefficient. Thus, they can be replaced simultaneously with Qi−1(AT) in the formula, which leads to

αi = (Qi−1(AT)0, Pi−1(A)r0)/(Qi−1(AT)0, ATi−1(A)r0) = ρ̃i/(0, AQi−1(A)Ti−1(A)r0) = ρ̃i/(0, Ap̃i).

Finally, BiCGSTAB selects ωi to minimize i = (IωiA)si in 2-norm as a function of ωi. This is achieved when

((IωiA)si, Asi) = 0,

giving the optimal value

ωi = (Asi, si)/(Asi, Asi).

Generalization

BiCGSTAB can be viewed as a combination of BiCG and GMRES where each BiCG step is followed by a GMRES(1) (i.e., GMRES restarted at each step) step to repair the irregular convergence behavior of CGS, as an improvement of which BiCGSTAB was developed. However, due to the use of degree-one minimum residual polynomials, such repair may not be effective if the matrix A has large complex eigenpairs. In such cases, BiCGSTAB is likely to stagnate, as confirmed by numerical experiments.

One may expect that higher-degree minimum residual polynomials may better handle this situation. This gives rise to algorithms including BiCGSTAB2 and the more general BiCGSTAB(l) . In BiCGSTAB(l), a GMRES(l) step follows every l BiCG steps. BiCGSTAB2 is equivalent to BiCGSTAB(l) with l = 2.

See also

Related Research Articles

<span class="mw-page-title-main">Chinese remainder theorem</span> Theorem for solving simultaneous congruences

In mathematics, the Chinese remainder theorem states that if one knows the remainders of the Euclidean division of an integer n by several integers, then one can determine uniquely the remainder of the division of n by the product of these integers, under the condition that the divisors are pairwise coprime.

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the n-th approximation is derived from the previous ones.

In arithmetic and computer programming, the extended Euclidean algorithm is an extension to the Euclidean algorithm, and computes, in addition to the greatest common divisor (gcd) of integers a and b, also the coefficients of Bézout's identity, which are integers x and y such that

In arithmetic, long division is a standard division algorithm suitable for dividing multi-digit Hindu-Arabic numerals that is simple enough to perform by hand. It breaks down a division problem into a series of easier steps.

<span class="mw-page-title-main">Spline (mathematics)</span> Mathematical function defined piecewise by polynomials

In mathematics, a spline is a function defined piecewise by polynomials. In interpolating problems, spline interpolation is often preferred to polynomial interpolation because it yields similar results, even when using low degree polynomials, while avoiding Runge's phenomenon for higher degrees.

<span class="mw-page-title-main">Conjugate gradient method</span> Mathematical optimization algorithm

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are algorithms for solving linear and non-linear convex optimization problems. IPMs combine two advantages of previously-known algorithms:

In linear algebra, the order-rKrylov subspace generated by an n-by-n matrix A and a vector b of dimension n is the linear subspace spanned by the images of b under the first r powers of A, that is,

In mathematics, more specifically in numerical linear algebra, the biconjugate gradient method is an algorithm to solve systems of linear equations

In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize over unconstrained values of the real-vector where is a differentiable scalar function.

IML++, or the Iterative Methods Library, is a C++ library for solving linear systems of equations. It is said to be "templated" in the sense that the same source code works for dense, sparse, and distributed matrices.

In algebra, the greatest common divisor of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials. This concept is analogous to the greatest common divisor of two integers.

Hendrik "Henk" Albertus van der Vorst is a Dutch mathematician and Emeritus Professor of Numerical Analysis at Utrecht University. According to the Institute for Scientific Information (ISI), his paper on the BiCGSTAB method was the most cited paper in the field of mathematics in the 1990s. He is a member of the Royal Netherlands Academy of Arts and Sciences (KNAW) since 2002 and the Netherlands Academy of Technology and Innovation. In 2006 he was awarded a knighthood of the Order of the Netherlands Lion. Henk van der Vorst is a Fellow of Society for Industrial and Applied Mathematics (SIAM).

In numerical linear algebra, the Chebyshev iteration is an iterative method for determining the solutions of a system of linear equations. The method is named after Russian mathematician Pafnuty Chebyshev.

Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem

In numerical linear algebra, the conjugate gradient method is an iterative method for numerically solving the linear system

<span class="mw-page-title-main">Gradient discretisation method</span>

In numerical mathematics, the gradient discretisation method (GDM) is a framework which contains classical and recent numerical schemes for diffusion problems of various kinds: linear or non-linear, steady-state or time-dependent. The schemes may be conforming or non-conforming, and may rely on very general polygonal or polyhedral meshes.

<span class="mw-page-title-main">Minimal residual method</span> Computational method

The Minimal Residual Method or MINRES is a Krylov subspace method for the iterative solution of symmetric linear equation systems. It was proposed by mathematicians Christopher Conway Paige and Michael Alan Saunders in 1975.

In numerical linear algebra, the conjugate gradient squared method (CGS) is an iterative algorithm for solving systems of linear equations of the form , particularly in cases where computing the transpose is impractical. The CGS method was developed as an improvement to the biconjugate gradient method.

References

  1. Schoutrop, Chris; Boonkkamp, Jan ten Thije; Dijk, Jan van (July 2022). "Reliability Investigation of BiCGStab and IDR Solvers for the Advection-Diffusion-Reaction Equation". Communications in Computational Physics. 32 (1): 156–188. doi:10.4208/cicp.oa-2021-0182. ISSN   1815-2406.