Biconjugate gradient stabilized method

Last updated April 09, 2024

In numerical linear algebra, the biconjugate gradient stabilized method, often abbreviated as BiCGSTAB, is an iterative method developed by H. A. van der Vorst for the numerical solution of nonsymmetric linear systems. It is a variant of the biconjugate gradient method (BiCG) and has faster and smoother convergence than the original BiCG as well as other variants such as the conjugate gradient squared method (CGS). It is a Krylov subspace method. Unlike the original BiCG method, it doesn't require multiplication by the transpose of the system matrix.

Algorithmic steps

Unpreconditioned BiCGSTAB

In the following sections, $(x, y) = x T y$ denotes the dot product of vectors. To solve a linear system $Ax = b$ , BiCGSTAB starts with an initial guess $x 0$ and proceeds as follows:

$r 0 = b - Ax 0$
Choose an arbitrary vector $r̂ 0$ such that $(r̂ 0, r 0) \neq 0$ , e.g., $r̂ 0 = r 0$
$ρ 0 = (r̂ 0, r 0)$
$p 0 = r 0$
For i = 1, 2, 3, …
1. $v = Ap i -1$
2. $α = ρ i -1 /(r̂ 0, v)$
3. $h = x i -1 + α p i -1$
4. $s = r i -1 - α v$
5. If $h$ is accurate enough, i.e., if s is small enough, then set $x i = h$ and quit
6. $t = As$
7. $ω = (t, s)/(t, t)$
8. $x i = h + ω s$
9. $r i = s - ω t$
10. If $x i$ is accurate enough, i.e., if $r i$ is small enough, then quit
11. $ρ i = (r̂ 0, r i)$
12. $β = (ρ i / ρ i -1)(α / ω)$
13. $p i = r i + β (p i -1 - ω v)$

In some cases, choosing the vector $r̂ 0$ randomly improves numerical stability.^[1]

Preconditioned BiCGSTAB

Preconditioners are usually used to accelerate convergence of iterative methods. To solve a linear system $Ax = b$ with a preconditioner $K = K 1 K 2 \approx A$ , preconditioned BiCGSTAB starts with an initial guess $x 0$ and proceeds as follows:

$r 0 = b - Ax 0$
Choose an arbitrary vector $r̂ 0$ such that $(r̂ 0, r 0) \neq 0$ , e.g., $r̂ 0 = r 0$
$ρ 0 = (r̂ 0, r 0)$
$p 0 = r 0$
For i = 1, 2, 3, …
1. $y = K -1 2 K -1 1 p i -1$
2. $v = Ay$
3. $α = ρ i -1 /(r̂ 0, v)$
4. $h = x i -1 + α y$
5. $s = r i -1 - α v$
6. If $h$ is accurate enough then $x i = h$ and quit
7. $z = K -1 2 K -1 1 s$
8. $t = Az$
9. $ω = (K -1 1 t, K -1 1 s)/(K -1 1 t, K -1 1 t)$
10. $x i = h + ω z$
11. $r i = s - ω t$
12. If $x i$ is accurate enough then quit
13. $ρ i = (r̂ 0, r i)$
14. $β = (ρ i / ρ i -1)(α / ω)$
15. $p i = r i + β (p i -1 - ω v)$

This formulation is equivalent to applying unpreconditioned BiCGSTAB to the explicitly preconditioned system

Ãx̃ = b̃

with $Ã = K -1 1 A K -1 2$ , $x̃ = K 2 x$ and $b̃ = K -1 1 b$ . In other words, both left- and right-preconditioning are possible with this formulation.

Derivation

BiCG in polynomial form

In BiCG, the search directions $p i$ and $p̂ i$ and the residuals $r i$ and $r̂ i$ are updated using the following recurrence relations:

p i = r i -1 + β i p i -1

,

p̂ i = r̂ i -1 + β i p̂ i -1

,

r i = r i -1 - α i Ap i

,

r̂ i = r̂ i -1 - α i A T p̂ i

.

The constants $α i$ and $β i$ are chosen to be

α i = ρ i /(p̂ i, Ap i)

,

β i = ρ i / ρ i -1

where $ρ i = (r̂ i -1, r i -1)$ so that the residuals and the search directions satisfy biorthogonality and biconjugacy, respectively, i.e., for $i \neq j$ ,

(r̂ i, r j) = 0

,

(p̂ i, Ap j) = 0

.

It is straightforward to show that

r i = P i (A) r 0

,

r̂ i = P i (A T) r̂ 0

,

p i +1 = T i (A) r 0

,

p̂ i +1 = T i (A T) r̂ 0

where $P i (A)$ and $T i (A)$ are $i$ th-degree polynomials in $A$ . These polynomials satisfy the following recurrence relations:

P i (A) = P i -1 (A) - α i A T i -1 (A)

,

T i (A) = P i (A) + β i +1 T i -1 (A)

.

Derivation of BiCGSTAB from BiCG

It is unnecessary to explicitly keep track of the residuals and search directions of BiCG. In other words, the BiCG iterations can be performed implicitly. In BiCGSTAB, one wishes to have recurrence relations for

r̃ i = Q i (A) P i (A) r 0

where $Q i (A) = (I - ω 1 A)(I - ω 2 A)\dots(I - ω i A)$ with suitable constants $ω j$ instead of $r i = P i (A) r 0$ in the hope that $Q i (A)$ will enable faster and smoother convergence in $r̃ i$ than $r i$ .

It follows from the recurrence relations for $P i (A)$ and $T i (A)$ and the definition of $Q i (A)$ that

Q i (A) P i (A) r 0 = (I - ω i A)(Q i -1 (A) P i -1 (A) r 0 - α i A Q i -1 (A) T i -1 (A) r 0)

,

which entails the necessity of a recurrence relation for $Q i (A) T i (A) r 0$ . This can also be derived from the BiCG relations:

Q i (A) T i (A) r 0 = Q i (A) P i (A) r 0 + β i +1 (I - ω i A) Q i -1 (A) P i -1 (A) r 0

.

Similarly to defining $r̃ i$ , BiCGSTAB defines

p̃ i +1 = Q i (A) T i (A) r 0

.

Written in vector form, the recurrence relations for $p̃ i$ and $r̃ i$ are

p̃ i = r̃ i -1 + β i (I - ω i -1 A) p̃ i -1

,

r̃ i = (I - ω i A)(r̃ i -1 - α i A p̃ i)

.

To derive a recurrence relation for $x i$ , define

s i = r̃ i -1 - α i A p̃ i

.

The recurrence relation for $r̃ i$ can then be written as

r̃ i = r̃ i -1 - α i A p̃ i - ω i As i

,

which corresponds to

x i = x i -1 + α i p̃ i + ω i s i

.

Determination of BiCGSTAB constants

Now it remains to determine the BiCG constants $α i$ and $β i$ and choose a suitable $ω i$ .

In BiCG, $β i = ρ i / ρ i -1$ with

ρ i = (r̂ i -1, r i -1) = (P i -1 (A T) r̂ 0, P i -1 (A) r 0)

.

Since BiCGSTAB does not explicitly keep track of $r̂ i$ or $r i$ , $ρ i$ is not immediately computable from this formula. However, it can be related to the scalar

ρ̃ i = (Q i -1 (A T) r̂ 0, P i -1 (A) r 0) = (r̂ 0, Q i -1 (A) P i -1 (A) r 0) = (r̂ 0, r i -1)

.

Due to biorthogonality, $r i -1 = P i -1 (A) r 0$ is orthogonal to $U i -2 (A T) r̂ 0$ where $U i -2 (A T)$ is any polynomial of degree $i - 2$ in $A T$ . Hence, only the highest-order terms of $P i -1 (A T)$ and $Q i -1 (A T)$ matter in the dot products $(P i -1 (A T) r̂ 0, P i -1 (A) r 0)$ and $(Q i -1 (A T) r̂ 0, P i -1 (A) r 0)$ . The leading coefficients of $P i -1 (A T)$ and $Q i -1 (A T)$ are $(-1) i -1 α 1 α 2 \dots α i -1$ and $(-1) i -1 ω 1 ω 2 \dots ω i -1$ , respectively. It follows that

ρ i = (α 1 / ω 1)(α 2 / ω 2)\dots(α i -1 / ω i -1) ρ̃ i

,

and thus

β i = ρ i / ρ i -1 = (ρ̃ i / ρ̃ i -1)(α i -1 / ω i -1)

.

A simple formula for $α i$ can be similarly derived. In BiCG,

α i = ρ i /(p̂ i, Ap i) = (P i -1 (A T) r̂ 0, P i -1 (A) r 0)/(T i -1 (A T) r̂ 0, A T i -1 (A) r 0)

.

Similarly to the case above, only the highest-order terms of $P i -1 (A T)$ and $T i -1 (A T)$ matter in the dot products thanks to biorthogonality and biconjugacy. It happens that $P i -1 (A T)$ and $T i -1 (A T)$ have the same leading coefficient. Thus, they can be replaced simultaneously with $Q i -1 (A T)$ in the formula, which leads to

α i = (Q i -1 (A T) r̂ 0, P i -1 (A) r 0)/(Q i -1 (A T) r̂ 0, A T i -1 (A) r 0) = ρ̃ i /(r̂ 0, A Q i -1 (A) T i -1 (A) r 0) = ρ̃ i /(r̂ 0, Ap̃ i)

.

Finally, BiCGSTAB selects $ω i$ to minimize $r̃ i = (I - ω i A) s i$ in $2$ -norm as a function of $ω i$ . This is achieved when

((I - ω i A) s i, As i) = 0

,

giving the optimal value

ω i = (As i, s i)/(As i, As i)

.

Generalization

BiCGSTAB can be viewed as a combination of BiCG and GMRES where each BiCG step is followed by a GMRES( $1$ ) (i.e., GMRES restarted at each step) step to repair the irregular convergence behavior of CGS, as an improvement of which BiCGSTAB was developed. However, due to the use of degree-one minimum residual polynomials, such repair may not be effective if the matrix $A$ has large complex eigenpairs. In such cases, BiCGSTAB is likely to stagnate, as confirmed by numerical experiments.

One may expect that higher-degree minimum residual polynomials may better handle this situation. This gives rise to algorithms including BiCGSTAB2 and the more general BiCGSTAB( $l$ ). In BiCGSTAB( $l$ ), a GMRES( $l$ ) step follows every $l$ BiCG steps. BiCGSTAB2 is equivalent to BiCGSTAB( $l$ ) with $l = 2$ .

Related Research Articles

In mathematics, the Chinese remainder theorem states that if one knows the remainders of the Euclidean division of an integer n by several integers, then one can determine uniquely the remainder of the division of n by the product of these integers, under the condition that the divisors are pairwise coprime.

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the n-th approximation is derived from the previous ones.

In arithmetic and computer programming, the extended Euclidean algorithm is an extension to the Euclidean algorithm, and computes, in addition to the greatest common divisor (gcd) of integers a and b, also the coefficients of Bézout's identity, which are integers x and y such that

In arithmetic, long division is a standard division algorithm suitable for dividing multi-digit Hindu-Arabic numerals that is simple enough to perform by hand. It breaks down a division problem into a series of easier steps.

In mathematics, a spline is a function defined piecewise by polynomials. In interpolating problems, spline interpolation is often preferred to polynomial interpolation because it yields similar results, even when using low degree polynomials, while avoiding Runge's phenomenon for higher degrees.

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are algorithms for solving linear and non-linear convex optimization problems. IPMs combine two advantages of previously-known algorithms:

In linear algebra, the order-rKrylov subspace generated by an n-by-n matrix A and a vector b of dimension n is the linear subspace spanned by the images of b under the first r powers of A, that is,

In mathematics, more specifically in numerical linear algebra, the biconjugate gradient method is an algorithm to solve systems of linear equations

In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize $over unconstrained values of the real-vector where is a differentiable scalar function.$

IML++, or the Iterative Methods Library, is a C++ library for solving linear systems of equations. It is said to be "templated" in the sense that the same source code works for dense, sparse, and distributed matrices.

In algebra, the greatest common divisor of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials. This concept is analogous to the greatest common divisor of two integers.

Hendrik "Henk" Albertus van der Vorst is a Dutch mathematician and Emeritus Professor of Numerical Analysis at Utrecht University. According to the Institute for Scientific Information (ISI), his paper on the BiCGSTAB method was the most cited paper in the field of mathematics in the 1990s. He is a member of the Royal Netherlands Academy of Arts and Sciences (KNAW) since 2002 and the Netherlands Academy of Technology and Innovation. In 2006 he was awarded a knighthood of the Order of the Netherlands Lion. Henk van der Vorst is a Fellow of Society for Industrial and Applied Mathematics (SIAM).

In numerical linear algebra, the Chebyshev iteration is an iterative method for determining the solutions of a system of linear equations. The method is named after Russian mathematician Pafnuty Chebyshev.

Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem

In numerical linear algebra, the conjugate gradient method is an iterative method for numerically solving the linear system

<span class="mw-page-title-main">Gradient discretisation method</span>

In numerical mathematics, the gradient discretisation method (GDM) is a framework which contains classical and recent numerical schemes for diffusion problems of various kinds: linear or non-linear, steady-state or time-dependent. The schemes may be conforming or non-conforming, and may rely on very general polygonal or polyhedral meshes.

The Minimal Residual Method or MINRES is a Krylov subspace method for the iterative solution of symmetric linear equation systems. It was proposed by mathematicians Christopher Conway Paige and Michael Alan Saunders in 1975.

In numerical linear algebra, the conjugate gradient squared method (CGS) is an iterative algorithm for solving systems of linear equations of the form $, particularly in cases where computing the transpose is impractical. The CGS method was developed as an improvement to the biconjugate gradient method.$

References

↑ Schoutrop, Chris; Boonkkamp, Jan ten Thije; Dijk, Jan van (July 2022). "Reliability Investigation of BiCGStab and IDR Solvers for the Advection-Diffusion-Reaction Equation". Communications in Computational Physics. 32 (1): 156–188. doi:10.4208/cicp.oa-2021-0182. ISSN 1815-2406.

Van der Vorst, H. A. (1992). "Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems". SIAM J. Sci. Stat. Comput. 13 (2): 631–644. doi:10.1137/0913035. hdl: 10338.dmlcz/104566 .
Saad, Y. (2003). "§7.4.2 BICGSTAB" . Iterative Methods for Sparse Linear Systems (2nd ed.). SIAM. pp. 231–234. ISBN 978-0-89871-534-7.
^ Gutknecht, M. H. (1993). "Variants of BICGSTAB for Matrices with Complex Spectrum". SIAM J. Sci. Comput. 14 (5): 1020–1033. doi:10.1137/0914062.
^ Sleijpen, G. L. G.; Fokkema, D. R. (November 1993). "BiCGstab(l) for linear equations involving unsymmetric matrices with complex spectrum" (PDF). Electronic Transactions on Numerical Analysis . 1. Kent, OH: Kent State University: 11–32. ISSN 1068-9613.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Schoutrop, Chris; Boonkkamp, Jan ten Thije; Dijk, Jan van (July 2022). "Reliability Investigation of BiCGStab and IDR Solvers for the Advection-Diffusion-Reaction Equation". Communications in Computational Physics. 32 (1): 156–188. doi:10.4208/cicp.oa-2021-0182. ISSN 1815-2406.

[1]

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	ATLAS MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software