Minimal residual method

Last updated
A comparison of the norm of error and residual in the CG method (blue) and the MINRES method (green). The matrix used comes from a 2D boundary-value problem. Minres illustration de.svg
A comparison of the norm of error and residual in the CG method (blue) and the MINRES method (green). The matrix used comes from a 2D boundary-value problem.

The Minimal Residual Method or MINRES is a Krylov subspace method for the iterative solution of symmetric linear equation systems. It was proposed by mathematicians Christopher Conway Paige and Michael Alan Saunders in 1975. [1]

Contents

In contrast to the popular CG method, the MINRES method does not assume that the matrix is positive definite, only the symmetry of the matrix is mandatory.

GMRES vs. MINRES

The GMRES method is essentially a generalization of MINRES for arbitrary matrices. Both minimize the 2-norm of the residual and do the same calculations in exact arithmetic when the matrix is symmetric. MINRES is a short-recurrence method with a constant memory requirement, whereas GMRES requires storing the whole Krylov space, so its memory requirement is roughly proportional to the number of iterations. On the other hand, GMRES tends to suffer less from loss of orthogonality. [1] [2]

Properties of the MINRES method

The MINRES method iteratively calculates an approximate solution of a linear system of equations of the form

where is a symmetric matrix and a vector.

For this, the norm of the residual in a -dimensional Krylov subspace

is minimized. Here is an initial value (often zero) and .

More precisely, we define the approximate solutions through

where is the standard Euclidean norm on .

Because of the symmetry of , unlike in the GMRES method, it is possible to carry out this minimization process recursively, storing only two previous steps (short recurrence). This saves memory.

MINRES algorithm

Note: The MINRES method is more complicated than the algebraically equivalent Conjugate Residual method. The Conjugate Residual (CR) method was therefore produced below as a substitute. It differs from MINRES in that in MINRES, the columns of a basis of the Krylov space (denoted below by ) can be orthogonalized, whereas in CR their images (below labeled with ) can be orthogonalized via the Lanczos recursion. There are more efficient and preconditioned variants with fewer AXPYs. Compare with the article.

First you choose arbitrary and compute

Then we iterate for in the following steps:

Convergence rate of the MINRES method

In the case of positive definite matrices, the convergence rate of the MINRES method can be estimated in a way similar to that of the CG method. [3] In contrast to the CG method, however, the estimation does not apply to the errors of the iterates, but to the residual. The following applies:

where is the condition number of matrix . Because is normal, we have

where and are maximal and minimal eigenvalues of , respectively.

Implementation in GNU Octave / MATLAB

function[x, r] = minres(A, b, x0, maxit, tol)x=x0;r=b-A*x0;p0=r;s0=A*p0;p1=p0;s1=s0;foriter=1:maxitp2=p1;p1=p0;s2=s1;s1=s0;alpha=r'*s1/(s1'*s1);x=x+alpha*p1;r=r-alpha*s1;if(r'*r<tol^2)breakendp0=s1;s0=A*s1;beta1=s0'*s1/(s1'*s1);p0=p0-beta1*p1;s0=s0-beta1*s1;ifiter>1beta2=s0'*s2/(s2'*s2);p0=p0-beta2*p2;s0=s0-beta2*s2;endendend

Related Research Articles

<span class="mw-page-title-main">Inner product space</span> Generalization of the dot product; used to define Hilbert spaces

In mathematics, an inner product space is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often denoted with angle brackets such as in . Inner products allow formal definitions of intuitive geometric notions, such as lengths, angles, and orthogonality of vectors. Inner product spaces generalize Euclidean vector spaces, in which the inner product is the dot product or scalar product of Cartesian coordinates. Inner product spaces of infinite dimension are widely used in functional analysis. Inner product spaces over the field of complex numbers are sometimes referred to as unitary spaces. The first usage of the concept of a vector space with an inner product is due to Giuseppe Peano, in 1898.

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the n-th approximation is derived from the previous ones.

Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative.

<span class="mw-page-title-main">Gram–Schmidt process</span> Orthonormalization of a set of vectors

In mathematics, particularly linear algebra and numerical analysis, the Gram–Schmidt process or Gram-Schmidt algorithm is a method of constructing an orthonormal basis from a set of vectors in an inner product space, most commonly the Euclidean space Rn equipped with the standard inner product. The Gram–Schmidt process takes a finite, linearly independent set of vectors S = {v1, ..., vk} for kn and generates an orthogonal set S′ = {u1, ..., uk} that spans the same k-dimensional subspace of Rn as S.

In mathematics, a linear form is a linear map from a vector space to its field of scalars.

<span class="mw-page-title-main">Radon transform</span> Integral transform

In mathematics, the Radon transform is the integral transform which takes a function f defined on the plane to a function Rf defined on the (two-dimensional) space of lines in the plane, whose value at a particular line is equal to the line integral of the function over that line. The transform was introduced in 1917 by Johann Radon, who also provided a formula for the inverse transform. Radon further included formulas for the transform in three dimensions, in which the integral is taken over planes. It was later generalized to higher-dimensional Euclidean spaces and more broadly in the context of integral geometry. The complex analogue of the Radon transform is known as the Penrose transform. The Radon transform is widely applicable to tomography, the creation of an image from the projection data associated with cross-sectional scans of an object.

The density matrix renormalization group (DMRG) is a numerical variational technique devised to obtain the low-energy physics of quantum many-body systems with high accuracy. As a variational method, DMRG is an efficient algorithm that attempts to find the lowest-energy matrix product state wavefunction of a Hamiltonian. It was invented in 1992 by Steven R. White and it is nowadays the most efficient method for 1-dimensional systems.

<span class="mw-page-title-main">Conjugate gradient method</span> Mathematical optimization algorithm

In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems.

In optimization, a descent direction is a vector that points towards a local minimum of an objective function .

In linear algebra, the order-rKrylov subspace generated by an n-by-n matrix A and a vector b of dimension n is the linear subspace spanned by the images of b under the first r powers of A, that is,

Harmonic balance is a method used to calculate the steady-state response of nonlinear differential equations, and is mostly applied to nonlinear electrical circuits. It is a frequency domain method for calculating the steady state, as opposed to the various time-domain steady-state methods. The name "harmonic balance" is descriptive of the method, which starts with Kirchhoff's Current Law written in the frequency domain and a chosen number of harmonics. A sinusoidal signal applied to a nonlinear component in a system will generate harmonics of the fundamental frequency. Effectively the method assumes a linear combination of sinusoids can represent the solution, then balances current and voltage sinusoids to satisfy Kirchhoff's law. The method is commonly used to simulate circuits which include nonlinear elements, and is most applicable to systems with feedback in which limit cycles occur.

In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.

Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

In arithmetic, a complex-base system is a positional numeral system whose radix is an imaginary or complex number.

<span class="mw-page-title-main">Online machine learning</span> Method of machine learning

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction. Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches.

In cryptography, learning with errors (LWE) is a mathematical problem that is widely used to create secure encryption algorithms. It is based on the idea of representing secret information as a set of equations with errors. In other words, LWE is a way to hide the value of a secret by introducing noise to it. In more technical terms, it refers to the computational problem of inferring a linear -ary function over a finite ring from given samples some of which may be erroneous. The LWE problem is conjectured to be hard to solve, and thus to be useful in cryptography.

Coherent states have been introduced in a physical context, first as quasi-classical states in quantum mechanics, then as the backbone of quantum optics and they are described in that spirit in the article Coherent states. However, they have generated a huge variety of generalizations, which have led to a tremendous amount of literature in mathematical physics. In this article, we sketch the main directions of research on this line. For further details, we refer to several existing surveys.

In mathematics, Anderson acceleration, also called Anderson mixing, is a method for the acceleration of the convergence rate of fixed-point iterations. Introduced by Donald G. Anderson, this technique can be used to find the solution to fixed point equations often arising in the field of computational science.

Hamiltonian truncation is a numerical method used to study quantum field theories (QFTs) in spacetime dimensions. Hamiltonian truncation is an adaptation of the Rayleigh–Ritz method from quantum mechanics. It is closely related to the exact diagonalization method used to treat spin systems in condensed matter physics. The method is typically used to study QFTs on spacetimes of the form , specifically to compute the spectrum of the Hamiltonian along . A key feature of Hamiltonian truncation is that an explicit ultraviolet cutoff is introduced, akin to the lattice spacing a in lattice Monte Carlo methods. Since Hamiltonian truncation is a nonperturbative method, it can be used to study strong-coupling phenomena like spontaneous symmetry breaking.

References

  1. 1 2 Christopher C. Paige, Michael A. Saunders (1975). "Solution of sparse indefinite systems of linear equations". SIAM Journal on Numerical Analysis. 12 (4): 617–629. doi:10.1137/0712047.
  2. Nifa, M. Naoufal. "Effcient solvers for constrained optimization in parameter identification problems" (PDF) (Doctoral Thesis). pp. 51–52.
  3. Sven Gross, Arnold Reusken. Numerical Methods for Two-phase Incompressible Flows. section 5.2: Springer. ISBN   978-3-642-19685-0.{{cite book}}: CS1 maint: location (link)