Aitken's delta-squared process

Last updated September 25, 2024

In numerical analysis, Aitken's delta-squared process or Aitken extrapolation is a series acceleration method used for accelerating the rate of convergence of a sequence. It is named after Alexander Aitken, who introduced this method in 1926.^[1] It is most useful for accelerating the convergence of a sequence that is converging linearly. A precursor form was known to Seki Kōwa (1642 – 1708) and applied to the rectification of the circle, i.e., to the calculation of π.

Definition

Given a sequence $X={(x_{n})}$ with $n=0,1,2,3,\ldots ,$ Aitken's delta-squared process associates to this sequence the new sequence

$A[X]=(a_{n})={\left({\frac {x_{n}\,x_{n+2}-x_{n+1}^{2}}{x_{n}+x_{n+2}-2\,x_{n+1}}}\right)},$

which can also be written as

$A[X]=\left(x_{n}-{\frac {(\Delta x_{n})^{2}}{\Delta ^{2}x_{n}}}\right),$

with ${\textstyle \Delta x_{n}=x_{n+1}-x_{n}}$ and ${\textstyle \Delta ^{2}x_{n}=x_{n}-2x_{n+1}+x_{n+2}=\Delta x_{n+1}-\Delta x_{n}.}$ Both are the same sequence algebraically but the latter has improved numerical stability in computational implementation.

${\textstyle A[X]}$ is ill-defined if the sequence ${\textstyle \Delta ^{2}[X]=(\Delta ^{2}x_{n})}$ contains a zero element, which occurs if the sequence of forward differences, ${\textstyle \Delta [X]=(\Delta x_{n}),}$ has any repeated term. From a theoretical point of view, if that occurs only for a finite number of indices, one could apply the Aitken process to only the part of the sequence $X$ with indices $n>n_{0}$ such that $n_{0}$ is the last index for which the sequence ${\textstyle \Delta [X]}$ repeats. In practice, the first few terms of the sequence usually provide desired precision; also, when numerically computing the sequence, one has to take care to stop the computation before rounding errors in the denominator become too large, as the ${\textstyle \Delta ^{2}}$ sequence transformation may cancel significant digits.

Properties

Aitken's delta-squared process is an acceleration of convergence method and a particular case of a nonlinear sequence transformation.

A sequence ${\textstyle X=(x_{n})}$ that converges to a limiting value ${\textstyle \ell }$ is said to converge linearly, or more technically Q-linearly, if there is some number ${\textstyle \mu \in (0,1)}$ for which

$\lim _{n\to \infty }{\frac {|x_{n+1}-\ell |}{|x_{n}-\ell |}}=\mu .$

This means that asymptotically, the distance between the sequence and its limit shrinks by nearly the same proportion, $\mu ,$ on every step and the ratio of reduction becomes closer and closer to that proportion. This is also sometimes called "geometric convergence," since it is a characteristic property for geometric series, or "exponential convergence," since it is convergence like $\mu ^{n}=\exp(n\ln \mu ).$

Aitken's method will accelerate the convergence of a sequence $X$ if $A[X]=(a_{n}),$ with terms defined above, satisfies ${\textstyle \lim _{n\to \infty }{\frac {a_{n}-\ell }{x_{n}-\ell }}=0.}$

$A$ is not a linear operator on sequences, but it is linear with respect to addition of constant sequences: ${\textstyle A[X-C]=A[X]-C,}$ if $C$ is any constant sequence $C=(c)$ , constant for all $n.$ This is clear from the expression of $A[X]$ in terms of the finite difference operator $\Delta .$

The new process does not in general converge quadratically, but for an iterated function sequence satisfying $x_{n+1}=f(x_{n})$ for some function $f$ converging to a fixed point, the accelerated sequence's convergence is quadratic. In this case, the technique is known as Steffensen's method.

Empirically, the A-operation eliminates the "most important error term". One can check this by considering a sequence of the form $x_{n}=\ell +a^{n}+b^{n}$ , where $0<b<a<1$ : The sequence $A[X]$ will then go to the limit ${\textstyle \ell }$ like $b^{n}$ goes to zero.

Geometrically, the graph of an exponential function $f(t)$ that satisfies $f(n)=x_{n}$ , $f(n+1)=x_{n+1}$ and $f(n+2)=x_{n+2}$ has an horizontal asymptote at ${\frac {x_{n}x_{n+2}-x_{n+1}^{2}}{x_{n}-2x_{n+1}+x_{n+2}}}$ (if $x_{n}-2x_{n+1}+x_{n+2}\neq 0$ ).

One can also show that if a sequence $X$ converges to its limit $\ell$ at a rate strictly greater than 1, $A[X]$ does not have a better rate of convergence. (In practice, one rarely has e.g. quadratic convergence which would mean over 30 (respectively 100) correct decimal places after 5 (respectively 7) iterations (starting with 1 correct digit); usually no acceleration is needed in that case.)

In practice, $A[X]$ often converges much faster to the limit than $X$ does, as demonstrated by the example calculations below. Usually, it is much cheaper to calculate $A[X]$ (involving only calculation of differences, one multiplication and one division) than to calculate many more terms of the sequence $X$ . Care must be taken, however, to avoid introducing errors due to insufficient precision when calculating the differences in the numerator and denominator of the expression.

Example calculations

Example 1: The value of ${\sqrt {2}}\approx 1.4142136$ can be approximated by assuming an initial value for $x_{0}$ and iterating the following sequence, called Heron's method: $x_{n+1}={\frac {x_{n}+{\frac {2}{x_{n}}}}{2}}.$ Starting with $x_{0}=1:$

$n$	$X$	$A[X]$
0	1	1.4285714
1	1.5	1.4141414
2	1.4166667	1.4142136
3	1.4142157	--
4	1.4142136	--

It is worth noting here that Aitken's method does not save the cost of calculating two iterations here; computation of the first three ${\textstyle A[X]}$ values required the first five ${\textstyle X}$ values. Also, the second ${\textstyle A[X]}$ value is less accurate than the 4th ${\textstyle X}$ value, which is not surprising due to the fact that Aitken's process is best suited for sequences that converge linearly, rather than quadratically, and Heron's method for calculating square roots converges quadratically.^{[ citation needed ]}

Example 2: The value of ${\frac {\pi }{4}}$ may be calculated as an infinite sum via the Leibniz formula for π:

${\frac {\pi }{4}}=\sum _{n=0}^{\infty }{\frac {(-1)^{n}}{2n+1}}\approx 0.785398$

$n$	Series Terms	$X$ = Partial Sums	$A[X]$
0	1	1	0.79166667
1	−0.33333333	0.66666667	0.78333333
2	0.2	0.86666667	0.78630952
3	−0.14285714	0.72380952	0.78492063
4	0.11111111	0.83492063	0.78567821
5	−9.0909091×10⁻²	0.74401154	0.78522034
6	7.6923077×10⁻²	0.82093462	0.78551795
7	-6.6666667×10⁻²	0.75426795	--
8	5.8823529×10⁻²	0.81309148	--

In this example, Aitken's method is applied to a sublinearly converging series and accelerates convergence considerably. The convergence is still sublinear, but much faster than the original convergence: the first ${\textstyle A[X]}$ value, whose computation required the first three ${\textstyle X}$ values, is closer to the limit than the eighth ${\textstyle X}$ value.

Example pseudocode for Aitken extrapolation

The following is an example of using the Aitken extrapolation to help find the limit of the sequence $x_{n+1}=f(x_{n})$ when given some initial $x_{0},$ where the limit of this sequence is assumed to be a fixed point $f$ (say $\alpha =f(\alpha )$ ). For instance, if the sequence is given by ${\textstyle x_{n+1}={\frac {1}{2}}\left(x_{n}+{\frac {2}{x_{n}}}\right)}$ with starting point $x_{0}=1,$ then the function will be ${\textstyle f(x):={\frac {1}{2}}\left(x+{\frac {2}{x}}\right),}$ which has ${\displaystyle \alpha$ as a fixed point (see Methods of computing square roots); it is this fixed point whose value will be approximated.

This pseudo code also computes the Aitken approximation to $f^{\prime }(\alpha )$ . The Aitken extrapolates will be denoted by aitkenX. During the computation of the extrapolate, it is important to check if the denominator becomes too small, which could happen if we already have a large amount of accuracy; without this check, a large amount of error could be introduced by the division. This small number will be denoted by epsilon. Because the binary representation of the fixed point could be infinite (or at least too large to fit in the available memory), the calculation will stop once the approximation is within tolerance of the true value.

%These choices depend on the problem being solvedx0=1%The initial valuef(x)=(1/2)*(x+2/x)%The function that finds the next element in the sequencetolerance=10^-10%10 digit accuracy is desiredepsilon=10^-16%Do not divide by a number smaller than thismaxIterations=20%Do not allow the iterations to continue indefinitelyhaveWeFoundSolution=false%Were we able to find the solution to within the desired tolerance? not yetfori=1:maxIterationsx1=f(x0)x2=f(x1)if(x1~=x0)lambda=absoluteValue((x2-x1)/(x1-x0))%OPTIONAL: Computes an approximation of |f'(fixedPoint)|, which is denoted by lambdaenddenominator=(x2-x1)-(x1-x0);if(absoluteValue(denominator)<epsilon)%To avoid greatly increasing error, do not divide by too small of a numberprint('WARNING: denominator is too small')break%Leave the loopendaitkenX=x2-((x2-x1)^2)/denominatorif(absoluteValue(aitkenX-x2)<tolerance)%If the value is within toleranceprint("The fixed point is ",aitkenX))%Display the result of the Aitken extrapolationhaveWeFoundSolution=truebreak%Done, so leave the loopendx0=aitkenX%Update x0 to start again                  endif(haveWeFoundSolution==false)%If we were not able to find a solution to within the desired toleranceprint("Warning: Not able to find solution to within the desired tolerance of ",tolerance)print("The last computed extrapolate was ",aitkenX)end

Notes

↑ Aitken, Alexander (1926). "On Bernoulli's numerical solution of algebraic equations". Proceedings of the Royal Society of Edinburgh. 46: 289–305. doi:10.1017/S0370164600022070.

Related Research Articles

<span class="mw-page-title-main">Newton's method</span> Algorithm for finding zeros of functions

In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function $f$ , its derivative $f'$ , and an initial guess $x 0$ for a root of $f$ . If $f$ satisfies certain assumptions and the initial guess is close, then

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

Bresenham's line algorithm is a line drawing algorithm that determines the points of an n-dimensional raster that should be selected in order to form a close approximation to a straight line between two points. It is commonly used to draw line primitives in a bitmap image, as it uses only integer addition, subtraction, and bit shifting, all of which are very cheap operations in historically common computer architectures. It is an incremental error algorithm, and one of the earliest algorithms developed in the field of computer graphics. An extension to the original algorithm called the midpoint circle algorithm may be used for drawing circles.

In mathematics, a recurrence relation is an equation according to which the $th term of a sequence of numbers is equal to some combination of the previous terms. Often, only previous terms of the sequence appear in the equation, for a parameter that is independent of; this number is called the order of the relation. If the values of the first numbers in the sequence have been given, the rest of the sequence can be calculated by repeatedly applying the equation.$

In mathematics, a Riemann sum is a certain kind of approximation of an integral by a finite sum. It is named after nineteenth century German mathematician Bernhard Riemann. One very common application is in numerical integration, i.e., approximating the area of functions or lines on a graph, where it is also known as the rectangle rule. It can also be applied for approximating the length of curves and other approximations.

In the mathematical subfield of numerical analysis, numerical stability is a generally desirable property of numerical algorithms. The precise definition of stability depends on the context. One is numerical linear algebra and the other is algorithms for solving ordinary and partial differential equations by discrete approximation.

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method. However, the secant method predates Newton's method by over 3000 years.

In numerical analysis, the order of convergence and the rate of convergence of a convergent sequence are quantities that represent how quickly the sequence approaches its limit. In the most common applications, a sequence $that converges to is said to have order of convergence and rate of convergence if$

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Muller's method is a root-finding algorithm, a numerical method for solving equations of the form f(x) = 0. It was first presented by David E. Muller in 1956.

In numerical analysis, fixed-point iteration is a method of computing fixed points of a function.

In mathematics, series acceleration is one of a collection of sequence transformations for improving the rate of convergence of a series. Techniques for series acceleration are often applied in numerical analysis, where they are used to improve the speed of numerical integration. Series acceleration techniques may also be used, for example, to obtain a variety of identities on special functions. Thus, the Euler transform applied to the hypergeometric series gives some of the classic, well-known hypergeometric series identities.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

In numerical analysis, Halley's method is a root-finding algorithm used for functions of one real variable with a continuous second derivative. Edmond Halley was an English mathematician and astronomer who introduced the method now called by his name.

The difference-map algorithm is a search algorithm for general constraint satisfaction problems. It is a meta-algorithm in the sense that it is built from more basic algorithms that perform projections onto constraint sets. From a mathematical perspective, the difference-map algorithm is a dynamical system based on a mapping of Euclidean space. Solutions are encoded as fixed points of the mapping.

In numerical analysis, Steffensen's method is an iterative method for root-finding named after Johan Frederik Steffensen which is similar to Newton's method, but with certain situational advantages. In particular, Steffensen's method achieves similar quadratic convergence, but without using derivatives, as required for Newton's method.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ( $).$

In numerical analysis, the Shanks transformation is a non-linear series acceleration method to increase the rate of convergence of a sequence. This method is named after Daniel Shanks, who rediscovered this sequence transformation in 1955. It was first derived and published by R. Schmidt in 1941.

In numerical analysis and scientific computing, the Gauss–Legendre methods are a family of numerical methods for ordinary differential equations. Gauss–Legendre methods are implicit Runge–Kutta methods. More specifically, they are collocation methods based on the points of Gauss–Legendre quadrature. The Gauss–Legendre method based on s points has order 2s.

There are many programs and algorithms used to plot the Mandelbrot set and other fractals, some of which are described in fractal-generating software. These programs use a variety of algorithms to determine the color of individual pixels efficiently.

References

William H. Press, et al., Numerical Recipes in C, (1987) Cambridge University Press, ISBN 0-521-43108-5 (See section 5.1)
Abramowitz and Stegun, Handbook of Mathematical Functions , section 3.9.7
Kendall E. Atkinson, An Introduction to Numerical Analysis, (1989) John Wiley & Sons, Inc, ISBN 0-471-62489-6

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Aitken, Alexander (1926). "On Bernoulli's numerical solution of algebraic equations". Proceedings of the Royal Society of Edinburgh. 46: 289–305. doi:10.1017/S0370164600022070.

[1]