Shanks transformation

Last updated

In numerical analysis, the Shanks transformation is a non-linear series acceleration method to increase the rate of convergence of a sequence. This method is named after Daniel Shanks, who rediscovered this sequence transformation in 1955. It was first derived and published by R. Schmidt in 1941. [1]

Contents

One can calculate only a few terms of a perturbation expansion, usually no more than two or three, and almost never more than seven. The resulting series is often slowly convergent, or even divergent. Yet those few terms contain a remarkable amount of information, which the investigator should do his best to extract.
This viewpoint has been persuasively set forth in a delightful paper by Shanks (1955), who displays a number of amazing examples, including several from fluid mechanics.

Milton D. Van Dyke (1975) Perturbation methods in fluid mechanics, p. 202.

Formulation

For a sequence the series

is to be determined. First, the partial sum is defined as:

and forms a new sequence . Provided the series converges, will also approach the limit as The Shanks transformation of the sequence is the new sequence defined by [2] [3]

where this sequence often converges more rapidly than the sequence Further speed-up may be obtained by repeated use of the Shanks transformation, by computing etc.

Note that the non-linear transformation as used in the Shanks transformation is essentially the same as used in Aitken's delta-squared process so that as with Aitken's method, the right-most expression in 's definition (i.e. ) is more numerically stable than the expression to its left (i.e. ). Both Aitken's method and the Shanks transformation operate on a sequence, but the sequence the Shanks transformation operates on is usually thought of as being a sequence of partial sums, although any sequence may be viewed as a sequence of partial sums.

Example

Absolute error as a function of
n
{\displaystyle n}
in the partial sums
A
n
{\displaystyle A_{n}}
and after applying the Shanks transformation once or several times:
S
(
A
n
)
,
{\displaystyle S(A_{n}),}
S
2
(
A
n
)
{\displaystyle S^{2}(A_{n})}
and
S
3
(
A
n
)
.
{\displaystyle S^{3}(A_{n}).}
The series used is
4
(
1
-
1
3
+
1
5
-
1
7
+
1
9
-
[?]
)
,
{\displaystyle \scriptstyle 4\left(1-{\frac {1}{3}}+{\frac {1}{5}}-{\frac {1}{7}}+{\frac {1}{9}}-\cdots \right),}
which has the exact sum
p
.
{\displaystyle \pi .} Shanks transformation.svg
Absolute error as a function of in the partial sums and after applying the Shanks transformation once or several times: and The series used is which has the exact sum

As an example, consider the slowly convergent series [3]

which has the exact sum π  3.14159265. The partial sum has only one digit accuracy, while six-figure accuracy requires summing about 400,000 terms.

In the table below, the partial sums , the Shanks transformation on them, as well as the repeated Shanks transformations and are given for up to 12. The figure to the right shows the absolute error for the partial sums and Shanks transformation results, clearly showing the improved accuracy and convergence rate.

04.00000000
12.666666673.16666667
23.466666673.133333333.14210526
32.895238103.145238103.141450223.14159936
43.339682543.139682543.141643323.14159086
52.976046183.142712843.141571293.14159323
63.283738483.140881343.141602843.14159244
73.017071823.142071823.141587323.14159274
83.252365933.141254823.141595663.14159261
93.041839623.141839623.141590863.14159267
103.232315813.141406723.141593773.14159264
113.058402773.141736103.141591923.14159266
123.218402773.141479693.141593143.14159265

The Shanks transformation already has two-digit accuracy, while the original partial sums only establish the same accuracy at Remarkably, has six digits accuracy, obtained from repeated Shank transformations applied to the first seven terms As mentioned before, only obtains 6-digit accuracy after summing about 400,000 terms.

Motivation

The Shanks transformation is motivated by the observation that — for larger — the partial sum quite often behaves approximately as [2]

with so that the sequence converges transiently to the series result for So for and the respective partial sums are:

These three equations contain three unknowns: and Solving for gives [2]

In the (exceptional) case that the denominator is equal to zero: then for all

Generalized Shanks transformation

The generalized kth-order Shanks transformation is given as the ratio of the determinants: [4]

with It is the solution of a model for the convergence behaviour of the partial sums with distinct transients:

This model for the convergence behaviour contains unknowns. By evaluating the above equation at the elements and solving for the above expression for the kth-order Shanks transformation is obtained. The first-order generalized Shanks transformation is equal to the ordinary Shanks transformation:

The generalized Shanks transformation is closely related to Padé approximants and Padé tables. [4]

Note: The calculation of determinants requires many arithmetic operations to make, however Peter Wynn discovered a recursive evaluation procedure called epsilon-algorithm which avoids calculating the determinants. [5] [6]

See also

Notes

  1. Weniger (2003).
  2. 1 2 3 Bender & Orszag (1999), pp. 368–375.
  3. 1 2 Van Dyke (1975), pp. 202–205.
  4. 1 2 Bender & Orszag (1999), pp. 389–392.
  5. Wynn (1956)
  6. Wynn (1962)

Related Research Articles

<span class="mw-page-title-main">Dirac delta function</span> Generalized function whose value is zero everywhere except at zero

In mathematical analysis, the Dirac delta distribution, also known as the unit impulse, is a generalized function or distribution over the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one.

In probability theory, the central limit theorem (CLT) establishes that, in many situations, for independent and identically distributed random variables, the sampling distribution of the standardized sample mean tends towards the standard normal distribution even if the original variables themselves are not normally distributed.

<span class="mw-page-title-main">Noether's theorem</span> Statement relating differentiable symmetries to conserved quantities

Noether's theorem or Noether's first theorem states that every differentiable symmetry of the action of a physical system with conservative forces has a corresponding conservation law. The theorem was proven by mathematician Emmy Noether in 1915 and published in 1918. The action of a physical system is the integral over time of a Lagrangian function, from which the system's behavior can be determined by the principle of least action. This theorem only applies to continuous and smooth symmetries over physical space.

In mathematics, a generating function is a way of encoding an infinite sequence of numbers by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary series, the formal power series is not required to converge: in fact, the generating function is not actually regarded as a function, and the "variable" remains an indeterminate. Generating functions were first introduced by Abraham de Moivre in 1730, in order to solve the general linear recurrence problem. One can generalize to formal power series in more than one indeterminate, to encode information about infinite multi-dimensional arrays of numbers.

In mathematics, particularly in linear algebra, tensor analysis, and differential geometry, the Levi-Civita symbol or Levi-Civita epsilon represents a collection of numbers; defined from the sign of a permutation of the natural numbers 1, 2, ..., n, for some positive integer n. It is named after the Italian mathematician and physicist Tullio Levi-Civita. Other names include the permutation symbol, antisymmetric symbol, or alternating symbol, which refer to its antisymmetric property and definition in terms of permutations.

<span class="mw-page-title-main">Product rule</span> Formula for the derivative of a product

In calculus, the product rule is a formula used to find the derivatives of products of two or more functions. For two functions, it may be stated in Lagrange's notation as

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

<span class="mw-page-title-main">Rigid body dynamics</span> Study of the effects of forces on undeformable bodies

In the physical science of dynamics, rigid-body dynamics studies the movement of systems of interconnected bodies under the action of external forces. The assumption that the bodies are rigid simplifies analysis, by reducing the parameters that describe the configuration of the system to the translation and rotation of reference frames attached to each body. This excludes bodies that display fluid, highly elastic, and plastic behavior.

In mathematics, the Poisson summation formula is an equation that relates the Fourier series coefficients of the periodic summation of a function to values of the function's continuous Fourier transform. Consequently, the periodic summation of a function is completely defined by discrete samples of the original function's Fourier transform. And conversely, the periodic summation of a function's Fourier transform is completely defined by discrete samples of the original function. The Poisson summation formula was discovered by Siméon Denis Poisson and is sometimes called Poisson resummation.

In Hamiltonian mechanics, a canonical transformation is a change of canonical coordinates (q, p, t) → that preserves the form of Hamilton's equations. This is sometimes known as form invariance. It need not preserve the form of the Hamiltonian itself. Canonical transformations are useful in their own right, and also form the basis for the Hamilton–Jacobi equations and Liouville's theorem.

In mathematics, a divergent series is an infinite series that is not convergent, meaning that the infinite sequence of the partial sums of the series does not have a finite limit.

<span class="mw-page-title-main">Gauss–Newton algorithm</span> Mathematical algorithm

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

In numerical analysis, Aitken's delta-squared process or Aitken extrapolation is a series acceleration method, used for accelerating the rate of convergence of a sequence. It is named after Alexander Aitken, who introduced this method in 1926. Its early form was known to Seki Kōwa and was found for rectification of the circle, i.e. the calculation of π. It is most useful for accelerating the convergence of a sequence that is converging linearly.

In mathematics, series acceleration is one of a collection of sequence transformations for improving the rate of convergence of a series. Techniques for series acceleration are often applied in numerical analysis, where they are used to improve the speed of numerical integration. Series acceleration techniques may also be used, for example, to obtain a variety of identities on special functions. Thus, the Euler transform applied to the hypergeometric series gives some of the classic, well-known hypergeometric series identities.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

In computational chemistry, a constraint algorithm is a method for satisfying the Newtonian motion of a rigid body which consists of mass points. A restraint algorithm is used to ensure that the distance between mass points is maintained. The general steps involved are: (i) choose novel unconstrained coordinates, (ii) introduce explicit constraint forces, (iii) minimize constraint forces implicitly by the technique of Lagrange multipliers or projection methods.

In classical mechanics, Appell's equation of motion is an alternative general formulation of classical mechanics described by Josiah Willard Gibbs in 1879 and Paul Émile Appell in 1900.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ().

In mathematics, Capelli's identity, named after Alfredo Capelli (1887), is an analogue of the formula det(AB) = det(A) det(B), for certain matrices with noncommuting entries, related to the representation theory of the Lie algebra . It can be used to relate an invariant ƒ to the invariant Ωƒ, where Ω is Cayley's Ω process.

<span class="mw-page-title-main">Lagrangian mechanics</span> Formulation of classical mechanics

In physics, Lagrangian mechanics is a formulation of classical mechanics founded on the stationary-action principle. It was introduced by the Italian-French mathematician and astronomer Joseph-Louis Lagrange in his presentation to the Turin Academy of Science in 1760 culminating in his 1788 grand opus, Mécanique analytique.

References