Halley's method

Last updated December 20, 2024

In numerical analysis, Halley's method is a root-finding algorithm used for functions of one real variable with a continuous second derivative. Edmond Halley was an English mathematician and astronomer who introduced the method now called by his name.

Method

Halley's method is a numerical algorithm for solving the nonlinear equation $f (x) = 0$ . In this case, the function f has to be a function of one real variable. The method consists of a sequence of iterations:

x_{n+1}=x_{n}-{\frac {2f(x_{n})f'(x_{n})}{2{[f'(x_{n})]}^{2}-f(x_{n})f''(x_{n})}}

beginning with an initial guess $x 0$ .^[3]

If f is a three times continuously differentiable function and a is a zero of f but not of its derivative, then, in a neighborhood of a, the iterates $x n$ satisfy:

|x_{n+1}-a|\leq K\cdot {|x_{n}-a|}^{3},{\text{ for some }}K>0.

This means that the iterates converge to the zero if the initial guess is sufficiently close, and that the convergence is cubic.^[4]

The following alternative formulation shows the similarity between Halley's method and Newton's method. The expression $f(x_{n})/f'(x_{n})$ is computed only once, and it is particularly useful when $f''(x_{n})/f'(x_{n})$ can be simplified:

x_{n+1}=x_{n}-{\frac {f(x_{n})}{f'(x_{n})-{\frac {f(x_{n})}{f'(x_{n})}}{\frac {f''(x_{n})}{2}}}}=x_{n}-{\frac {f(x_{n})}{f'(x_{n})}}\left[1-{\frac {f(x_{n})}{f'(x_{n})}}\cdot {\frac {f''(x_{n})}{2f'(x_{n})}}\right]^{-1}.

When the second derivative is very close to zero, the Halley's method iteration is almost the same as the Newton's method iteration.

Derivation

Consider the function

g(x)={\frac {f(x)}{\sqrt {|f'(x)|}}}.

Any root r of f that is not a root of its derivative is a root of g (i.e., $g(r)=0$ when $f(r)=0\neq {\sqrt {|f'(r)|}}$ ), and any root r of g must be a root of f provided the derivative of f at r is not infinite. Applying Newton's method to g gives

x_{n+1}=x_{n}-{\frac {g(x_{n})}{g'(x_{n})}}

with

g'(x)={\frac {2[f'(x)]^{2}-f(x)f''(x)}{2f'(x){\sqrt {|f'(x)|}}}},

and the result follows. Notice that if $f'(c) = 0$ , then one cannot apply this at c because $g (c)$ would be undefined. ^{[ further explanation needed ]}

Cubic convergence

Suppose a is a root of f but not of its derivative. And suppose that the third derivative of f exists and is continuous in a neighborhood of a and $x n$ is in that neighborhood. Then Taylor's theorem implies:

0=f(a)=f(x_{n})+f'(x_{n})(a-x_{n})+{\frac {f''(x_{n})}{2}}(a-x_{n})^{2}+{\frac {f'''(\xi )}{6}}(a-x_{n})^{3}

and also

0=f(a)=f(x_{n})+f'(x_{n})(a-x_{n})+{\frac {f''(\eta )}{2}}(a-x_{n})^{2},

where ξ and η are numbers lying between a and $x n$ . Multiply the first equation by $2f'(x_{n})$ and subtract from it the second equation times $f''(x_{n})(a-x_{n})$ to give:

{\begin{aligned}0&=2f(x_{n})f'(x_{n})+2[f'(x_{n})]^{2}(a-x_{n})+f'(x_{n})f''(x_{n})(a-x_{n})^{2}+{\frac {f'(x_{n})f'''(\xi )}{3}}(a-x_{n})^{3}\\&\qquad -f(x_{n})f''(x_{n})(a-x_{n})-f'(x_{n})f''(x_{n})(a-x_{n})^{2}-{\frac {f''(x_{n})f''(\eta )}{2}}(a-x_{n})^{3}.\end{aligned}}

Canceling $f'(x_{n})f''(x_{n})(a-x_{n})^{2}$ and re-organizing terms yields:

0=2f(x_{n})f'(x_{n})+\left(2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n})\right)(a-x_{n})+\left({\frac {f'(x_{n})f'''(\xi )}{3}}-{\frac {f''(x_{n})f''(\eta )}{2}}\right)(a-x_{n})^{3}.

Put the second term on the left side and divide through by

2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n})

to get:

a-x_{n}={\frac {-2f(x_{n})f'(x_{n})}{2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n})}}-{\frac {2f'(x_{n})f'''(\xi )-3f''(x_{n})f''(\eta )}{6(2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n}))}}(a-x_{n})^{3}.

Thus:

a-x_{n+1}=-{\frac {2f'(x_{n})f'''(\xi )-3f''(x_{n})f''(\eta )}{12[f'(x_{n})]^{2}-6f(x_{n})f''(x_{n})}}(a-x_{n})^{3}.

The limit of the coefficient on the right side as $x n \to a$ is:

-{\frac {2f'(a)f'''(a)-3f''(a)f''(a)}{12[f'(a)]^{2}-6f(a)f''(a)}}.

If we take K to be a little larger than the absolute value of this, we can take absolute values of both sides of the formula and replace the absolute value of coefficient by its upper bound near a to get:

|a-x_{n+1}|\leq K|a-x_{n}|^{3}

which is what was to be proved.

To summarize,

\Delta x_{i+1}={\frac {3(f'')^{2}-2f'f'''}{12(f')^{2}}}(\Delta x_{i})^{3}+O[\Delta x_{i}]^{4},\qquad \Delta x_{i}\triangleq x_{i}-a.

^[5]

Halley's irrational method

Halley actually developed two third-order root-finding methods. The above, using only a division, is referred to as Halley's rational method. A second, "irrational" method uses a square root as well:^[6]^[7]^[8]

x_{n+1}=x_{n}-{\frac {f'(x_{n})-{\sqrt {[f'(x_{n})]^{2}-2f(x_{n})f''(x_{n})}}}{f''(x_{n})}}

This iteration was "deservedly preferred" to the rational method by Halley^[7] on the grounds that the denominator is smaller, making the division easier. A second advantage is that it tends to have about half of the error of the rational method, a benefit which multiplies as it is iterated. On a computer, it would appear to be slower as it has two slow operations (division and square root) instead of one, but on modern computers the reciprocal of the denominator can be computed at the same time as the square root via instruction pipelining, so the latency of each iteration differs very little.^[8]^: 24

Related Research Articles

In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the i-th approximation is derived from the previous ones.

<span class="mw-page-title-main">Newton's method</span> Algorithm for finding zeros of functions

In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function $f$ , its derivative $f'$ , and an initial guess $x 0$ for a root of $f$ . If $f$ satisfies certain assumptions and the initial guess is close, then

In mathematical analysis, the Dirac delta function, also known as the unit impulse, is a generalized function on the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one. Thus it can be represented heuristically as

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function $f$ is a number $x$ such that $f (x) = 0$ . As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros. For functions from the real numbers to real numbers or from the complex numbers to the complex numbers, these are expressed either as floating-point numbers without error bounds or as floating-point values together with error bounds. The latter, approximations with error bounds, are equivalent to small isolating intervals for real roots or disks for complex roots.

In calculus, the trapezoidal rule is a technique for numerical integration, i.e., approximating the definite integral:

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method, so it is considered a quasi-Newton method. Historically, it is as an evolution of the method of false position, which predates Newton's method by over 3000 years.

Stochastic gradient descent is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient by an estimate thereof. Especially in high-dimensional optimization problems this reduces the very high computational burden, achieving faster iterations in exchange for a lower convergence rate.

Muller's method is a root-finding algorithm, a numerical method for solving equations of the form f(x) = 0. It was first presented by David E. Muller in 1956.

In physics and fluid mechanics, a Blasius boundary layer describes the steady two-dimensional laminar boundary layer that forms on a semi-infinite plate which is held parallel to a constant unidirectional flow. Falkner and Skan later generalized Blasius' solution to wedge flow, i.e. flows in which the plate is not parallel to the flow.

Simple rational approximation (SRA) is a subset of interpolating methods using rational functions. Especially, SRA interpolates a given function with a specific rational function whose poles and zeros are simple, which means that there is no multiplicity in poles and zeros. Sometimes, it only implies simple poles.

In numerical analysis, fixed-point iteration is a method of computing fixed points of a function.

<span class="mw-page-title-main">Peridynamics</span>

Peridynamics is a non-local formulation of continuum mechanics that is oriented toward deformations with discontinuities, especially fractures. Originally, bond-based peridynamic has been introduced, wherein, internal interaction forces between a material point and all the other ones with which it can interact, are modeled as a central forces field. This type of force fields can be imagined as a mesh of bonds connecting each point of the body with every other interacting point within a certain distance which depends on material property, called peridynamic horizon. Later, to overcome bond-based framework limitations for the material Poisson’s ratio, state-base peridynamics, has been formulated. Its characteristic feature is that the force exchanged between a point and another one is influenced by the deformation state of all other bonds relative to its interaction zone.

In mathematics, and more specifically in numerical analysis, Householder's methods are a class of root-finding algorithms that are used for functions of one real variable with continuous derivatives up to some order $d + 1$ . Each of these methods is characterized by the number $d$ , which is known as the order of the method. The algorithm is iterative and has a rate of convergence of $d + 1$ .

In mathematics, prolate spheroidal wave functions are eigenfunctions of the Laplacian in prolate spheroidal coordinates, adapted to boundary conditions on certain ellipsoids of revolution. Related are the oblate spheroidal wave functions.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

In fracture mechanics, the energy release rate, $, is the rate at which energy is transformed as a material undergoes fracture. Mathematically, the energy release rate is expressed as the decrease in total potential energy per increase in fracture surface area, and is thus expressed in terms of energy per unit area. Various energy balances can be constructed relating the energy released during fracture to the energy of the resulting new surface, as well as other dissipative processes such as plasticity and heat generation. The energy release rate is central to the field of fracture mechanics when solving problems and estimating material properties related to fracture and fatigue.$

In fluid dynamics, a cnoidal wave is a nonlinear and exact periodic wave solution of the Korteweg–de Vries equation. These solutions are in terms of the Jacobi elliptic function cn, which is why they are coined cnoidal waves. They are used to describe surface gravity waves of fairly long wavelength, as compared to the water depth.

In applied mathematics, oblate spheroidal wave functions are involved in the solution of the Helmholtz equation in oblate spheroidal coordinates. When solving this equation, $, by the method of separation of variables,, with:$

Mean-field particle methods are a broad class of interacting type Monte Carlo algorithms for simulating from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability measures can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depends on the distributions of the current random states. A natural way to simulate these sophisticated nonlinear Markov processes is to sample a large number of copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and Markov chain Monte Carlo methods these mean-field particle techniques rely on sequential interacting samples. The terminology mean-field reflects the fact that each of the samples interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. In other words, starting with a chaotic configuration based on independent copies of initial state of the nonlinear Markov chain model, the chaos propagates at any time horizon as the size the system tends to infinity; that is, finite blocks of particles reduces to independent copies of the nonlinear Markov process. This result is called the propagation of chaos property. The terminology "propagation of chaos" originated with the work of Mark Kac in 1976 on a colliding mean-field kinetic gas model.

In image analysis, the generalized structure tensor (GST) is an extension of the Cartesian structure tensor to curvilinear coordinates. It is mainly used to detect and to represent the "direction" parameters of curves, just as the Cartesian structure tensor detects and represents the direction in Cartesian coordinates. Curve families generated by pairs of locally orthogonal functions have been the best studied.

References

↑ Gundersen, Geir; Steihaug, Trond (2010). "On large scale unconstrained optimization problems and higher order methods" (PDF). Optimization Methods & Software. 25 (3): 337–358. doi:10.1080/10556780903239071 . Retrieved 2 September 2024.
↑ Boyd, John P. (2013). "Finding the Zeros of a Univariate Equation: Proxy Rootfinders, Chebyshev Interpolation, and the Companion Matrix". SIAM Review. 55 (2): 375–396. doi:10.1137/110838297.
↑ Scavo, T. R.; Thoo, J. B. (1995). "On the geometry of Halley's method". American Mathematical Monthly. 102 (5): 417–426. doi:10.2307/2975033. JSTOR 2975033.
↑ Alefeld, G. (1981). "On the convergence of Halley's method". American Mathematical Monthly. 88 (7): 530–536. doi:10.2307/2321760. JSTOR 2321760.
↑ Proinov, Petko D.; Ivanov, Stoil I. (2015). "On the convergence of Halley's method for simultaneous computation of polynomial zeros". J. Numer. Math. 23 (4): 379–394. doi:10.1515/jnma-2015-0026. S2CID 10356202.
↑ Bateman, Harry (January 1938). "Halley's methods for solving equations". The American Mathematical Monthly. 45 (1): 11–17. doi:10.2307/2303467. JSTOR 2303467.
1 2 Halley, Edmond (May 1694). "Methodus nova accurata & facilis inveniendi radices æqnationum quarumcumque generaliter, sine praviæ reductione". Philosophical Transactions of the Royal Society (in Latin). 18 (210): 136–148. doi: 10.1098/rstl.1694.0029 . An English translation was published as Halley, Edmond (1809) [May 1694]. "A new, exact, and easy Method of finding the Roots of any Equations generally, and that without any previous Reduction". In C. Hutton; G. Shaw; R. Pearson (eds.). The Philosophical Transactions of the Royal Society of London, from their commencement, in 1665, to the year 1800. Vol. III from 1683 to 1694. pp. 640–649.
1 2 Leroy, Robin (21 June 2021). A correctly rounded binary64 cube root (PDF) (Technical report).

External links

Weisstein, Eric W. "Halley's method". MathWorld .
Newton's method and high order iterations , Pascal Sebah and Xavier Gourdon, 2001 (the site has a link to a Postscript version for better formula display)

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Gundersen, Geir; Steihaug, Trond (2010). "On large scale unconstrained optimization problems and higher order methods" (PDF). Optimization Methods & Software. 25 (3): 337–358. doi:10.1080/10556780903239071 . Retrieved 2 September 2024.

[2] Boyd, John P. (2013). "Finding the Zeros of a Univariate Equation: Proxy Rootfinders, Chebyshev Interpolation, and the Companion Matrix". SIAM Review. 55 (2): 375–396. doi:10.1137/110838297.

[3] Scavo, T. R.; Thoo, J. B. (1995). "On the geometry of Halley's method". American Mathematical Monthly. 102 (5): 417–426. doi:10.2307/2975033. JSTOR 2975033.

[4] Alefeld, G. (1981). "On the convergence of Halley's method". American Mathematical Monthly. 88 (7): 530–536. doi:10.2307/2321760. JSTOR 2321760.

[5] Proinov, Petko D.; Ivanov, Stoil I. (2015). "On the convergence of Halley's method for simultaneous computation of polynomial zeros". J. Numer. Math. 23 (4): 379–394. doi:10.1515/jnma-2015-0026. S2CID 10356202.

[6] Bateman, Harry (January 1938). "Halley's methods for solving equations". The American Mathematical Monthly. 45 (1): 11–17. doi:10.2307/2303467. JSTOR 2303467.

[Halley1694-7] 1 2 Halley, Edmond (May 1694). "Methodus nova accurata & facilis inveniendi radices æqnationum quarumcumque generaliter, sine praviæ reductione". Philosophical Transactions of the Royal Society (in Latin). 18 (210): 136–148. doi: 10.1098/rstl.1694.0029 . An English translation was published as Halley, Edmond (1809) [May 1694]. "A new, exact, and easy Method of finding the Roots of any Equations generally, and that without any previous Reduction". In C. Hutton; G. Shaw; R. Pearson (eds.). The Philosophical Transactions of the Royal Society of London, from their commencement, in 1665, to the year 1800. Vol. III from 1683 to 1694. pp. 640–649.

[eggrobin-8] 1 2 Leroy, Robin (21 June 2021). A correctly rounded binary64 cube root (PDF) (Technical report).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Householder	Newton's method Halley's method
Quasi-Newton	Broyden's method Secant method Newton–Krylov method Steffensen's method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Aberth method Bairstow's method Durand–Kerner method Graeffe's method Jenkins–Traub algorithm Lehmer–Schur algorithm Laguerre's method Splitting circle method
Other methods	Fixed-point iteration Inverse quadratic interpolation Muller's method Sidi's generalized secant method