Halley's method

Last updated January 30, 2024

In numerical analysis, Halley's method is a root-finding algorithm used for functions of one real variable with a continuous second derivative. Edmond Halley was an English mathematician and astronomer who introduced the method now called by his name.

Method

Halley's method is a numerical algorithm for solving the nonlinear equation $f (x) = 0$ . In this case, the function f has to be a function of one real variable. The method consists of a sequence of iterations:

x_{n+1}=x_{n}-{\frac {2f(x_{n})f'(x_{n})}{2{[f'(x_{n})]}^{2}-f(x_{n})f''(x_{n})}}

beginning with an initial guess $x 0$ .^[2]

If f is a three times continuously differentiable function and a is a zero of f but not of its derivative, then, in a neighborhood of a, the iterates $x n$ satisfy:

|x_{n+1}-a|\leq K\cdot {|x_{n}-a|}^{3},{\text{ for some }}K>0.

This means that the iterates converge to the zero if the initial guess is sufficiently close, and that the convergence is cubic.^[3]

The following alternative formulation shows the similarity between Halley's method and Newton's method. The expression $f(x_{n})/f'(x_{n})$ is computed only once, and it is particularly useful when $f''(x_{n})/f'(x_{n})$ can be simplified:

x_{n+1}=x_{n}-{\frac {f(x_{n})}{f'(x_{n})-{\frac {f(x_{n})}{f'(x_{n})}}{\frac {f''(x_{n})}{2}}}}=x_{n}-{\frac {f(x_{n})}{f'(x_{n})}}\left[1-{\frac {f(x_{n})}{f'(x_{n})}}\cdot {\frac {f''(x_{n})}{2f'(x_{n})}}\right]^{-1}.

When the second derivative is very close to zero, the Halley's method iteration is almost the same as the Newton's method iteration.

Derivation

Consider the function

g(x)={\frac {f(x)}{\sqrt {|f'(x)|}}}.

Any root r of f that is not a root of its derivative is a root of g (i.e., $g(r)=0$ when $f(r)=0\neq {\sqrt {|f'(r)|}}$ ), and any root r of g must be a root of f provided the derivative of f at r is not infinite. Applying Newton's method to g gives

x_{n+1}=x_{n}-{\frac {g(x_{n})}{g'(x_{n})}}

with

g'(x)={\frac {2[f'(x)]^{2}-f(x)f''(x)}{2f'(x){\sqrt {|f'(x)|}}}},

and the result follows. Notice that if $f'(c) = 0$ , then one cannot apply this at c because $g (c)$ would be undefined. ^{[ further explanation needed ]}

Cubic convergence

Suppose a is a root of f but not of its derivative. And suppose that the third derivative of f exists and is continuous in a neighborhood of a and $x n$ is in that neighborhood. Then Taylor's theorem implies:

0=f(a)=f(x_{n})+f'(x_{n})(a-x_{n})+{\frac {f''(x_{n})}{2}}(a-x_{n})^{2}+{\frac {f'''(\xi )}{6}}(a-x_{n})^{3}

and also

0=f(a)=f(x_{n})+f'(x_{n})(a-x_{n})+{\frac {f''(\eta )}{2}}(a-x_{n})^{2},

where ξ and η are numbers lying between a and $x n$ . Multiply the first equation by $2f'(x_{n})$ and subtract from it the second equation times $f''(x_{n})(a-x_{n})$ to give:

{\begin{aligned}0&=2f(x_{n})f'(x_{n})+2[f'(x_{n})]^{2}(a-x_{n})+f'(x_{n})f''(x_{n})(a-x_{n})^{2}+{\frac {f'(x_{n})f'''(\xi )}{3}}(a-x_{n})^{3}\\&\qquad -f(x_{n})f''(x_{n})(a-x_{n})-f'(x_{n})f''(x_{n})(a-x_{n})^{2}-{\frac {f''(x_{n})f''(\eta )}{2}}(a-x_{n})^{3}.\end{aligned}}

Canceling $f'(x_{n})f''(x_{n})(a-x_{n})^{2}$ and re-organizing terms yields:

0=2f(x_{n})f'(x_{n})+\left(2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n})\right)(a-x_{n})+\left({\frac {f'(x_{n})f'''(\xi )}{3}}-{\frac {f''(x_{n})f''(\eta )}{2}}\right)(a-x_{n})^{3}.

Put the second term on the left side and divide through by

2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n})

to get:

a-x_{n}={\frac {-2f(x_{n})f'(x_{n})}{2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n})}}-{\frac {2f'(x_{n})f'''(\xi )-3f''(x_{n})f''(\eta )}{6(2[f'(x_{n})]^{2}-f(x_{n})f''(x_{n}))}}(a-x_{n})^{3}.

Thus:

a-x_{n+1}=-{\frac {2f'(x_{n})f'''(\xi )-3f''(x_{n})f''(\eta )}{12[f'(x_{n})]^{2}-6f(x_{n})f''(x_{n})}}(a-x_{n})^{3}.

The limit of the coefficient on the right side as $x n \to a$ is:

-{\frac {2f'(a)f'''(a)-3f''(a)f''(a)}{12[f'(a)]^{2}-6f(a)f''(a)}}.

If we take K to be a little larger than the absolute value of this, we can take absolute values of both sides of the formula and replace the absolute value of coefficient by its upper bound near a to get:

|a-x_{n+1}|\leq K|a-x_{n}|^{3}

which is what was to be proved.

To summarize,

\Delta x_{i+1}={\frac {3(f'')^{2}-2f'f'''}{12(f')^{2}}}(\Delta x_{i})^{3}+O[\Delta x_{i}]^{4},\qquad \Delta x_{i}\triangleq x_{i}-a.

^[4]

Related Research Articles

In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function $f$ , its derivative $f'$ , and an initial guess $x 0$ for a root of $f$ . If $f$ satisfies certain assumptions and the initial guess is close, then

In mathematical analysis, the Dirac delta function, also known as the unit impulse, is a generalized function on the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one. Since there is no function having this property, to model the delta "function" rigorously involves the use of limits or, as is common in mathematics, measure theory and the theory of distributions.

A finite difference is a mathematical expression of the form $f (x + b) - f (x + a)$ . If a finite difference is divided by $b - a$ , one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for the numerical solution of differential equations, especially boundary value problems.

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function $f$ , from the real numbers to real numbers or from the complex numbers to the complex numbers, is a number $x$ such that $f (x) = 0$ . As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros, expressed either as floating-point numbers or as small isolating intervals, or disks for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).

In calculus, the trapezoidal rule is a technique for numerical integration, i.e., approximating the definite integral:

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method. However, the secant method predates Newton's method by over 3000 years.

In mathematics, Laplace's method, named after Pierre-Simon Laplace, is a technique used to approximate integrals of the form

Second-order linear partial differential equations (PDEs) are classified as either elliptic, hyperbolic, or parabolic. Any second-order linear PDE in two variables can be written in the form

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Stochastic gradient descent is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient by an estimate thereof. Especially in high-dimensional optimization problems this reduces the very high computational burden, achieving faster iterations in exchange for a lower convergence rate.

Muller's method is a root-finding algorithm, a numerical method for solving equations of the form f(x) = 0. It was first presented by David E. Muller in 1956.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

<span class="mw-page-title-main">Peridynamics</span>

Peridynamics is a non-local formulation of continuum mechanics that is oriented toward deformations with discontinuities, especially fractures. Originally, bond-based peridynamic has been introduced, wherein, internal interaction forces between a material point and all the other ones with which it can interact, are modeled as a central forces field. This type of force fields can be imagined as a mesh of bonds connecting each point of the body with every other interacting point within a certain distance which depends on material property, called peridynamic horizon. Later, to overcome bond-based framework limitations for the material Poisson’s ratio, state-base peridynamics, has been formulated. Its characteristic feature is that the force exchanged between a point and another one is influenced by the deformation state of all other bonds relative to its interaction zone.

In mathematics, and more specifically in numerical analysis, Householder's methods are a class of root-finding algorithms that are used for functions of one real variable with continuous derivatives up to some order $d + 1$ . Each of these methods is characterized by the number $d$ , which is known as the order of the method. The algorithm is iterative and has a rate of convergence of $d + 1$ .

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

In fracture mechanics, the energy release rate, $, is the rate at which energy is transformed as a material undergoes fracture. Mathematically, the energy release rate is expressed as the decrease in total potential energy per increase in fracture surface area, and is thus expressed in terms of energy per unit area. Various energy balances can be constructed relating the energy released during fracture to the energy of the resulting new surface, as well as other dissipative processes such as plasticity and heat generation. The energy release rate is central to the field of fracture mechanics when solving problems and estimating material properties related to fracture and fatigue.$

In mathematical physics, the Berezin integral, named after Felix Berezin,, is a way to define integration for functions of Grassmann variables. It is not an integral in the Lebesgue sense; the word "integral" is used because the Berezin integral has properties analogous to the Lebesgue integral and because it extends the path integral in physics, where it is used as a sum over histories for fermions.

In fluid dynamics, a cnoidal wave is a nonlinear and exact periodic wave solution of the Korteweg–de Vries equation. These solutions are in terms of the Jacobi elliptic function cn, which is why they are coined cnoidal waves. They are used to describe surface gravity waves of fairly long wavelength, as compared to the water depth.

In thermal quantum field theory, the Matsubara frequency summation is the summation over discrete imaginary frequencies. It takes the following form

Mean-field particle methods are a broad class of interacting type Monte Carlo algorithms for simulating from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability measures can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depends on the distributions of the current random states. A natural way to simulate these sophisticated nonlinear Markov processes is to sample a large number of copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and Markov chain Monte Carlo methods these mean-field particle techniques rely on sequential interacting samples. The terminology mean-field reflects the fact that each of the samples interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. In other words, starting with a chaotic configuration based on independent copies of initial state of the nonlinear Markov chain model, the chaos propagates at any time horizon as the size the system tends to infinity; that is, finite blocks of particles reduces to independent copies of the nonlinear Markov process. This result is called the propagation of chaos property. The terminology "propagation of chaos" originated with the work of Mark Kac in 1976 on a colliding mean-field kinetic gas model.

References

↑ Boyd, J. P. (2013). "Finding the Zeros of a Univariate Equation: Proxy Rootfinders, Chebyshev Interpolation, and the Companion Matrix". SIAM Review. 55 (2): 375–396. doi:10.1137/110838297.
↑ Scavo, T. R.; Thoo, J. B. (1995). "On the geometry of Halley's method". American Mathematical Monthly. 102 (5): 417–426. doi:10.2307/2975033. JSTOR 2975033.
↑ Alefeld, G. (1981). "On the convergence of Halley's method". American Mathematical Monthly. 88 (7): 530–536. doi:10.2307/2321760. JSTOR 2321760.
↑ Proinov, Petko D.; Ivanov, Stoil I. (2015). "On the convergence of Halley's method for simultaneous computation of polynomial zeros". J. Numer. Math. 23 (4): 379–394. doi:10.1515/jnma-2015-0026. S2CID 10356202.

External links

Weisstein, Eric W. "Halley's method". MathWorld .
Newton's method and high order iterations , Pascal Sebah and Xavier Gourdon, 2001 (the site has a link to a Postscript version for better formula display)

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Boyd, J. P. (2013). "Finding the Zeros of a Univariate Equation: Proxy Rootfinders, Chebyshev Interpolation, and the Companion Matrix". SIAM Review. 55 (2): 375–396. doi:10.1137/110838297.

[2] Scavo, T. R.; Thoo, J. B. (1995). "On the geometry of Halley's method". American Mathematical Monthly. 102 (5): 417–426. doi:10.2307/2975033. JSTOR 2975033.

[3] Alefeld, G. (1981). "On the convergence of Halley's method". American Mathematical Monthly. 88 (7): 530–536. doi:10.2307/2321760. JSTOR 2321760.

[4] Proinov, Petko D.; Ivanov, Stoil I. (2015). "On the convergence of Halley's method for simultaneous computation of polynomial zeros". J. Numer. Math. 23 (4): 379–394. doi:10.1515/jnma-2015-0026. S2CID 10356202.

[1]

[2]

[3]

[4]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP Method
Newton	Newton's method
Quasi-Newton	Halley's method Muller's method Secant method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Bairstow's method Jenkins–Traub method Laguerre's method