Householder's method

Last updated December 14, 2024

In mathematics, and more specifically in numerical analysis, Householder's methods are a class of root-finding algorithms that are used for functions of one real variable with continuous derivatives up to some order $d + 1$ . Each of these methods is characterized by the number $d$ , which is known as the order of the method. The algorithm is iterative and has a rate of convergence of $d + 1$ .

Method

Householder's method is a numerical algorithm for solving the equation $f (x) = 0$ . In this case, the function $f$ has to be a function of one real variable. The method consists of a sequence of iterations

$x_{n+1}=x_{n}+d\;{\frac {\left(1/f\right)^{(d-1)}(x_{n})}{\left(1/f\right)^{(d)}(x_{n})}}$

beginning with an initial guess $x 0$ .^[1]

If $f$ is a $d + 1$ times continuously differentiable function and $a$ is a zero of $f$ but not of its derivative, then, in a neighborhood of $a$ , the iterates $x n$ satisfy:^{[ citation needed ]}

$|x_{n+1}-a|\leq K\cdot {|x_{n}-a|}^{d+1}$ , for some $K>0.\!$

This means that the iterates converge to the zero if the initial guess is sufficiently close, and that the convergence has order $d + 1$ or better. Furthermore, when close enough to $a$ , it commonly is the case that $x_{n+1}-a\approx C(x_{n}-a)^{d+1}$ for some $C\neq 0$ . In particular,

if $d + 1$ is even and $C > 0$ then convergence to $a$ will be from values greater than $a$ ;
if $d + 1$ is even and $C < 0$ then convergence to $a$ will be from values less than $a$ ;
if $d + 1$ is odd and $C > 0$ then convergence to $a$ will be from the side where it starts; and
if $d + 1$ is odd and $C < 0$ then convergence to $a$ will alternate sides.

Despite their order of convergence, these methods are not widely used because the gain in precision is not commensurate with the rise in effort for large $d$ . The Ostrowski index expresses the error reduction in the number of function evaluations instead of the iteration count.^[2]

For polynomials, the evaluation of the first $d$ derivatives of $f$ at $x n$ using Horner's method has an effort of $d + 1$ polynomial evaluations. Since $n (d + 1)$ evaluations over $n$ iterations give an error exponent of $(d + 1) n$ , the exponent for one function evaluation is ${\sqrt[{d+1}]{d+1}}$ , numerically $1.4142$ , $1.4422$ , $1.4142$ , $1.3797$ for $d = 1, 2, 3, 4$ , and falling after that. By this criterion, the $d =2$ case (Halley's method) is the optimal value of $d$ .
For general functions the derivative evaluation using the Taylor arithmetic of automatic differentiation requires the equivalent of $(d + 1)(d + 2)/2$ function evaluations. One function evaluation thus reduces the error by an exponent of ${\sqrt[{\frac {(d+1)(d+2)}{2}}]{d+1}}$ , which is ${\sqrt[{3}]{2}}\approx 1.2599$ for Newton's method, ${\sqrt[{6}]{3}}\approx 1.2009$ for Halley's method and falling towards 1 or linear convergence for the higher order methods.

Motivation

First approach

Suppose $f$ is analytic in a neighborhood of $a$ and $f (a) = 0$ . Then $f$ has a Taylor series at $a$ and its constant term is zero. Because this constant term is zero, the function $f (x) / (x - a)$ will have a Taylor series at $a$ and, when $f ' (a) \neq 0$ , its constant term will not be zero. Because that constant term is not zero, it follows that the reciprocal $(x - a) / f (x)$ has a Taylor series at $a$ , which we will write as $\sum _{k=0}^{\infty }{\frac {c_{k}(x-a)^{k}}{k!}}$ and its constant term $c 0$ will not be zero. Using that Taylor series we can write ${\frac {1}{f}}={\frac {c_{0}}{x-a}}+\sum _{k=1}^{\infty }{\frac {c_{k}(x-a)^{k-1}}{k~(k-1)!}}\,.$ When we compute its $d$ -th derivative, we note that the terms for $k = 1, ..., d$ conveniently vanish: $\left({\frac {1}{f}}\right)^{(d)}={\frac {(-1)^{d}d!~c_{0}}{(x-a)^{d+1}}}+\sum _{k=d+1}^{\infty }{\frac {c_{k}(x-a)^{k-d-1}}{k~(k-d-1)!}}$ $={\frac {(-1)^{d}d!~c_{0}}{(x-a)^{d+1}}}\left(1+{\frac {1}{(-1)^{d}d!~c_{0}}}\sum _{k=d+1}^{\infty }{\frac {c_{k}(x-a)^{k}}{k~(k-d-1)!}}\right)$ $={\frac {(-1)^{d}d!~c_{0}}{(x-a)^{d+1}}}\left(1+{\mathcal {O}}\left((x-a)^{d+1}\right)\right)\,,$ using big O notation. We thus get that the correction term that we add to $x = x n$ to get a value of $x n +1$ that is closer to $a$ is: $d~{\frac {(1/f)^{(d-1)}}{(1/f)^{(d)}}}=d~{\frac {(-1)^{d-1}(d-1)!~c_{0}}{(-1)^{d}d!~c_{0}}}(x-a)\left({\frac {1+{\mathcal {O}}\left((x-a)^{d}\right)}{1+{\mathcal {O}}\left((x-a)^{d+1}\right)}}\right)$ $=-(x-a)\left(1+{\mathcal {O}}\left((x-a)^{d}\right)\right)\,.$ If $a$ is the zero of $f$ that is closest to $x$ then the second factor goes to 1 as $d$ goes to infinity and $x+d~{\frac {(1/f)^{(d-1)}}{(1/f)^{(d)}}}$ goes to $a$ .

Second approach

Suppose $x = a$ is a simple root. Then near $x = a$ , $(1/ f)(x)$ is a meromorphic function. Suppose we have the Taylor expansion: $(1/f)(x)=\sum _{d=0}^{\infty }{\frac {(1/f)^{(d)}(b)}{d!}}(x-b)^{d}$ around a point $b$ that is closer to $a$ than it is to any other zero of $f$ . By König's theorem, we have: $a-b=\lim _{d\rightarrow \infty }{\frac {\frac {(1/f)^{(d-1)}(b)}{(d-1)!}}{\frac {(1/f)^{(d)}(b)}{d!}}}=d{\frac {(1/f)^{(d-1)}(b)}{(1/f)^{(d)}(b)}}.$

These suggest that Householder's iteration might be a good convergent iteration. The actual proof of the convergence is also based on these ideas.

The methods of lower order

Householder's method of order 1 is just Newton's method, since: ${\begin{array}{rl}x_{n+1}=&x_{n}+1\,{\frac {\left(1/f\right)(x_{n})}{\left(1/f\right)^{(1)}(x_{n})}}\\[.7em]=&x_{n}+{\frac {1}{f(x_{n})}}\cdot \left({\frac {-f'(x_{n})}{f(x_{n})^{2}}}\right)^{-1}\\[.7em]=&x_{n}-{\frac {f(x_{n})}{f'(x_{n})}}.\end{array}}$

For Householder's method of order 2 one gets Halley's method, since the identities $\textstyle (1/f)'(x)=-{\frac {f'(x)}{f(x)^{2}}}\$ and $\textstyle \ (1/f)''(x)=-{\frac {f''(x)}{f(x)^{2}}}+2{\frac {f'(x)^{2}}{f(x)^{3}}}$ result in ${\begin{array}{rl}x_{n+1}=&x_{n}+2\,{\frac {\left(1/f\right)'(x_{n})}{\left(1/f\right)''(x_{n})}}\\[1em]=&x_{n}+{\frac {-2f(x_{n})\,f'(x_{n})}{-f(x_{n})f''(x_{n})+2f'(x_{n})^{2}}}\\[1em]=&x_{n}-{\frac {f(x_{n})f'(x_{n})}{f'(x_{n})^{2}-{\tfrac {1}{2}}f(x_{n})f''(x_{n})}}\\[1em]=&x_{n}+h_{n}\;{\frac {1}{1+{\frac {1}{2}}(f''/f')(x_{n})\,h_{n}}}.\end{array}}$ In the last line, $h_{n}=-{\tfrac {f(x_{n})}{f'(x_{n})}}$ is the update of the Newton iteration at the point $x_{n}$ . This line was added to demonstrate where the difference to the simple Newton's method lies.

The third order method is obtained from the identity of the third order derivative of $1/ f$ $\textstyle (1/f)'''(x)=-{\frac {f'''(x)}{f(x)^{2}}}+6{\frac {f'(x)\,f''(x)}{f(x)^{3}}}-6{\frac {f'(x)^{3}}{f(x)^{4}}}$ and has the formula ${\begin{array}{rl}x_{n+1}=&x_{n}+3\,{\frac {\left(1/f\right)''(x_{n})}{\left(1/f\right)'''(x_{n})}}\\[1em]=&x_{n}-{\frac {6f(x_{n})\,f'(x_{n})^{2}-3f(x_{n})^{2}f''(x_{n})}{6f'(x_{n})^{3}-6f(x_{n})f'(x_{n})\,f''(x_{n})+f(x_{n})^{2}\,f'''(x_{n})}}\\[1em]=&x_{n}+h_{n}{\frac {1+{\frac {1}{2}}(f''/f')(x_{n})\,h_{n}}{1+(f''/f')(x_{n})\,h_{n}+{\frac {1}{6}}(f'''/f')(x_{n})\,h_{n}^{2}}}\end{array}}$ and so on.

Example

The first problem solved by Newton with the Newton-Raphson-Simpson method was the polynomial equation $y^{3}-2y-5=0$ . He observed that there should be a solution close to 2. Replacing $y = x + 2$ transforms the equation into $0=f(x)=-1+10x+6x^{2}+x^{3}$ . The Taylor series of the reciprocal function starts with ${\begin{array}{rl}1/f(x)=&-1-10\,x-106\,x^{2}-1121\,x^{3}-11856\,x^{4}-125392\,x^{5}\\&-1326177\,x^{6}-14025978\,x^{7}-148342234\,x^{8}-1568904385\,x^{9}\\&-16593123232\,x^{10}+O(x^{11})\end{array}}$ The result of applying Householder's methods of various orders at $x = 0$ is also obtained by dividing neighboring coefficients of the latter power series. For the first orders one gets the following values after just one iteration step: For an example, in the case of the 3rd order, $x_{1}=0.0+106/1121=0.09455842997324$ .

d	x₁
1	0.100000000000000000000000000000000
2	0.094339622641509433962264150943396
3	0.094558429973238180196253345227475
4	0.094551282051282051282051282051282
5	0.094551486538216154140615031261962
6	0.094551481438752142436492263099118
7	0.094551481543746895938379484125812
8	0.094551481542336756233561913325371
9	0.094551481542324837086869382419375
10	0.094551481542326678478801765822985

As one can see, there are a little bit more than $d$ correct decimal places for each order d. The first one hundred digits of the correct solution are 0.0945514815423265914823865405793029638573061056282391803041285290453121899834836671462672817771577578.

Let's calculate the $x_{2},x_{3},x_{4}$ values for some lowest order,

$f=-1+10x+6x^{2}+x^{3}$ $f^{\prime }=10+12x+3x^{2}$ $f^{\prime \prime }=12+6x$ $f^{\prime \prime \prime }=6$

And using following relations,

1st order;

x_{i+1}=x_{i}-f(x_{i})/f^{\prime }(x_{i})

2nd order;

x_{i+1}=x_{i}-2ff^{\prime }/(2{f^{\prime }}^{2}-ff^{\prime \prime })

3rd order;

x_{i+1}=x_{i}-(6f{f^{\prime }}^{2}-3f^{2}f^{\prime \prime })/(6{f^{\prime }}^{3}-6ff^{\prime }f^{\prime \prime }+f^{2}f^{\prime \prime \prime })

x	1st (Newton)	2nd (Halley)	3rd order	4th order
x₁	0.100000000000000000000000000000000	0.094339622641509433962264150943395	0.094558429973238180196253345227475	0.09455128205128
x₂	0.094568121104185218165627782724844	0.094551481540164214717107966227500	0.094551481542326591482567319958483
x₃	0.094551481698199302883823703544266	0.094551481542326591482386540579303	0.094551481542326591482386540579303
x₄	0.094551481542326591496064847153714	0.094551481542326591482386540579303	0.094551481542326591482386540579303
x₅	0.094551481542326591482386540579303
x₆	0.094551481542326591482386540579303

Derivation

An exact derivation of Householder's methods starts from the Padé approximation of order $d + 1$ of the function, where the approximant with linear numerator is chosen. Once this has been achieved, the update for the next approximation results from computing the unique zero of the numerator.

The Padé approximation has the form $f(x+h)={\frac {a_{0}+h}{b_{0}+b_{1}h+\cdots +b_{d-1}h^{d-1}}}+O(h^{d+1}).$ The rational function has a zero at $h=-a_{0}$ .

Just as the Taylor polynomial of degree $d$ has $d + 1$ coefficients that depend on the function $f$ , the Padé approximation also has $d + 1$ coefficients dependent on $f$ and its derivatives. More precisely, in any Padé approximant, the degrees of the numerator and denominator polynomials have to add to the order of the approximant. Therefore, $b_{d}=0$ has to hold.

One could determine the Padé approximant starting from the Taylor polynomial of $f$ using Euclid's algorithm. However, starting from the Taylor polynomial of $1/ f$ is shorter and leads directly to the given formula. Since $(1/f)(x+h)=(1/f)(x)+(1/f)'(x)h+\cdots +(1/f)^{(d-1)}(x){\frac {h^{d-1}}{(d-1)!}}+(1/f)^{(d)}(x){\frac {h^{d}}{d!}}+O(h^{d+1})$ has to be equal to the inverse of the desired rational function, we get after multiplying with $a_{0}+h$ in the power $h^{d}$ the equation $0=b_{d}=a_{0}(1/f)^{(d)}(x){\frac {1}{d!}}+(1/f)^{(d-1)}(x){\frac {1}{(d-1)!}}$ .

Now, solving the last equation for the zero $h=-a_{0}$ of the numerator results in ${\begin{aligned}h&=-a_{0}={\frac {{\frac {1}{(d-1)!}}(1/f)^{(d-1)}(x)}{{\frac {1}{d!}}(1/f)^{(d)}(x)}}\\&=d\,{\frac {(1/f)^{(d-1)}(x)}{(1/f)^{(d)}(x)}}\end{aligned}}$ .

This implies the iteration formula $x_{n+1}=x_{n}+d\;{\frac {\left(1/f\right)^{(d-1)}(x_{n})}{\left(1/f\right)^{(d)}(x_{n})}}$ .

Relation to Newton's method

Householder's method applied to the real-valued function $f (x)$ is the same as applying Newton's method $x_{n+1}=x_{n}-{\frac {g(x_{n})}{g'(x_{n})}}$ to find the zeros of the function: $g(x)=\left|(1/f)^{(d-1)}\right|^{-1/d}\,.$ In particular, $d = 1$ gives Newton's method unmodified and $d = 2$ gives Halley's method.

Related Research Articles

<span class="mw-page-title-main">Newton's method</span> Algorithm for finding zeros of functions

In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function $f$ , its derivative $f'$ , and an initial guess $x 0$ for a root of $f$ . If $f$ satisfies certain assumptions and the initial guess is close, then

In mathematics, the prime number theorem (PNT) describes the asymptotic distribution of the prime numbers among the positive integers. It formalizes the intuitive idea that primes become less common as they become larger by precisely quantifying the rate at which this occurs. The theorem was proved independently by Jacques Hadamard and Charles Jean de la Vallée Poussin in 1896 using ideas introduced by Bernhard Riemann.

The Riemann zeta function or Euler–Riemann zeta function, denoted by the Greek letter $ζ$ (zeta), is a mathematical function of a complex variable defined as $for, and its analytic continuation elsewhere.$

In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor series are named after Brook Taylor, who introduced them in 1715. A Taylor series is also called a Maclaurin series when 0 is the point where the derivatives are considered, after Colin Maclaurin, who made extensive use of this special case of Taylor series in the 18th century.

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function $f$ is a number $x$ such that $f (x) = 0$ . As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros. For functions from the real numbers to real numbers or from the complex numbers to the complex numbers, these are expressed either as floating-point numbers without error bounds or as floating-point values together with error bounds. The latter, approximations with error bounds, are equivalent to small isolating intervals for real roots or disks for complex roots.

In numerical analysis, the Runge–Kutta methods are a family of implicit and explicit iterative methods, which include the Euler method, used in temporal discretization for the approximate solutions of simultaneous nonlinear equations. These methods were developed around 1900 by the German mathematicians Carl Runge and Wilhelm Kutta.

In mathematics, a Dirichlet series is any series of the form $where s is complex, and is a complex sequence. It is a special case of general Dirichlet series.$

In numerical analysis, inverse iteration is an iterative eigenvalue algorithm. It allows one to find an approximate eigenvector when an approximation to a corresponding eigenvalue is already known. The method is conceptually similar to the power method. It appears to have originally been developed to compute resonance frequencies in the field of structural mechanics.

In mathematical analysis, asymptotic analysis, also known as asymptotics, is a method of describing limiting behavior.

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods. The method is also called the interval halving method, the binary search method, or the dichotomy method.

In mathematics, the regula falsi, method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and error technique of using test ("false") values for the variable and then adjusting the test value according to the outcome. This is sometimes also referred to as "guess and check". Versions of the method predate the advent of algebra and the use of equations.

In mathematical analysis, particularly numerical analysis, the rate of convergence and order of convergence of a sequence that converges to a limit are any of several characterizations of how quickly that sequence approaches its limit. These are broadly divided into rates and orders of convergence that describe how quickly a sequence further approaches its limit once it is already close to it, called asymptotic rates and orders of convergence, and those that describe how quickly sequences approach their limits from starting points that are not necessarily close to their limits, called non-asymptotic rates and orders of convergence.

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Methods of computing square roots are algorithms for approximating the non-negative square root $of a positive real number . Since all square roots of natural numbers, other than of perfect squares, are irrational, square roots can usually only be computed to some finite precision: these methods typically construct a series of increasingly accurate approximations.$

In numerical linear algebra, the method of successive over-relaxation (SOR) is a variant of the Gauss–Seidel method for solving a linear system of equations, resulting in faster convergence. A similar method can be used for any slowly converging iterative process.

Simple rational approximation (SRA) is a subset of interpolating methods using rational functions. Especially, SRA interpolates a given function with a specific rational function whose poles and zeros are simple, which means that there is no multiplicity in poles and zeros. Sometimes, it only implies simple poles.

In numerical analysis, Halley's method is a root-finding algorithm used for functions of one real variable with a continuous second derivative. Edmond Halley was an English mathematician and astronomer who introduced the method now called by his name.

The Jenkins–Traub algorithm for polynomial zeros is a fast globally convergent iterative polynomial root-finding method published in 1970 by Michael A. Jenkins and Joseph F. Traub. They gave two variants, one for general polynomials with complex coefficients, commonly known as the "CPOLY" algorithm, and a more complicated variant for the special case of polynomials with real coefficients, commonly known as the "RPOLY" algorithm. The latter is "practically a standard in black-box polynomial root-finders".

In numerical analysis and scientific computing, the Gauss–Legendre methods are a family of numerical methods for ordinary differential equations. Gauss–Legendre methods are implicit Runge–Kutta methods. More specifically, they are collocation methods based on the points of Gauss–Legendre quadrature. The Gauss–Legendre method based on s points has order 2s.

References

↑ Householder, Alston Scott (1970). The Numerical Treatment of a Single Nonlinear Equation . McGraw-Hill. p. 169. ISBN 0-07-030465-3.
↑ Ostrowski, A. M. (1966). Solution of Equations and Systems of Equations. Pure and Applied Mathematics. Vol. 9 (Second ed.). New York: Academic Press.

External links

Pascal Sebah and Xavier Gourdon (2001). "Newton's method and high order iteration".Note: Use the PostScript version of this link; the website version is not compiled correctly.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Householder, Alston Scott (1970). The Numerical Treatment of a Single Nonlinear Equation . McGraw-Hill. p. 169. ISBN 0-07-030465-3.

[2] Ostrowski, A. M. (1966). Solution of Equations and Systems of Equations. Pure and Applied Mathematics. Vol. 9 (Second ed.). New York: Academic Press.

[1]

[2]