Root-finding algorithm

Last updated

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function f is a number x such that f(x) = 0. As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros. For functions from the real numbers to real numbers or from the complex numbers to the complex numbers, these are expressed either as floating-point numbers without error bounds or as floating-point values together with error bounds. The latter, approximations with error bounds, are equivalent to small isolating intervals for real roots or disks for complex roots. [1]

Contents

Solving an equation f(x) = g(x) is the same as finding the roots of the function h(x) = f(x) – g(x). Thus root-finding algorithms can be used to solve any equation of continuous functions. However, most root-finding algorithms do not guarantee that they will find all roots of a function, and if such an algorithm does not find any root, that does not necessarily mean that no root exists.

Most numerical root-finding methods are iterative methods, producing a sequence of numbers that ideally converges towards a root as a limit. They require one or more initial guesses of the root as starting values, then each iteration of the algorithm produces a successively more accurate approximation to the root. Since the iteration must be stopped at some point, these methods produce an approximation to the root, not an exact solution. Many methods compute subsequent values by evaluating an auxiliary function on the preceding values. The limit is thus a fixed point of the auxiliary function, which is chosen for having the roots of the original equation as fixed points and for converging rapidly to these fixed points.

The behavior of general root-finding algorithms is studied in numerical analysis. However, for polynomials specifically, the study of root-finding algorithms belongs to computer algebra, since algebraic properties of polynomials are fundamental for the most efficient algorithms. The efficiency and applicability of an algorithm may depend sensitively on the characteristics of the given functions. For example, many algorithms use the derivative of the input function, while others work on every continuous function. In general, numerical algorithms are not guaranteed to find all the roots of a function, so failing to find a root does not prove that there is no root. However, for polynomials, there are specific algorithms that use algebraic properties for certifying that no root is missed and for locating the roots in separate intervals (or disks for complex roots) that are small enough to ensure the convergence of numerical methods (typically Newton's method) to the unique root within each interval (or disk).

Bracketing methods

Bracketing methods determine successively smaller intervals (brackets) that contain a root. When the interval is small enough, then a root is considered found. These generally use the intermediate value theorem, which asserts that if a continuous function has values of opposite signs at the end points of an interval, then the function has at least one root in the interval. Therefore, they require starting with an interval such that the function takes opposite signs at the end points of the interval. However, in the case of polynomials there are other methods such as Descartes' rule of signs, Budan's theorem and Sturm's theorem for bounding or determining the number of roots in an interval. They lead to efficient algorithms for real-root isolation of polynomials, which find all real roots with a guaranteed accuracy.

Bisection method

The simplest root-finding algorithm is the bisection method. Let f be a continuous function for which one knows an interval [a, b] such that f(a) and f(b) have opposite signs (a bracket). Let c = (a +b)/2 be the middle of the interval (the midpoint or the point that bisects the interval). Then either f(a) and f(c), or f(c) and f(b) have opposite signs, and one has divided by two the size of the interval. Although the bisection method is robust, it gains one and only one bit of accuracy with each iteration. Therefore, the number of function evaluations required for finding an ε-approximate root is . Other methods, under appropriate conditions, can gain accuracy faster.

False position (regula falsi)

The false position method, also called the regula falsi method, is similar to the bisection method, but instead of using bisection search's middle of the interval it uses the x-intercept of the line that connects the plotted function values at the endpoints of the interval, that is

False position is similar to the secant method, except that, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method can be faster than the bisection method and will never diverge like the secant method. However, it may fail to converge in some naive implementations due to roundoff errors that may lead to a wrong sign for f(c). Typically, this may occur if the derivative of f is large in the neighborhood of the root.

ITP method

The ITP method is the only known method to bracket the root with the same worst case guarantees of the bisection method while guaranteeing a superlinear convergence to the root of smooth functions as the secant method. It is also the only known method guaranteed to outperform the bisection method on the average for any continuous distribution on the location of the root (see ITP Method#Analysis). It does so by keeping track of both the bracketing interval as well as the minmax interval in which any point therein converges as fast as the bisection method. The construction of the queried point c follows three steps: interpolation (similar to the regula falsi), truncation (adjusting the regula falsi similar to Regula falsi § Improvements in regula falsi) and then projection onto the minmax interval. The combination of these steps produces a simultaneously minmax optimal method with guarantees similar to interpolation based methods for smooth functions, and in practice will outperform both the bisection method and interpolation based methods applied to both smooth and non-smooth functions.

Interpolation

Many root-finding processes work by interpolation. This consists in using the last computed approximate values of the root for approximating the function by a polynomial of low degree, which takes the same values at these approximate roots. Then the root of the polynomial is computed and used as a new approximate value of the root of the function, and the process is iterated.

Interpolating two values yields a line: a polynomial of degree one. This is the basis of the secant method. Regula falsi is also an interpolation method that interpolates two points at a time but it differs from the secant method by using two points that are not necessarily the last two computed points. Three values define a parabolic curve: a quadratic function. This is the basis of Muller's method.

Iterative methods

Although all root-finding algorithms proceed by iteration, an iterative root-finding method generally uses a specific type of iteration, consisting of defining an auxiliary function, which is applied to the last computed approximations of a root for getting a new approximation. The iteration stops when a fixed point of the auxiliary function is reached to the desired precision, i.e., when a new computed value is sufficiently close to the preceding ones.

Newton's method (and similar derivative-based methods)

Newton's method assumes the function f to have a continuous derivative. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method; its order of convergence is usually quadratic whereas the bisection method's is linear. Newton's method is also important because it readily generalizes to higher-dimensional problems. Householder's methods are a class of Newton-like methods with higher orders of convergence. The first one after Newton's method is Halley's method with cubic order of convergence.

Secant method

Replacing the derivative in Newton's method with a finite difference, we get the secant method. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order of convergence is the golden ratio, approximately 1.62 [2] ). A generalization of the secant method in higher dimensions is Broyden's method.

Steffensen's method

If we use a polynomial fit to remove the quadratic part of the finite difference used in the secant method, so that it better approximates the derivative, we obtain Steffensen's method, which has quadratic convergence, and whose behavior (both good and bad) is essentially the same as Newton's method but does not require a derivative.

Fixed point iteration method

We can use the fixed-point iteration to find the root of a function. Given a function which we have set to zero to find the root (), we rewrite the equation in terms of so that becomes (note, there are often many functions for each function). Next, we relabel each side of the equation as so that we can perform the iteration. Next, we pick a value for and perform the iteration until it converges towards a root of the function. If the iteration converges, it will converge to a root. The iteration will only converge if .

As an example of converting to , if given the function , we will rewrite it as one of the following equations.

,
,
,
, or
.

Inverse interpolation

The appearance of complex values in interpolation methods can be avoided by interpolating the inverse of f, resulting in the inverse quadratic interpolation method. Again, convergence is asymptotically faster than the secant method, but inverse quadratic interpolation often behaves poorly when the iterates are not close to the root.

Combinations of methods

Brent's method

Brent's method is a combination of the bisection method, the secant method and inverse quadratic interpolation. At every iteration, Brent's method decides which method out of these three is likely to do best, and proceeds by doing a step according to that method. This gives a robust and fast method, which therefore enjoys considerable popularity.

Ridders' method

Ridders' method is a hybrid method that uses the value of function at the midpoint of the interval to perform an exponential interpolation to the root. This gives a fast convergence with a guaranteed convergence of at most twice the number of iterations as the bisection method.

Roots of polynomials

Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.

Finding roots in higher dimensions

The bisection method has been generalized to higher dimensions; these methods are called generalized bisection methods. [3] [4] At each iteration, the domain is partitioned into two parts, and the algorithm decides - based on a small number of function evaluations - which of these two parts must contain a root. In one dimension, the criterion for decision is that the function has opposite signs. The main challenge in extending the method to multiple dimensions is to find a criterion that can be computed easily and guarantees the existence of a root.

The Poincaré–Miranda theorem gives a criterion for the existence of a root in a rectangle, but it is hard to verify because it requires evaluating the function on the entire boundary of the rectangle.

Another criterion is given by a theorem of Kronecker. [5] It says that, if the topological degree of a function f on a rectangle is non-zero, then the rectangle must contain at least one root of f. This criterion is the basis for several root-finding methods, such as those of Stenger [6] and Kearfott. [7] However, computing the topological degree can be time-consuming.

A third criterion is based on a characteristic polyhedron. This criterion is used by a method called Characteristic Bisection. [3] :19-- It does not require computing the topological degree; it only requires computing the signs of function values. The number of required evaluations is at least , where D is the length of the longest edge of the characteristic polyhedron. [8] :11,Lemma.4.7 Note that Vrahatis and Iordanidis [8] prove a lower bound on the number of evaluations, and not an upper bound.

A fourth method uses an intermediate value theorem on simplices. [9] Again, no upper bound on the number of queries is given.

See also

Broyden's method  – Quasi-Newton root-finding method for the multivariable case

Related Research Articles

<span class="mw-page-title-main">Newton's method</span> Algorithm for finding zeros of functions

In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function f, its derivative f, and an initial guess x0 for a root of f. If f satisfies certain assumptions and the initial guess is close, then

<span class="mw-page-title-main">Secant method</span> Root-finding method

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method, so it is considered a quasi-Newton method. Historically, it is as an evolution of the method of false position, which predates Newton's method by over 3000 years.

<span class="mw-page-title-main">Bisection method</span> Algorithm for finding a zero of a function

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods. The method is also called the interval halving method, the binary search method, or the dichotomy method.

In mathematics, the regula falsi, method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and error technique of using test ("false") values for the variable and then adjusting the test value according to the outcome. This is sometimes also referred to as "guess and check". Versions of the method predate the advent of algebra and the use of equations.

In numerical analysis, Brent's method is a hybrid root-finding algorithm combining the bisection method, the secant method and inverse quadratic interpolation. It has the reliability of bisection but it can be as quick as some of the less-reliable methods. The algorithm tries to use the potentially fast-converging secant method or inverse quadratic interpolation if possible, but it falls back to the more robust bisection method if necessary. Brent's method is due to Richard Brent and builds on an earlier algorithm by Theodorus Dekker. Consequently, the method is also known as the Brent–Dekker method.

In numerical analysis, inverse quadratic interpolation is a root-finding algorithm, meaning that it is an algorithm for solving equations of the form f(x) = 0. The idea is to use quadratic interpolation to approximate the inverse of f. This algorithm is rarely used on its own, but it is important because it forms part of the popular Brent's method.

In optimization, line search is a basic iterative approach to find a local minimum of an objective function . It first finds a descent direction along which the objective function will be reduced, and then computes a step size that determines how far should move along that direction. The descent direction can be computed by various methods, such as gradient descent or quasi-Newton method. The step size can be determined either exactly or inexactly.

Muller's method is a root-finding algorithm, a numerical method for solving equations of the form f(x) = 0. It was first presented by David E. Muller in 1956.

In numerical analysis, the Weierstrass method or Durand–Kerner method, discovered by Karl Weierstrass in 1891 and rediscovered independently by Durand in 1960 and Kerner in 1966, is a root-finding algorithm for solving polynomial equations. In other words, the method can be used to solve numerically the equation

In numerical analysis, Bairstow's method is an efficient algorithm for finding the roots of a real polynomial of arbitrary degree. The algorithm first appeared in the appendix of the 1920 book Applied Aerodynamics by Leonard Bairstow. The algorithm finds the roots in complex conjugate pairs using only real arithmetic.

The Aberth method, or Aberth–Ehrlich method or Ehrlich–Aberth method, named after Oliver Aberth and Louis W. Ehrlich, is a root-finding algorithm developed in 1967 for simultaneous approximation of all the roots of a univariate polynomial.

In numerical analysis, a quasi-Newton method is an iterative numerical method used either to find zeroes or to find local maxima and minima of functions via an iterative recurrence formula much like the one for Newton's method, except using approximations of the derivatives of the functions in place of exact derivatives. Newton's method requires the Jacobian matrix of all partial derivatives of a multivariate function when used to search for zeros or the Hessian matrix when used for finding extrema. Quasi-Newton methods, on the other hand, can be used when the Jacobian matrices or Hessian matrices are unavailable or are impractical to compute at every iteration.

The Jenkins–Traub algorithm for polynomial zeros is a fast globally convergent iterative polynomial root-finding method published in 1970 by Michael A. Jenkins and Joseph F. Traub. They gave two variants, one for general polynomials with complex coefficients, commonly known as the "CPOLY" algorithm, and a more complicated variant for the special case of polynomials with real coefficients, commonly known as the "RPOLY" algorithm. The latter is "practically a standard in black-box polynomial root-finders".

In numerical analysis, Steffensen's method is an iterative method for numerical root-finding named after Johan Frederik Steffensen that is similar to the secant method and to Newton's method. Steffensen's method achieves a quadratic order of convergence without using derivatives, whereas Newton's method converges quadratically but requires derivatives and the secant method does not require derivatives but also converges less quickly than quadratically. Steffensen's method has the drawback that it requires two function evaluations per step, however, whereas Newton's method and the secant method require only one evaluation per step, so it is not necessarily most efficient in terms of computational cost.

Sidi's generalized secant method is a root-finding algorithm, that is, a numerical method for solving equations of the form . The method was published by Avram Sidi.

In mathematics, and, more specifically in numerical analysis and computer algebra, real-root isolation of a polynomial consist of producing disjoint intervals of the real line, which contain each one real root of the polynomial, and, together, contain all the real roots of the polynomial.

Numerical certification is the process of verifying the correctness of a candidate solution to a system of equations. In (numerical) computational mathematics, such as numerical algebraic geometry, candidate solutions are computed algorithmically, but there is the possibility that errors have corrupted the candidates. For instance, in addition to the inexactness of input data and candidate solutions, numerical errors or errors in the discretization of the problem may result in corrupted candidate solutions. The goal of numerical certification is to provide a certificate which proves which of these candidates are, indeed, approximate solutions.

In numerical analysis, the ITP method, short for Interpolate Truncate and Project, is the first root-finding algorithm that achieves the superlinear convergence of the secant method while retaining the optimal worst-case performance of the bisection method. It is also the first method with guaranteed average performance strictly better than the bisection method under any continuous distribution. In practice it performs better than traditional interpolation and hybrid based strategies, since it not only converges super-linearly over well behaved functions but also guarantees fast performance under ill-behaved functions where interpolations fail.

Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.

References

  1. Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P. (2007). "Chapter 9. Root Finding and Nonlinear Sets of Equations". Numerical Recipes: The Art of Scientific Computing (3rd ed.). New York: Cambridge University Press. ISBN   978-0-521-88068-8.
  2. Chanson, Jeffrey R. (October 3, 2024). "Order of Convergence". LibreTexts Mathematics. Retrieved October 3, 2024.
  3. 1 2 Mourrain, B.; Vrahatis, M. N.; Yakoubsohn, J. C. (2002-06-01). "On the Complexity of Isolating Real Roots and Computing with Certainty the Topological Degree". Journal of Complexity. 18 (2): 612–640. doi: 10.1006/jcom.2001.0636 . ISSN   0885-064X.
  4. Vrahatis, Michael N. (2020). "Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions". In Sergeyev, Yaroslav D.; Kvasov, Dmitri E. (eds.). Numerical Computations: Theory and Algorithms. Lecture Notes in Computer Science. Vol. 11974. Cham: Springer International Publishing. pp. 223–238. doi:10.1007/978-3-030-40616-5_17. ISBN   978-3-030-40616-5. S2CID   211160947.
  5. "Iterative solution of nonlinear equations in several variables". Guide books. Retrieved 2023-04-16.
  6. Stenger, Frank (1975-03-01). "Computing the topological degree of a mapping inRn". Numerische Mathematik. 25 (1): 23–38. doi:10.1007/BF01419526. ISSN   0945-3245. S2CID   122196773.
  7. Kearfott, Baker (1979-06-01). "An efficient degree-computation method for a generalized method of bisection". Numerische Mathematik. 32 (2): 109–127. doi:10.1007/BF01404868. ISSN   0029-599X. S2CID   122058552.
  8. 1 2 Vrahatis, M. N.; Iordanidis, K. I. (1986-03-01). "A Rapid Generalized Method of Bisection for Solving Systems of Non-linear Equations". Numerische Mathematik. 49 (2): 123–138. doi:10.1007/BF01389620. ISSN   0945-3245. S2CID   121771945.
  9. Vrahatis, Michael N. (2020-04-15). "Intermediate value theorem for simplices for simplicial approximation of fixed points and zeros". Topology and Its Applications. 275: 107036. doi: 10.1016/j.topol.2019.107036 . ISSN   0166-8641. S2CID   213249321.

Further reading