ITP method

Last updated

In numerical analysis, the ITP method, short for Interpolate Truncate and Project, is the first root-finding algorithm that achieves the superlinear convergence of the secant method [1] while retaining the optimal [2] worst-case performance of the bisection method. [3] It is also the first method with guaranteed average performance strictly better than the bisection method under any continuous distribution. [3] In practice it performs better than traditional interpolation and hybrid based strategies (Brent's Method, Ridders, Illinois), since it not only converges super-linearly over well behaved functions but also guarantees fast performance under ill-behaved functions where interpolations fail. [3]

Contents

The ITP method follows the same structure of standard bracketing strategies that keeps track of upper and lower bounds for the location of the root; but it also keeps track of the region where worst-case performance is kept upper-bounded. As a bracketing strategy, in each iteration the ITP queries the value of the function on one point and discards the part of the interval between two points where the function value shares the same sign. The queried point is calculated with three steps: it interpolates finding the regula falsi estimate, then it perturbes/truncates the estimate (similar to Regula falsi § Improvements in regula falsi) and then projects the perturbed estimate onto an interval in the neighbourhood of the bisection midpoint. The neighbourhood around the bisection point is calculated in each iteration in order to guarantee minmax optimality (Theorem 2.1 of [3] ). The method depends on three hyper-parameters and where is the golden ratio : the first two control the size of the truncation and the third is a slack variable that controls the size of the interval for the projection step. [a]

Root finding problem

Given a continuous function defined from to such that , where at the cost of one query one can access the values of on any given . And, given a pre-specified target precision , a root-finding algorithm is designed to solve the following problem with the least amount of queries as possible:

Problem Definition: Find such that , where satisfies .

This problem is very common in numerical analysis, computer science and engineering; and, root-finding algorithms are the standard approach to solve it. Often, the root-finding procedure is called by more complex parent algorithms within a larger context, and, for this reason solving root problems efficiently is of extreme importance since an inefficient approach might come at a high computational cost when the larger context is taken into account. This is what the ITP method attempts to do by simultaneously exploiting interpolation guarantees as well as minmax optimal guarantees of the bisection method that terminates in at most iterations when initiated on an interval .

The method

Given , and where is the golden ratio , in each iteration the ITP method calculates the point following three steps:

Step 1 of the ITP method. ITPstep1.png
Step 1 of the ITP method.
Step 2 of the ITP method. ITPstep2.png
Step 2 of the ITP method.
Step 3 of the ITP method. ITPstep3.png
Step 3 of the ITP method.
All three steps combined form the ITP method. The thick blue line represents the "projected-truncated-interpolation" of the method. ITPall steps.png
All three steps combined form the ITP method. The thick blue line represents the "projected-truncated-interpolation" of the method.
  1. [Interpolation Step] Calculate the bisection and the regula falsi points: and  ;
  2. [Truncation Step] Perturb the estimator towards the center: where and  ;
  3. [Projection Step] Project the estimator to minmax interval: where .

The value of the function on this point is queried, and the interval is then reduced to bracket the root by keeping the sub-interval with function values of opposite sign on each end.

The algorithm

The following algorithm (written in pseudocode) assumes the initial values of and are given and satisfy where and ; and, it returns an estimate that satisfies in at most function evaluations.

Input: Preprocessing:, , and  ;     While ( )Calculating Parameters:, , ;         Interpolation:;         Truncation:;             If  then ,             Else ;         Projection:             If  then ,             Else ;         Updating Interval:;             If  then  and ,             Elseif  then  and ,             Else  and ;             ; Output: 

Example: Finding the root of a polynomial

Suppose that the ITP method is used to find a root of the polynomial Using and we find that:

Iteration
1121.43333333333333-0.488629629629630
21.4333333333333321.527131450569660.0343383329048983
31.433333333333331.527131450569661.52009281150978-0.00764147709265051
41.520092811509781.527131450569661.52137899116052-4.25363464540141e-06
51.521378991160521.527131450569661.521383012732681.96497878177659e-05
61.521378991160521.52138301273268 Stopping Criteria Satisfied

This example can be compared to Bisection method § Example: Finding the root of a polynomial. The ITP method required less than half the number of iterations than the bisection to obtain a more precise estimate of the root with no cost on the minmax guarantees. Other methods might also attain a similar speed of convergence (such as Ridders, Brent etc.) but without the minmax guarantees given by the ITP method.

Analysis

The main advantage of the ITP method is that it is guaranteed to require no more iterations than the bisection method when . And so its average performance is guaranteed to be better than the bisection method even when interpolation fails. Furthermore, if interpolations do not fail (smooth functions), then it is guaranteed to enjoy the high order of convergence as interpolation based methods.

Worst case performance

Because the ITP method projects the estimator onto the minmax interval with a slack, it will require at most iterations (Theorem 2.1 of [3] ). This is minmax optimal like the bisection method when is chosen to be .

Average performance

Because it does not take more than iterations, the average number of iterations will always be less than that of the bisection method for any distribution considered when (Corollary 2.2 of [3] ).

Asymptotic performance

If the function is twice differentiable and the root is simple, then the intervals produced by the ITP method converges to 0 with an order of convergence of if or if and is not a power of 2 with the term not too close to zero (Theorem 2.3 of [3] ).

Software

See also

Notes

  1. For a more in-depth discussion of the hyper-parameters, see the documentation for ITP in the kurbo library.

Related Research Articles

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function f is a number x such that f(x) = 0. As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros. For functions from the real numbers to real numbers or from the complex numbers to the complex numbers, these are expressed either as floating-point numbers without error bounds or as floating-point values together with error bounds. The latter, approximations with error bounds, are equivalent to small isolating intervals for real roots or disks for complex roots.

In mathematics, the Kronecker delta is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise: or with use of Iverson brackets: For example, because , whereas because .

<span class="mw-page-title-main">Secant method</span> Root-finding method

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method, so it is considered a quasi-Newton method. Historically, it is as an evolution of the method of false position, which predates Newton's method by over 3000 years.

In mathematics, nonstandard calculus is the modern application of infinitesimals, in the sense of nonstandard analysis, to infinitesimal calculus. It provides a rigorous justification for some arguments in calculus that were previously considered merely heuristic.

<span class="mw-page-title-main">Bisection method</span> Algorithm for finding a zero of a function

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods. The method is also called the interval halving method, the binary search method, or the dichotomy method.

In mathematics, the regula falsi, method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and error technique of using test ("false") values for the variable and then adjusting the test value according to the outcome. This is sometimes also referred to as "guess and check". Versions of the method predate the advent of algebra and the use of equations.

In mathematics, specifically the study of differential equations, the Picard–Lindelöf theorem gives a set of conditions under which an initial value problem has a unique solution. It is also known as Picard's existence theorem, the Cauchy–Lipschitz theorem, or the existence and uniqueness theorem.

In mathematical analysis, particularly numerical analysis, the rate of convergence and order of convergence of a sequence that converges to a limit are any of several characterizations of how quickly that sequence approaches its limit. These are broadly divided into rates and orders of convergence that describe how quickly a sequence further approaches its limit once it is already close to it, called asymptotic rates and orders of convergence, and those that describe how quickly sequences approach their limits from starting points that are not necessarily close to their limits, called non-asymptotic rates and orders of convergence.

In numerical analysis, Brent's method is a hybrid root-finding algorithm combining the bisection method, the secant method and inverse quadratic interpolation. It has the reliability of bisection but it can be as quick as some of the less-reliable methods. The algorithm tries to use the potentially fast-converging secant method or inverse quadratic interpolation if possible, but it falls back to the more robust bisection method if necessary. Brent's method is due to Richard Brent and builds on an earlier algorithm by Theodorus Dekker. Consequently, the method is also known as the Brent–Dekker method.

In optimization, line search is a basic iterative approach to find a local minimum of an objective function . It first finds a descent direction along which the objective function will be reduced, and then computes a step size that determines how far should move along that direction. The descent direction can be computed by various methods, such as gradient descent or quasi-Newton method. The step size can be determined either exactly or inexactly.

<span class="mw-page-title-main">Eisenstein integer</span> Complex number whose mapping on a coordinate plane produces a triangular lattice

In mathematics, the Eisenstein integers, occasionally also known as Eulerian integers, are the complex numbers of the form

Methods of computing square roots are algorithms for approximating the non-negative square root of a positive real number . Since all square roots of natural numbers, other than of perfect squares, are irrational, square roots can usually only be computed to some finite precision: these methods typically construct a series of increasingly accurate approximations.

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments. In other words, a binomial proportion confidence interval is an interval estimate of a success probability when only the number of experiments and the number of successes are known.

The fast multipole method (FMM) is a numerical technique that was developed to speed up the calculation of long-ranged forces in the n-body problem. It does this by expanding the system Green's function using a multipole expansion, which allows one to group sources that lie close together and treat them as if they are a single source.

The Jenkins–Traub algorithm for polynomial zeros is a fast globally convergent iterative polynomial root-finding method published in 1970 by Michael A. Jenkins and Joseph F. Traub. They gave two variants, one for general polynomials with complex coefficients, commonly known as the "CPOLY" algorithm, and a more complicated variant for the special case of polynomials with real coefficients, commonly known as the "RPOLY" algorithm. The latter is "practically a standard in black-box polynomial root-finders".

<span class="mw-page-title-main">Anatoly Karatsuba</span> Russian mathematician (1937–2008)

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In mathematics, a packing in a hypergraph is a partition of the set of the hypergraph's edges into a number of disjoint subsets such that no pair of edges in each subset share any vertex. There are two famous algorithms to achieve asymptotically optimal packing in k-uniform hypergraphs. One of them is a random greedy algorithm which was proposed by Joel Spencer. He used a branching process to formally prove the optimal achievable bound under some side conditions. The other algorithm is called the Rödl nibble and was proposed by Vojtěch Rödl et al. They showed that the achievable packing by the Rödl nibble is in some sense close to that of the random greedy algorithm.

<span class="mw-page-title-main">Conditional probability</span> Probability of an event occurring, given that another event has already occurred

In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) is already known to have occurred. This particular method relies on event A occurring with some sort of relationship with another event B. In this situation, the event A can be analyzed by a conditional probability with respect to B. If the event of interest is A and the event B is known or assumed to have occurred, "the conditional probability of A given B", or "the probability of A under the condition B", is usually written as P(A|B) or occasionally PB(A). This can also be understood as the fraction of probability B that intersects with A, or the ratio of the probabilities of both events happening to the "given" one happening (how many times A occurs rather than not assuming B has occurred): .

(Stochastic) variance reduction is an algorithmic approach to minimizing functions that can be decomposed into finite sums. By exploiting the finite sum structure, variance reduction techniques are able to achieve convergence rates that are impossible to achieve with methods that treat the objective as an infinite sum, as in the classical Stochastic approximation setting.

In supersymmetry, type I supergravity is the theory of supergravity in ten dimensions with a single supercharge. It consists of a single supergravity multiplet and a single Yang–Mills multiplet. The full non-abelian action was first derived in 1983 by George Chapline and Nicholas Manton. Classically the theory can admit any gauge group, but a consistent quantum theory resulting in anomaly cancellation only exists if the gauge group is either or . Both these supergravities are realised as the low-energy limits of string theories, in particular of type I string theory and of the two heterotic string theories.

References

  1. Argyros, I. K.; Hernández-Verón, M. A.; Rubio, M. J. (2019). "On the Convergence of Secant-Like Methods". Current Trends in Mathematical Analysis and Its Interdisciplinary Applications. pp. 141–183. doi:10.1007/978-3-030-15242-0_5. ISBN   978-3-030-15241-3. S2CID   202156085.[ permanent dead link ]
  2. Sikorski, K. (1982-02-01). "Bisection is optimal". Numerische Mathematik. 40 (1): 111–117. doi:10.1007/BF01459080. ISSN   0945-3245. S2CID   119952605.
  3. 1 2 3 4 5 6 7 Oliveira, I. F. D.; Takahashi, R. H. C. (2020-12-06). "An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality". ACM Transactions on Mathematical Software. 47 (1): 5:1–5:24. doi:10.1145/3423597. ISSN   0098-3500. S2CID   230586635.
  4. Northrop, P. J. (2023), itp: The Interpolate, Truncate, Project (ITP) Root-Finding Algorithm