ITP method

Last updated December 29, 2024

In numerical analysis, the ITP method (Interpolate Truncate and Project method) is the first root-finding algorithm that achieves the superlinear convergence of the secant method ^[1] while retaining the optimal^[2] worst-case performance of the bisection method.^[3] It is also the first method with guaranteed average performance strictly better than the bisection method under any continuous distribution.^[3] In practice it performs better than traditional interpolation and hybrid based strategies (Brent's Method, Ridders, Illinois), since it not only converges super-linearly over well behaved functions but also guarantees fast performance under ill-behaved functions where interpolations fail.^[3]

The ITP method follows the same structure of standard bracketing strategies that keeps track of upper and lower bounds for the location of the root; but it also keeps track of the region where worst-case performance is kept upper-bounded. As a bracketing strategy, in each iteration the ITP queries the value of the function on one point and discards the part of the interval between two points where the function value shares the same sign. The queried point is calculated with three steps: it interpolates finding the regula falsi estimate, then it perturbes/truncates the estimate (similar to Regula falsi § Improvements in regula falsi) and then projects the perturbed estimate onto an interval in the neighbourhood of the bisection midpoint. The neighbourhood around the bisection point is calculated in each iteration in order to guarantee minmax optimality (Theorem 2.1 of ^[3]). The method depends on three hyper-parameters $\kappa _{1}\in (0,\infty ),\kappa _{2}\in \left[1,1+\phi \right)$ and $n_{0}\in [0,\infty )$ where $\phi$ is the golden ratio ${\tfrac {1}{2}}(1+{\sqrt {5}})$ : the first two control the size of the truncation and the third is a slack variable that controls the size of the interval for the projection step.^[a]

Root finding problem

Given a continuous function $f$ defined from $[a,b]$ to $\mathbb {R}$ such that $f(a)f(b)\leq 0$ , where at the cost of one query one can access the values of $f(x)$ on any given $x$ . And, given a pre-specified target precision $\epsilon >0$ , a root-finding algorithm is designed to solve the following problem with the least amount of queries as possible:

Problem Definition: Find ${\hat {x}}$ such that $|{\hat {x}}-x^{*}|\leq \epsilon$ , where $x^{*}$ satisfies $f(x^{*})=0$ .

This problem is very common in numerical analysis, computer science and engineering; and, root-finding algorithms are the standard approach to solve it. Often, the root-finding procedure is called by more complex parent algorithms within a larger context, and, for this reason solving root problems efficiently is of extreme importance since an inefficient approach might come at a high computational cost when the larger context is taken into account. This is what the ITP method attempts to do by simultaneously exploiting interpolation guarantees as well as minmax optimal guarantees of the bisection method that terminates in at most $n_{1/2}\equiv \lceil \log _{2}((b_{0}-a_{0})/2\epsilon )\rceil$ iterations when initiated on an interval $[a_{0},b_{0}]$ .

The method

Given $\kappa _{1}\in (0,\infty ),\kappa _{2}\in \left[1,1+\phi \right)$ , $n_{1/2}\equiv \lceil \log _{2}((b_{0}-a_{0})/2\epsilon )\rceil$ and $n_{0}\in [0,\infty )$ where $\phi$ is the golden ratio ${\tfrac {1}{2}}(1+{\sqrt {5}})$ , in each iteration $j=0,1,2\dots$ the ITP method calculates the point $x_{\text{ITP}}$ following three steps:

Step 1 of the ITP method. ITPstep1.png — Step 1 of the ITP method.

Step 2 of the ITP method. ITPstep2.png — Step 2 of the ITP method.

Step 3 of the ITP method. ITPstep3.png — Step 3 of the ITP method.

[Interpolation Step] Calculate the bisection and the regula falsi points: $x_{1/2}\equiv {\frac {a+b}{2}}$ and $x_{f}\equiv {\frac {bf(a)-af(b)}{f(a)-f(b)}}$ ;
[Truncation Step] Perturb the estimator towards the center: $x_{t}\equiv x_{f}+\sigma \delta$ where $\sigma \equiv {\text{sign}}(x_{1/2}-x_{f})$ and $\delta \equiv \min\{\kappa _{1}|b-a|^{\kappa _{2}},|x_{1/2}-x_{f}|\}$ ;
[Projection Step] Project the estimator to minmax interval: $x_{\text{ITP}}\equiv x_{1/2}-\sigma \rho _{k}$ where $\rho _{k}\equiv \min \left\{\epsilon 2^{n_{1/2}+n_{0}-j}-{\frac {b-a}{2}},|x_{t}-x_{1/2}|\right\}$ .

The value of the function $f(x_{\text{ITP}})$ on this point is queried, and the interval is then reduced to bracket the root by keeping the sub-interval with function values of opposite sign on each end.

The algorithm

The following algorithm (written in pseudocode) assumes the initial values of $y_{a}$ and $y_{b}$ are given and satisfy $y_{a}<0<y_{b}$ where $y_{a}\equiv f(a)$ and $y_{b}\equiv f(b)$ ; and, it returns an estimate ${\hat {x}}$ that satisfies $|{\hat {x}}-x^{*}|\leq \epsilon$ in at most $n_{1/2}+n_{0}$ function evaluations.

Input:  $a,b,\epsilon ,\kappa _{1},\kappa _{2},n_{0},f$ Preprocessing: $n_{1/2}=\lceil \log _{2}{\tfrac {b-a}{2\epsilon }}\rceil$ ,  $n_{\max }=n_{1/2}+n_{0}$ , and   $j=0$ ;     While (  $b-a>2\epsilon$ )Calculating Parameters: $x_{1/2}={\tfrac {a+b}{2}}$ ,  $r=\epsilon 2^{n_{\max }-j}-(b-a)/2$ ,  $\delta =\kappa _{1}(b-a)^{\kappa _{2}}$ ;         Interpolation: $x_{f}={\tfrac {y_{b}a-y_{a}b}{y_{b}-y_{a}}}$ ;         Truncation: $\sigma ={\text{sign}}(x_{1/2}-x_{f})$ ;             If  $\delta \leq |x_{1/2}-x_{f}|$  then  $x_{t}=x_{f}+\sigma \delta$ ,             Else  $x_{t}=x_{1/2}$ ;         Projection:             If  $|x_{t}-x_{1/2}|\leq r$  then  $x_{\text{ITP}}=x_{t}$ ,             Else  $x_{\text{ITP}}=x_{1/2}-\sigma r$ ;         Updating Interval: $y_{\text{ITP}}=f(x_{\text{ITP}})$ ;             If  $y_{\text{ITP}}>0$  then  $b=x_{ITP}$  and  $y_{b}=y_{\text{ITP}}$ ,             Elseif  $y_{\text{ITP}}<0$  then  $a=x_{\text{ITP}}$  and  $y_{a}=y_{\text{ITP}}$ ,             Else  $a=x_{\text{ITP}}$  and  $b=x_{\text{ITP}}$ ;              $j=j+1$ ; Output:  ${\hat {x}}={\tfrac {a+b}{2}}$

Example: Finding the root of a polynomial

Suppose that the ITP method is used to find a root of the polynomial $f(x)=x^{3}-x-2\,.$ Using $\epsilon =0.0005,\kappa _{1}=0.1,\kappa _{2}=2$ and $n_{0}=1$ we find that:

Iteration	$a_{n}$	$b_{n}$	$c_{n}$	$f(c_{n})$
1	1	2	1.43333333333333	-0.488629629629630
2	1.43333333333333	2	1.52713145056966	0.0343383329048983
3	1.43333333333333	1.52713145056966	1.52009281150978	-0.00764147709265051
4	1.52009281150978	1.52713145056966	1.52137899116052	-4.25363464540141e-06
5	1.52137899116052	1.52713145056966	1.52138301273268	1.96497878177659e-05
6	1.52137899116052	1.52138301273268	← Stopping Criteria Satisfied

This example can be compared to Bisection method § Example: Finding the root of a polynomial. The ITP method required less than half the number of iterations than the bisection to obtain a more precise estimate of the root with no cost on the minmax guarantees. Other methods might also attain a similar speed of convergence (such as Ridders, Brent etc.) but without the minmax guarantees given by the ITP method.

Analysis

The main advantage of the ITP method is that it is guaranteed to require no more iterations than the bisection method when $n_{0}=0$ . And so its average performance is guaranteed to be better than the bisection method even when interpolation fails. Furthermore, if interpolations do not fail (smooth functions), then it is guaranteed to enjoy the high order of convergence as interpolation based methods.

Worst case performance

Because the ITP method projects the estimator onto the minmax interval with a $n_{0}$ slack, it will require at most $n_{1/2}+n_{0}$ iterations (Theorem 2.1 of ^[3]). This is minmax optimal like the bisection method when $n_{0}$ is chosen to be $n_{0}=0$ .

Average performance

Because it does not take more than $n_{1/2}+n_{0}$ iterations, the average number of iterations will always be less than that of the bisection method for any distribution considered when $n_{0}=0$ (Corollary 2.2 of ^[3]).

Asymptotic performance

If the function $f(x)$ is twice differentiable and the root $x^{*}$ is simple, then the intervals produced by the ITP method converges to 0 with an order of convergence of ${\sqrt {\kappa _{2}}}$ if $n_{0}\neq 0$ or if $n_{0}=0$ and $(b-a)/\epsilon$ is not a power of 2 with the term ${\tfrac {\epsilon 2^{n_{1/2}}}{b-a}}$ not too close to zero (Theorem 2.3 of ^[3]).

Software

The itp^[4] contributed package in R.

Notes

↑ For a more in-depth discussion of the hyper-parameters, see the documentation for ITP in the kurbo library.

Related Research Articles

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function $f$ is a number $x$ such that $f (x) = 0$ . As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros. For functions from the real numbers to real numbers or from the complex numbers to the complex numbers, these are expressed either as floating-point numbers without error bounds or as floating-point values together with error bounds. The latter, approximations with error bounds, are equivalent to small isolating intervals for real roots or disks for complex roots.

In mathematics, the Kronecker delta is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise: $or with use of Iverson brackets: For example, because, whereas because .$

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method, so it is considered a quasi-Newton method. Historically, it is as an evolution of the method of false position, which predates Newton's method by over 3000 years.

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods. The method is also called the interval halving method, the binary search method, or the dichotomy method.

In mathematics, the regula falsi, method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and error technique of using test ("false") values for the variable and then adjusting the test value according to the outcome. This is sometimes also referred to as "guess and check". Versions of the method predate the advent of algebra and the use of equations.

In mathematics, specifically the study of differential equations, the Picard–Lindelöf theorem gives a set of conditions under which an initial value problem has a unique solution. It is also known as Picard's existence theorem, the Cauchy–Lipschitz theorem, or the existence and uniqueness theorem.

In mathematical analysis, particularly numerical analysis, the rate of convergence and order of convergence of a sequence that converges to a limit are any of several characterizations of how quickly that sequence approaches its limit. These are broadly divided into rates and orders of convergence that describe how quickly a sequence further approaches its limit once it is already close to it, called asymptotic rates and orders of convergence, and those that describe how quickly sequences approach their limits from starting points that are not necessarily close to their limits, called non-asymptotic rates and orders of convergence.

In numerical analysis, Brent's method is a hybrid root-finding algorithm combining the bisection method, the secant method and inverse quadratic interpolation. It has the reliability of bisection but it can be as quick as some of the less-reliable methods. The algorithm tries to use the potentially fast-converging secant method or inverse quadratic interpolation if possible, but it falls back to the more robust bisection method if necessary. Brent's method is due to Richard Brent and builds on an earlier algorithm by Theodorus Dekker. Consequently, the method is also known as the Brent–Dekker method.

In mathematics, the Eisenstein integers, occasionally also known as Eulerian integers, are the complex numbers of the form

Methods of computing square roots are algorithms for approximating the non-negative square root $of a positive real number . Since all square roots of natural numbers, other than of perfect squares, are irrational, square roots can usually only be computed to some finite precision: these methods typically construct a series of increasingly accurate approximations.$

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments. In other words, a binomial proportion confidence interval is an interval estimate of a success probability $when only the number of experiments and the number of successes are known.$

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members often asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— $in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.$

The fast multipole method (FMM) is a numerical technique that was developed to speed up the calculation of long-ranged forces in the n-body problem. It does this by expanding the system Green's function using a multipole expansion, which allows one to group sources that lie close together and treat them as if they are a single source.

The Jenkins–Traub algorithm for polynomial zeros is a fast globally convergent iterative polynomial root-finding method published in 1970 by Michael A. Jenkins and Joseph F. Traub. They gave two variants, one for general polynomials with complex coefficients, commonly known as the "CPOLY" algorithm, and a more complicated variant for the special case of polynomials with real coefficients, commonly known as the "RPOLY" algorithm. The latter is "practically a standard in black-box polynomial root-finders".

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In mathematics, a packing in a hypergraph is a partition of the set of the hypergraph's edges into a number of disjoint subsets such that no pair of edges in each subset share any vertex. There are two famous algorithms to achieve asymptotically optimal packing in k-uniform hypergraphs. One of them is a random greedy algorithm which was proposed by Joel Spencer. He used a branching process to formally prove the optimal achievable bound under some side conditions. The other algorithm is called the Rödl nibble and was proposed by Vojtěch Rödl et al. They showed that the achievable packing by the Rödl nibble is in some sense close to that of the random greedy algorithm.

In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) is already known to have occurred. This particular method relies on event A occurring with some sort of relationship with another event B. In this situation, the event A can be analyzed by a conditional probability with respect to B. If the event of interest is $A$ and the event $B$ is known or assumed to have occurred, "the conditional probability of $A$ given $B$ ", or "the probability of $A$ under the condition $B$ ", is usually written as $P(A | B)$ or occasionally $P B (A)$ . This can also be understood as the fraction of probability B that intersects with A, or the ratio of the probabilities of both events happening to the "given" one happening (how many times A occurs rather than not assuming B has occurred): $.$

An approach to nonlinear congruential methods of generating uniform pseudorandom numbers in the interval [0,1) is the Inversive congruential generator with prime modulus. A generalization for arbitrary composite moduli $with arbitrary distinct primes will be present here.$

(Stochastic) variance reduction is an algorithmic approach to minimizing functions that can be decomposed into finite sums. By exploiting the finite sum structure, variance reduction techniques are able to achieve convergence rates that are impossible to achieve with methods that treat the objective as an infinite sum, as in the classical Stochastic approximation setting.

In supersymmetry, type I supergravity is the theory of supergravity in ten dimensions with a single supercharge. It consists of a single supergravity multiplet and a single Yang–Mills multiplet. The full non-abelian action was first derived in 1983 by George Chapline and Nicholas Manton. Classically the theory can admit any gauge group, but a consistent quantum theory resulting in anomaly cancellation only exists if the gauge group is either $or . Both these supergravities are realised as the low-energy limits of string theories, in particular of type I string theory and of the two heterotic string theories.$

References

↑ Argyros, I. K.; Hernández-Verón, M. A.; Rubio, M. J. (2019). "On the Convergence of Secant-Like Methods". Current Trends in Mathematical Analysis and Its Interdisciplinary Applications. pp. 141–183. doi:10.1007/978-3-030-15242-0_5. ISBN 978-3-030-15241-3. S2CID 202156085.^{[ permanent dead link ‍]}
↑ Sikorski, K. (1982-02-01). "Bisection is optimal". Numerische Mathematik. 40 (1): 111–117. doi:10.1007/BF01459080. ISSN 0945-3245. S2CID 119952605.
1 2 3 4 5 6 7 Oliveira, I. F. D.; Takahashi, R. H. C. (2020-12-06). "An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality". ACM Transactions on Mathematical Software. 47 (1): 5:1–5:24. doi:10.1145/3423597. ISSN 0098-3500. S2CID 230586635.
↑ Northrop, P. J. (2023), itp: The Interpolate, Truncate, Project (ITP) Root-Finding Algorithm

External links

An Improved Bisection Method, by Kudos

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[4] For a more in-depth discussion of the hyper-parameters, see the documentation for ITP in the kurbo library.

[1] Argyros, I. K.; Hernández-Verón, M. A.; Rubio, M. J. (2019). "On the Convergence of Secant-Like Methods". Current Trends in Mathematical Analysis and Its Interdisciplinary Applications. pp. 141–183. doi:10.1007/978-3-030-15242-0_5. ISBN 978-3-030-15241-3. S2CID 202156085.^{[ permanent dead link ‍]}

[2] Sikorski, K. (1982-02-01). "Bisection is optimal". Numerische Mathematik. 40 (1): 111–117. doi:10.1007/BF01459080. ISSN 0945-3245. S2CID 119952605.

[:0-3] 1 2 3 4 5 6 7 Oliveira, I. F. D.; Takahashi, R. H. C. (2020-12-06). "An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality". ACM Transactions on Mathematical Software. 47 (1): 5:1–5:24. doi:10.1145/3423597. ISSN 0098-3500. S2CID 230586635.

[5] Northrop, P. J. (2023), itp: The Interpolate, Truncate, Project (ITP) Root-Finding Algorithm

[1]

[2]

[3]

[a]

[4]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Householder	Newton's method Halley's method
Quasi-Newton	Broyden's method Secant method Newton–Krylov method Steffensen's method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Aberth method Bairstow's method Durand–Kerner method Graeffe's method Jenkins–Traub algorithm Lehmer–Schur algorithm Laguerre's method Splitting circle method
Other methods	Fixed-point iteration Inverse quadratic interpolation Muller's method Sidi's generalized secant method