Brent's method

Last updated July 16, 2024

In numerical analysis, Brent's method is a hybrid root-finding algorithm combining the bisection method, the secant method and inverse quadratic interpolation. It has the reliability of bisection but it can be as quick as some of the less-reliable methods. The algorithm tries to use the potentially fast-converging secant method or inverse quadratic interpolation if possible, but it falls back to the more robust bisection method if necessary. Brent's method is due to Richard Brent ^[1] and builds on an earlier algorithm by Theodorus Dekker.^[2] Consequently, the method is also known as the Brent–Dekker method.

Modern improvements on Brent's method include Chandrupatla's method, which is simpler and faster for functions that are flat around their roots;^[3]^[4] Ridders' method, which performs exponential interpolations instead of quadratic providing a simpler closed formula for the iterations; and the ITP method which is a hybrid between regula-falsi and bisection that achieves optimal worst-case and asymptotic guarantees.

Dekker's method

The idea to combine the bisection method with the secant method goes back to Dekker (1969).

Suppose that we want to solve the equation f(x) = 0. As with the bisection method, we need to initialize Dekker's method with two points, say a₀ and b₀, such that f(a₀) and f(b₀) have opposite signs. If f is continuous on [a₀, b₀], the intermediate value theorem guarantees the existence of a solution between a₀ and b₀.

Three points are involved in every iteration:

b_k is the current iterate, i.e., the current guess for the root of f.
a_k is the "contrapoint," i.e., a point such that f(a_k) and f(b_k) have opposite signs, so the interval [a_k, b_k] contains the solution. Furthermore, |f(b_k)| should be less than or equal to |f(a_k)|, so that b_k is a better guess for the unknown solution than a_k.
b_k−1 is the previous iterate (for the first iteration, we set b_k−1 = a₀).

Two provisional values for the next iterate are computed. The first one is given by linear interpolation, also known as the secant method:

s={\begin{cases}b_{k}-{\frac {b_{k}-b_{k-1}}{f(b_{k})-f(b_{k-1})}}f(b_{k}),&{\mbox{if }}f(b_{k})\neq f(b_{k-1})\\m&{\mbox{otherwise }}\end{cases}}

and the second one is given by the bisection method

m={\frac {a_{k}+b_{k}}{2}}.

If the result of the secant method, s, lies strictly between b_k and m, then it becomes the next iterate (b_k+1 = s), otherwise the midpoint is used (b_k+1 = m).

Then, the value of the new contrapoint is chosen such that f(a_k+1) and f(b_k+1) have opposite signs. If f(a_k) and f(b_k+1) have opposite signs, then the contrapoint remains the same: a_k+1 = a_k. Otherwise, f(b_k+1) and f(b_k) have opposite signs, so the new contrapoint becomes a_k+1 = b_k.

Finally, if |f(a_k+1)| < |f(b_k+1)|, then a_k+1 is probably a better guess for the solution than b_k+1, and hence the values of a_k+1 and b_k+1 are exchanged.

This ends the description of a single iteration of Dekker's method.

Dekker's method performs well if the function f is reasonably well-behaved. However, there are circumstances in which every iteration employs the secant method, but the iterates b_k converge very slowly (in particular, |b_k−b_k−1| may be arbitrarily small). Dekker's method requires far more iterations than the bisection method in this case.

Brent's method

Brent (1973) proposed a small modification to avoid the problem with Dekker's method. He inserts an additional test which must be satisfied before the result of the secant method is accepted as the next iterate. Two inequalities must be simultaneously satisfied:

Given a specific numerical tolerance $\delta$ , if the previous step used the bisection method, the inequality ${\textstyle |\delta |<|b_{k}-b_{k-1}|}$ must hold to perform interpolation, otherwise the bisection method is performed and its result used for the next iteration.

If the previous step performed interpolation, then the inequality ${\textstyle |\delta |<|b_{k-1}-b_{k-2}|}$ is used instead to perform the next action (to choose) interpolation (when inequality is true) or bisection method (when inequality is not true).

Also, if the previous step used the bisection method, the inequality ${\textstyle |s-b_{k}|<{\begin{matrix}{\frac {1}{2}}\end{matrix}}|b_{k}-b_{k-1}|}$ must hold, otherwise the bisection method is performed and its result used for the next iteration. If the previous step performed interpolation, then the inequality ${\textstyle |s-b_{k}|<{\begin{matrix}{\frac {1}{2}}\end{matrix}}|b_{k-1}-b_{k-2}|}$ is used instead.

This modification ensures that at the kth iteration, a bisection step will be performed in at most $2\log _{2}(|b_{k-1}-b_{k-2}|/\delta )$ additional iterations, because the above conditions force consecutive interpolation step sizes to halve every two iterations, and after at most $2\log _{2}(|b_{k-1}-b_{k-2}|/\delta )$ iterations, the step size will be smaller than $\delta$ , which invokes a bisection step. Brent proved that his method requires at most N² iterations, where N denotes the number of iterations for the bisection method. If the function f is well-behaved, then Brent's method will usually proceed by either inverse quadratic or linear interpolation, in which case it will converge superlinearly.

Furthermore, Brent's method uses inverse quadratic interpolation instead of linear interpolation (as used by the secant method). If f(b_k), f(a_k) and f(b_k−1) are distinct, it slightly increases the efficiency. As a consequence, the condition for accepting s (the value proposed by either linear interpolation or inverse quadratic interpolation) has to be changed: s has to lie between (3a_k + b_k) / 4 and b_k.

Algorithm

inputa, b, and (a pointer to) a function for f calculate f(a) calculate f(b) iff(a)f(b) ≥ 0 then      exit function because the root is not bracketed. end ifif |f(a)| < |f(b)| then     swap (a,b) end ifc := aset mflag repeat untilf(b or s) = 0 or |b−a| is small enough (convergence)iff(a) ≠ f(c) andf(b) ≠ f(c) then ${\textstyle s:={\frac {af(b)f(c)}{(f(a)-f(b))(f(a)-f(c))}}+{\frac {bf(a)f(c)}{(f(b)-f(a))(f(b)-f(c))}}+{\frac {cf(a)f(b)}{(f(c)-f(a))(f(c)-f(b))}}}$ (inverse quadratic interpolation)else ${\textstyle s:=b-f(b){\frac {b-a}{f(b)-f(a)}}}$ (secant method)end ifif(condition 1)sis notbetween  $(3a+b)/4$  and bor(condition 2) (mflag is set and|s−b| ≥ |b−c|/2)or(condition 3) (mflag is cleared and|s−b| ≥ |c−d|/2)or(condition 4) (mflag is set and|b−c| < | $δ$ |)or(condition 5) (mflag is cleared and|c−d| < | $δ$ |)then ${\textstyle s:={\frac {a+b}{2}}}$ (bisection method)set mflag     elseclear mflag     end if     calculate f(s)     d := c(d is assigned for the first time here; it won't be used above on the first iteration because mflag is set)c := biff(a)f(s) < 0 thenb := selsea := send ifif |f(a)| < |f(b)| then         swap (a,b)      end ifend repeatoutputbor s (return the root)

Example

Suppose that we are seeking a zero of the function defined by f(x) = (x + 3)(x− 1)².

We take [a₀, b₀] = [−4, 4/3] as our initial interval.

We have f(a₀) = −25 and f(b₀) = 0.48148 (all numbers in this section are rounded), so the conditions f(a₀) f(b₀) < 0 and |f(b₀)| ≤ |f(a₀)| are satisfied.

In the first iteration, we use linear interpolation between (b₋₁, f(b₋₁)) = (a₀, f(a₀)) = (−4, −25) and (b₀, f(b₀)) = (1.33333, 0.48148), which yields s = 1.23256. This lies between (3a₀ + b₀) / 4 and b₀, so this value is accepted. Furthermore, f(1.23256) = 0.22891, so we set a₁ = a₀ and b₁ = s = 1.23256.
In the second iteration, we use inverse quadratic interpolation between (a₁, f(a₁)) = (−4, −25) and (b₀, f(b₀)) = (1.33333, 0.48148) and (b₁, f(b₁)) = (1.23256, 0.22891). This yields 1.14205, which lies between (3a₁ + b₁) / 4 and b₁. Furthermore, the inequality |1.14205 −b₁| ≤ |b₀−b₋₁| / 2 is satisfied, so this value is accepted. Furthermore, f(1.14205) = 0.083582, so we set a₂ = a₁ and b₂ = 1.14205.
In the third iteration, we use inverse quadratic interpolation between (a₂, f(a₂)) = (−4, −25) and (b₁, f(b₁)) = (1.23256, 0.22891) and (b₂, f(b₂)) = (1.14205, 0.083582). This yields 1.09032, which lies between (3a₂ + b₂) / 4 and b₂. But here Brent's additional condition kicks in: the inequality |1.09032 −b₂| ≤ |b₁−b₀| / 2 is not satisfied, so this value is rejected. Instead, the midpoint m = −1.42897 of the interval [a₂, b₂] is computed. We have f(m) = 9.26891, so we set a₃ = a₂ and b₃ = −1.42897.
In the fourth iteration, we use inverse quadratic interpolation between (a₃, f(a₃)) = (−4, −25) and (b₂, f(b₂)) = (1.14205, 0.083582) and (b₃, f(b₃)) = (−1.42897, 9.26891). This yields 1.15448, which is not in the interval between (3a₃ + b₃) / 4 and b₃). Hence, it is replaced by the midpoint m = −2.71449. We have f(m) = 3.93934, so we set a₄ = a₃ and b₄ = −2.71449.
In the fifth iteration, inverse quadratic interpolation yields −3.45500, which lies in the required interval. However, the previous iteration was a bisection step, so the inequality |−3.45500 −b₄| ≤ |b₄−b₃| / 2 need to be satisfied. This inequality is false, so we use the midpoint m = −3.35724. We have f(m) = −6.78239, so m becomes the new contrapoint (a₅ = −3.35724) and the iterate remains the same (b₅ = b₄).
In the sixth iteration, we cannot use inverse quadratic interpolation because b₅ = b₄. Hence, we use linear interpolation between (a₅, f(a₅)) = (−3.35724, −6.78239) and (b₅, f(b₅)) = (−2.71449, 3.93934). The result is s = −2.95064, which satisfies all the conditions. But since the iterate did not change in the previous step, we reject this result and fall back to bisection. We update s = -3.03587, and f(s) = -0.58418.
In the seventh iteration, we can again use inverse quadratic interpolation. The result is s = −3.00219, which satisfies all the conditions. Now, f(s) = −0.03515, so we set a₇ = b₆ and b₇ = −3.00219 (a₇ and b₇ are exchanged so that the condition |f(b₇)| ≤ |f(a₇)| is satisfied). (Correct : linear interpolation ⁠ $s=-2.99436,f(s)=0.089961$ ⁠)
In the eighth iteration, we cannot use inverse quadratic interpolation because a₇ = b₆. Linear interpolation yields s = −2.99994, which is accepted. (Correct : ⁠ $s=-2.9999,f(s)=0.0016$ ⁠)
In the following iterations, the root x = −3 is approached rapidly: b₉ = −3 + 6·10⁻⁸ and b₁₀ = −3 − 3·10⁻¹⁵. (Correct : Iter 9 : f(s) = −1.4 × 10⁻⁷, Iter 10 : f(s) = 6.96 × 10⁻¹²)

Implementations

Brent (1973) published an Algol 60 implementation.
Netlib contains a Fortran translation of this implementation with slight modifications.
The PARI/GP method solve implements the method.
Other implementations of the algorithm (in C++, C, and Fortran) can be found in the Numerical Recipes books.
The Apache Commons Math library implements the algorithm in Java.
The SciPy optimize module implements the algorithm in Python (programming language)
The Modelica Standard Library implements the algorithm in Modelica.
The uniroot function implements the algorithm in R (software).
The fzero function implements the algorithm in MATLAB.
The Boost (C++ libraries) implements two algorithms based on Brent's method in C++ in the Math toolkit:
1. Function minimization at minima.hpp with an example locating function minima.
2. Root finding implements the newer TOMS748, a more modern and efficient algorithm than Brent's original, at TOMS748, and Boost.Math rooting finding that uses TOMS748 internally with examples.
The Optim.jl package implements the algorithm in Julia (programming language)
The Emmy computer algebra system (written in Clojure (programming language)) implements a variant of the algorithm designed for univariate function minimization.
Root-Finding in C# library hosted in Code Project.

Related Research Articles

The Gauss–Legendre algorithm is an algorithm to compute the digits of $π$ . It is notable for being rapidly convergent, with only 25 iterations producing 45 million correct digits of $π$ . However, it has some drawbacks and therefore all record-breaking calculations for many years have used other methods, almost always the Chudnovsky algorithm. For details, see Chronology of computation of $π$ .

In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" when reaching a certain value, called the modulus. The modern approach to modular arithmetic was developed by Carl Friedrich Gauss in his book Disquisitiones Arithmeticae, published in 1801.

<span class="mw-page-title-main">Newton's method</span> Algorithm for finding zeros of functions

In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function $f$ , its derivative $f'$ , and an initial guess $x 0$ for a root of $f$ . If $f$ satisfies certain assumptions and the initial guess is close, then

In mathematics, a continued fraction is an expression obtained through an iterative process of representing a number as the sum of its integer part and the reciprocal of another number, then writing this other number as the sum of its integer part and another reciprocal, and so on. In a finite continued fraction, the iteration/recursion is terminated after finitely many steps by using an integer in lieu of another continued fraction. In contrast, an infinite continued fraction is an infinite expression. In either case, all integers in the sequence, other than the first, must be positive. The integers $are called the coefficients or terms of the continued fraction.$

In numerical analysis, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function $f$ , from the real numbers to real numbers or from the complex numbers to the complex numbers, is a number $x$ such that $f (x) = 0$ . As, generally, the zeros of a function cannot be computed exactly nor expressed in closed form, root-finding algorithms provide approximations to zeros, expressed either as floating-point numbers or as small isolating intervals, or disks for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).

In numerical analysis, inverse iteration is an iterative eigenvalue algorithm. It allows one to find an approximate eigenvector when an approximation to a corresponding eigenvalue is already known. The method is conceptually similar to the power method. It appears to have originally been developed to compute resonance frequencies in the field of structural mechanics.

In numerical analysis, the secant method is a root-finding algorithm that uses a succession of roots of secant lines to better approximate a root of a function f. The secant method can be thought of as a finite-difference approximation of Newton's method. However, the secant method predates Newton's method by over 3000 years.

In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods. The method is also called the interval halving method, the binary search method, or the dichotomy method.

In mathematics, the regula falsi, method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and error technique of using test ("false") values for the variable and then adjusting the test value according to the outcome. This is sometimes also referred to as "guess and check". Versions of the method predate the advent of algebra and the use of equations.

In calculus, Newton's method (also called Newton–Raphson) is an iterative method for finding the roots of a differentiable function $F$ , which are solutions to the equation $F (x) = 0$ . As such, Newton's method can be applied to the derivative $f'$ of a twice-differentiable function $f$ to find the roots of the derivative (solutions to $f'(x) = 0$ ), also known as the critical points of $f$ . These solutions may be minima, maxima, or saddle points; see section "Several variables" in Critical point (mathematics) and also section "Geometric interpretation" in this article. This is relevant in optimization, which aims to find (global) minima of the function $f$ .

In numerical analysis, inverse quadratic interpolation is a root-finding algorithm, meaning that it is an algorithm for solving equations of the form f(x) = 0. The idea is to use quadratic interpolation to approximate the inverse of f. This algorithm is rarely used on its own, but it is important because it forms part of the popular Brent's method.

In optimization, line search is a basic iterative approach to find a local minimum $of an objective function . It first finds a descent direction along which the objective function will be reduced, and then computes a step size that determines how far should move along that direction. The descent direction can be computed by various methods, such as gradient descent or quasi-Newton method. The step size can be determined either exactly or inexactly.$

Muller's method is a root-finding algorithm, a numerical method for solving equations of the form f(x) = 0. It was first presented by David E. Muller in 1956.

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems. Like the related Davidon–Fletcher–Powell method, BFGS determines the descent direction by preconditioning the gradient with curvature information. It does so by gradually improving an approximation to the Hessian matrix of the loss function, obtained only from gradient evaluations via a generalized secant method.

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize $over unconstrained values of the real-vector where is a differentiable scalar function.$

Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization which may be considered a quasi-Newton method. SQP methods are used on mathematical problems for which the objective function and the constraints are twice continuously differentiable, but not necessarily convex.

Quasi-Newton methods are methods used to find either zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration. The "full" Newton's method requires the Jacobian in order to search for zeros, or the Hessian for finding extrema. Some iterative methods that reduce to Newton's method, such as SLSQP, may be considered quasi-Newtonian.

Sidi's generalized secant method is a root-finding algorithm, that is, a numerical method for solving equations of the form $. The method was published by Avram Sidi.$

In numerical analysis, the ITP method, short for Interpolate Truncate and Project, is the first root-finding algorithm that achieves the superlinear convergence of the secant method while retaining the optimal worst-case performance of the bisection method. It is also the first method with guaranteed average performance strictly better than the bisection method under any continuous distribution. In practice it performs better than traditional interpolation and hybrid based strategies, since it not only converges super-linearly over well behaved functions but also guarantees fast performance under ill-behaved functions where interpolations fail.

References

↑ Brent 1973
↑ Dekker 1969
↑ Chandrupatla, Tirupathi R. (1997). "A new hybrid quadratic/Bisection algorithm for finding the zero of a nonlinear function without using derivatives". Advances in Engineering Software. 28 (3): 145–149. doi:10.1016/S0965-9978(96)00051-8.
↑ "Ten Little Algorithms, Part 5: Quadratic Extremum Interpolation and Chandrupatla's Method - Jason Sachs".

Brent, R. P. (1973), "Chapter 4: An Algorithm with Guaranteed Convergence for Finding a Zero of a Function", Algorithms for Minimization without Derivatives, Englewood Cliffs, NJ: Prentice-Hall, ISBN 0-13-022335-2
Dekker, T. J. (1969), "Finding a zero by means of successive linear interpolation", in Dejon, B.; Henrici, P. (eds.), Constructive Aspects of the Fundamental Theorem of Algebra, London: Wiley-Interscience, ISBN 978-0-471-20300-1

External links

zeroin.f at Netlib.
module brent in C++ (also C, Fortran, Matlab) Archived 2018-04-05 at the Wayback Machine by John Burkardt
GSL implementation.
Boost C++ implementation.
Python (Scipy) implementation

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Brent 1973

[2] Dekker 1969

[3] Chandrupatla, Tirupathi R. (1997). "A new hybrid quadratic/Bisection algorithm for finding the zero of a nonlinear function without using derivatives". Advances in Engineering Software. 28 (3): 145–149. doi:10.1016/S0965-9978(96)00051-8.

[4] "Ten Little Algorithms, Part 5: Quadratic Extremum Interpolation and Chandrupatla's Method - Jason Sachs".

[1]

[2]

[3]

[4]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Newton	Newton's method
Quasi-Newton	Halley's method Muller's method Secant method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Bairstow's method Jenkins–Traub method Laguerre's method

Brent's method

Contents

Dekker's method

Brent's method

Algorithm

Example

Implementations

Related Research Articles

References

Further reading

External links