Real-root isolation

Last updated

In mathematics, and, more specifically in numerical analysis and computer algebra, real-root isolation of a polynomial consist of producing disjoint intervals of the real line, which contain each one (and only one) real root of the polynomial, and, together, contain all the real roots of the polynomial.

Contents

Real-root isolation is useful because usual root-finding algorithms for computing the real roots of a polynomial may produce some real roots, but, cannot generally certify having found all real roots. In particular, if such an algorithm does not find any root, one does not know whether it is because there is no real root. Some algorithms compute all complex roots, but, as there are generally much fewer real roots than complex roots, most of their computation time is generally spent for computing non-real roots (in the average, a polynomial of degree n has n complex roots, and only log n real roots; see Geometrical properties of polynomial roots § Real roots). Moreover, it may be difficult to distinguish the real roots from the non-real roots with small imaginary part (see the example of Wilkinson's polynomial in next section).

The first complete real-root isolation algorithm results from Sturm's theorem (1829). However, when real-root-isolation algorithms began to be implemented on computers it appeared that algorithms derived from Sturm's theorem are less efficient than those derived from Descartes' rule of signs (1637).

Since the beginning of 20th century there is an active research activity for improving the algorithms derived from Descartes' rule of signs, getting very efficient implementations, and computing their computational complexity. The best implementations can routinely isolate real roots of polynomials of degree more than 1,000. [1] [2]

Specifications and general strategy

For finding real roots of a polynomial, the common strategy is to divide the real line (or an interval of it where root are searched) into disjoint intervals until having at most one root in each interval. Such a procedure is called root isolation, and a resulting interval that contains exactly one root is an isolating interval for this root.

Wilkinson's polynomial shows that a very small modification of one coefficient of a polynomial may change dramatically not only the value of the roots, but also their nature (real or complex). Also, even with a good approximation, when one evaluates a polynomial at an approximate root, one may get a result that is far to be close to zero. For example, if a polynomial of degree 20 (the degree of Wilkinson's polynomial) has a root close to 10, the derivative of the polynomial at the root may be of the order of this implies that an error of on the value of the root may produce a value of the polynomial at the approximate root that is of the order of It follows that, except maybe for very low degrees, a root-isolation procedure cannot give reliable results without using exact arithmetic. Therefore, if one wants to isolate roots of a polynomial with floating-point coefficients, it is often better to convert them to rational numbers, and then take the primitive part of the resulting polynomial, for having a polynomial with integer coefficients.

For this reason, although the methods that are described below work theoretically with real numbers, they are generally used in practice with polynomials with integer coefficients, and intervals ending with rational numbers. Also, the polynomials are always supposed to be square free. There are two reasons for that. Firstly Yun's algorithm for computing the square-free factorization is less costly than twice the cost of the computation of the greatest common divisor of the polynomial and its derivative. As this may produce factors of lower degrees, it is generally advantageous to apply root-isolation algorithms only on polynomials without multiple roots, even when this is not required by the algorithm. The second reason for considering only square-free polynomials is that the fastest root-isolation algorithms do not work in the case of multiple roots.

For root isolation, one requires a procedure for counting the real roots of a polynomial in an interval without having to compute them, or, at least a procedure for deciding whether an interval contains zero, one or more roots. With such a decision procedure, one may work with a working list of intervals that may contain real roots. At the beginning, the list contains a single interval containing all roots of interest, generally the whole real line or its positive part. Then each interval of the list is divided into two smaller intervals. If one of the new intervals does not contain any root, it is removed from the list. If it contains one root, it is put in an output list of isolating intervals. Otherwise, it is kept in the working list for further divisions, and the process may continue until all roots are eventually isolated

Sturm's theorem

The first complete root-isolation procedure results of Sturm's theorem (1829), which expresses the number of real roots in an interval in terms of the number of sign variations of the values of a sequence of polynomials, called Sturm's sequence, at the ends of the interval. Sturm's sequence is the sequence of remainders that occur in a variant of Euclidean algorithm applied to the polynomial and its derivatives. When implemented on computers, it appeared that root isolation with Sturm's theorem is less efficient than the other methods that are described below. [3] Consequently, Sturm's theorem is rarely used for effective computations, although it remains useful for theoretical purposes.

Descartes' rule of signs and its generalizations

Descartes' rule of signs asserts that the difference between the number of sign variations in the sequence of the coefficients of a polynomial and the number of its positive real roots is a nonnegative even integer. It results that if this number of sign variations is zero, then the polynomial does not have any positive real roots, and, if this number is one, then the polynomial has a unique positive real root, which is a single root. Unfortunately the converse is not true, that is, a polynomial which has either no positive real root or has a single positive simple root may have a number of sign variations greater than 1.

This has been generalized by Budan's theorem (1807), into a similar result for the real roots in a half-open interval (a, b]: If f(x) is a polynomial, and v is the difference between of the numbers of sign variations of the sequences of the coefficients of f(x + a) and f(x + b), then v minus the number of real roots in the interval, counted with their multiplicities, is a nonnegative even integer. This is a generalization of Descartes' rule of signs, because, for b sufficiently large, there is no sign variation in the coefficients of f(x + b), and all real roots are smaller than b.

Budan's may provide a real-root-isolation algorithm for a square-free polynomial (a polynomial without multiple root): from the coefficients of polynomial, one may compute an upper bound M of the absolute values of the roots and a lower bound m on the absolute values of the differences of two roots (see Properties of polynomial roots). Then, if one divides the interval [–M, M] into intervals of length less than m, then every real root is contained in some interval, and no interval contains two roots. The isolating intervals are thus the intervals for which Budan's theorem asserts an odd number of roots.

However, this algorithm is very inefficient, as one cannot use a coarser partition of the interval [–M, M], because, if Budan's theorem gives a result larger than 1 for an interval of larger size, there is no way for insuring that it does not contain several roots.

Vincent's theorem (1834) [4] provides a method for real-root isolation, which is at the basis of the most efficient real-root-isolation algorithms. It concerns the positive real roots of a square-free polynomial (that is a polynomial without multiple roots). If is a sequence of positive real numbers, let

be the kth convergent of the continued fraction

Vincent's theorem  Let be a square-free polynomial of degree n, and be a sequence of real numbers. For i = 1, 2,..., consider the polynomial

Then, there is an integer k such that either or the sequence of the coefficients of has at most one sign variation.

In the first case, the convergent ck is a positive root of Otherwise, this number of sign variations (either 0 or 1) is the number of real roots of in the interval defined by and

For proving his theorem, Vincent proved a result that is useful on its own: [4]

Vincent's auxiliary theorem  If p(x) is a square-free polynomial of degree n, and a, b, c, d are nonnegative real numbers such that is small enough (but not 0), then there is at most one sign variation in the coefficients of the polynomial

and this number of sign variations is the number of real roots of p(x) in the open interval defined by and

For working with real numbers, one may always choose c = d = 1, but, as effective computations are done with rational numbers, it is generally convenient to suppose that a, b, c, d are integers.

The "small enough" condition has been quantified independently by Nikola Obreshkov, [5] and Alexander Ostrowski: [6]

Obreschkoff-Ostrowski theorem: in blue and yellow, the regions of the complex plane where there should be no root for having 0 or 1 sign variation; on the left the regions excluded for the roots of p, on the right, the regions excluded for the roots of the transformed polynomial q; in blue, the regions that are excluded for having one sign variation, but allowed for zero sign variations. Sketch of proof.jpg
Obreschkoff–Ostrowski theorem: in blue and yellow, the regions of the complex plane where there should be no root for having 0 or 1 sign variation; on the left the regions excluded for the roots of p, on the right, the regions excluded for the roots of the transformed polynomial q; in blue, the regions that are excluded for having one sign variation, but allowed for zero sign variations.

Theorem (Obreschkoff–Ostrowski)  The conclusion of Vincent's auxiliary result holds if the polynomial p(x) has at most one root α + such that

In particular the conclusion holds if

where sep(p) is the minimal distance between two roots of p.

For polynomials with integer coefficients, the minimum distance sep(p) may be lower bounded in terms of the degree of the polynomial and the maximal absolute value of its coefficients; see Properties of polynomial roots § Root separation. This allows the analysis of worst-case complexity of algorithms based on Vincent's theorems. However, Obreschkoff–Ostrowski theorem shows that the number of iterations of these algorithms depend on the distances between roots in the neighborhood of the working interval; therefore, the number of iterations may vary dramatically for different roots of the same polynomial.

James V. Uspensky gave a bound on the length of the continued fraction (the integer k in Vincent's theorem), for getting zero or one sign variations: [1] [7]

Theorem (Uspensky)   Let p(x) be a polynomial of degree n, and sep(p) be the minimal distance between two roots of p. Let

Then the integer k, whose existence is asserted in Vincent's theorem, is not greater than the smallest integer h such that

where is the hth Fibonacci number.

Continued fraction method

The use of continued fractions for real-root isolation has been introduced by Vincent, although he credited Joseph-Louis Lagrange for this idea, without providing a reference. [4] For making an algorithm of Vincent's theorem, one must provide a criterion for choosing the that occur in his theorem. Vincent himself provided some choice (see below). Some other choices are possible, and the efficiency of the algorithm may depend dramatically on these choices. Below is presented an algorithm, in which these choices result from an auxiliary function that will be discussed later.

For running this algorithm one must work with a list of intervals represented by a specific data structure. The algorithm works by choosing an interval, removing it from the list, adding zero, one or two smaller intervals to the list, and possibly outputs an isolation interval.

For isolating the real roots of a polynomial p(x) of degree n, each interval is represented by a pair where A(x) is a polynomial of degree n and is a Möbius transformation with integer coefficients. One has

and the interval represented by this data structure is the interval that has and as end points. The Möbius transformation maps the roots of p in this interval to the roots of A in (0, +∞).

The algorithm works with a list of intervals that, at the beginning, contains the two intervals and corresponding to the partition of the reals into the positive and the negative ones (one may suppose that zero is not a root, as, if it were, it suffices to apply the algorithm to p(x)/x). Then for each interval (A(x), M(x)) in the list, the algorithm remove it from the list; if the number of sign variations of the coefficients of A is zero, there is no root in the interval, and one passes to the next interval. If the number of sign variations is one, the interval defined by and is an isolating interval. Otherwise, one chooses a positive real number b for dividing the interval (0, +∞) into (0, b) and (b, +∞), and, for each subinterval, one composes M with a Möbius transformation that maps the interval onto (0, +∞), for getting two new intervals to be added to the list. In pseudocode, this gives the following, where var(A) denotes the number of sign variations of the coefficients of the polynomial A.

function continued fraction isinput: P(x), a square-free polynomial,     output: a list of pairs of rational numbers defining isolating intervals     /* Initialization */         L := [(P(x), x), (P(–x), –x)]                /* two starting intervals */         Isol := [ ]     /* Computation */     while L  [ ] doChoose (A(x), M(x)) in L, and remove it from L         v := var(A)         if v = 0 then exit                /* no root in the interval */         if v = 1 then                     /* isolating interval found */             add (M(0), M(∞)) to Isol             exit         b := some positive integer         B(x) = A(x + b)         w := v – var(B)         if B(0) = 0 then                         /* rational root found */             add (M(b), M(b)) to Isol             B(x) := B(x)/x         add (B(x),  M(b + x)) to L           /* roots in (M(b), M(+∞)) */         if w = 0 then exit                  /* Budan's theorem */          if w = 1 then                       /* Budan's theorem again */              add (M(0), M(b)) to Isol         if w > 1 thenadd ( A(b/(1 + x)),  M(b/(1 + x)) )to L      /* roots in (M(0), M(b)) */

The different variants of the algorithm depend essentially on the choice of b. In Vincent's papers, and in Uspensky's book, one has always b = 1, with the difference that Uspensky did not use Budan's theorem for avoiding further bisections of the interval associated to (0, b)

The drawback of always choosing b = 1 is that one has to do many successive changes of variable of the form x → 1 + x. These may be replaced by a single change of variable xn + x, but, nevertheless, one has to do the intermediate changes of variables for applying Budan's theorem.

A way for improving the efficiency of the algorithm is to take for b a lower bound of the positive real roots, computed from the coefficients of the polynomial (see Properties of polynomial roots for such bounds). [8] [1]

Bisection method

The bisection method consists roughly of starting from an interval containing all real roots of a polynomial, and divides it recursively into two parts until getting eventually intervals that contain either zero or one root. The starting interval may be of the form (-B, B), where B is an upper bound on the absolute values of the roots, such as those that are given in Properties of polynomial roots § Bounds on (complex) polynomial roots. For technical reasons (simpler changes of variable, simpler complexity analysis, possibility of taking advantage of the binary analysis of computers), the algorithms are generally presented as starting with the interval [0, 1]. There is no loss of generality, as the changes of variables x = By and x = –By move respectively the positive and the negative roots in the interval [0, 1]. (The single changes variable x = (2ByB) may also be used.)

The method requires an algorithm for testing whether an interval has zero, one, or possibly several roots, and for warranting termination, this testing algorithm must exclude the possibility of getting infinitely many times the output "possibility of several roots". Sturm's theorem and Vincent's auxiliary theorem provide such convenient tests. As the use Descartes' rule of signs and Vincent's auxiliary theorem is much more computationally efficient than the use of Sturm's theorem, only the former is described in this section.

The bisection method based on Descartes' rules of signs and Vincent's auxiliary theorem has been introduced in 1976 by Akritas and Collins under the name of Modified Uspensky algorithm, [3] and has been referred to as the Uspensky algorithm, the Vincent–Akritas–Collins algorithm, or Descartes method, although Descartes, Vincent and Uspensky never described it.

The method works as follows. For searching the roots in some interval, one changes first the variable for mapping the interval onto [0, 1] giving a new polynomial q(x). For searching the roots of q in [0, 1], one maps the interval [0, 1] onto [0, +∞]) by the change of variable giving a polynomial r(x). Descartes' rule of signs applied to the polynomial r gives indications on the number of real roots of q in the interval [0, 1], and thus on the number of roots of the initial polynomial in the interval that has been mapped on [0, 1]. If there is no sign variation in the sequence of the coefficients of r, then there is no real root in the considered intervals. If there is one sign variation, then one has an isolation interval. Otherwise, one splits the interval [0, 1] into [0, 1/2] and [1/2, 1], one maps them onto [0, 1] by the changes of variable x = y/2 and x = (y + 1)/2. Vincent's auxiliary theorem insures the termination of this procedure.

Except for the initialization, all these changes of variable consists of the composition of at most two very simple changes of variable which are the scalings by two xx/2 , the translation xx + 1, and the inversion x → 1/x, the latter consisting simply of reverting the order of the coefficients of the polynomial. As most of the computing time is devoted to changes of variable, the method consisting of mapping every interval to [0, 1] is fundamental for insuring a good efficiency.

Pseudocode

The following notation is used in the pseudocode that follows.

function bisection isinput: p(x), a square-free polynomial, such that p(0) p(1) ≠ 0,                        for which the roots in the interval [0, 1] are searched     output: a list of triples (c, k, h),                        representing isolating intervals of the form      /* Initialization */     L := [(0, 0, p(x))] /* a single element in the working list L */     Isol := [ ]     n := degree(p)     /* Computation */     while L  [ ] doChoose (c, k, q(x))in L, and remove it from L         ifq(0) = 0thenq(x) := q(x)/x             n := n – 1                /* A rational root found */             add (c, k, 0) to Isol         v := if v = 1 then                /* An isolating interval found */             add (c, k, 1) to Isol         if v > 1 then                /* Bisecting */             add (2c, k + 1, to L             add (2c + 1, k + 1, to L     end

This procedure is essentially the one that has been described by Collins and Akritas. [3] The running time depends mainly on the number of intervals that have to be considered, and on the changes of variables. There are ways for improving the efficiency, which have been an active subject of research since the publication of the algorithm, and mainly since the beginning of the 21st century.

Recent improvements

Various ways for improving Akritas–Collins bisection algorithm have been proposed. They include a method for avoiding storing a long list of polynomials without losing the simplicity of the changes of variables, [9] the use of approximate arithmetic (floating point and interval arithmetic) when it allows getting the right value for the number of sign variations, [9] the use of Newton's method when possible, [9] the use of fast polynomial arithmetic, [10] shortcuts for long chains of bisections in case of clusters of close roots, [10] bisections in unequal parts for limiting instability problems in polynomial evaluation. [10]

All these improvement lead to an algorithm for isolating all real roots of a polynomial with integer coefficients, which has the complexity (using soft O notation, Õ, for omitting logarithmic factors)

where n is the degree of the polynomial, k is the number of nonzero terms, t is the maximum of digits of the coefficients. [10]

The implementation of this algorithm appears to be more efficient than any other implemented method for computing the real roots of a polynomial, even in the case of polynomials having very close roots (the case which was previously the most difficult for the bisection method). [2]

Related Research Articles

<span class="mw-page-title-main">Algebraic number</span> Complex number that is a root of a non-zero polynomial in one variable with rational coefficients

An algebraic number is a number that is a root of a non-zero polynomial in one variable with integer coefficients. For example, the golden ratio, , is an algebraic number, because it is a root of the polynomial x2x − 1. That is, it is a value for x for which the polynomial evaluates to zero. As another example, the complex number is algebraic because it is a root of x4 + 4.

In mathematics, Bézout's identity, named after Étienne Bézout who proved it for polynomials, is the following theorem:

In complex analysis, an entire function, also called an integral function, is a complex-valued function that is holomorphic on the whole complex plane. Typical examples of entire functions are polynomials and the exponential function, and any finite sums, products and compositions of these, such as the trigonometric functions sine and cosine and their hyperbolic counterparts sinh and cosh, as well as derivatives and integrals of entire functions such as the error function. If an entire function has a root at , then , taking the limit value at , is an entire function. On the other hand, the natural logarithm, the reciprocal function, and the square root are all not entire functions, nor can they be continued analytically to an entire function.

In mathematics, a polynomial is a mathematical expression consisting of indeterminates and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An example of a polynomial of a single indeterminate x is x2 − 4x + 7. An example with three indeterminates is x3 + 2xyz2yz + 1.

The fundamental theorem of algebra, also called d'Alembert's theorem or the d'Alembert–Gauss theorem, states that every non-constant single-variable polynomial with complex coefficients has at least one complex root. This includes polynomials with real coefficients, since every real number is a complex number with its imaginary part equal to zero.

In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the roots without computing them. More precisely, it is a polynomial function of the coefficients of the original polynomial. The discriminant is widely used in polynomial factoring, number theory, and algebraic geometry.

<span class="mw-page-title-main">Factorization</span> (Mathematical) decomposition into a product

In mathematics, factorization (or factorisation, see English spelling differences) or factoring consists of writing a number or another mathematical object as a product of several factors, usually smaller or simpler objects of the same kind. For example, 3 × 5 is an integer factorization of 15, and (x – 2)(x + 2) is a polynomial factorization of x2 – 4.

In algebraic number theory, an algebraic integer is a complex number which is integral over the integers. That is, an algebraic integer is a complex root of some monic polynomial whose coefficients are integers. The set of all algebraic integers A is closed under addition, subtraction and multiplication and therefore is a commutative subring of the complex numbers.

<span class="mw-page-title-main">Root of unity</span> Number that has an integer power equal to 1

In mathematics, a root of unity, occasionally called a de Moivre number, is any complex number that yields 1 when raised to some positive integer power n. Roots of unity are used in many branches of mathematics, and are especially important in number theory, the theory of group characters, and the discrete Fourier transform.

In mathematics, an irreducible polynomial is, roughly speaking, a polynomial that cannot be factored into the product of two non-constant polynomials. The property of irreducibility depends on the nature of the coefficients that are accepted for the possible factors, that is, the field to which the coefficients of the polynomial and its possible factors are supposed to belong. For example, the polynomial x2 − 2 is a polynomial with integer coefficients, but, as every integer is also a real number, it is also a polynomial with real coefficients. It is irreducible if it is considered as a polynomial with integer coefficients, but it factors as if it is considered as a polynomial with real coefficients. One says that the polynomial x2 − 2 is irreducible over the integers but not over the reals.

In mathematics, an algebraic equation or polynomial equation is an equation of the form , where P is a polynomial with coefficients in some field, often the field of the rational numbers. For example, is an algebraic equation with integer coefficients and

In mathematics, the Sturm sequence of a univariate polynomial p is a sequence of polynomials associated with p and its derivative by a variant of Euclid's algorithm for polynomials. Sturm's theorem expresses the number of distinct real roots of p located in an interval in terms of the number of changes of signs of the values of the Sturm sequence at the bounds of the interval. Applied to the interval of all the real numbers, it gives the total number of real roots of p.

In mathematics, Descartes' rule of signs, first described by René Descartes in his work La Géométrie, is a technique for getting information on the number of positive real roots of a polynomial. It asserts that the number of positive roots is at most the number of sign changes in the sequence of polynomial's coefficients, and that the difference between these two numbers is always even. This implies, in particular, that if the number of sign changes is zero or one, then there are exactly zero or one positive roots, respectively.

In mathematics and computer algebra, factorization of polynomials or polynomial factorization expresses a polynomial with coefficients in a given field or in the integers as the product of irreducible factors with coefficients in the same domain. Polynomial factorization is one of the fundamental components of computer algebra systems.

In mathematics, a univariate polynomial of degree n with real or complex coefficients has n complex roots, if counted with their multiplicities. They form a multiset of n points in the complex plane. This article concerns the geometry of these points, that is the information about their localization in the complex plane that can be deduced from the degree and the coefficients of the polynomial.

In algebra, the greatest common divisor of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials. This concept is analogous to the greatest common divisor of two integers.

A system of polynomial equations is a set of simultaneous equations f1 = 0, ..., fh = 0 where the fi are polynomials in several variables, say x1, ..., xn, over some field k.

In mathematics, Budan's theorem is a theorem for bounding the number of real roots of a polynomial in an interval, and computing the parity of this number. It was published in 1807 by François Budan de Boislaurent.

In mathematics, Vincent's theorem—named after Alexandre Joseph Hidulphe Vincent—is a theorem that isolates the real roots of polynomials with rational coefficients.

Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.

References

  1. 1 2 3 Tsigaridas & Emiris 2006
  2. 1 2 Kobel, Rouillier & Sagraloff 2016
  3. 1 2 3 Collins & Akritas 1976
  4. 1 2 3 Vincent 1834
  5. Obreschkoff 1963
  6. Ostrowski 1950
  7. Uspensky 1948
  8. Akritas & Strzeboński 2005
  9. 1 2 3 Rouillier & Zimmerman 2004
  10. 1 2 3 4 Sagraloff & Mehlhorn 2016

Sources