In mathematics, a univariate polynomial of degree n with real or complex coefficients has n complex roots , if counted with their multiplicities. They form a multiset of n points in the complex plane. This article concerns the geometry of these points, that is the information about their localization in the complex plane that can be deduced from the degree and the coefficients of the polynomial.
Some of these geometrical properties are related to a single polynomial, such as upper bounds on the absolute values of the roots, which define a disk containing all roots, or lower bounds on the distance between two roots. Such bounds are widely used for root-finding algorithms for polynomials, either for tuning them, or for computing their computational complexity.
Some other properties are probabilistic, such as the expected number of real roots of a random polynomial of degree n with real coefficients, which is less than for n sufficiently large.
In this article, a polynomial that is considered is always denoted
where are real or complex numbers and ; thus is the degree of the polynomial.
The n roots of a polynomial of degree n depend continuously on the coefficients. For simple roots, this results immediately from the implicit function theorem. This is true also for multiple roots, but some care is needed for the proof.
A small change of coefficients may induce a dramatic change of the roots, including the change of a real root into a complex root with a rather large imaginary part (see Wilkinson's polynomial). A consequence is that, for classical numeric root-finding algorithms, the problem of approximating the roots given the coefficients can be ill-conditioned for many inputs.
The complex conjugate root theorem states that if the coefficients of a polynomial are real, then the non-real roots appear in pairs of the form (a + ib, a – ib).
It follows that the roots of a polynomial with real coefficients are mirror-symmetric with respect to the real axis.
This can be extended to algebraic conjugation: the roots of a polynomial with rational coefficients are conjugate (that is, invariant) under the action of the Galois group of the polynomial. However, this symmetry can rarely be interpreted geometrically.
Upper bounds on the absolute values of polynomial roots are widely used for root-finding algorithms, either for limiting the regions where roots should be searched, or for the computation of the computational complexity of these algorithms.
Many such bounds have been given, and the sharper one depends generally on the specific sequence of coefficient that are considered. Most bounds are greater or equal to one, and are thus not sharp for a polynomial which have only roots of absolute values lower than one. However, such polynomials are very rare, as shown below.
Any upper bound on the absolute values of roots provides a corresponding lower bound. In fact, if and U is an upper bound of the absolute values of the roots of
then 1/U is a lower bound of the absolute values of the roots of
since the roots of either polynomial are the multiplicative inverses of the roots of the other. Therefore, in the remainder of the article lower bounds will not be given explicitly.
Lagrange and Cauchy were the first to provide upper bounds on all complex roots. [1] Lagrange's bound is [2]
and Cauchy's bound is [3]
Lagrange's bound is sharper (smaller) than Cauchy's bound only when 1 is larger than the sum of all but the largest. This is relatively rare in practice, and explains why Cauchy's bound is more widely used than Lagrange's.
Both bounds result from the Gershgorin circle theorem applied to the companion matrix of the polynomial and its transpose. They can also be proved by elementary methods.
Proof of Lagrange's and Cauchy's bounds |
---|
If z is a root of the polynomial, and |z| ≥ 1 one has Dividing by one gets which is Lagrange's bound when there is at least one root of absolute value larger than 1. Otherwise, 1 is a bound on the roots, and is not larger than Lagrange's bound. Similarly, for Cauchy's bound, one has, if |z| ≥ 1, Thus Solving in |z|, one gets Cauchy's bound if there is a root of absolute value larger than 1. Otherwise the bound is also correct, as Cauchy's bound is larger than 1. |
These bounds are not invariant by scaling. That is, the roots of the polynomial p(sx) are the quotient by s of the root of p, and the bounds given for the roots of p(sx) are not the quotient by s of the bounds of p. Thus, one may get sharper bounds by minimizing over possible scalings. This gives
and
for Lagrange's and Cauchy's bounds respectively.
Another bound, originally given by Lagrange, but attributed to Zassenhaus by Donald Knuth, is [4]
This bound is invariant by scaling.
Proof of the preceding bound |
---|
Let A be the largest for 0 ≤ i < n. Thus one has for If z is a root of p, one has and thus, after dividing by As we want to prove |z| ≤ 2A, we may suppose that |z| > A (otherwise there is nothing to prove). Thus which gives the result, since |
Lagrange improved this latter bound into the sum of the two largest values (possibly equal) in the sequence [4]
Lagrange also provided the bound [ citation needed ]
where denotes the ith nonzero coefficient when the terms of the polynomials are sorted by increasing degrees.
Hölder's inequality allows the extension of Lagrange's and Cauchy's bounds to every h-norm. The h-norm of a sequence
is
for any real number h ≥ 1, and
If with 1 ≤ h, k ≤ ∞, and 1 / ∞ = 0, an upper bound on the absolute values of the roots of p is
For k = 1 and k = ∞, one gets respectively Cauchy's and Lagrange's bounds.
For h = k = 2, one has the bound
This is not only a bound of the absolute values of the roots, but also a bound of the product of their absolute values larger than 1; see § Landau's inequality, below.
Proof |
---|
Let z be a root of the polynomial Setting we have to prove that every root z of p satisfies If the inequality is true; so, one may suppose for the remainder of the proof. Writing the equation as Hölder's inequality implies If k = ∞, this is Thus In the case 1 ≤ k < ∞, the summation formula for a geometric progression, gives Thus which simplifies to Thus, in all cases which finishes the proof. |
Many other upper bounds for the magnitudes of all roots have been given. [5]
Fujiwara's bound [6]
slightly improves the bound given above by dividing the last argument of the maximum by two.
Kojima's bound is [7] [ verification needed ]
where denotes the ith nonzero coefficient when the terms of the polynomials are sorted by increasing degrees. If all coefficients are nonzero, Fujiwara's bound is sharper, since each element in Fujiwara's bound is the geometric mean of first elements in Kojima's bound.
Sun and Hsieh obtained another improvement on Cauchy's bound. [8] Assume the polynomial is monic with general term aixi. Sun and Hsieh showed that upper bounds 1 + d1 and 1 + d2 could be obtained from the following equations.
d2 is the positive root of the cubic equation
They also noted that d2 ≤ d1.
The previous bounds are upper bounds for each root separately. Landau's inequality provides an upper bound for the absolute values of the product of the roots that have an absolute value greater than one. This inequality, discovered in 1905 by Edmund Landau, [9] has been forgotten and rediscovered at least three times during the 20th century. [10] [11] [12]
This bound of the product of roots is not much greater than the best preceding bounds of each root separately. [13] Let be the n roots of the polynomial p. If
is the Mahler measure of p, then
Surprisingly, this bound of the product of the absolute values larger than 1 of the roots is not much larger than the best bounds of one root that have been given above for a single root. This bound is even exactly equal to one of the bounds that are obtained using Hölder's inequality.
This bound is also useful to bound the coefficients of a divisor of a polynomial with integer coefficients: [14] if
is a divisor of p, then
and, by Vieta's formulas,
for i = 0, ..., m, where is a binomial coefficient. Thus
and
Rouché's theorem allows defining discs centered at zero and containing a given number of roots. More precisely, if there is a positive real number R and an integer 0 ≤ k ≤ n such that
then there are exactly k roots, counted with multiplicity, of absolute value less than R.
Proof |
---|
If then By Rouché's theorem, this implies directly that and have the same number of roots of absolute values less than R, counted with multiplicities. As this number is k, the result is proved. |
The above result may be applied if the polynomial
takes a negative value for some positive real value of x.
In the remaining of the section, suppose that a0 ≠ 0. If it is not the case, zero is a root, and the localization of the other roots may be studied by dividing the polynomial by a power of the indeterminate, getting a polynomial with a nonzero constant term.
For k = 0 and k = n, Descartes' rule of signs shows that the polynomial has exactly one positive real root. If and are these roots, the above result shows that all the roots satisfy
As these inequalities apply also to and these bounds are optimal for polynomials with a given sequence of the absolute values of their coefficients. They are thus sharper than all bounds given in the preceding sections.
For 0 < k < n, Descartes' rule of signs implies that either has two positive real roots that are not multiple, or is nonnegative for every positive value of x. So, the above result may be applied only in the first case. If are these two roots, the above result implies that
for k roots of p, and that
for the n – k other roots.
Instead of explicitly computing and it is generally sufficient to compute a value such that (necessarily ). These have the property of separating roots in terms of their absolute values: if, for h < k, both and exist, there are exactly k – h roots z such that
For computing one can use the fact that is a convex function (its second derivative is positive). Thus exists if and only if is negative at its unique minimum. For computing this minimum, one can use any optimization method, or, alternatively, Newton's method for computing the unique positive zero of the derivative of (it converges rapidly, as the derivative is a monotonic function).
One can increase the number of existing 's by applying the root squaring operation of the Dandelin–Graeffe iteration. If the roots have distinct absolute values, one can eventually completely separate the roots in terms of their absolute values, that is, compute n + 1 positive numbers such there is exactly one root with an absolute value in the open interval for k = 1, ..., n.
The Gershgorin circle theorem applies the companion matrix of the polynomial on a basis related to Lagrange interpolation to define discs centered at the interpolation points, each containing a root of the polynomial; see Durand–Kerner method § Root inclusion via Gerschgorin's circles for details.
If the interpolation points are close to the roots of the roots of the polynomial, the radii of the discs are small, and this is a key ingredient of Durand–Kerner method for computing polynomial roots.
For polynomials with real coefficients, it is often useful to bound only the real roots. It suffices to bound the positive roots, as the negative roots of p(x) are the positive roots of p(–x).
Clearly, every bound of all roots applies also for real roots. But in some contexts, tighter bounds of real roots are useful. For example, the efficiency of the method of continued fractions for real-root isolation strongly depends on tightness of a bound of positive roots. This has led to establishing new bounds that are tighter than the general bounds of all roots. These bounds are generally expressed not only in terms of the absolute values of the coefficients, but also in terms of their signs.
Other bounds apply only to polynomials whose all roots are reals (see below).
To give a bound of the positive roots, one can assume without loss of generality, as changing the signs of all coefficients does not change the roots.
Every upper bound of the positive roots of
is also a bound of the real zeros of
In fact, if B is such a bound, for all x > B, one has p(x) ≥ q(x) > 0.
Applied to Cauchy's bound, this gives the upper bound
for the real roots of a polynomial with real coefficients. If this bound is not greater than 1, this means that all nonzero coefficients have the same sign, and that there is no positive root.
Similarly, another upper bound of the positive roots is
If all nonzero coefficients have the same sign, there is no positive root, and the maximum must be zero.
Other bounds have been recently developed, mainly for the method of continued fractions for real-root isolation. [15] [16]
If all roots of a polynomial are real, Laguerre proved the following lower and upper bounds of the roots, by using what is now called Samuelson's inequality. [17]
Let be a polynomial with all real roots. Then its roots are located in the interval with endpoints
For example, the roots of the polynomial satisfy
The root separation of a polynomial is the minimal distance between two roots, that is the minimum of the absolute values of the difference of two roots:
The root separation is a fundamental parameter of the computational complexity of root-finding algorithms for polynomials. In fact, the root separation determines the precision of number representation that is needed for being certain of distinguishing distinct roots. Also, for real-root isolation, it allows bounding the number of interval divisions that are needed for isolating all roots.
For polynomials with real or complex coefficients, it is not possible to express a lower bound of the root separation in terms of the degree and the absolute values of the coefficients only, because a small change on a single coefficient transforms a polynomial with multiple roots into a square-free polynomial with a small root separation, and essentially the same absolute values of the coefficient. However, involving the discriminant of the polynomial allows a lower bound.
For square-free polynomials with integer coefficients, the discriminant is an integer, and has thus an absolute value that is not smaller than 1. This allows lower bounds for root separation that are independent from the discriminant.
Mignotte's separation bound is [18] [19] [20]
where is the discriminant, and
For a square free polynomial with integer coefficients, this implies
where s is the bit size of p, that is the sum of the bitsize of its coefficients.
The Gauss–Lucas theorem states that the convex hull of the roots of a polynomial contains the roots of the derivative of the polynomial.
A sometimes useful corollary is that, if all roots of a polynomial have positive real part, then so do the roots of all derivatives of the polynomial.
A related result is Bernstein's inequality. It states that for a polynomial P of degree n with derivative P′ we have
If the coefficients ai of a random polynomial are independently and identically distributed with a mean of zero, most complex roots are on the unit circle or close to it. In particular, the real roots are mostly located near ±1, and, moreover, their expected number is, for a large degree, less than the natural logarithm of the degree.
If the coefficients are Gaussian distributed with a mean of zero and variance of σ then the mean density of real roots is given by the Kac formula [21] [22]
where
When the coefficients are Gaussian distributed with a non-zero mean and variance of σ, a similar but more complex formula is known.[ citation needed ]
For large n, the mean density of real roots near x is asymptotically
if and
It follows that the expected number of real roots is, using big O notation
where C is a constant approximately equal to 0.6257358072. [23]
In other words, the expected number of real roots of a random polynomial of high degree is lower than the natural logarithm of the degree.
Kac, Erdős and others have shown that these results are insensitive to the distribution of the coefficients, if they are independent and have the same distribution with mean zero. However, if the variance of the ith coefficient is equal to the expected number of real roots is [23]
A polynomial can be written in the form of
with distinct roots and corresponding multiplicities . A root is a simple root if or a multiple root if . Simple roots are Lipschitz continuous with respect to coefficients but multiple roots are not. In other words, simple roots have bounded sensitivities but multiple roots are infinitely sensitive if the coefficients are perturbed arbitrarily. As a result, most root-finding algorithms suffer substantial loss of accuracy on multiple roots in numerical computation.
In 1972, William Kahan proved that there is an inherent stability of multiple roots. [24] Kahan discovered that polynomials with a particular set of multiplicities form what he called a pejorative manifold and proved that a multiple root is Lipschitz continuous if the perturbation maintains its multiplicity.
This geometric property of multiple roots is crucial in numerical computation of multiple roots.
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers n ≥ k ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula
In complex analysis, an entire function, also called an integral function, is a complex-valued function that is holomorphic on the whole complex plane. Typical examples of entire functions are polynomials and the exponential function, and any finite sums, products and compositions of these, such as the trigonometric functions sine and cosine and their hyperbolic counterparts sinh and cosh, as well as derivatives and integrals of entire functions such as the error function. If an entire function has a root at , then , taking the limit value at , is an entire function. On the other hand, the natural logarithm, the reciprocal function, and the square root are all not entire functions, nor can they be continued analytically to an entire function.
The fundamental theorem of algebra, also called d'Alembert's theorem or the d'Alembert–Gauss theorem, states that every non-constant single-variable polynomial with complex coefficients has at least one complex root. This includes polynomials with real coefficients, since every real number is a complex number with its imaginary part equal to zero.
In calculus, Taylor's theorem gives an approximation of a -times differentiable function around a given point by a polynomial of degree , called the -th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation at the order of the Taylor series of the function. The first-order Taylor polynomial is the linear approximation of the function, and the second-order Taylor polynomial is often referred to as the quadratic approximation. There are several versions of Taylor's theorem, some giving explicit estimates of the approximation error of the function by its Taylor polynomial.
In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the roots without computing them. More precisely, it is a polynomial function of the coefficients of the original polynomial. The discriminant is widely used in polynomial factoring, number theory, and algebraic geometry.
In mathematics, factorization (or factorisation, see English spelling differences) or factoring consists of writing a number or another mathematical object as a product of several factors, usually smaller or simpler objects of the same kind. For example, 3 × 5 is an integer factorization of 15, and (x – 2)(x + 2) is a polynomial factorization of x2 – 4.
In mathematics, a root of unity, occasionally called a de Moivre number, is any complex number that yields 1 when raised to some positive integer power n. Roots of unity are used in many branches of mathematics, and are especially important in number theory, the theory of group characters, and the discrete Fourier transform.
In mathematics, thenth cyclotomic polynomial, for any positive integer n, is the unique irreducible polynomial with integer coefficients that is a divisor of and is not a divisor of for any k < n. Its roots are all nth primitive roots of unity , where k runs over the positive integers less than n and coprime to n. In other words, the nth cyclotomic polynomial is equal to
In mathematics, a quadratic polynomial is a polynomial of degree two in one or more variables. A quadratic function is the polynomial function defined by a quadratic polynomial. Before the 20th century, the distinction was unclear between a polynomial and its associated polynomial function; so "quadratic polynomial" and "quadratic function" were almost synonymous. This is still the case in many elementary courses, where both terms are often abbreviated as "quadratic".
In algebra, the partial fraction decomposition or partial fraction expansion of a rational fraction is an operation that consists of expressing the fraction as a sum of a polynomial and one or several fractions with a simpler denominator.
In mathematics, an integral polytope has an associated Ehrhart polynomial that encodes the relationship between the volume of a polytope and the number of integer points the polytope contains. The theory of Ehrhart polynomials can be seen as a higher-dimensional generalization of Pick's theorem in the Euclidean plane.
In numerical analysis, Chebyshev nodes are a set of specific real algebraic numbers, used as nodes for polynomial interpolation. They are the projection of equispaced points on the unit circle onto the real interval the diameter of the circle.
In numerical analysis, the Weierstrass method or Durand–Kerner method, discovered by Karl Weierstrass in 1891 and rediscovered independently by Durand in 1960 and Kerner in 1966, is a root-finding algorithm for solving polynomial equations. In other words, the method can be used to solve numerically the equation
In mathematics, the splitting circle method is a numerical algorithm for the numerical factorization of a polynomial and, ultimately, for finding its complex roots. It was introduced by Arnold Schönhage in his 1982 paper The fundamental theorem of algebra in terms of computational complexity. A revised algorithm was presented by Victor Pan in 1998. An implementation was provided by Xavier Gourdon in 1996 for the Magma and PARI/GP computer algebra systems.
In mathematics, Graeffe's method or Dandelin–Lobachesky–Graeffe method is an algorithm for finding all of the roots of a polynomial. It was developed independently by Germinal Pierre Dandelin in 1826 and Lobachevsky in 1834. In 1837 Karl Heinrich Gräffe also discovered the principal idea of the method. The method separates the roots of a polynomial by squaring them repeatedly. This squaring of the roots is done implicitly, that is, only working on the coefficients of the polynomial. Finally, Viète's formulas are used in order to approximate the roots.
The Jenkins–Traub algorithm for polynomial zeros is a fast globally convergent iterative polynomial root-finding method published in 1970 by Michael A. Jenkins and Joseph F. Traub. They gave two variants, one for general polynomials with complex coefficients, commonly known as the "CPOLY" algorithm, and a more complicated variant for the special case of polynomials with real coefficients, commonly known as the "RPOLY" algorithm. The latter is "practically a standard in black-box polynomial root-finders".
In mathematics, Vincent's theorem—named after Alexandre Joseph Hidulphe Vincent—is a theorem that isolates the real roots of polynomials with rational coefficients.
In mathematics, and, more specifically in numerical analysis and computer algebra, real-root isolation of a polynomial consist of producing disjoint intervals of the real line, which contain each one real root of the polynomial, and, together, contain all the real roots of the polynomial.
In mathematics, a linear recurrence with constant coefficients sets equal to 0 a polynomial that is linear in the various iterates of a variable—that is, in the values of the elements of a sequence. The polynomial's linearity means that each of its terms has degree 0 or 1. A linear recurrence denotes the evolution of some variable over time, with the current time period or discrete moment in time denoted as t, one period earlier denoted as t − 1, one period later as t + 1, etc.
In algebra, a Landau-Mignotte bound (sometimes only referred to as Mignotte's bound) is one of a family of inequalities concerning a univariate integer polynomial f(x) and one of its factors h(x). A basic version states that the coefficients of h(x) are bounded independently of h(x) by an exponential expression involving only the degree and coefficients of f(x), i.e. only depending on f(x).