Wilkinson's polynomial

Last updated

Wilkinson polynomial.svg
Plot of Wilkinson's polynomial
Log Wilkinson polynomial.svg
Plot of

In numerical analysis, Wilkinson's polynomial is a specific polynomial which was used by James H. Wilkinson in 1963 to illustrate a difficulty when finding the root of a polynomial: the location of the roots can be very sensitive to perturbations in the coefficients of the polynomial.

Contents

The polynomial is

Sometimes, the term Wilkinson's polynomial is also used to refer to some other polynomials appearing in Wilkinson's discussion.

Background

Wilkinson's polynomial arose in the study of algorithms for finding the roots of a polynomial

It is a natural question in numerical analysis to ask whether the problem of finding the roots of p from the coefficients ci is well-conditioned. That is, we hope that a small change in the coefficients will lead to a small change in the roots. Unfortunately, this is not the case here.

The problem is ill-conditioned when the polynomial has a multiple root. For instance, the polynomial x2 has a double root at x = 0. However, the polynomial x2ε (a perturbation of size ε) has roots at ±√ε, which is much bigger than ε when ε is small.

It is therefore natural to expect that ill-conditioning also occurs when the polynomial has zeros which are very close. However, the problem may also be extremely ill-conditioned for polynomials with well-separated zeros. Wilkinson used the polynomial w(x) to illustrate this point (Wilkinson 1963).

In 1984, he described the personal impact of this discovery:

Speaking for myself I regard it as the most traumatic experience in my career as a numerical analyst. [1]

Wilkinson's polynomial is often used to illustrate the undesirability of naively computing eigenvalues of a matrix by first calculating the coefficients of the matrix's characteristic polynomial and then finding its roots, since using the coefficients as an intermediate step may introduce an extreme ill-conditioning even if the original problem was well conditioned. [2]

Conditioning of Wilkinson's polynomial

Wilkinson's polynomial

clearly has 20 roots, located at x = 1, 2, ..., 20. These roots are far apart. However, the polynomial is still very ill-conditioned.

Expanding the polynomial, one finds

If the coefficient of x19 is decreased from −210 by 223 to −210.0000001192, then the polynomial value w(20) decreases from 0 to 2232019 = 6.25×1017, and the root at x = 20 grows to x ≈ 20.8. The roots at x = 18 and x = 19 collide into a double root at x ≈ 18.62 which turns into a pair of complex conjugate roots at x ≈ 19.5 ± 1.9i as the perturbation increases further. The 20 roots become (to 5 decimals)

Some of the roots are greatly displaced, even though the change to the coefficient is tiny and the original roots seem widely spaced. Wilkinson showed by the stability analysis discussed in the next section that this behavior is related to the fact that some roots α (such as α = 15) have many roots β that are "close" in the sense that |α  β| is smaller than |α|.

Wilkinson chose the perturbation of 223 because his Pilot ACE computer had 30-bit floating point significands, so for numbers around 210, 223 was an error in the first bit position not represented in the computer. The two real numbers, 210 and 210 223, are represented by the same floating point number, which means that 223 is the unavoidable error in representing a real coefficient close to 210 by a floating point number on that computer. The perturbation analysis shows that 30-bit coefficient precision is insufficient for separating the roots of Wilkinson's polynomial.

Stability analysis

Suppose that we perturb a polynomial p(x) = Π (xαj) with roots αj by adding a small multiple t·c(x) of a polynomial c(x), and ask how this affects the roots αj. To first order, the change in the roots will be controlled by the derivative

When the derivative is large, the roots will be more stable under variations of t, and conversely if this derivative is small the roots will be unstable. In particular, if αj is a multiple root, then the denominator vanishes. In this case, αj is usually not differentiable with respect to t (unless c happens to vanish there), and the roots will be extremely unstable.

For small values of t the perturbed root is given by the power series expansion in t

and one expects problems when |t| is larger than the radius of convergence of this power series, which is given by the smallest value of |t| such that the root αj becomes multiple. A very crude estimate for this radius takes half the distance from αj to the nearest root, and divides by the derivative above.

In the example of Wilkinson's polynomial of degree 20, the roots are given by αj = j for j = 1, ..., 20, and c(x) is equal to x19. So the derivative is given by

This shows that the root αj will be less stable if there are many roots αk close to αj, in the sense that the distance |αj  αk| between them is smaller than |αj|.

Example. For the root α1 = 1, the derivative is equal to 1/19! which is very small; this root is stable even for large changes in t. This is because all the other roots β are a long way from it, in the sense that |α1  β| = 1, 2, 3, ..., 19 is larger than |α1| = 1. For example, even if t is as large as –10000000000, the root α1 only changes from 1 to about 0.99999991779380 (which is very close to the first order approximation 1 + t/19! ≈ 0.99999991779365). Similarly, the other small roots of Wilkinson's polynomial are insensitive to changes in t.

Example. On the other hand, for the root α20 = 20, the derivative is equal to 2019/19! which is huge (about 43000000), so this root is very sensitive to small changes in t. The other roots β are close to α20, in the sense that |β  α20| = 1, 2, 3, ..., 19 is less than |α20| = 20. For t = 2  23 the first-order approximation 20  t·2019/19! = 25.137... to the perturbed root 20.84... is terrible; this is even more obvious for the root α19 where the perturbed root has a large imaginary part but the first-order approximation (and for that matter all higher-order approximations) are real. The reason for this discrepancy is that |t| ≈ 0.000000119 is greater than the radius of convergence of the power series mentioned above (which is about 0.0000000029, somewhat smaller than the value 0.00000001 given by the crude estimate) so the linearized theory does not apply. For a value such as t = 0.000000001 that is significantly smaller than this radius of convergence, the first-order approximation 19.9569... is reasonably close to the root 19.9509...

At first sight the roots α1 = 1 and α20 = 20 of Wilkinson's polynomial appear to be similar, as they are on opposite ends of a symmetric line of roots, and have the same set of distances 1, 2, 3, ..., 19 from other roots. However the analysis above shows that this is grossly misleading: the root α20 = 20 is less stable than α1 = 1 (to small perturbations in the coefficient of x19) by a factor of 2019 = 5242880000000000000000000.

Wilkinson's second example

The second example considered by Wilkinson is

The twenty zeros of this polynomial are in a geometric progression with common ratio 2, and hence the quotient

cannot be large. Indeed, the zeros of w2 are quite stable to large relative changes in the coefficients.

The effect of the basis

The expansion

expresses the polynomial in a particular basis, namely that of the monomials. If the polynomial is expressed in another basis, then the problem of finding its roots may cease to be ill-conditioned. For example, in a Lagrange form, a small change in one (or several) coefficients need not change the roots too much. Indeed, the basis polynomials for interpolation at the points 0, 1, 2, ..., 20 are

Every polynomial (of degree 20 or less) can be expressed in this basis:

For Wilkinson's polynomial, we find

Given the definition of the Lagrange basis polynomial 0(x), a change in the coefficient d0 will produce no change in the roots of w. However, a perturbation in the other coefficients (all equal to zero) will slightly change the roots. Therefore, Wilkinson's polynomial is well-conditioned in this basis.

Notes

  1. Wilkinson, James H. (1984). "The perfidious polynomial". In Gene H. Golub (ed.). Studies in Numerical Analysis. Mathematical Association of America. p. 3. ISBN   978-0-88385-126-5.
  2. Trefethen, Lloyd N.; Bau, David (1997), Numerical Linear Algebra, SIAM

Related Research Articles

In mathematics, a finite field or Galois field is a field that contains a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subtraction and division are defined and satisfy certain basic rules. The most common examples of finite fields are given by the integers mod p when p is a prime number.

In coding theory, the Bose–Chaudhuri–Hocquenghem codes form a class of cyclic error-correcting codes that are constructed using polynomials over a finite field. BCH codes were invented in 1959 by French mathematician Alexis Hocquenghem, and independently in 1960 by Raj Chandra Bose and D.K. Ray-Chaudhuri. The name Bose–Chaudhuri–Hocquenghem arises from the initials of the inventors' surnames.

In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the roots without computing them. More precisely, it is a polynomial function of the coefficients of the original polynomial. The discriminant is widely used in polynomial factoring, number theory, and algebraic geometry.

In algebraic number theory, an algebraic integer is a complex number which is integral over the integers. That is, an algebraic integer is a complex root of some monic polynomial whose coefficients are integers. The set of all algebraic integers A is closed under addition, subtraction and multiplication and therefore is a commutative subring of the complex numbers.

<span class="mw-page-title-main">Lindemann–Weierstrass theorem</span> On algebraic independence of exponentials of linearly independent algebraic numbers over Q

In transcendental number theory, the Lindemann–Weierstrass theorem is a result that is very useful in establishing the transcendence of numbers. It states the following:

<span class="mw-page-title-main">Polynomial ring</span> Algebraic structure

In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring formed from the set of polynomials in one or more indeterminates with coefficients in another ring, often a field.

In mathematics, a linear differential equation is a differential equation that is defined by a linear polynomial in the unknown function and its derivatives, that is an equation of the form

In mathematics, the resultant of two polynomials is a polynomial expression of their coefficients that is equal to zero if and only if the polynomials have a common root, or, equivalently, a common factor. In some older texts, the resultant is also called the eliminant.

In numerical analysis, the Weierstrass method or Durand–Kerner method, discovered by Karl Weierstrass in 1891 and rediscovered independently by Durand in 1960 and Kerner in 1966, is a root-finding algorithm for solving polynomial equations. In other words, the method can be used to solve numerically the equation

In mathematics and computer algebra, factorization of polynomials or polynomial factorization expresses a polynomial with coefficients in a given field or in the integers as the product of irreducible factors with coefficients in the same domain. Polynomial factorization is one of the fundamental components of computer algebra systems.

In mathematics, Graeffe's method or Dandelin–Lobachesky–Graeffe method is an algorithm for finding all of the roots of a polynomial. It was developed independently by Germinal Pierre Dandelin in 1826 and Lobachevsky in 1834. In 1837 Karl Heinrich Gräffe also discovered the principal idea of the method. The method separates the roots of a polynomial by squaring them repeatedly. This squaring of the roots is done implicitly, that is, only working on the coefficients of the polynomial. Finally, Viète's formulas are used in order to approximate the roots.

In mathematics, a univariate polynomial of degree n with real or complex coefficients has n complex roots, if counted with their multiplicities. They form a multiset of n points in the complex plane. This article concerns the geometry of these points, that is the information about their localization in the complex plane that can be deduced from the degree and the coefficients of the polynomial.

In field theory, a branch of mathematics, the minimal polynomial of an element α of a field extension is, roughly speaking, the polynomial of lowest degree having coefficients in the field, such that α is a root of the polynomial. If the minimal polynomial of α exists, it is unique. The coefficient of the highest-degree term in the polynomial is required to be 1.

In mathematics, Macdonald polynomialsPλ(x; t,q) are a family of orthogonal symmetric polynomials in several variables, introduced by Macdonald in 1987. He later introduced a non-symmetric generalization in 1995. Macdonald originally associated his polynomials with weights λ of finite root systems and used just one variable t, but later realized that it is more natural to associate them with affine root systems rather than finite root systems, in which case the variable t can be replaced by several different variables t=(t1,...,tk), one for each of the k orbits of roots in the affine root system. The Macdonald polynomials are polynomials in n variables x=(x1,...,xn), where n is the rank of the affine root system. They generalize many other families of orthogonal polynomials, such as Jack polynomials and Hall–Littlewood polynomials and Askey–Wilson polynomials, which in turn include most of the named 1-variable orthogonal polynomials as special cases. Koornwinder polynomials are Macdonald polynomials of certain non-reduced root systems. They have deep relationships with affine Hecke algebras and Hilbert schemes, which were used to prove several conjectures made by Macdonald about them.

The Jenkins–Traub algorithm for polynomial zeros is a fast globally convergent iterative polynomial root-finding method published in 1970 by Michael A. Jenkins and Joseph F. Traub. They gave two variants, one for general polynomials with complex coefficients, commonly known as the "CPOLY" algorithm, and a more complicated variant for the special case of polynomials with real coefficients, commonly known as the "RPOLY" algorithm. The latter is "practically a standard in black-box polynomial root-finders".

In mathematics, the multiplicity of a member of a multiset is the number of times it appears in the multiset. For example, the number of times a given polynomial has a root at a given point is the multiplicity of that root.

In mathematics, the Kostant polynomials, named after Bertram Kostant, provide an explicit basis of the ring of polynomials over the ring of polynomials invariant under the finite reflection group of a root system.

<span class="mw-page-title-main">Resolvent cubic</span>

In algebra, a resolvent cubic is one of several distinct, although related, cubic polynomials defined from a monic polynomial of degree four:

In mathematics, a linear recurrence with constant coefficients sets equal to 0 a polynomial that is linear in the various iterates of a variable—that is, in the values of the elements of a sequence. The polynomial's linearity means that each of its terms has degree 0 or 1. A linear recurrence denotes the evolution of some variable over time, with the current time period or discrete moment in time denoted as t, one period earlier denoted as t − 1, one period later as t + 1, etc.

Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.

References

Wilkinson discussed "his" polynomial in

It is mentioned in standard text books in numerical analysis, like

Other references:

A high-precision numerical computation is presented in: