Polynomial evaluation

Last updated

In mathematics and computer science, polynomial evaluation refers to computation of the value of a polynomial when its indeterminates are substituted for some values. In other words, evaluating the polynomial at consists of computing See also Polynomial ring § Polynomial evaluation

Contents

For evaluating the univariate polynomial the most naive method would use multiplications to compute , use multiplications to compute and so on for a total of multiplications and additions. Using better methods, such as Horner's rule, this can be reduced to multiplications and additions. If some preprocessing is allowed, even more savings are possible.

Background

This problem arises frequently in practice. In computational geometry, polynomials are used to compute function approximations using Taylor polynomials. In cryptography and hash tables, polynomials are used to compute k-independent hashing.

In the former case, polynomials are evaluated using floating-point arithmetic, which is not exact. Thus different schemes for the evaluation will, in general, give slightly different answers. In the latter case, the polynomials are usually evaluated in a finite field, in which case the answers are always exact.

General methods

Horner's rule

Horner's method evaluates a polynomial using repeated bracketing:

This method reduces the number of multiplications and additions to just

Horner's method is so common that a computer instruction "multiply–accumulate operation" has been added to many computer processors, which allow doing the addition and multiplication operations in one combined step.

Multivariate

If the polynomial is multivariate, Horner's rule can be applied recursively over some ordering of the variables. E.g.

can be written as

An efficient version of this approach was described by Carnicer and Gasca. [1]

Estrin's scheme

While it's not possible to do less computation than Horner's rule (without preprocessing), on modern computers the order of evaluation can matter a lot for the computational efficiency. A method known as Estrin's scheme computes a (single variate) polynomial in a tree like pattern:

Combined by Exponentiation by squaring, this allows parallelizing the computation.

Evaluation with preprocessing

Arbitrary polynomials can be evaluated with fewer operations than Horner's rule requires if we first "preprocess" the coefficients .

An example was first given by Motzkin [2] who noted that

can be written as

where the values are computed in advanced, based on . Motzkin's method uses just 3 multiplications compared to Horner's 4.

The values for each can be easily computed by expanding and equating the coefficients:

Example

To compute the Taylor expansion , we can upscale by a factor 24, apply the above steps, and scale back down. That gives us the three multiplication computation

Improving over the equivalent Horner form (that is ) by 1 multiplication.

Some general methods include the Knuth–Eve algorithm and the Rabin–Winograd algorithm.

Multipoint evaluation

Evaluate of a -degree polynomial in multiple points can be done with multiplications by using Horner's method times. Using above preprocessing approach, this can be reduced that by a factor of two, that is, to multiplications. However, it is possible to do better.

It is possible to reduce the time requirement to just . [3] The idea is to define two polynomials that are zero in respectively the first and second half of the points: and . We then compute and using the Polynomial remainder theorem, which can be done in time using a fast Fourier transform. This means and by construction, where and are polynomials of degree at most . Because of how and were defined, we have

Thus to compute on all of the , it suffices to compute the smaller polynomials and on each half of the points. This gives us a divide-and-conquer algorithm with , which implies by the master theorem.


In the case where the points in which we wish to evaluate the polynomials have some structure, simpler methods exist. For example, Knuth [4] section 4.6.4 gives a method for tabulating polynomial values of the type

Dynamic evaluation

In the case where are not known in advance, Kedlaya and Umans [5] gave a data structure for evaluating polynomials over a finite field of size in time per evaluation after some initial preprocessing. This was shown by Larsen [6] to be essentially optimal.

The idea is to transform of degree into a multivariate polynomial , such that and the individual degrees of is at most . Since this is over , the largest value can take (over ) is . Using the Chinese remainder theorem, it suffices to evaluate modulo different primes with a product at least . Each prime can be taken to be roughly , and the number of primes needed, , is roughly the same. Doing this process recursively, we can get the primes as small as . That means we can compute and store on all the possible values in time and space. If we take , we get , so the time/space requirement is just

Kedlaya and Umans further show how to combine this preprocessing with fast (FFT) multipoint evaluation. This allows optimal algorithms for many important algebraic problems, such as polynomial modular composition.

Specific polynomials

While general polynomials require operations to evaluate, some polynomials can be computed much faster. For example, the polynomial can be computed using just one multiplication and one addition since

Evaluation of powers

A particularly interesting type of polynomial is powers like . Such polynomials can always be computed in operations. Suppose, for example, that we need to compute ; we could simply start with and multiply by to get . We can then multiply that by itself to get and so on to get and in just four multiplications. Other powers like can similarly be computed efficiently by first computing by 2 multiplications and then multiplying by .

The most efficient way to compute a given power is provided by addition-chain exponentiation. However, this requires designing a specific algorithm for each exponent, and the computation needed for designing these algorithms are difficult (NP-complete [7] ), so exponentiation by squaring is generally preferred for effective computations.

Polynomial families

Often polynomials show up in a different form than the well known . For polynomials in Chebyshev form we can use Clenshaw algorithm. For polynomials in Bézier form we can use De Casteljau's algorithm, and for B-splines there is De Boor's algorithm.

Hard polynomials

The fact that some polynomials can be computed significantly faster than "general polynomials" suggests the question: Can we give an example of a simple polynomial that cannot be computed in time much smaller than its degree? Volker Strassen has shown [8] that the polynomial

cannot be evaluated by with less than multiplications and additions. At least this bound holds if only operations of those types are allowed, giving rise to a so-called "polynomial chain of length ".

The polynomial given by Strassen has very large coefficients, but by probabilistic methods, one can show there must exist even polynomials with coefficients just 0's and 1's such that the evaluation requires at least multiplications. [9]

For other simple polynomials, the complexity is unknown. The polynomial is conjectured to not be computable in time for any . This is supported by the fact, that if it can be computed fast then integer factorization can be computed in polynomial time, breaking the RSA cryptosystem. [10]

Matrix polynomials

Sometimes the computational cost of scalar multiplications (like ) is less than the computational cost of "non scalar" multiplications (like ). The typical example of this is matrices. If is an matrix, a scalar multiplication takes about arithmetic operations, while computing takes about (or using fast matrix multiplication).

Matrix polynomials are important for example for computing the Matrix Exponential.

Paterson and Stockmeyer [11] showed how to compute a degree polynomial using only non scalar multiplications and scalar multiplications. Thus a matrix polynomial of degree n can be evaluated in time. If this is , as fast as one matrix multiplication with the standard algorithm.

This method works as follows: For a polynomial

let k be the least integer not smaller than The powers are computed with matrix multiplications, and are then computed by repeated multiplication by Now,

,

where for in. This requires just more non-scalar multiplications.

We can write this succinctly using the Kronecker product:

.

The direct application of this method uses non-scalar multiplications, but combining it with Evaluation with preprocessing, Paterson and Stockmeyer show you can reduce this to .

Methods based on matrix polynomial multiplications and additions have been proposed allowing to save nonscalar matrix multiplications with respect to the Paterson-Stockmeyer method. [12]

See also

Related Research Articles

<span class="mw-page-title-main">BQP</span> Computational complexity class of problems

In computational complexity theory, bounded-error quantum polynomial time (BQP) is the class of decision problems solvable by a quantum computer in polynomial time, with an error probability of at most 1/3 for all instances. It is the quantum analogue to the complexity class BPP.

In mathematics and computer programming, exponentiating by squaring is a general method for fast computation of large positive integer powers of a number, or more generally of an element of a semigroup, like a polynomial or a square matrix. Some variants are commonly referred to as square-and-multiply algorithms or binary exponentiation. These can be of quite general use, for example in modular arithmetic or powering of matrices. For semigroups for which additive notation is commonly used, like elliptic curves used in cryptography, this method is also referred to as double-and-add.

In computational complexity theory, the class NC (for "Nick's Class") is the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. In other words, a problem with input size n is in NC if there exist constants c and k such that it can be solved in time O((log n)c) using O(nk) parallel processors. Stephen Cook coined the name "Nick's class" after Nick Pippenger, who had done extensive research on circuits with polylogarithmic depth and polynomial size.

<span class="mw-page-title-main">Matrix multiplication</span> Mathematical operation in linear algebra

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices A and B is denoted as AB.

Toom–Cook, sometimes known as Toom-3, named after Andrei Toom, who introduced the new algorithm with its low complexity, and Stephen Cook, who cleaned the description of it, is a multiplication algorithm for large integers.

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring formed from the set of polynomials in one or more indeterminates with coefficients in another ring, often a field.

A commitment scheme is a cryptographic primitive that allows one to commit to a chosen value while keeping it hidden to others, with the ability to reveal the committed value later. Commitment schemes are designed so that a party cannot change the value or statement after they have committed to it: that is, commitment schemes are binding. Commitment schemes have important applications in a number of cryptographic protocols including secure coin flipping, zero-knowledge proofs, and secure computation.

<span class="mw-page-title-main">Schönhage–Strassen algorithm</span> Multiplication algorithm

The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers, published by Arnold Schönhage and Volker Strassen in 1971. It works by recursively applying fast Fourier transform (FFT) over the integers modulo 2n+1. The run-time bit complexity to multiply two n-digit numbers using the algorithm is in big O notation.

In mathematics, the resultant of two polynomials is a polynomial expression of their coefficients that is equal to zero if and only if the polynomials have a common root, or, equivalently, a common factor. In some older texts, the resultant is also called the eliminant.

Hidden Fields Equations (HFE), also known as HFE trapdoor function, is a public key cryptosystem which was introduced at Eurocrypt in 1996 and proposed by (in French)Jacques Patarin following the idea of the Matsumoto and Imai system. It is based on polynomials over finite fields of different size to disguise the relationship between the private key and public key. HFE is in fact a family which consists of basic HFE and combinatorial versions of HFE. The HFE family of cryptosystems is based on the hardness of the problem of finding solutions to a system of multivariate quadratic equations since it uses private affine transformations to hide the extension field and the private polynomials. Hidden Field Equations also have been used to construct digital signature schemes, e.g. Quartz and Sflash.

The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power methods to find the "most useful" eigenvalues and eigenvectors of an Hermitian matrix, where is often but not necessarily much smaller than . Although computationally efficient in principle, the method as initially formulated was not useful, due to its numerical instability.

In mathematics, Graeffe's method or Dandelin–Lobachesky–Graeffe method is an algorithm for finding all of the roots of a polynomial. It was developed independently by Germinal Pierre Dandelin in 1826 and Lobachevsky in 1834. In 1837 Karl Heinrich Gräffe also discovered the principal idea of the method. The method separates the roots of a polynomial by squaring them repeatedly. This squaring of the roots is done implicitly, that is, only working on the coefficients of the polynomial. Finally, Viète's formulas are used in order to approximate the roots.

The softmax function, also known as softargmax or normalized exponential function, converts a vector of K real numbers into a probability distribution of K possible outcomes. It is a generalization of the logistic function to multiple dimensions, and used in multinomial logistic regression. The softmax function is often used as the last activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes, based on Luce's choice axiom.

The fast multipole method (FMM) is a numerical technique that was developed to speed up the calculation of long-ranged forces in the n-body problem. It does this by expanding the system Green's function using a multipole expansion, which allows one to group sources that lie close together and treat them as if they are a single source.

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ().

In linear algebra, the computation of the permanent of a matrix is a problem that is thought to be more difficult than the computation of the determinant of a matrix despite the apparent similarity of the definitions.

An important aspect in the study of elliptic curves is devising effective ways of counting points on the curve. There have been several approaches to do so, and the algorithms devised have proved to be useful tools in the study of various fields such as number theory, and more recently in cryptography and Digital Signature Authentication. While in number theory they have important consequences in the solving of Diophantine equations, with respect to cryptography, they enable us to make effective use of the difficulty of the discrete logarithm problem (DLP) for the group , of elliptic curves over a finite field , where q = pk and p is a prime. The DLP, as it has come to be known, is a widely used approach to public key cryptography, and the difficulty in solving this problem determines the level of security of the cryptosystem. This article covers algorithms to count points on elliptic curves over fields of large characteristic, in particular p > 3. For curves over fields of small characteristic more efficient algorithms based on p-adic methods exist.

In mathematics and computer algebra the factorization of a polynomial consists of decomposing it into a product of irreducible factors. This decomposition is theoretically possible and is unique for polynomials with coefficients in any field, but rather strong restrictions on the field of the coefficients are needed to allow the computation of the factorization by means of an algorithm. In practice, algorithms have been designed only for polynomials with coefficients in a finite field, in the field of rationals or in a finitely generated field extension of one of them.

The cyclotomic fast Fourier transform is a type of fast Fourier transform algorithm over finite fields. This algorithm first decomposes a DFT into several circular convolutions, and then derives the DFT results from the circular convolution results. When applied to a DFT over , this algorithm has a very low multiplicative complexity. In practice, since there usually exist efficient algorithms for circular convolutions with specific lengths, this algorithm is very efficient.

References

  1. Carnicer, J.; Gasca, M. (1990). "Evaluation of Multivariate Polynomials and Their Derivatives". Mathematics of Computation . 54 (189): 231–243. doi: 10.2307/2008692 . JSTOR   2008692.
  2. Motzkin, T. S. (1955). "Evaluation of polynomials and evaluation of rational functions". Bulletin of the American Mathematical Society . 61 (163): 10.
  3. Von Zur Gathen, Joachim; Jürgen, Gerhard (2013). Modern computer algebra. Cambridge University Press. Chapter 10. ISBN   9781139856065.
  4. Knuth, Donald (2005). Art of Computer Programming . Vol. 2: Seminumerical Algorithms. Addison-Wesley. ISBN   9780201853926.
  5. Kedlaya, Kiran S.; Umans, Christopher (2011). "Fast Polynomial Factorization and Modular Composition". SIAM Journal on Computing . 40 (6): 1767–1802. doi:10.1137/08073408x. hdl: 1721.1/71792 . S2CID   412751.
  6. Larsen, K. G. (2012). "Higher Cell Probe Lower Bounds for Evaluating Polynomials". 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science. Vol. 53. IEEE. pp. 293–301. doi:10.1109/FOCS.2012.21. ISBN   978-0-7695-4874-6. S2CID   7906483.
  7. Downey, Peter; Leong, Benton; Sethi, Ravi (1981). "Computing Sequences with Addition Chains". SIAM Journal on Computing. 10 (3). Retrieved 27 January 2024.
  8. Strassen, Volker (1974). "Polynomials with Rational Coefficients Which are Hard to Compute". SIAM Journal on Computing . 3 (2): 128–149. doi:10.1137/0203010.
  9. Schnorr, C. P. (1979), "On the additive complexity of polynomials and some new lower bounds", Theoretical Computer Science , Lecture Notes in Computer Science, Springer, vol. 67, pp. 286–297, doi:10.1007/3-540-09118-1_30, ISBN   978-3-540-09118-9
  10. Chen, Xi, Neeraj Kayal, and Avi Wigderson. Partial derivatives in arithmetic complexity and beyond. Now Publishers Inc, 2011.
  11. Paterson, Michael S.; Stockmeyer, Larry J. (1973). "On the Number of Nonscalar Multiplications Necessary to Evaluate Polynomials". SIAM Journal on Computing . 2 (1): 60–66. doi:10.1137/0202007.
  12. Fasi, Massimiliano (1 August 2019). "Optimality of the Paterson–Stockmeyer method for evaluating matrix polynomials and rational matrix functions" (PDF). Linear Algebra and its Applications. 574: 185. doi: 10.1016/j.laa.2019.04.001 . ISSN   0024-3795.