In numerical analysis, polynomial interpolation is the interpolation of a given bivariate data set by the polynomial of lowest possible degree that passes through the points of the dataset. [1]
Given a set of n + 1 data points , with no two the same, a polynomial function is said to interpolate the data if for each .
There is always a unique such polynomial, commonly given by two explicit formulas, the Lagrange polynomials and Newton polynomials.
The original use of interpolation polynomials was to approximate values of important transcendental functions such as natural logarithm and trigonometric functions. Starting with a few accurately computed data points, the corresponding interpolation polynomial will approximate the function at an arbitrary nearby point. Polynomial interpolation also forms the basis for algorithms in numerical quadrature (Simpson's rule) and numerical ordinary differential equations (multigrid methods).
In computer graphics, polynomials can be used to approximate complicated plane curves given a few specified points, for example the shapes of letters in typography. This is usually done with Bézier curves, which are a simple generalization of interpolation polynomials (having specified tangents as well as specified points).
In numerical analysis, polynomial interpolation is essential to perform sub-quadratic multiplication and squaring, such as Karatsuba multiplication and Toom–Cook multiplication, where interpolation through points on a product polynomial yields the specific product required. For example, given a = f(x) = a0x0 + a1x1 + ··· and b = g(x) = b0x0 + b1x1 + ···, the product ab is a specific value of W(x) = f(x)g(x). One may easily find points along W(x) at small values of x, and interpolation based on those points will yield the terms of W(x) and the specific product ab. As fomulated in Karatsuba multiplication, this technique is substantially faster than quadratic multiplication, even for modest-sized inputs, especially on parallel hardware.
In computer science, polynomial interpolation also leads to algorithms for secure multi party computation and secret sharing.
For any bivariate data points , where no two are the same, there exists a unique polynomial of degree at most that interpolates these points, i.e. . [2]
Equivalently, for a fixed choice of interpolation nodes , polynomial interpolation defines a linear bijection between the (n+1)-tuples of real-number values and the vector space of real polynomials of degree at most n:
This is a type of unisolvence theorem. The theorem is also valid over any infinite field in place of the real numbers , for example the rational or complex numbers.
Consider the Lagrange basis functions given by:
Notice that is a polynomial of degree , and we have for each , while . It follows that the linear combination: has , so is an interpolating polynomial of degree .
To prove uniqueness, assume that there exists another interpolating polynomial of degree at most , so that for all . Then is a polynomial of degree at most which has distinct zeros (the ). But a non-zero polynomial of degree at most can have at most zeros, [lower-alpha 1] so must be the zero polynomial, i.e. . [3]
Write out the interpolation polynomial in the form
(1) |
Substituting this into the interpolation equations , we get a system of linear equations in the coefficients , which reads in matrix-vector form as the following multiplication:
An interpolant corresponds to a solution of the above matrix equation . The matrix X on the left is a Vandermonde matrix, whose determinant is known to be which is non-zero since the nodes are all distinct. This ensures that the matrix is invertible and the equation has the unique solution ; that is, exists and is unique.
If is a polynomial of degree at most , then the interpolating polynomial of at distinct points is itself.
We may write down the polynomial immediately in terms of Lagrange polynomials as: For matrix arguments, this formula is called Sylvester's formula and the matrix-valued Lagrange polynomials are the Frobenius covariants.
For a polynomial of degree less than or equal to n, that interpolates at the nodes where . Let be the polynomial of degree less than or equal to n+1 that interpolates at the nodes where . Then is given by:where also known as Newton basis and .
Proof:
This can be shown for the case where :and when :By the uniqueness of interpolated polynomials of degree less than , is the required polynomial interpolation. The function can thus be expressed as:
To find , we have to solve the lower triangular matrix formed by arranging from above equation in matrix form:
The coefficients are derived as
where
is the notation for divided differences. Thus, Newton polynomials are used to provide a polynomial interpolation formula of n points. [3]
Proof |
---|
The first few coefficients can be calculated using the system of equations. The form of n-th coefficient is assumed for proof by mathematical induction. Let Q be polynomial interpolation of points . Adding to the polynomial Q: where . By uniqueness of the interpolating polynomial of the points , equating the coefficients of we get, . Hence the polynomial can be expressed as: Adding to the polynomial Q, it has to satisfiy: where the formula for and interpolating polynomial are used. The term for the polynomial can be found by calculating:which implies that . Hence it is proved by principle of mathematical induction. |
The Newton polynomial can be expressed in a simplified form when are arranged consecutively with equal spacing.
If are consecutively arranged and equally spaced with for i = 0, 1, ..., k and some variable x is expressed as , then the difference can be written as . So the Newton polynomial becomes
Since the relationship between divided differences and forward differences is given as: [4] Taking , if the representation of x in the previous sections was instead taken to be , the Newton forward interpolation formula is expressed as:which is the interpolation of all points after . It is expanded as:
If the nodes are reordered as , the Newton polynomial becomes
If are equally spaced with for i = 0, 1, ..., k and , then,
Since the relationship between divided differences and backward differences is given as:[ citation needed ]taking , if the representation of x in the previous sections was instead taken to be , the Newton backward interpolation formula is expressed as:which is the interpolation of all points before . It is expanded as:
A Lozenge diagram is a diagram that is used to describe different interpolation formulas that can be constructed for a given data set. A line starting on the left edge and tracing across the diagram to the right can be used to represent an interpolation formula if the following rules are followed: [5]
The factors are expressed using the formula:
If a path goes from to , it can connect through three intermediate steps, (a) through , (b) through or (c) through . Proving the equivalence of these three two-step paths should prove that all (n-step) paths can be morphed with the same starting and ending, all of which represents the same formula.
Path (a):
Path (b):
Path (c):
Subtracting contributions from path a and b:
Thus, the contribution of either path (a) or path (b) is the same. Since path (c) is the average of path (a) and (b), it also contributes identical function to the polynomial. Hence the equivalence of paths with same starting and ending points is shown. To check if the paths can be shifted to different values in the leftmost corner, taking only two step paths is sufficient: (a) to through or (b) factor between and , to through or (c) starting from .
Path (a)
Path (b)
Path (c)
Since , substituting in the above equations shows that all the above terms reduce to and are hence equivalent. Hence these paths can be morphed to start from the leftmost corner and end in a common point. [5]
Taking negative slope transversal from to gives the interpolation formula of all the consecutively arranged points, equivalent to Newton's forward interpolation formula:
whereas, taking positive slope transversal from to , gives the interpolation formula of all the consecutively arranged points, equivalent to Newton's backward interpolation formula:
where is the number corresponding to that introduced in Newton interpolation.
Taking a zigzag line towards the right starting from with negative slope, we get Gauss forward formula:
whereas starting from with positive slope, we get Gauss backward formula:
By taking a horizontal path towards the right starting from , we get Stirling formula:
Stirling formula is the average of Gauss forward and Gauss backward formulas.
By taking a horizontal path towards the right starting from factor between and , we get Stirling formula:
The Vandermonde matrix in the second proof above may have large condition number, [6] causing large errors when computing the coefficients ai if the system of equations is solved using Gaussian elimination.
Several authors have therefore proposed algorithms which exploit the structure of the Vandermonde matrix to compute numerically stable solutions in O(n2) operations instead of the O(n3) required by Gaussian elimination. [7] [8] [9] These methods rely on constructing first a Newton interpolation of the polynomial and then converting it to a monomial form.
To find the interpolation polynomial p(x) in the vector space P(n) of polynomials of degree n, we may use the usual monomial basis for P(n) and invert the Vandermonde matrix by Gaussian elimination, giving a computational cost of O(n3) operations. To improve this algorithm, a more convenient basis for P(n) can simplify the calculation of the coefficients, which must then be translated back in terms of the monomial basis.
One method is to write the interpolation polynomial in the Newton form (i.e. using Newton basis) and use the method of divided differences to construct the coefficients, e.g. Neville's algorithm. The cost is O(n2) operations. Furthermore, you only need to do O(n) extra work if an extra point is added to the data set, while for the other methods, you have to redo the whole computation.
Another method is preferred when the aim is not to compute the coefficients of p(x), but only a single valuep(a) at a point x = a not in the original data set. The Lagrange form computes the value p(a) with complexity O(n2). [10]
The Bernstein form was used in a constructive proof of the Weierstrass approximation theorem by Bernstein and has gained great importance in computer graphics in the form of Bézier curves.
Given a set of (position, value) data points where no two positions are the same, the interpolating polynomial may be considered as a linear combination of the values , using coefficients which are polynomials in depending on the . For example, the interpolation polynomial in the Lagrange form is the linear combination with each coefficient given by the corresponding Lagrange basis polynomial on the given positions :
Since the coefficients depend only on the positions , not the values , we can use the same coefficients to find the interpolating polynomial for a second set of data points at the same positions:
Furthermore, the coefficients only depend on the relative spaces between the positions. Thus, given a third set of data whose points are given by the new variable (an affine transformation of , inverted by ):
we can use a transformed version of the previous coefficient polynomials:
and write the interpolation polynomial as:
Data points often have equally spaced positions, which may be normalized by an affine transformation to . For example, consider the data points
.
The interpolation polynomial in the Lagrange form is the linear combination
For example, and .
The case of equally spaced points can also be treated by the method of finite differences. The first difference of a sequence of values is the sequence defined by . Iterating this operation gives the nth difference operation , defined explicitly by:where the coefficients form a signed version of Pascal's triangle, the triangle of binomial transform coefficients:
1 | Row n = 0 | ||||||||||||||||
1 | −1 | Row n = 1 or d = 0 | |||||||||||||||
1 | −2 | 1 | Row n = 2 or d = 1 | ||||||||||||||
1 | −3 | 3 | −1 | Row n = 3 or d = 2 | |||||||||||||
1 | −4 | 6 | −4 | 1 | Row n = 4 or d = 3 | ||||||||||||
1 | −5 | 10 | −10 | 5 | −1 | Row n = 5 or d = 4 | |||||||||||
1 | −6 | 15 | −20 | 15 | −6 | 1 | Row n = 6 or d = 5 | ||||||||||
1 | −7 | 21 | −35 | 35 | −21 | 7 | −1 | Row n = 7 or d = 6 | |||||||||
A polynomial of degree d defines a sequence of values at positive integer points, , and the difference of this sequence is identically zero:
.
Thus, given values at equally spaced points, where , we have: For example, 4 equally spaced data points of a quadratic obey , and solving for gives the same interpolation equation obtained above using the Lagrange method.
This section may be confusing or unclear to readers.(June 2011) |
When interpolating a given function f by a polynomial of degree n at the nodes x0,..., xn we get the error
where is the (n+1)st divided difference of the data points
.
Furthermore, there is a Lagrange remainder form of the error, for a function f which is n + 1 times continuously differentiable on a closed interval , and a polynomial of degree at most n that interpolates f at n + 1 distinct points . For each there exists such that
This error bound suggests choosing the interpolation points xi to minimize the product , which is achieved by the Chebyshev nodes.
Set the error term as , and define an auxiliary function:Thus:
But since is a polynomial of degree at most n, we have , and:
Now, since xi are roots of and , we have , which means Y has at least n + 2 roots. From Rolle's theorem, has at least n + 1 roots, and iteratively has at least one root ξ in the interval I. Thus:
and:
This parallels the reasoning behind the Lagrange remainder term in the Taylor theorem; in fact, the Taylor remainder is a special case of interpolation error when all interpolation nodes xi are identical. [11] Note that the error will be zero when for any i. Thus, the maximum error will occur at some point in the interval between two successive nodes.
In the case of equally spaced interpolation nodes where , for and where the product term in the interpolation error formula can be bound as [12]
Thus the error bound can be given as
However, this assumes that is dominated by , i.e. . In several cases, this is not true and the error actually increases as n → ∞ (see Runge's phenomenon). That question is treated in the section Convergence properties.
We fix the interpolation nodes x0, ..., xn and an interval [a, b] containing all the interpolation nodes. The process of interpolation maps the function f to a polynomial p. This defines a mapping X from the space C([a, b]) of all continuous functions on [a, b] to itself. The map X is linear and it is a projection on the subspace of polynomials of degree n or less.
The Lebesgue constant L is defined as the operator norm of X. One has (a special case of Lebesgue's lemma):
In other words, the interpolation polynomial is at most a factor (L + 1) worse than the best possible approximation. This suggests that we look for a set of interpolation nodes that makes L small. In particular, we have for Chebyshev nodes:
We conclude again that Chebyshev nodes are a very good choice for polynomial interpolation, as the growth in n is exponential for equidistant nodes. However, those nodes are not optimal.
It is natural to ask, for which classes of functions and for which interpolation nodes the sequence of interpolating polynomials converges to the interpolated function as n → ∞? Convergence may be understood in different ways, e.g. pointwise, uniform or in some integral norm.
The situation is rather bad for equidistant nodes, in that uniform convergence is not even guaranteed for infinitely differentiable functions. One classical example, due to Carl Runge, is the function f(x) = 1 / (1 + x2) on the interval [−5, 5]. The interpolation error || f − pn||∞ grows without bound as n → ∞. Another example is the function f(x) = |x| on the interval [−1, 1], for which the interpolating polynomials do not even converge pointwise except at the three points x = ±1, 0. [13]
One might think that better convergence properties may be obtained by choosing different interpolation nodes. The following result seems to give a rather encouraging answer:
Theorem — For any function f(x) continuous on an interval [a,b] there exists a table of nodes for which the sequence of interpolating polynomials converges to f(x) uniformly on [a,b].
It is clear that the sequence of polynomials of best approximation converges to f(x) uniformly (due to the Weierstrass approximation theorem). Now we have only to show that each may be obtained by means of interpolation on certain nodes. But this is true due to a special property of polynomials of best approximation known from the equioscillation theorem. Specifically, we know that such polynomials should intersect f(x) at least n + 1 times. Choosing the points of intersection as interpolation nodes we obtain the interpolating polynomial coinciding with the best approximation polynomial.
The defect of this method, however, is that interpolation nodes should be calculated anew for each new function f(x), but the algorithm is hard to be implemented numerically. Does there exist a single table of nodes for which the sequence of interpolating polynomials converge to any continuous function f(x)? The answer is unfortunately negative:
Theorem — For any table of nodes there is a continuous function f(x) on an interval [a, b] for which the sequence of interpolating polynomials diverges on [a,b]. [14]
The proof essentially uses the lower bound estimation of the Lebesgue constant, which we defined above to be the operator norm of Xn (where Xn is the projection operator on Πn). Now we seek a table of nodes for which
Due to the Banach–Steinhaus theorem, this is only possible when norms of Xn are uniformly bounded, which cannot be true since we know that
For example, if equidistant points are chosen as interpolation nodes, the function from Runge's phenomenon demonstrates divergence of such interpolation. Note that this function is not only continuous but even infinitely differentiable on [−1, 1]. For better Chebyshev nodes, however, such an example is much harder to find due to the following result:
Theorem — For every absolutely continuous function on [−1, 1] the sequence of interpolating polynomials constructed on Chebyshev nodes converges to f(x) uniformly. [15]
Runge's phenomenon shows that for high values of n, the interpolation polynomial may oscillate wildly between the data points. This problem is commonly resolved by the use of spline interpolation. Here, the interpolant is not a polynomial but a spline: a chain of several polynomials of a lower degree.
Interpolation of periodic functions by harmonic functions is accomplished by Fourier transform. This can be seen as a form of polynomial interpolation with harmonic base functions, see trigonometric interpolation and trigonometric polynomial.
Hermite interpolation problems are those where not only the values of the polynomial p at the nodes are given, but also all derivatives up to a given order. This turns out to be equivalent to a system of simultaneous polynomial congruences, and may be solved by means of the Chinese remainder theorem for polynomials. Birkhoff interpolation is a further generalization where only derivatives of some orders are prescribed, not necessarily all orders from 0 to a k.
Collocation methods for the solution of differential and integral equations are based on polynomial interpolation.
The technique of rational function modeling is a generalization that considers ratios of polynomial functions.
At last, multivariate interpolation for higher dimensions.
A finite difference is a mathematical expression of the form f (x + b) − f (x + a). If a finite difference is divided by b − a, one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for the numerical solution of differential equations, especially boundary value problems.
In the mathematical field of numerical analysis, a Newton polynomial, named after its inventor Isaac Newton, is an interpolation polynomial for a given set of data points. The Newton polynomial is sometimes called Newton's divided differences interpolation polynomial because the coefficients of the polynomial are calculated using Newton's divided differences method.
In numerical analysis, the Lagrange interpolating polynomial is the unique polynomial of lowest degree that interpolates a given set of data.
In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem. In one dimension, it is equivalent to the fundamental theorem of calculus. In three dimensions, it is equivalent to the divergence theorem.
In calculus, the product rule is a formula used to find the derivatives of products of two or more functions. For two functions, it may be stated in Lagrange's notation as or in Leibniz's notation as
In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and returns another function.
In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.
In the mathematical field of numerical analysis, spline interpolation is a form of interpolation where the interpolant is a special type of piecewise polynomial called a spline. That is, instead of fitting a single, high-degree polynomial to all of the values at once, spline interpolation fits low-degree polynomials to small subsets of the values, for example, fitting nine cubic polynomials between each of the pairs of ten points, instead of fitting a single degree-nine polynomial to all of them. Spline interpolation is often preferred over polynomial interpolation because the interpolation error can be made small even when using low-degree polynomials for the spline. Spline interpolation also avoids the problem of Runge's phenomenon, in which oscillation can occur between points when interpolating using high-degree polynomials.
In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its operation.
In mathematics, Birkhoff interpolation is an extension of polynomial interpolation. It refers to the problem of finding a polynomial of degree such that only certain derivatives have specified values at specified points:
In mathematics, differential algebra is, broadly speaking, the area of mathematics consisting in the study of differential equations and differential operators as algebraic objects in view of deriving properties of differential equations and operators without computing the solutions, similarly as polynomial algebras are used for the study of algebraic varieties, which are solution sets of systems of polynomial equations. Weyl algebras and Lie algebras may be considered as belonging to differential algebra.
In numerical analysis, Hermite interpolation, named after Charles Hermite, is a method of polynomial interpolation, which generalizes Lagrange interpolation. Lagrange interpolation allows computing a polynomial of degree less than n that takes the same value at n given points as a given function. Instead, Hermite interpolation computes a polynomial of degree less than n such that the polynomial and its first few derivatives have the same values at m given points as the given function and its first few derivatives at those points. The number of pieces of information, function values and derivative values, must add up to .
In mathematics, a Witt vector is an infinite sequence of elements of a commutative ring. Ernst Witt showed how to put a ring structure on the set of Witt vectors, in such a way that the ring of Witt vectors over the finite field of order is isomorphic to , the ring of -adic integers. They have a highly non-intuitive structure upon first glance because their additive and multiplicative structure depends on an infinite set of recursive formulas which do not behave like addition and multiplication formulas for standard p-adic integers.
In mathematics, a genus of a multiplicative sequence is a ring homomorphism from the ring of smooth compact manifolds up to the equivalence of bounding a smooth manifold with boundary to another ring, usually the rational numbers, having the property that they are constructed from a sequence of polynomials in characteristic classes that arise as coefficients in formal power series with good multiplicative properties.
In applied mathematics, polyharmonic splines are used for function approximation and data interpolation. They are very useful for interpolating and fitting scattered data in many dimensions. Special cases include thin plate splines and natural cubic splines in one dimension.
In the mathematical field of numerical analysis, monotone cubic interpolation is a variant of cubic interpolation that preserves monotonicity of the data set being interpolated.
In polynomial interpolation of two variables, the Padua points are the first known example of a unisolvent point set with minimal growth of their Lebesgue constant, proven to be . Their name is due to the University of Padua, where they were originally discovered.
A locally decodable code (LDC) is an error-correcting code that allows a single bit of the original message to be decoded with high probability by only examining a small number of bits of a possibly corrupted codeword. This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Note that locally decodable codes are not a subset of locally testable codes, though there is some overlap between the two.
In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping Reed–Solomon codewords over a larger alphabet by careful bundling of codeword symbols.
In mathematics, the Whitney inequality gives an upper bound for the error of best approximation of a function by polynomials in terms of the moduli of smoothness. It was first proved by Hassler Whitney in 1957, and is an important tool in the field of approximation theory for obtaining upper estimates on the errors of best approximation.