Newton polynomial

Last updated

In the mathematical field of numerical analysis, a Newton polynomial, named after its inventor Isaac Newton, [1] is an interpolation polynomial for a given set of data points. The Newton polynomial is sometimes called Newton's divided differences interpolation polynomial because the coefficients of the polynomial are calculated using Newton's divided differences method.



Given a set of k + 1 data points

where no two xj are the same, the Newton interpolation polynomial is a linear combination of Newton basis polynomials

with the Newton basis polynomials defined as

for j > 0 and .

The coefficients are defined as


is the notation for divided differences.

Thus the Newton polynomial can be written as

Newton forward divided difference formula

The Newton polynomial can be expressed in a simplified form when are arranged consecutively with equal spacing. Introducing the notation for each and , the difference can be written as . So the Newton polynomial becomes

This is called the Newton forward divided difference formula[ citation needed ].

Newton backward divided difference formula

If the nodes are reordered as , the Newton polynomial becomes

If are equally spaced with and for i = 0, 1, ..., k, then,

is called the Newton backward divided difference formula[ citation needed ].


Newton's formula is of interest because it is the straightforward and natural differences-version of Taylor's polynomial. Taylor's polynomial tells where a function will go, based on its y value, and its derivatives (its rate of change, and the rate of change of its rate of change, etc.) at one particular x value. Newton's formula is Taylor's polynomial based on finite differences instead of instantaneous rates of change.

Addition of new points

As with other difference formulas, the degree of a Newton interpolating polynomial can be increased by adding more terms and points without discarding existing ones. Newton's form has the simplicity that the new points are always added at one end: Newton's forward formula can add new points to the right, and Newton's backward formula can add new points to the left.

The accuracy of polynomial interpolation depends on how close the interpolated point is to the middle of the x values of the set of points used. Obviously, as new points are added at one end, that middle becomes farther and farther from the first data point. Therefore, if it isn't known how many points will be needed for the desired accuracy, the middle of the x-values might be far from where the interpolation is done.

Gauss, Stirling, and Bessel all developed formulae to remedy that problem. [2]

Gauss's formula alternately adds new points at the left and right ends, thereby keeping the set of points centered near the same place (near the evaluated point). When so doing, it uses terms from Newton's formula, with data points and x values renamed in keeping with one's choice of what data point is designated as the x0 data point.

Stirling's formula remains centered about a particular data point, for use when the evaluated point is nearer to a data point than to a middle of two data points.

Bessel's formula remains centered about a particular middle between two data points, for use when the evaluated point is nearer to a middle than to a data point.

Bessel and Stirling achieve that by sometimes using the average of two differences, and sometimes using the average of two products of binomials in x, where Newton's or Gauss's would use just one difference or product. Stirling's uses an average difference in odd-degree terms (whose difference uses an even number of data points); Bessel's uses an average difference in even-degree terms (whose difference uses an odd number of data points).

Strengths and weaknesses of various formulae

For any given finite set of data points, there is only one polynomial of least possible degree that passes through all of them. Thus, it is appropriate to speak of the "Newton form", or Lagrange form, etc., of the interpolation polynomial. However, the way the polynomial is obtained matters. There are several similar methods, such as those of Gauss, Bessel and Stirling. They can be derived from Newton's by renaming the x-values of the data points, but in practice they are important.

Bessel vs. Stirling

The choice between Bessel and Stirling depends on whether the interpolated point is closer to a data point, or closer to a middle between two data points.

A polynomial interpolation's error approaches zero, as the interpolation point approaches a data-point. Therefore, Stirling's formula brings its accuracy improvement where it is least needed and Bessel brings its accuracy improvement where it is most needed.

So, Bessel's formula could be said to be the most consistently accurate difference formula, and, in general, the most consistently accurate of the familiar polynomial interpolation formulas.

Divided-Difference Methods vs. Lagrange

Lagrange is sometimes said to require less work, and is sometimes recommended for problems in which it is known, in advance, from previous experience, how many terms are needed for sufficient accuracy.

The divided difference methods have the advantage that more data points can be added, for improved accuracy. The terms based on the previous data points can continue to be used. With the ordinary Lagrange formula, to do the problem with more data points would require re-doing the whole problem.

There is a "barycentric" version of Lagrange that avoids the need to re-do the entire calculation when adding a new data point. But it requires that the values of each term be recorded.

But the ability, of Gauss, Bessel and Stirling, to keep the data points centered close to the interpolated point gives them an advantage over Lagrange, when it isn't known, in advance, how many data points will be needed.

Additionally, suppose that one wants to find out if, for some particular type of problem, linear interpolation is sufficiently accurate. That can be determined by evaluating the quadratic term of a divided difference formula. If the quadratic term is negligible—meaning that the linear term is sufficiently accurate without adding the quadratic term—then linear interpolation is sufficiently accurate. If the problem is sufficiently important, or if the quadratic term is nearly big enough to matter, then one might want to determine whether the _sum_ of the quadratic and cubic terms is large enough to matter in the problem.

Of course, only a divided-difference method can be used for such a determination.

For that purpose, the divided-difference formula and/or its x0 point should be chosen so that the formula will use, for its linear term, the two data points between which the linear interpolation of interest would be done.

The divided difference formulas are more versatile, useful in more kinds of problems.

The Lagrange formula is at its best when all the interpolation will be done at one x value, with only the data points' y values varying from one problem to another, and when it is known, from past experience, how many terms are needed for sufficient accuracy.

With the Newton form of the interpolating polynomial a compact and effective algorithm exists for combining the terms to find the coefficients of the polynomial. [3]


When, with Stirling's or Bessel's, the last term used includes the average of two differences, then one more point is being used than Newton's or other polynomial interpolations would use for the same polynomial degree. So, in that instance, Stirling's or Bessel's is not putting an N−1 degree polynomial through N points, but is, instead, trading equivalence with Newton's for better centering and accuracy, giving those methods sometimes potentially greater accuracy, for a given polynomial degree, than other polynomial interpolations.

General case

For the special case of xi = i, there is a closely related set of polynomials, also called the Newton polynomials, that are simply the binomial coefficients for general argument. That is, one also has the Newton polynomials given by

In this form, the Newton polynomials generate the Newton series. These are in turn a special case of the general difference polynomials which allow the representation of analytic functions through generalized difference equations.

Main idea

Solving an interpolation problem leads to a problem in linear algebra where we have to solve a system of linear equations. Using a standard monomial basis for our interpolation polynomial we get the very complicated Vandermonde matrix. By choosing another basis, the Newton basis, we get a system of linear equations with a much simpler lower triangular matrix which can be solved faster.

For k + 1 data points we construct the Newton basis as

Using these polynomials as a basis for we have to solve

to solve the polynomial interpolation problem.

This system of equations can be solved iteratively by solving


While the interpolation formula can be found by solving a linear system of equations, there is a loss of intuition in what the formula is showing and why Newton's interpolation formula works is not readily apparent. To begin, we will need the following result:

. This equality means that reversing the terms of the divided difference has no effect on the end result. We will prove this result with induction.


Induction: Suppose the result holds for a divided difference involving fewer than terms. Using the induction hypothesis, we see that where the induction hypothesis was used at the 2nd equality.

In order to derive the interpolation formula, we will now use the following result which will also be proven with induction:

where is the unique polynomial of degree (at most) passing through the points . With this result, we can now exactly quantify the error between the interpolation polynomial at and the true value .

Basis: where is the unique polynomial of degree 0 passing through .

Induction: Suppose the result holds for when there are fewer than points. Let be the polynomial of degree (at most) passing through

where is the unique polynomial of degree (at most) passing through the points . The second to last equality comes from the induction hypothesis as involves points and thus . Getting close to the desired result, we now claim that as both polynomials pass through and are of degree (at most) . Both of these criteria uniquely define a polynomial. The fact that the left hand side passes through is readily apparent by how is defined to be. To demonstrate the left hand side passes through , we will use the first result proved above along with the induction hypothesis:

where the 2nd equality follows from the fact that is the polynomial of degree (at most) passing through satisfying the induction hypothesis. Continuing the induction step above, we now see that where is the polynomial of degree passing through Thus the proof is complete.

All this work now leads to where Newton's interpolation formula comes from. Rearranging the result above, we note that is the polynomial of degree (at most) passing through , and thus we see that "extending" a polynomial to the next point requires adding the term giving us Newton's interpolation formula.

Taylor polynomial

The limit of the Newton polynomial if all nodes coincide is a Taylor polynomial, because the divided differences become derivatives.


As can be seen from the definition of the divided differences new data points can be added to the data set to create a new interpolation polynomial without recalculating the old coefficients. And when a data point changes we usually do not have to recalculate all coefficients. Furthermore, if the xi are distributed equidistantly the calculation of the divided differences becomes significantly easier. Therefore, the divided-difference formulas are usually preferred over the Lagrange form for practical purposes.


The divided differences can be written in the form of a table. For example, for a function f is to be interpolated on points . Write

Then the interpolating polynomial is formed as above using the topmost entries in each column as coefficients.

For example, suppose we are to construct the interpolating polynomial to f(x) = tan(x) using divided differences, at the points

Using six digits of accuracy, we construct the table

Thus, the interpolating polynomial is

Given more digits of accuracy in the table, the first and third coefficients will be found to be zero.

Another example:

The sequence such that and , i.e., they are from to .

You obtain the slope of order in the following way:

As we have the slopes of order , it is possible to obtain the next order:

Finally, we define the slope of order :

Once we have the slope, we can define the consequent polynomials:

See also

Related Research Articles

Binomial coefficient

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n, and is given by the formula

Discrete Fourier transform

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

Taylor series Expression of a function as an infinite sum

In mathematics, the Taylor series of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor's series are named after Brook Taylor who introduced them in 1715.

A finite difference is a mathematical expression of the form f (x + b) − f (x + a). If a finite difference is divided by ba, one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for the numerical solution of differential equations, especially boundary value problems.

In mathematics, a recurrence relation is an equation that recursively defines a sequence or multidimensional array of values, once one or more initial terms are given; each further term of the sequence or array is defined as a function of the preceding terms.

In numerical analysis, polynomial interpolation is the interpolation of a given data set by the polynomial of lowest possible degree that passes through the points of the dataset.

Lagrange polynomial

In numerical analysis, Lagrange polynomials are used for polynomial interpolation. For a given set of points with no two values equal, the Lagrange polynomial is the polynomial of lowest degree that assumes at each value the corresponding value , so that the functions coincide at each point.

Simpsons rule Method for numerical integration

In numerical analysis, Simpson's rule is a method for numerical integration, the numerical approximation of definite integrals. Specifically, it is the following approximation for values bounding equally spaced subdivisions :

In mathematics, a generalized hypergeometric series is a power series in which the ratio of successive coefficients indexed by n is a rational function of n. The series, if convergent, defines a generalized hypergeometric function, which may then be defined over a wider domain of the argument by analytic continuation. The generalized hypergeometric series is sometimes just called the hypergeometric series, though this term also sometimes just refers to the Gaussian hypergeometric series. Generalized hypergeometric functions include the (Gaussian) hypergeometric function and the confluent hypergeometric function as special cases, which in turn have many particular special functions as special cases, such as elementary functions, Bessel functions, and the classical orthogonal polynomials.

Spline (mathematics)

In mathematics, a spline is a special function defined piecewise by polynomials. In interpolating problems, spline interpolation is often preferred to polynomial interpolation because it yields similar results, even when using low degree polynomials, while avoiding Runge's phenomenon for higher degrees.

In numerical analysis, a cubic Hermite spline or cubic Hermite interpolator is a spline where each piece is a third-degree polynomial specified in Hermite form, that is, by its values and first derivatives at the end points of the corresponding domain interval.

In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its operation.

In numerical analysis, Hermite interpolation, named after Charles Hermite, is a method of interpolating data points as a polynomial function. The generated Hermite interpolating polynomial is closely related to the Newton polynomial, in that both are derived from the calculation of divided differences. However, the Hermite interpolating polynomial may also be computed without using divided differences, see Chinese remainder theorem § Hermite interpolation.

Linear multistep methods are used for the numerical solution of ordinary differential equations. Conceptually, a numerical method starts from an initial point and then takes a short step forward in time to find the next solution point. The process continues with subsequent steps to map out the solution. Single-step methods refer to only one previous point and its derivative to determine the current value. Methods such as Runge–Kutta take some intermediate steps to obtain a higher order method, but then discard all previous information before taking a second step. Multistep methods attempt to gain efficiency by keeping and using the information from previous steps rather than discarding it. Consequently, multistep methods refer to several previous points and derivative values. In the case of linear multistep methods, a linear combination of the previous points and derivative values is used.

In mathematics, Newton's identities, also known as the Girard-Newton formulae, give relations between two types of symmetric polynomials, namely between power sums and elementary symmetric polynomials. Evaluated at the roots of a monic polynomial P in one variable, they allow expressing the sums of the k-th powers of all roots of P in terms of the coefficients of P, without actually finding those roots. These identities were found by Isaac Newton around 1666, apparently in ignorance of earlier work (1629) by Albert Girard. They have applications in many areas of mathematics, including Galois theory, invariant theory, group theory, combinatorics, as well as further applications outside mathematics, including general relativity.

Newton fractal

The Newton fractal is a boundary set in the complex plane which is characterized by Newton's method applied to a fixed polynomial or transcendental function. It is the Julia set of the meromorphic function which is given by Newton's method. When there are no attractive cycles, it divides the complex plane into regions , each of which is associated with a root of the polynomial, . In this way the Newton fractal is similar to the Mandelbrot set, and like other fractals it exhibits an intricate appearance arising from a simple description. It is relevant to numerical analysis because it shows that the Newton method can be very sensitive to its choice of start point.

In applied mathematics, the nonuniform discrete Fourier transform of a signal is a type of Fourier transform, related to a discrete Fourier transform or discrete-time Fourier transform, but in which the input signal is not sampled at equally spaced points or frequencies. It is a generalization of the shifted DFT. It has important applications in signal processing, magnetic resonance imaging, and the numerical solution of partial differential equations.

In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping Reed–Solomon codewords over a larger alphabet by careful bundling of codeword symbols.

Nonuniform sampling is a branch of sampling theory involving results related to the Nyquist–Shannon sampling theorem. Nonuniform sampling is based on Lagrange interpolation and the relationship between itself and the (uniform) sampling theorem. Nonuniform sampling is a generalisation of the Whittaker–Shannon–Kotelnikov (WSK) sampling theorem.

The Bernoulli polynomials of the second kindψn(x), also known as the Fontana-Bessel polynomials, are the polynomials defined by the following generating function:


  1. Dunham, William (1990). "7". Journey Through Genius: The Great Theorems of Mathematics. Kanak Agrawal, Inc. pp.  155–183. ISBN   9780140147391 . Retrieved 24 October 2019.
  2. Numerical Methods for Scientists and Engineers, R.W. Hamming
  3. Stetekluh, Jeff. "Algorithm for the Newton Form of the Interpolating Polynomial".