Bernstein polynomial

Last updated
Bernstein polynomials approximating a curve Bernstein Approximation.gif
Bernstein polynomials approximating a curve

In the mathematical field of numerical analysis, a Bernstein polynomial is a polynomial expressed as a linear combination of Bernstein basis polynomials. The idea is named after mathematician Sergei Natanovich Bernstein.

Contents

Polynomials in Bernstein form were first used by Bernstein in a constructive proof for the Weierstrass approximation theorem. With the advent of computer graphics, Bernstein polynomials, restricted to the interval [0, 1], became important in the form of Bézier curves.

A numerically stable way to evaluate polynomials in Bernstein form is de Casteljau's algorithm.

Bernstein basis polynomials for 4th degree curve blending Bernstein Polynomials.svg
Bernstein basis polynomials for 4th degree curve blending

Definition

Bernstein basis polynomials

The n+1 Bernstein basis polynomials of degree n are defined as

where is a binomial coefficient.

So, for example,

The first few Bernstein basis polynomials for blending 1, 2, 3 or 4 values together are:

The Bernstein basis polynomials of degree n form a basis for the vector space of polynomials of degree at most n with real coefficients.

Bernstein polynomials

A linear combination of Bernstein basis polynomials

is called a Bernstein polynomial or polynomial in Bernstein form of degree n. [1] The coefficients are called Bernstein coefficients or Bézier coefficients.

The first few Bernstein basis polynomials from above in monomial form are:

Properties

The Bernstein basis polynomials have the following properties:

Approximating continuous functions

Let ƒ be a continuous function on the interval [0, 1]. Consider the Bernstein polynomial

It can be shown that

uniformly on the interval [0, 1]. [4] [1] [5] [6]

Bernstein polynomials thus provide one way to prove the Weierstrass approximation theorem that every real-valued continuous function on a real interval [a, b] can be uniformly approximated by polynomial functions over . [7]

A more general statement for a function with continuous kth derivative is

where additionally

is an eigenvalue of Bn; the corresponding eigenfunction is a polynomial of degree k.

Probabilistic proof

This proof follows Bernstein's original proof of 1912. [8] See also Feller (1966) or Koralov & Sinai (2007). [9] [5]

Motivation

We will first give intuition for Bernstein's original proof. A continuous function on a compact interval must be uniformly continuous. Thus, the value of any continuous function can be uniformly approximated by its value on some finite net of points in the interval. This consideration renders the approximation theorem intuitive, given that polynomials should be flexible enough to match (or nearly match) a finite number of pairs . To do so, we might (1) construct a function close to on a lattice, and then (2) smooth out the function outside the lattice to make a polynomial.

The probabilistic proof below simply provides a constructive method to create a polynomial which is approximately equal to on such a point lattice, given that "smoothing out" a function is not always trivial. Taking the expectation of a random variable with a simple distribution is a common way to smooth. Here, we take advantage of the fact that Bernstein polynomials look like Binomial expectations. We split the interval into a lattice of n discrete values. Then, to evaluate any f(x), we evaluate f at one of the n lattice points close to x, randomly chosen by the Binomial distribution. The expectation of this approximation technique is polynomial, as it is the expectation of a function of a binomial RV. The proof below illustrates that this achieves a uniform approximation of f. The crux of the proof is to (1) justify replacing an arbitrary point with a binomially chosen lattice point by concentration properties of a Binomial distribution, and (2) justify the inference from to by uniform continuity.

Bernstein's proof

Suppose K is a random variable distributed as the number of successes in n independent Bernoulli trials with probability x of success on each trial; in other words, K has a binomial distribution with parameters n and x. Then we have the expected value and

By the weak law of large numbers of probability theory,

for every δ > 0. Moreover, this relation holds uniformly in x, which can be seen from its proof via Chebyshev's inequality, taking into account that the variance of 1n K, equal to 1n x(1x), is bounded from above by 1(4n) irrespective of x.

Because ƒ, being continuous on a closed bounded interval, must be uniformly continuous on that interval, one infers a statement of the form

uniformly in x for each . Taking into account that ƒ is bounded (on the given interval) one finds that

uniformly in x. To justify this statement, we use a common method in probability theory to convert from closeness in probability to closeness in expectation. One splits the expectation of into two parts split based on whether or not . In the interval where the difference does not exceed ε, the expectation clearly cannot exceed ε. In the other interval, the difference still cannot exceed 2M, where M is an upper bound for |ƒ(x)| (since uniformly continuous functions are bounded). However, by our 'closeness in probability' statement, this interval cannot have probability greater than ε. Thus, this part of the expectation contributes no more than 2M times ε. Then the total expectation is no more than , which can be made arbitrarily small by choosing small ε.

Finally, one observes that the absolute value of the difference between expectations never exceeds the expectation of the absolute value of the difference, a consequence of Holder's Inequality. Thus, using the above expectation, we see that (uniformly in x)

Noting that our randomness was over K while x is constant, the expectation of f(x) is just equal to f(x). But then we have shown that converges to f(x). Then we will be done if is a polynomial in x (the subscript reminding us that x controls the distribution of K). Indeed it is:

Uniform convergence rates between functions

In the above proof, recall that convergence in each limit involving f depends on the uniform continuity of f, which implies a rate of convergence dependent on f 's modulus of continuity It also depends on 'M', the absolute bound of the function, although this can be bypassed if one bounds and the interval size. Thus, the approximation only holds uniformly across x for a fixed f, but one can readily extend the proof to uniformly approximate a set of functions with a set of Bernstein polynomials in the context of equicontinuity.

Elementary proof

The probabilistic proof can also be rephrased in an elementary way, using the underlying probabilistic ideas but proceeding by direct verification: [10] [6] [11] [12] [13]

The following identities can be verified:

  1. ("probability")
  2. ("mean")
  3. ("variance")

In fact, by the binomial theorem

and this equation can be applied twice to . The identities (1), (2), and (3) follow easily using the substitution .

Within these three identities, use the above basis polynomial notation

and let

Thus, by identity (1)

so that

Since f is uniformly continuous, given , there is a such that whenever . Moreover, by continuity, . But then

The first sum is less than ε. On the other hand, by identity (3) above, and since , the second sum is bounded by times

(Chebyshev's inequality)

It follows that the polynomials fn tend to f uniformly.

Generalizations to higher dimension

Bernstein polynomials can be generalized to k dimensions – the resulting polynomials have the form Bi1(x1) Bi2(x2) ... Bik(xk). [1] In the simplest case only products of the unit interval [0,1] are considered; but, using affine transformations of the line, Bernstein polynomials can also be defined for products [a1, b1] × [a2, b2] × ... × [ak, bk]. For a continuous function f on the k-fold product of the unit interval, the proof that f(x1, x2, ... , xk) can be uniformly approximated by

is a straightforward extension of Bernstein's proof in one dimension. [14]

See also

Notes

  1. 1 2 3 Lorentz 1953
  2. Mathar, R. J. (2018). "Orthogonal basis function over the unit circle with the minimax property". Appendix B. arXiv: 1802.09518 [math.NA].
  3. Rababah, Abedallah (2003). "Transformation of Chebyshev-Bernstein Polynomial Basis". Comp. Meth. Appl. Math. 3 (4): 608–622. doi: 10.2478/cmam-2003-0038 . S2CID   120938358.
  4. Natanson (1964) p. 6
  5. 1 2 Feller 1966
  6. 1 2 Beals 2004
  7. Natanson (1964) p. 3
  8. Bernstein 1912
  9. Koralov, L.; Sinai, Y. (2007). ""Probabilistic proof of the Weierstrass theorem"". Theory of probability and random processes (2nd ed.). Springer. p. 29.
  10. Lorentz 1953 , pp. 5–6
  11. Goldberg 1964
  12. Akhiezer 1956
  13. Burkill 1959
  14. Hildebrandt, T. H.; Schoenberg, I. J. (1933), "On linear functional operations and the moment problem for a finite interval in one or several dimensions", Annals of Mathematics , 34 (2): 327, doi:10.2307/1968205, JSTOR   1968205

Related Research Articles

<span class="mw-page-title-main">Bessel function</span> Families of solutions to related differential equations

Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions y(x) of Bessel's differential equation for an arbitrary complex number , which represents the order of the Bessel function. Although and produce the same differential equation, it is conventional to define different Bessel functions for these two values in such a way that the Bessel functions are mostly smooth functions of .

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

A finite difference is a mathematical expression of the form f (x + b) − f (x + a). If a finite difference is divided by ba, one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for the numerical solution of differential equations, especially boundary value problems.

<span class="mw-page-title-main">Bremsstrahlung</span> Electromagnetic radiation due to deceleration of charged particles

In particle physics, bremsstrahlung is electromagnetic radiation produced by the deceleration of a charged particle when deflected by another charged particle, typically an electron by an atomic nucleus. The moving particle loses kinetic energy, which is converted into radiation, thus satisfying the law of conservation of energy. The term is also used to refer to the process of producing the radiation. Bremsstrahlung has a continuous spectrum, which becomes more intense and whose peak intensity shifts toward higher frequencies as the change of the energy of the decelerated particles increases.

<span class="mw-page-title-main">Noether's theorem</span> Statement relating differentiable symmetries to conserved quantities

Noether's theorem states that every continuous symmetry of the action of a physical system with conservative forces has a corresponding conservation law. This is the first of two theorems proven by mathematician Emmy Noether in 1915 and published in 1918. The action of a physical system is the integral over time of a Lagrangian function, from which the system's behavior can be determined by the principle of least action. This theorem only applies to continuous and smooth symmetries of physical space.

In numerical analysis, polynomial interpolation is the interpolation of a given bivariate data set by the polynomial of lowest possible degree that passes through the points of the dataset.

<span class="mw-page-title-main">Bernoulli polynomials</span> Polynomial sequence

In mathematics, the Bernoulli polynomials, named after Jacob Bernoulli, combine the Bernoulli numbers and binomial coefficients. They are used for series expansion of functions, and with the Euler–MacLaurin formula.

In calculus and real analysis, absolute continuity is a smoothness property of functions that is stronger than continuity and uniform continuity. The notion of absolute continuity allows one to obtain generalizations of the relationship between the two central operations of calculus—differentiation and integration. This relationship is commonly characterized in the framework of Riemann integration, but with absolute continuity it may be formulated in terms of Lebesgue integration. For real-valued functions on the real line, two interrelated notions appear: absolute continuity of functions and absolute continuity of measures. These two notions are generalized in different directions. The usual derivative of a function is related to the Radon–Nikodym derivative, or density, of a measure. We have the following chains of inclusions for functions over a compact subset of the real line:

In mathematics, the Poisson summation formula is an equation that relates the Fourier series coefficients of the periodic summation of a function to values of the function's continuous Fourier transform. Consequently, the periodic summation of a function is completely defined by discrete samples of the original function's Fourier transform. And conversely, the periodic summation of a function's Fourier transform is completely defined by discrete samples of the original function. The Poisson summation formula was discovered by Siméon Denis Poisson and is sometimes called Poisson resummation.

In mathematics, nonstandard calculus is the modern application of infinitesimals, in the sense of nonstandard analysis, to infinitesimal calculus. It provides a rigorous justification for some arguments in calculus that were previously considered merely heuristic.

<span class="mw-page-title-main">Propagator</span> Function in quantum field theory showing probability amplitudes of moving particles

In quantum mechanics and quantum field theory, the propagator is a function that specifies the probability amplitude for a particle to travel from one place to another in a given period of time, or to travel with a certain energy and momentum. In Feynman diagrams, which serve to calculate the rate of collisions in quantum field theory, virtual particles contribute their propagator to the rate of the scattering event described by the respective diagram. Propagators may also be viewed as the inverse of the wave operator appropriate to the particle, and are, therefore, often called (causal) Green's functions.

<span class="mw-page-title-main">Lamb shift</span> Difference in energy of hydrogenic atom electron states not predicted by the Dirac equation

In physics, the Lamb shift, named after Willis Lamb, is an anomalous difference in energy between two electron orbitals in a hydrogen atom. The difference was not predicted by theory and it cannot be derived from the Dirac equation, which predicts identical energies. Hence the Lamb shift is a deviation from theory seen in the differing energies contained by the 2S1/2 and 2P1/2 orbitals of the hydrogen atom.

<span class="mw-page-title-main">Moment problem</span> Trying to map moments to a measure that generates them

In mathematics, a moment problem arises as the result of trying to invert the mapping that takes a measure to the sequence of moments

In theoretical physics, massive gravity is a theory of gravity that modifies general relativity by endowing the graviton with a nonzero mass. In the classical theory, this means that gravitational waves obey a massive wave equation and hence travel at speeds below the speed of light.

In combinatorics, the binomial transform is a sequence transformation that computes its forward differences. It is closely related to the Euler transform, which is the result of applying the binomial transform to the sequence associated with its ordinary generating function.

<span class="mw-page-title-main">Anatoly Karatsuba</span> Russian mathematician (1937–2008)

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In probability theory and statistics, the generalized multivariate log-gamma (G-MVLG) distribution is a multivariate distribution introduced by Demirhan and Hamurkaroglu in 2011. The G-MVLG is a flexible distribution. Skewness and kurtosis are well controlled by the parameters of the distribution. This enables one to control dispersion of the distribution. Because of this property, the distribution is effectively used as a joint prior distribution in Bayesian analysis, especially when the likelihood is not from the location-scale family of distributions such as normal distribution.

In mathematics, van der Corput's method generates estimates for exponential sums. The method applies two processes, the van der Corput processes A and B which relate the sums into simpler sums which are easier to estimate.

In mathematics, the Whitney inequality gives an upper bound for the error of best approximation of a function by polynomials in terms of the moduli of smoothness. It was first proved by Hassler Whitney in 1957, and is an important tool in the field of approximation theory for obtaining upper estimates on the errors of best approximation.

Nabarro–Herring creep is a mode of deformation of crystalline materials that occurs at low stresses and held at elevated temperatures in fine-grained materials. In Nabarro–Herring creep, atoms diffuse through the crystals, and the creep rate varies inversely with the square of the grain size so fine-grained materials creep faster than coarser-grained ones. NH creep is solely controlled by diffusional mass transport. This type of creep results from the diffusion of vacancies from regions of high chemical potential at grain boundaries subjected to normal tensile stresses to regions of lower chemical potential where the average tensile stresses across the grain boundaries are zero. Self-diffusion within the grains of a polycrystalline solid can cause the solid to yield to an applied shearing stress, the yielding being caused by a diffusional flow of matter within each crystal grain away from boundaries where there is a normal pressure and toward those where there is a normal tension. Atoms migrating in the opposite direction account for the creep strain. The creep strain rate is derived in the next section. NH creep is more important in ceramics than metals as dislocation motion is more difficult to effect in ceramics.

References