|Part of a series of articles about|
In calculus, Taylor's theorem gives an approximation of a k-times differentiable function around a given point by a polynomial of degree k, called the kth-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation at the order k of the Taylor series of the function. The first-order Taylor polynomial is the linear approximation of the function, and the second-order Taylor polynomial is often referred to as the quadratic approximation.There are several versions of Taylor's theorem, some giving explicit estimates of the approximation error of the function by its Taylor polynomial.
Taylor's theorem is named after the mathematician Brook Taylor, who stated a version of it in 1715,although an earlier version of the result was already mentioned in 1671 by James Gregory.
Taylor's theorem is taught in introductory-level calculus courses and is one of the central elementary tools in mathematical analysis. It gives simple arithmetic formulas to accurately compute values of many transcendental functions such as the exponential function and trigonometric functions. It is the starting point of the study of analytic functions, and is fundamental in various areas of mathematics, as well as in numerical analysis and mathematical physics. Taylor's theorem also generalizes to multivariate and vector valued functions.
If a real-valued function f(x) is differentiable at the point x = a, then it has a linear approximation near this point. This means that there exists a function h1(x) such that
is the linear approximation of f(x) for x near the point a, whose graph y = P1(x) is the tangent line to the graph y = f(x) at x = a. The error in the approximation is:
As x tends to a, this error goes to zero much faster than , making a useful approximation.
For a better approximation to f(x), we can fit a quadratic polynomial instead of a linear function:
Instead of just matching one derivative of f(x) at x = a, this polynomial has the same first and second derivatives, as is evident upon differentiation.
Taylor's theorem ensures that the quadratic approximation is, in a sufficiently small neighborhood of x =a, more accurate than the linear approximation. Specifically,
Here the error in the approximation is
which, given the limiting behavior of , goes to zero faster than as x tends to a.
Similarly, we might get still better approximations to f if we use polynomials of higher degree, since then we can match even more derivatives with f at the selected base point.
In general, the error in approximating a function by a polynomial of degree k will go to zero much faster than as x tends to a. However, there are functions, even infinitely differentiable ones, for which increasing the degree of the approximating polynomial does not increase the accuracy of approximation: we say such a function fails to be analytic at x = a: it is not (locally) determined by its derivatives at this point.
Taylor's theorem is of asymptotic nature: it only tells us that the error Rk in an approximation by a k-th order Taylor polynomial Pk tends to zero faster than any nonzero k-th degree polynomial as x → a. It does not tell us how large the error is in any concrete neighborhood of the center of expansion, but for this purpose there are explicit formulas for the remainder term (given below) which are valid under some additional regularity assumptions on f. These enhanced versions of Taylor's theorem typically lead to uniform estimates for the approximation error in a small neighborhood of the center of expansion, but the estimates do not necessarily hold for neighborhoods which are too large, even if the function f is analytic. In that situation one may have to select several Taylor polynomials with different centers of expansion to have reliable Taylor-approximations of the original function (see animation on the right.)
There are several ways we might use the remainder term:
The precise statement of the most basic version of Taylor's theorem is as follows:
The polynomial appearing in Taylor's theorem is the k-th order Taylor polynomial
of the function f at the point a. The Taylor polynomial is the unique "asymptotic best fit" polynomial in the sense that if there exists a function hk : R → R and a k-th order polynomial p such that
then p = Pk. Taylor's theorem describes the asymptotic behavior of the remainder term
which is the approximation error when approximating f with its Taylor polynomial. Using the little-o notation, the statement in Taylor's theorem reads as
Under stronger regularity assumptions on f there are several precise formulas for the remainder term Rk of the Taylor polynomial, the most common ones being the following.
These refinements of Taylor's theorem are usually proved using the mean value theorem, whence the name. Also other similar expressions can be found. For example, if G(t) is continuous on the closed interval and differentiable with a non-vanishing derivative on the open interval between a and x, then
for some number ξ between a and x. This version covers the Lagrange and Cauchy forms of the remainder as special cases, and is proved below using Cauchy's mean value theorem.
The statement for the integral form of the remainder is more advanced than the previous ones, and requires understanding of Lebesgue integration theory for the full generality. However, it holds also in the sense of Riemann integral provided the (k + 1)th derivative of f is continuous on the closed interval [a,x].
Due to absolute continuity of f(k) on the closed interval between a and x, its derivative f(k+1) exists as an L1-function, and the result can be proven by a formal calculation using fundamental theorem of calculus and integration by parts.
It is often useful in practice to be able to estimate the remainder term appearing in the Taylor approximation, rather than having an exact formula for it. Suppose that f is (k + 1)-times continuously differentiable in an interval I containing a. Suppose that there are real constants q and Q such that
throughout I. Then the remainder term satisfies the inequality
if x > a, and a similar estimate if x < a. This is a simple consequence of the Lagrange form of the remainder. In particular, if
on an interval I = (a − r,a + r) with some , then
for all x∈(a − r,a + r). The second inequality is called a uniform estimate, because it holds uniformly for all x on the interval (a − r,a + r).
Suppose that we wish to find the approximate value of the function f(x) = ex on the interval [−1,1] while ensuring that the error in the approximation is no more than 10−5. In this example we pretend that we only know the following properties of the exponential function:
From these properties it follows that f(k)(x) = ex for all k, and in particular, f(k)(0) = 1. Hence the k-th order Taylor polynomial of f at 0 and its remainder term in the Lagrange form are given by
where ξ is some number between 0 and x. Since ex is increasing by ( ⁎ ), we can simply use ex ≤ 1 for x ∈ [−1, 0] to estimate the remainder on the subinterval [−1, 0]. To obtain an upper bound for the remainder on [0,1], we use the property eξ < ex for 0<ξ<x to estimate
using the second order Taylor expansion. Then we solve for ex to deduce that
simply by maximizing the numerator and minimizing the denominator. Combining these estimates for ex we see that
so the required precision is certainly reached, when
(See factorial or compute by hand the values 9! = 362880 and 10! = 3628800.) As a conclusion, Taylor's theorem leads to the approximation
For instance, this approximation provides a decimal expression e ≈ 2.71828, correct up to five decimal places.
Let I ⊂ R be an open interval. By definition, a function f : I → R is real analytic if it is locally defined by a convergent power series. This means that for every a ∈ I there exists some r > 0 and a sequence of coefficients ck ∈ R such that (a − r, a + r) ⊂ I and
In general, the radius of convergence of a power series can be computed from the Cauchy–Hadamard formula
This result is based on comparison with a geometric series, and the same method shows that if the power series based on a converges for some b ∈ R, it must converge uniformly on the closed interval [a − rb, a + rb], where rb = |b − a|. Here only the convergence of the power series is considered, and it might well be that (a − R,a + R) extends beyond the domain I of f.
The Taylor polynomials of the real analytic function f at a are simply the finite truncations
of its locally defining power series, and the corresponding remainder terms are locally given by the analytic functions
Here the functions
are also analytic, since their defining power series have the same radius of convergence as the original series. Assuming that [a − r, a + r] ⊂ I and r < R, all these series converge uniformly on (a − r, a + r). Naturally, in the case of analytic functions one can estimate the remainder term Rk(x) by the tail of the sequence of the derivatives f′(a) at the center of the expansion, but using complex analysis also another possibility arises, which is described below.
The Taylor series of f will converge in some interval in which all its derivatives are bounded and do not grow too fast as k goes to infinity. (However, even if the Taylor series converges, it might not converge to f, as explained below; f is then said to be non-analytic.)
One might think of the Taylor series
of an infinitely many times differentiable function f : R → R as its "infinite order Taylor polynomial" at a. Now the estimates for the remainder imply that if, for any r, the derivatives of f are known to be bounded over (a − r, a + r), then for any order k and for any r > 0 there exists a constant Mk,r > 0 such that
for every x ∈ (a − r,a + r). Sometimes the constants Mk,r can be chosen in such way that Mk,r is bounded above, for fixed r and all k. Then the Taylor series of f converges uniformly to some analytic function
(One also gets convergence even if Mk,r is not bounded above as long as it grows slowly enough.)
The limit function Tf is by definition always analytic, but it is not necessarily equal to the original function f, even if f is infinitely differentiable. In this case, we say f is a non-analytic smooth function, for example a flat function:
Using the chain rule repeatedly by mathematical induction, one shows that for any order k,
for some polynomial pk of degree 2(k − 1). The function tends to zero faster than any polynomial as x → 0, so f is infinitely many times differentiable and f(k)(0) = 0 for every positive integer k. The above results all hold in this case:
However, as k increases for fixed r, the value of Mk,r grows more quickly that rk, and the error does not go to zero.
Taylor's theorem generalizes to functions f : C → C which are complex differentiable in an open subset U ⊂ C of the complex plane. However, its usefulness is dwarfed by other general theorems in complex analysis. Namely, stronger versions of related results can be deduced for complex differentiable functions f : U → C using Cauchy's integral formula as follows.
Let r > 0 such that the closed disk B(z, r) ∪ S(z, r) is contained in U. Then Cauchy's integral formula with a positive parametrization γ(t) = z + reit of the circle S(z, r) with t ∈ [0, 2π] gives
Here all the integrands are continuous on the circle S(z, r), which justifies differentiation under the integral sign. In particular, if f is once complex differentiable on the open set U, then it is actually infinitely many times complex differentiable on U. One also obtains the Cauchy's estimates
for any z ∈ U and r > 0 such that B(z, r) ∪ S(c, r) ⊂ U. These estimates imply that the complex Taylor series
of f converges uniformly on any open disk B(c, r) ⊂ U with S(c, r) ⊂ U into some function Tf. Furthermore, using the contour integral formulas for the derivatives f(k)(c),
so any complex differentiable function f in an open set U ⊂ C is in fact complex analytic. All that is said for real analytic functions here holds also for complex analytic functions with the open interval I replaced by an open subset U ∈ C and a-centered intervals (a − r, a + r) replaced by c-centered disks B(c, r). In particular, the Taylor expansion holds in the form
where the remainder term Rk is complex analytic. Methods of complex analysis provide some powerful results regarding Taylor expansions. For example, using Cauchy's integral formula for any positively oriented Jordan curve γ which parametrizes the boundary ∂W ⊂ U of a region W ⊂ U, one obtains expressions for the derivatives f(j)(c) as above, and modifying slightly the computation for Tf(z) = f(z), one arrives at the exact formula
The important feature here is that the quality of the approximation by a Taylor polynomial on the region W ⊂ U is dominated by the values of the function f itself on the boundary ∂W ⊂ U. Similarly, applying Cauchy's estimates to the series expression for the remainder, one obtains the uniform estimates
is real analytic, that is, locally determined by its Taylor series. This function was plotted above to illustrate the fact that some elementary functions cannot be approximated by Taylor polynomials in neighborhoods of the center of expansion which are too large. This kind of behavior is easily understood in the framework of complex analysis. Namely, the function f extends into a meromorphic function
on the compactified complex plane. It has simple poles at z = i and z = −i, and it is analytic elsewhere. Now its Taylor series centered at z0 converges on any disc B(z0, r) with r < |z − z0|, where the same Taylor series converges at z ∈ C. Therefore, Taylor series of f centered at 0 converges on B(0, 1) and it does not converge for any z ∈ C with |z| > 1 due to the poles at i and −i. For the same reason the Taylor series of f centered at 1 converges on B(1, √2) and does not converge for any z ∈ C with |z − 1| > √2.
A function f: Rn → R is differentiable at a ∈ Rn if and only if there exists a linear functional L : Rn → R and a function h : Rn → R such that
If this is the case, then L = df(a) is the (uniquely defined) differential of f at the point a. Furthermore, then the partial derivatives of f exist at a and the differential of f at a is given by
Introduce the multi-index notation
for α ∈ Nn and x ∈ Rn. If all the k-th order partial derivatives of f : Rn → R are continuous at a ∈ Rn, then by Clairaut's theorem, one can change the order of mixed derivatives at a, so the notation
for the higher order partial derivatives is justified in this situation. The same is true if all the (k − 1)-th order partial derivatives of f exist in some neighborhood of a and are differentiable at a. Then we say that f is ktimes differentiable at the point a.
If the function f : Rn → R is k + 1 times continuously differentiable in a closed ball for some , then one can derive an exact formula for the remainder in terms of (k+1)-th order partial derivatives of f in this neighborhood. Namely,
In this case, due to the continuity of (k+1)-th order partial derivatives in the compact set B, one immediately obtains the uniform estimates
For example, the third-order Taylor polynomial of a smooth function f: R2 → R is, denoting x − a = v,
where, as in the statement of Taylor's theorem,
It is sufficient to show that
The proof here is based on repeated application of L'Hôpital's rule. Note that, for each j = 0,1,…,k−1, . Hence each of the first k−1 derivatives of the numerator in vanishes at , and the same is true of the denominator. Also, since the condition that the function f be k times differentiable at a point requires differentiability up to order k−1 in a neighborhood of said point (this is true, because differentiability requires a function to be defined in a whole neighborhood of a point), the numerator and its k − 2 derivatives are differentiable in a neighborhood of a. Clearly, the denominator also satisfies said condition, and additionally, doesn't vanish unless x=a, therefore all conditions necessary for L'Hopital's rule are fulfilled, and its use is justified. So
where the last equality follows by the definition of the derivative at x = a.
Let G be any real-valued function, continuous on the closed interval between a and x and differentiable with a non-vanishing derivative on the open interval between a and x, and define
For . Then, by Cauchy's mean value theorem,
for some ξ on the open interval between a and x. Note that here the numerator F(x) − F(a) = Rk(x) is exactly the remainder of the Taylor polynomial for f(x). Compute
plug it into ( ⁎⁎⁎ ) and rearrange terms to find that
This is the form of the remainder term mentioned after the actual statement of Taylor's theorem with remainder in the mean value form. The Lagrange form of the remainder is found by choosing and the Cauchy form by choosing .
Remark. Using this method one can also recover the integral form of the remainder by choosing
but the requirements for f needed for the use of mean value theorem are too strong, if one aims to prove the claim in the case that f(k) is only absolutely continuous. However, if one uses Riemann integral instead of Lebesgue integral, the assumptions cannot be weakened.
Due to absolute continuity of f(k) on the closed interval between a and x its derivative f(k+1) exists as an L1-function, and we can use fundamental theorem of calculus and integration by parts. This same proof applies for the Riemann integral assuming that f(k) is continuous on the closed interval and differentiable on the open interval between a and x, and this leads to the same result than using the mean value theorem.
The fundamental theorem of calculus states that
Now we can integrate by parts and use the fundamental theorem of calculus again to see that
which is exactly Taylor's theorem with remainder in the integral form in the case k=1. The general statement is proved using induction. Suppose that
Integrating the remainder term by parts we arrive at
Substituting this into the formula in ( ⁎⁎⁎⁎ ) shows that if it holds for the value k, it must also hold for the value k + 1. Therefore, since it holds for k = 1, it must hold for every positive integer k.
We prove the special case, where f : Rn → R has continuous partial derivatives up to the order k+1 in some closed ball B with center a. The strategy of the proof is to apply the one-variable case of Taylor's theorem to the restriction of f to the line segment adjoining x and a. Parametrize the line segment between a and x by u(t) = a + t(x − a). We apply the one-variable version of Taylor's theorem to the function g(t) = f(u(t)):
Applying the chain rule for several variables gives
where is the multinomial coefficient. Since , we get:
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers n ≥ k ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n, and is given by the formula
In mathematics, the Taylor series of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor's series are named after Brook Taylor, who introduced them in 1715.
In statistics, the likelihood function measures the goodness of fit of a statistical model to a sample of data for given values of the unknown parameters. It is formed from the joint probability distribution of the sample, but viewed and used as a function of the parameters only, thus treating the random variables as fixed at the observed values.
The fundamental theorem of algebra states that every non-constant single-variable polynomial with complex coefficients has at least one complex root. This includes polynomials with real coefficients, since every real number is a complex number with its imaginary part equal to zero.
In mathematics, an analytic function is a function that is locally given by a convergent power series. There exist both real analytic functions and complex analytic functions. Functions of each type are infinitely differentiable, but complex analytic functions exhibit properties that do not generally hold for real analytic functions. A function is analytic if and only if its Taylor series about x0 converges to the function in some neighborhood for every x0 in its domain.
The Chebyshev polynomials are two sequences of polynomials related to the cosine and sine functions, notated as and . They can be defined several ways that have the same end result; in this article the polynomials are defined by starting with trigonometric functions:
In algebra, the partial fraction decomposition or partial fraction expansion of a rational fraction is an operation that consists of expressing the fraction as a sum of a polynomial and one or several fractions with a simpler denominator.
In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and returns another function.
In the calculus of variations, a field of mathematical analysis, the functional derivative relates a change in a Functional to a change in a function on which the functional depends.
In probability and statistics, the Dirichlet distribution, often denoted , is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.
In numerical analysis, the Weierstrass method or Durand–Kerner method, discovered by Karl Weierstrass in 1891 and rediscovered independently by Durand in 1960 and Kerner in 1966, is a root-finding algorithm for solving polynomial equations. In other words, the method can be used to solve numerically the equation
In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are nonnegative real constants C, α>0, such that
In mathematics, in the area of complex analysis, Nachbin's theorem is commonly used to establish a bound on the growth rates for an analytic function. This article provides a brief review of growth rates, including the idea of a function of exponential type. Classification of growth rates based on type help provide a finer tool than big O or Landau notation, since a number of theorems about the analytic structure of the bounded function and its integral transforms can be stated. In particular, Nachbin's theorem may be used to give the domain of convergence of the generalized Borel transform, given below.
The Wiener–Hopf method is a mathematical technique widely used in applied mathematics. It was initially developed by Norbert Wiener and Eberhard Hopf as a method to solve systems of integral equations, but has found wider use in solving two-dimensional partial differential equations with mixed boundary conditions on the same boundary. In general, the method works by exploiting the complex-analytical properties of transformed functions. Typically, the standard Fourier transform is used, but examples exist using other transforms, such as the Mellin transform.
The Remez algorithm or Remez exchange algorithm, published by Evgeny Yakovlevich Remez in 1934, is an iterative algorithm used to find simple approximations to functions, specifically, approximations by functions in a Chebyshev space that are the best in the uniform norm L∞ sense.
In numerical analysis, finite-difference methods (FDM) are a class of numerical techniques for solving differential equations by approximating derivatives with finite differences. Both the spatial domain and time interval are discretized, or broken into a finite number of steps, and the value of the solution at these discrete points is approximated by solving algebraic equations containing finite differences and values from nearby points.
In mathematics, the ATS theorem is the theorem on the approximation of a trigonometric sum by a shorter one. The application of the ATS theorem in certain problems of mathematical and theoretical physics can be very helpful.
In continuum mechanics, a compatible deformation tensor field in a body is that unique tensor field that is obtained when the body is subjected to a continuous, single-valued, displacement field. Compatibility is the study of the conditions under which such a displacement field can be guaranteed. Compatibility conditions are particular cases of integrability conditions and were first derived for linear elasticity by Barré de Saint-Venant in 1864 and proved rigorously by Beltrami in 1886.
In complex analysis and geometric function theory, the Grunsky matrices, or Grunsky operators, are infinite matrices introduced in 1939 by Helmut Grunsky. The matrices correspond to either a single holomorphic function on the unit disk or a pair of holomorphic functions on the unit disk and its complement. The Grunsky inequalities express boundedness properties of these matrices, which in general are contraction operators or in important special cases unitary operators. As Grunsky showed, these inequalities hold if and only if the holomorphic function is univalent. The inequalities are equivalent to the inequalities of Goluzin, discovered in 1947. Roughly speaking, the Grunsky inequalities give information on the coefficients of the logarithm of a univalent function; later generalizations by Milin, starting from the Lebedev–Milin inequality, succeeded in exponentiating the inequalities to obtain inequalities for the coefficients of the univalent function itself. The Grunsky matrix and its associated inequalities were originally formulated in a more general setting of univalent functions between a region bounded by finitely many sufficiently smooth Jordan curves and its complement: the results of Grunsky, Goluzin and Milin generalize to that case.
In mathematical analysis its applications, a function of several real variables or real multivariate function is a function with more than one argument, with all arguments being real variables. This concept extends the idea of a function of a real variable to several variables. The "input" variables take real values, while the "output", also called the "value of the function", may be real or complex. However, the study of the complex valued functions may be easily reduced to the study of the real valued functions, by considering the real and imaginary parts of the complex function; therefore, unless explicitly specified, only real valued functions will be considered in this article.