Mean value theorem

Last updated

In mathematics, the mean value theorem (or Lagrange's mean value theorem) states, roughly, that for a given planar arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant through its endpoints. It is one of the most important results in real analysis. This theorem is used to prove statements about a function on an interval starting from local hypotheses about derivatives at points of the interval.

Contents

History

A special case of this theorem for inverse interpolation of the sine was first described by Parameshvara (1380–1460), from the Kerala School of Astronomy and Mathematics in India, in his commentaries on Govindasvāmi and Bhāskara II. [1] A restricted form of the theorem was proved by Michel Rolle in 1691; the result was what is now known as Rolle's theorem, and was proved only for polynomials, without the techniques of calculus. The mean value theorem in its modern form was stated and proved by Augustin Louis Cauchy in 1823. [2] Many variations of this theorem have been proved since then. [3] [4]

Statement

The function
f
{\displaystyle f}
attains the slope of the secant between
a
{\displaystyle a}
and
b
{\displaystyle b}
as the derivative at the point
x
[?]
(
a
,
b
)
{\displaystyle \xi \in (a,b)}
. Mittelwertsatz3.svg
The function attains the slope of the secant between and as the derivative at the point .
It is also possible that there are multiple tangents parallel to the secant. Mittelwertsatz6.svg
It is also possible that there are multiple tangents parallel to the secant.

Let be a continuous function on the closed interval , and differentiable on the open interval , where . Then there exists some in such that: [5]

The mean value theorem is a generalization of Rolle's theorem, which assumes , so that the right-hand side above is zero.

The mean value theorem is still valid in a slightly more general setting. One only needs to assume that is continuous on , and that for every in the limit

exists as a finite number or equals or . If finite, that limit equals . An example where this version of the theorem applies is given by the real-valued cube root function mapping , whose derivative tends to infinity at the origin.

Proof

The expression gives the slope of the line joining the points and , which is a chord of the graph of , while gives the slope of the tangent to the curve at the point . Thus the mean value theorem says that given any chord of a smooth curve, we can find a point on the curve lying between the end-points of the chord such that the tangent of the curve at that point is parallel to the chord. The following proof illustrates this idea.

Define , where is a constant. Since is continuous on and differentiable on , the same is true for . We now want to choose so that satisfies the conditions of Rolle's theorem. Namely

By Rolle's theorem, since is differentiable and , there is some in for which , and it follows from the equality that,

Implications

Theorem 1: Assume that is a continuous, real-valued function, defined on an arbitrary interval of the real line. If the derivative of at every interior point of the interval exists and is zero, then is constant in the interior.

Proof: Assume the derivative of at every interior point of the interval exists and is zero. Let be an arbitrary open interval in . By the mean value theorem, there exists a point in such that

This implies that . Thus, is constant on the interior of and thus is constant on by continuity. (See below for a multivariable version of this result.)

Remarks:

Theorem 2: If for all in an interval of the domain of these functions, then is constant, i.e. where is a constant on .

Proof: Let , then on the interval , so the above theorem 1 tells that is a constant or .

Theorem 3: If is an antiderivative of on an interval , then the most general antiderivative of on is where is a constant.

Proof: It directly follows from the theorem 2 above.

Cauchy's mean value theorem

Cauchy's mean value theorem, also known as the extended mean value theorem, is a generalization of the mean value theorem. [6] [7] It states: if the functions and are both continuous on the closed interval and differentiable on the open interval , then there exists some , such that

Geometrical meaning of Cauchy's theorem Cauchy.svg
Geometrical meaning of Cauchy's theorem

Of course, if and , this is equivalent to:

Geometrically, this means that there is some tangent to the graph of the curve [8]

which is parallel to the line defined by the points and . However, Cauchy's theorem does not claim the existence of such a tangent in all cases where and are distinct points, since it might be satisfied only for some value with , in other words a value for which the mentioned curve is stationary; in such points no tangent to the curve is likely to be defined at all. An example of this situation is the curve given by

which on the interval goes from the point to , yet never has a horizontal tangent; however it has a stationary point (in fact a cusp) at .

Cauchy's mean value theorem can be used to prove L'Hôpital's rule. The mean value theorem is the special case of Cauchy's mean value theorem when .

Proof

The proof of Cauchy's mean value theorem is based on the same idea as the proof of the mean value theorem.

Mean value theorem in several variables

The mean value theorem generalizes to real functions of multiple variables. The trick is to use parametrization to create a real function of one variable, and then apply the one-variable theorem.

Let be an open subset of , and let be a differentiable function. Fix points such that the line segment between lies in , and define . Since is a differentiable function in one variable, the mean value theorem gives:

for some between 0 and 1. But since and , computing explicitly we have:

where denotes a gradient and a dot product. This is an exact analog of the theorem in one variable (in the case this is the theorem in one variable). By the Cauchy–Schwarz inequality, the equation gives the estimate:

In particular, when is convex and the partial derivatives of are bounded, is Lipschitz continuous (and therefore uniformly continuous).

As an application of the above, we prove that is constant if the open subset is connected and every partial derivative of is 0. Pick some point , and let . We want to show for every . For that, let . Then is closed in and nonempty. It is open too: for every ,

for every in open ball centered at and contained in . Since is connected, we conclude .

The above arguments are made in a coordinate-free manner; hence, they generalize to the case when is a subset of a Banach space.

Mean value theorem for vector-valued functions

There is no exact analog of the mean value theorem for vector-valued functions (see below). However, there is an inequality which can be applied to many of the same situations to which the mean value theorem is applicable in the one dimensional case: [9]

Theorem  For a continuous vector-valued function differentiable on , there exists a number such that

.
Proof

Take . Then is real-valued and thus, by the mean value theorem,

for some . Now, and Hence, using the Cauchy–Schwarz inequality, from the above equation, we get:

If , the theorem holds trivially. Otherwise, dividing both sides by yields the theorem.

Mean value inequality

Jean Dieudonné in his classic treatise Foundations of Modern Analysis discards the mean value theorem and replaces it by mean inequality as the proof is not constructive and one cannot find the mean value and in applications one only needs mean inequality. Serge Lang in Analysis I uses the mean value theorem, in integral form, as an instant reflex but this use requires the continuity of the derivative. If one uses the Henstock–Kurzweil integral one can have the mean value theorem in integral form without the additional assumption that derivative should be continuous as every derivative is Henstock–Kurzweil integrable.

The reason why there is no analog of mean value equality is the following: If f : URm is a differentiable function (where URn is open) and if x + th, x, hRn, t ∈ [0, 1] is the line segment in question (lying inside U), then one can apply the above parametrization procedure to each of the component functions fi (i = 1, …, m) of f (in the above notation set y = x + h). In doing so one finds points x + tih on the line segment satisfying

But generally there will not be a single point x + t*h on the line segment satisfying

for all isimultaneously. For example, define:

Then , but and are never simultaneously zero as ranges over .

The above theorem implies the following:

Mean value inequality [10]   For a continuous function , if is differentiable on , then

.

In fact, the above statement suffices for many applications and can be proved directly as follows. (We shall write for for readability.)

Proof

First assume is differentiable at too. If is unbounded on , there is nothing to prove. Thus, assume . Let be some real number. Let We want to show . By continuity of , the set is closed. It is also nonempty as is in it. Hence, the set has the largest element . If , then and we are done. Thus suppose otherwise. For ,

Let be such that . By the differentiability of at (note may be 0), if is sufficiently close to , the first term is . The second term is . The third term is . Hence, summing the estimates up, we get: , a contradiction to the maximality of . Hence, and that means:

Since is arbitrary, this then implies the assertion. Finally, if is not differentiable at , let and apply the first case to restricted on , giving us:

since . Letting finishes the proof.

Cases where the theorem cannot be applied

All conditions for the mean value theorem are necessary:

  1. is differentiable on
  2. is continuous on
  3. is real-valued

When one of the above conditions is not satisfied, the mean value theorem is not valid in general, and so it cannot be applied.

Absolute Value.svg

The necessity of the first condition can be seen by the counterexample where the function on [-1,1] is not differentiable.

The necessity of the second condition can be seen by the counterexample where the function satisfies criteria 1 since on but not criteria 2 since and for all so no such exists.

The theorem is false if a differentiable function is complex-valued instead of real-valued. For example, if for all real , then while for any real .

Mean value theorems for definite integrals

First mean value theorem for definite integrals

Geometrically: interpreting f(c) as the height of a rectangle and b-a as the width, this rectangle has the same area as the region below the curve from a to b Ji Fen Zhong Zhi Ding Li .jpg
Geometrically: interpreting f(c) as the height of a rectangle and ba as the width, this rectangle has the same area as the region below the curve from a to b

Let f : [a, b] → R be a continuous function. Then there exists c in (a, b) such that

This follows at once from the fundamental theorem of calculus, together with the mean value theorem for derivatives. Since the mean value of f on [a, b] is defined as

we can interpret the conclusion as f achieves its mean value at some c in (a, b). [12]

In general, if f : [a, b] → R is continuous and g is an integrable function that does not change sign on [a, b], then there exists c in (a, b) such that

Second mean value theorem for definite integrals

There are various slightly different theorems called the second mean value theorem for definite integrals. A commonly found version is as follows:

If is a positive monotonically decreasing function and is an integrable function, then there exists a number x in (a, b] such that

Here stands for , the existence of which follows from the conditions. Note that it is essential that the interval (a, b] contains b. A variant not having this requirement is: [13]

If is a monotonic (not necessarily decreasing and positive) function and is an integrable function, then there exists a number x in (a, b) such that

If the function returns a multi-dimensional vector, then the MVT for integration is not true, even if the domain of is also multi-dimensional.

For example, consider the following 2-dimensional function defined on an -dimensional cube:

Then, by symmetry it is easy to see that the mean value of over its domain is (0,0):

However, there is no point in which , because everywhere.

Generalizations

Linear algebra

Assume that and are differentiable functions on that are continuous on . Define

There exists such that .

Notice that

and if we place , we get Cauchy's mean value theorem. If we place and we get Lagrange's mean value theorem.

The proof of the generalization is quite simple: each of and are determinants with two identical rows, hence . The Rolle's theorem implies that there exists such that .

Probability theory

Let X and Y be non-negative random variables such that E[X] < E[Y] < ∞ and (i.e. X is smaller than Y in the usual stochastic order). Then there exists an absolutely continuous non-negative random variable Z having probability density function

Let g be a measurable and differentiable function such that E[g(X)], E[g(Y)] < ∞, and let its derivative g′ be measurable and Riemann-integrable on the interval [x, y] for all yx ≥ 0. Then, E[g′(Z)] is finite and [14]

Complex analysis

As noted above, the theorem does not hold for differentiable complex-valued functions. Instead, a generalization of the theorem is stated such: [15]

Let f : Ω → C be a holomorphic function on the open convex set Ω, and let a and b be distinct points in Ω. Then there exist points u, v on the interior of the line segment from a to b such that

Where Re() is the real part and Im() is the imaginary part of a complex-valued function.

See also

Notes

  1. J. J. O'Connor and E. F. Robertson (2000). Paramesvara, MacTutor History of Mathematics archive .
  2. Ádám Besenyei. "Historical development of the mean value theorem" (PDF).
  3. Lozada-Cruz, German (2020-10-02). "Some variants of Cauchy's mean value theorem". International Journal of Mathematical Education in Science and Technology. 51 (7): 1155–1163. Bibcode:2020IJMES..51.1155L. doi:10.1080/0020739X.2019.1703150. ISSN   0020-739X. S2CID   213335491.
  4. Sahoo, Prasanna. (1998). Mean value theorems and functional equations. Riedel, T. (Thomas), 1962-. Singapore: World Scientific. ISBN   981-02-3544-5. OCLC   40951137.
  5. Rudin 1976, p. 108.
  6. W., Weisstein, Eric. "Extended Mean-Value Theorem". mathworld.wolfram.com. Retrieved 2018-10-08.{{cite web}}: CS1 maint: multiple names: authors list (link)
  7. Rudin 1976, pp. 107–108.
  8. "Cauchy's Mean Value Theorem". Math24. Retrieved 2018-10-08.
  9. Rudin 1976, p. 113.
  10. Hörmander 2015, Theorem 1.1.1. and remark following it.
  11. "Mathwords: Mean Value Theorem for Integrals". www.mathwords.com.
  12. Michael Comenetz (2002). Calculus: The Elements. World Scientific. p. 159. ISBN   978-981-02-4904-5.
  13. Hobson, E. W. (1909). "On the Second Mean-Value Theorem of the Integral Calculus". Proc. London Math. Soc. S2–7 (1): 14–23. Bibcode:1909PLMS...27...14H. doi:10.1112/plms/s2-7.1.14. MR   1575669.
  14. Di Crescenzo, A. (1999). "A Probabilistic Analogue of the Mean Value Theorem and Its Applications to Reliability Theory". J. Appl. Probab. 36 (3): 706–719. doi:10.1239/jap/1032374628. JSTOR   3215435. S2CID   250351233.
  15. 1 J.-Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly, Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.

Related Research Articles

The Cauchy–Schwarz inequality is an upper bound on the inner product between two vectors in an inner product space in terms of the product of the vector norms. It is considered one of the most important and widely used inequalities in mathematics.

<span class="mw-page-title-main">Taylor's theorem</span> Approximation of a function by a truncated power series

In calculus, Taylor's theorem gives an approximation of a -times differentiable function around a given point by a polynomial of degree , called the -th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation at the order of the Taylor series of the function. The first-order Taylor polynomial is the linear approximation of the function, and the second-order Taylor polynomial is often referred to as the quadratic approximation. There are several versions of Taylor's theorem, some giving explicit estimates of the approximation error of the function by its Taylor polynomial.

<span class="mw-page-title-main">Cauchy's integral formula</span> Provides integral formulas for all derivatives of a holomorphic function

In mathematics, Cauchy's integral formula, named after Augustin-Louis Cauchy, is a central statement in complex analysis. It expresses the fact that a holomorphic function defined on a disk is completely determined by its values on the boundary of the disk, and it provides integral formulas for all derivatives of a holomorphic function. Cauchy's formula shows that, in complex analysis, "differentiation is equivalent to integration": complex differentiation, like integration, behaves well under uniform limits – a result that does not hold in real analysis.

In mathematics, a linear form is a linear map from a vector space to its field of scalars.

In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem. In one dimension, it is equivalent to the fundamental theorem of calculus. In three dimensions, it is equivalent to the divergence theorem.

<span class="mw-page-title-main">Jensen's inequality</span> Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation.

In mathematical analysis, a function of bounded variation, also known as BV function, is a real-valued function whose total variation is bounded (finite): the graph of a function having this property is well behaved in a precise sense. For a continuous function of a single variable, being of bounded variation means that the distance along the direction of the y-axis, neglecting the contribution of motion along x-axis, traveled by a point moving along the graph has a finite value. For a continuous function of several variables, the meaning of the definition is the same, except for the fact that the continuous path to be considered cannot be the whole graph of the given function, but can be every intersection of the graph itself with a hyperplane parallel to a fixed x-axis and to the y-axis.

In multivariable calculus, the implicit function theorem is a tool that allows relations to be converted to functions of several real variables. It does so by representing the relation as the graph of a function. There may not be a single function whose graph can represent the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

In mathematics, specifically the study of differential equations, the Picard–Lindelöf theorem gives a set of conditions under which an initial value problem has a unique solution. It is also known as Picard's existence theorem, the Cauchy–Lipschitz theorem, or the existence and uniqueness theorem.

In the theory of stochastic processes, the Karhunen–Loève theorem, also known as the Kosambi–Karhunen–Loève theorem states that a stochastic process can be represented as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function on a bounded interval. The transformation is also known as Hotelling transform and eigenvector transform, and is closely related to principal component analysis (PCA) technique widely used in image processing and in data analysis in many fields.

In mathematical analysis, and applications in geometry, applied mathematics, engineering, and natural sciences, a function of a real variable is a function whose domain is the real numbers , or a subset of that contains an interval of positive length. Most real functions that are considered and studied are differentiable in some interval. The most widely considered such functions are the real functions, which are the real-valued functions of a real variable, that is, the functions of a real variable whose codomain is the set of real numbers.

In mathematics, the Riesz–Fischer theorem in real analysis is any of a number of closely related results concerning the properties of the space L2 of square integrable functions. The theorem was proven independently in 1907 by Frigyes Riesz and Ernst Sigismund Fischer.

In calculus, the Leibniz integral rule for differentiation under the integral sign, named after Gottfried Wilhelm Leibniz, states that for an integral of the form where and the integrands are functions dependent on the derivative of this integral is expressible as where the partial derivative indicates that inside the integral, only the variation of with is considered in taking the derivative.

<span class="mw-page-title-main">Multiple integral</span> Generalization of definite integrals to functions of multiple variables

In mathematics (specifically multivariable calculus), a multiple integral is a definite integral of a function of several real variables, for instance, f(x, y) or f(x, y, z).

<span class="mw-page-title-main">Characteristic function (probability theory)</span> Fourier transform of the probability density function

In probability theory and statistics, the characteristic function of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform of the probability density function. Thus it provides an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.

In mathematics, subharmonic and superharmonic functions are important classes of functions used extensively in partial differential equations, complex analysis and potential theory.

In mathematics, especially functional analysis, a Fréchet algebra, named after Maurice René Fréchet, is an associative algebra over the real or complex numbers that at the same time is also a Fréchet space. The multiplication operation for is required to be jointly continuous. If is an increasing family of seminorms for the topology of , the joint continuity of multiplication is equivalent to there being a constant and integer for each such that for all . Fréchet algebras are also called B0-algebras.

In mathematics, Darboux's theorem is a theorem in real analysis, named after Jean Gaston Darboux. It states that every function that results from the differentiation of another function has the intermediate value property: the image of an interval is also an interval.

In mathematics, the Pettis integral or Gelfand–Pettis integral, named after Israel M. Gelfand and Billy James Pettis, extends the definition of the Lebesgue integral to vector-valued functions on a measure space, by exploiting duality. The integral was introduced by Gelfand for the case when the measure space is an interval with Lebesgue measure. The integral is also called the weak integral in contrast to the Bochner integral, which is the strong integral.

In mathematics, calculus on Euclidean space is a generalization of calculus of functions in one or several variables to calculus of functions on Euclidean space as well as a finite-dimensional real vector space. This calculus is also known as advanced calculus, especially in the United States. It is similar to multivariable calculus but is somewhat more sophisticated in that it uses linear algebra more extensively and covers some concepts from differential geometry such as differential forms and Stokes' formula in terms of differential forms. This extensive use of linear algebra also allows a natural generalization of multivariable calculus to calculus on Banach spaces or topological vector spaces.

References