Young's inequality for products

Last updated
The area of the rectangle a,b can't be larger than sum of the areas under the functions
f
{\displaystyle f}
(red) and
f
-
1
{\displaystyle f^{-1}}
(yellow) Young.png
The area of the rectangle a,b can't be larger than sum of the areas under the functions (red) and (yellow)

In mathematics, Young's inequality for products is a mathematical inequality about the product of two numbers. [1] The inequality is named after William Henry Young and should not be confused with Young's convolution inequality.

Contents

Young's inequality for products can be used to prove Hölder's inequality. It is also widely used to estimate the norm of nonlinear terms in PDE theory, since it allows one to estimate a product of two terms by a sum of the same terms raised to a power and scaled.

Standard version for conjugate Hölder exponents

The standard form of the inequality is the following, which can be used to prove Hölder's inequality.

Theorem  If and are nonnegative real numbers and if and are real numbers such that then

Equality holds if and only if

Proof [2]

Since A graph on the -plane is thus also a graph From sketching a visual representation of the integrals of the area between this curve and the axes, and the area in the rectangle bounded by the lines and the fact that is always increasing for increasing and vice versa, we can see that upper bounds the area of the rectangle below the curve (with equality when ) and upper bounds the area of the rectangle above the curve (with equality when ). Thus, with equality when (or equivalently, ). Young's inequality follows from evaluating the integrals. (See below for a generalization.)

A second proof is via Jensen's inequality.

Proof [3]

The claim is certainly true if or so henceforth assume that and Put and Because the logarithm function is concave,

with the equality holding if and only if Young's inequality follows by exponentiating.

Yet another proof is to first prove it with an then apply the resulting inequality to . The proof below illustrates also why Hölder conjugate exponent is the only possible parameter that makes Young's inequality hold for all non-negative values. The details follow:

Proof

Let and . The inequality

holds if and only if (and hence ). This can be shown by convexity arguments or by simply minimizing the single-variable function.

To prove full Young's inequality, clearly we assume that and . Now, we apply the inequality above to to obtain:

It is easy to see that choosing and multiplying both sides by yields Young's inequality.

Young's inequality may equivalently be written as

Where this is just the concavity of the logarithm function. Equality holds if and only if or This also follows from the weighted AM-GM inequality.

Generalizations

Theorem [4]   Suppose and If and are such that then

Using and replacing with and with results in the inequality:

which is useful for proving Hölder's inequality.

Proof [4]

Define a real-valued function on the positive real numbers by

for every and then calculate its minimum.

Theorem  If with then

Equality holds if and only if all the s with non-zero s are equal.

Elementary case

An elementary case of Young's inequality is the inequality with exponent

which also gives rise to the so-called Young's inequality with (valid for every ), sometimes called the Peter–Paul inequality. [5] This name refers to the fact that tighter control of the second term is achieved at the cost of losing some control of the first term – one must "rob Peter to pay Paul"

Proof: Young's inequality with exponent is the special case However, it has a more elementary proof.

Start by observing that the square of every real number is zero or positive. Therefore, for every pair of real numbers and we can write:

Work out the square of the right hand side:

Add to both sides:

Divide both sides by 2 and we have Young's inequality with exponent

Young's inequality with follows by substituting and as below into Young's inequality with exponent

Matricial generalization

T. Ando proved a generalization of Young's inequality for complex matrices ordered by Loewner ordering. [6] It states that for any pair of complex matrices of order there exists a unitary matrix such that

where denotes the conjugate transpose of the matrix and

Standard version for increasing functions

For the standard version [7] [8] of the inequality, let denote a real-valued, continuous and strictly increasing function on with and Let denote the inverse function of Then, for all and

with equality if and only if

With and this reduces to standard version for conjugate Hölder exponents.

For details and generalizations we refer to the paper of Mitroi & Niculescu. [9]

Generalization using Fenchel–Legendre transforms

By denoting the convex conjugate of a real function by we obtain

This follows immediately from the definition of the convex conjugate. For a convex function this also follows from the Legendre transformation.

More generally, if is defined on a real vector space and its convex conjugate is denoted by (and is defined on the dual space ), then

where is the dual pairing.

Examples

The convex conjugate of is with such that and thus Young's inequality for conjugate Hölder exponents mentioned above is a special case.

The Legendre transform of is , hence for all non-negative and This estimate is useful in large deviations theory under exponential moment conditions, because appears in the definition of relative entropy, which is the rate function in Sanov's theorem.

See also

Notes

  1. Young, W. H. (1912), "On classes of summable functions and their Fourier series", Proceedings of the Royal Society A , 87 (594): 225–229, Bibcode:1912RSPSA..87..225Y, doi: 10.1098/rspa.1912.0076 , JFM   43.1114.12, JSTOR   93236
  2. Pearse, Erin. "Math 209D - Real Analysis Summer Preparatory Seminar Lecture Notes" (PDF). Retrieved 17 September 2022.
  3. Bahouri, Chemin & Danchin 2011.
  4. 1 2 Jarchow 1981, pp. 47–55.
  5. Tisdell, Chris (2013), The Peter Paul Inequality, YouTube video on Dr Chris Tisdell's YouTube channel,
  6. T. Ando (1995). "Matrix Young Inequalities". In Huijsmans, C. B.; Kaashoek, M. A.; Luxemburg, W. A. J.; et al. (eds.). Operator Theory in Function Spaces and Banach Lattices. Springer. pp. 33–38. ISBN   978-3-0348-9076-2.
  7. Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) [1934], Inequalities, Cambridge Mathematical Library (2nd ed.), Cambridge: Cambridge University Press, ISBN   0-521-05206-8, MR   0046395, Zbl   0047.05302 , Chapter 4.8
  8. Henstock, Ralph (1988), Lectures on the Theory of Integration , Series in Real Analysis Volume I, Singapore, New Jersey: World Scientific, ISBN   9971-5-0450-2, MR   0963249, Zbl   0668.28001 , Theorem 2.9
  9. Mitroi, F. C., & Niculescu, C. P. (2011). An extension of Young's inequality. In Abstract and Applied Analysis (Vol. 2011). Hindawi.

Related Research Articles

In number theory, a Liouville number is a real number with the property that, for every positive integer , there exists a pair of integers with such that

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

<span class="mw-page-title-main">Product rule</span> Formula for the derivative of a product

In calculus, the product rule is a formula used to find the derivatives of products of two or more functions. For two functions, it may be stated in Lagrange's notation as

In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential. It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables.

In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate. It allows in particular for a far reaching generalization of Lagrangian duality.

In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.

<span class="mw-page-title-main">Stable distribution</span> Distribution of variables which satisfies a stability property under linear combinations

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

<span class="mw-page-title-main">Inverse-gamma distribution</span> Two-parameter family of continuous probability distributions

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

<span class="mw-page-title-main">Beta prime distribution</span> Probability distribution

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.

In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, > 0, such that

<span class="mw-page-title-main">Dvoretzky–Kiefer–Wolfowitz inequality</span> Statistical inequality

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

<span class="mw-page-title-main">Viscoplasticity</span> Theory in continuum mechanics

Viscoplasticity is a theory in continuum mechanics that describes the rate-dependent inelastic behavior of solids. Rate-dependence in this context means that the deformation of the material depends on the rate at which loads are applied. The inelastic behavior that is the subject of viscoplasticity is plastic deformation which means that the material undergoes unrecoverable deformations when a load level is reached. Rate-dependent plasticity is important for transient plasticity calculations. The main difference between rate-independent plastic and viscoplastic material models is that the latter exhibit not only permanent deformations after the application of loads but continue to undergo a creep flow as a function of time under the influence of the applied load.

<span class="mw-page-title-main">Anatoly Karatsuba</span> Russian mathematician (1937–2008)

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In mechanics, strain is defined as relative deformation, compared to a reference position configuration. Different equivalent choices may be made for the expression of a strain field depending on whether it is defined with respect to the initial or the final configuration of the body and on whether the metric tensor or its dual is considered.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

<span class="mw-page-title-main">Jacobi polynomials</span> Polynomial sequence

In mathematics, Jacobi polynomials are a class of classical orthogonal polynomials. They are orthogonal with respect to the weight on the interval . The Gegenbauer polynomials, and thus also the Legendre, Zernike and Chebyshev polynomials, are special cases of the Jacobi polynomials.

In mathematics, singular integral operators of convolution type are the singular integral operators that arise on Rn and Tn through convolution by distributions; equivalently they are the singular integral operators that commute with translations. The classical examples in harmonic analysis are the harmonic conjugation operator on the circle, the Hilbert transform on the circle and the real line, the Beurling transform in the complex plane and the Riesz transforms in Euclidean space. The continuity of these operators on L2 is evident because the Fourier transform converts them into multiplication operators. Continuity on Lp spaces was first established by Marcel Riesz. The classical techniques include the use of Poisson integrals, interpolation theory and the Hardy–Littlewood maximal function. For more general operators, fundamental new techniques, introduced by Alberto Calderón and Antoni Zygmund in 1952, were developed by a number of authors to give general criteria for continuity on Lp spaces. This article explains the theory for the classical operators and sketches the subsequent general theory.

References