Young's inequality for products

Last updated April 23, 2024

The area of the rectangle a,b can't be larger than sum of the areas under the functions
f
{\displaystyle f}
(red) and
f
-
1
{\displaystyle f^{-1}}
(yellow) Young.png — The area of the rectangle a,b can't be larger than sum of the areas under the functions $f$ (red) and $f^{-1}$ (yellow)

In mathematics, Young's inequality for products is a mathematical inequality about the product of two numbers.^[1] The inequality is named after William Henry Young and should not be confused with Young's convolution inequality.

Standard version for conjugate Hölder exponents

The standard form of the inequality is the following, which can be used to prove Hölder's inequality.

Theorem — If $a\geq 0$ and $b\geq 0$ are nonnegative real numbers and if $p>1$ and $q>1$ are real numbers such that ${\frac {1}{p}}+{\frac {1}{q}}=1,$ then

ab~\leq ~{\frac {a^{p}}{p}}+{\frac {b^{q}}{q}}.

Equality holds if and only if $a^{p}=b^{q}.$

Proof^[2]

Since ${\tfrac {1}{p}}+{\tfrac {1}{q}}=1,$ $p-1={\tfrac {1}{q-1}}.$ A graph $y=x^{p-1}$ on the $xy$ -plane is thus also a graph $x=y^{q-1}.$ From sketching a visual representation of the integrals of the area between this curve and the axes, and the area in the rectangle bounded by the lines $x=0,x=a,y=0,y=b,$ and the fact that $y$ is always increasing for increasing $x$ and vice versa, we can see that $\int _{0}^{a}x^{p-1}\mathrm {d} x$ upper bounds the area of the rectangle below the curve (with equality when $b\geq a^{p-1}$ ) and $\int _{0}^{b}y^{q-1}\mathrm {d} y$ upper bounds the area of the rectangle above the curve (with equality when $b\leq a^{p-1}$ ). Thus, $\int _{0}^{a}x^{p-1}\mathrm {d} x+\int _{0}^{b}y^{q-1}\mathrm {d} y\geq ab,$ with equality when $b=a^{p-1}$ (or equivalently, $a^{p}=b^{q}$ ). Young's inequality follows from evaluating the integrals. (See below for a generalization.)

A second proof is via Jensen's inequality.

Proof^[3]

The claim is certainly true if $a=0$ or $b=0$ so henceforth assume that $a>0$ and $b>0.$ Put $t=1/p$ and $(1-t)=1/q.$ Because the logarithm function is concave,

\ln \left(ta^{p}+(1-t)b^{q}\right)~\geq ~t\ln \left(a^{p}\right)+(1-t)\ln \left(b^{q}\right)=\ln(a)+\ln(b)=\ln(ab)

with the equality holding if and only if $a^{p}=b^{q}.$ Young's inequality follows by exponentiating.

Yet another proof is to first prove it with $b=1$ an then apply the resulting inequality to ${\tfrac {a}{b^{q}}}$ . The proof below illustrates also why Hölder conjugate exponent is the only possible parameter that makes Young's inequality hold for all non-negative values. The details follow:

Proof

Let $0<\alpha <1$ and $\alpha +\beta =1$ . The inequality

x~\leq ~\alpha x^{p}+\beta ,\qquad \,for\quad \ all\quad \ x~\geq ~0

holds if and only if $\alpha ={\tfrac {1}{p}}$ (and hence $\beta ={\tfrac {1}{q}}$ ). This can be shown by convexity arguments or by simply minimizing the single-variable function.

To prove full Young's inequality, clearly we assume that $a>0$ and $b>0$ . Now, we apply the inequality above to $x={\tfrac {a}{b^{s}}}$ to obtain:

{\tfrac {a}{b^{s}}}~\leq ~{\tfrac {1}{p}}{\tfrac {a^{p}}{b^{sp}}}+{\tfrac {1}{q}}.

It is easy to see that choosing $s=q-1$ and multiplying both sides by $b^{q+1}$ yields Young's inequality.

Young's inequality may equivalently be written as

a^{\alpha }b^{\beta }\leq \alpha a+\beta b,\qquad \,0\leq \alpha ,\beta \leq 1,\quad \ \alpha +\beta =1.

Where this is just the concavity of the logarithm function. Equality holds if and only if $a=b$ or $\{\alpha ,\beta \}=\{0,1\}.$ This also follows from the weighted AM-GM inequality.

Generalizations

Theorem^[4] — Suppose $a>0$ and $b>0.$ If $1<p<\infty$ and $q$ are such that ${\tfrac {1}{p}}+{\tfrac {1}{q}}=1$ then

ab~=~\min _{0<t<\infty }\left({\frac {t^{p}a^{p}}{p}}+{\frac {t^{-q}b^{q}}{q}}\right).

Using $t:=1$ and replacing $a$ with $a^{1/p}$ and $b$ with $b^{1/q}$ results in the inequality:

a^{1/p}\,b^{1/q}~\leq ~{\frac {a}{p}}+{\frac {b}{q}},

which is useful for proving Hölder's inequality.

Proof^[4]

Define a real-valued function $f$ on the positive real numbers by

f(t)~=~{\frac {t^{p}a^{p}}{p}}+{\frac {t^{-q}b^{q}}{q}}

for every $t>0$ and then calculate its minimum.

Theorem — If $0\leq p_{i}\leq 1$ with $\sum _{i}p_{i}=1$ then

\prod _{i}{a_{i}}^{p_{i}}~\leq ~\sum _{i}p_{i}a_{i}.

Equality holds if and only if all the $a_{i}$ s with non-zero $p_{i}$ s are equal.

Elementary case

An elementary case of Young's inequality is the inequality with exponent $2,$

ab\leq {\frac {a^{2}}{2}}+{\frac {b^{2}}{2}},

which also gives rise to the so-called Young's inequality with $\varepsilon$ (valid for every $\varepsilon >0$ ), sometimes called the Peter–Paul inequality. ^[5] This name refers to the fact that tighter control of the second term is achieved at the cost of losing some control of the first term – one must "rob Peter to pay Paul"

ab~\leq ~{\frac {a^{2}}{2\varepsilon }}+{\frac {\varepsilon b^{2}}{2}}.

Proof: Young's inequality with exponent $2$ is the special case $p=q=2.$ However, it has a more elementary proof.

Start by observing that the square of every real number is zero or positive. Therefore, for every pair of real numbers $a$ and $b$ we can write:

0\leq (a-b)^{2}

Work out the square of the right hand side:

0\leq a^{2}-2ab+b^{2}

Add $2ab$ to both sides:

2ab\leq a^{2}+b^{2}

Divide both sides by 2 and we have Young's inequality with exponent $2:$

ab\leq {\frac {a^{2}}{2}}+{\frac {b^{2}}{2}}

Young's inequality with $\varepsilon$ follows by substituting $a'$ and $b'$ as below into Young's inequality with exponent $2:$

a'=a/{\sqrt {\varepsilon }},\;b'={\sqrt {\varepsilon }}b.

Matricial generalization

T. Ando proved a generalization of Young's inequality for complex matrices ordered by Loewner ordering.^[6] It states that for any pair $A,B$ of complex matrices of order $n$ there exists a unitary matrix $U$ such that

U^{*}|AB^{*}|U\preceq {\tfrac {1}{p}}|A|^{p}+{\tfrac {1}{q}}|B|^{q},

where ${}^{*}$ denotes the conjugate transpose of the matrix and $|A|={\sqrt {A^{*}A}}.$

Standard version for increasing functions

For the standard version^[7]^[8] of the inequality, let $f$ denote a real-valued, continuous and strictly increasing function on $[0,c]$ with $c>0$ and $f(0)=0.$ Let $f^{-1}$ denote the inverse function of $f.$ Then, for all $a\in [0,c]$ and $b\in [0,f(c)],$

ab~\leq ~\int _{0}^{a}f(x)\,dx+\int _{0}^{b}f^{-1}(x)\,dx

with equality if and only if $b=f(a).$

With $f(x)=x^{p-1}$ and $f^{-1}(y)=y^{q-1},$ this reduces to standard version for conjugate Hölder exponents.

For details and generalizations we refer to the paper of Mitroi & Niculescu.^[9]

Generalization using Fenchel–Legendre transforms

By denoting the convex conjugate of a real function $f$ by $g,$ we obtain

ab~\leq ~f(a)+g(b).

This follows immediately from the definition of the convex conjugate. For a convex function $f$ this also follows from the Legendre transformation.

More generally, if $f$ is defined on a real vector space $X$ and its convex conjugate is denoted by $f^{\star }$ (and is defined on the dual space $X^{\star }$ ), then

\langle u,v\rangle \leq f^{\star }(u)+f(v).

where $\langle \cdot ,\cdot \rangle :X^{\star }\times X\to \mathbb {R}$ is the dual pairing.

Examples

The convex conjugate of $f(a)=a^{p}/p$ is $g(b)=b^{q}/q$ with $q$ such that ${\tfrac {1}{p}}+{\tfrac {1}{q}}=1,$ and thus Young's inequality for conjugate Hölder exponents mentioned above is a special case.

The Legendre transform of $f(a)=e^{a}-1$ is $g(b)=1-b+b\ln b$ , hence $ab\leq e^{a}-b+b\ln b$ for all non-negative $a$ and $b.$ This estimate is useful in large deviations theory under exponential moment conditions, because $b\ln b$ appears in the definition of relative entropy, which is the rate function in Sanov's theorem.

Notes

↑ Young, W. H. (1912), "On classes of summable functions and their Fourier series", Proceedings of the Royal Society A , 87 (594): 225–229, Bibcode:1912RSPSA..87..225Y, doi: 10.1098/rspa.1912.0076 , JFM 43.1114.12, JSTOR 93236
↑ Pearse, Erin. "Math 209D - Real Analysis Summer Preparatory Seminar Lecture Notes" (PDF). Retrieved 17 September 2022.
↑ Bahouri, Chemin & Danchin 2011.
1 2 Jarchow 1981, pp. 47–55.
↑ Tisdell, Chris (2013), The Peter Paul Inequality, YouTube video on Dr Chris Tisdell's YouTube channel,
↑ T. Ando (1995). "Matrix Young Inequalities". In Huijsmans, C. B.; Kaashoek, M. A.; Luxemburg, W. A. J.; et al. (eds.). Operator Theory in Function Spaces and Banach Lattices. Springer. pp. 33–38. ISBN 978-3-0348-9076-2.
↑ Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) [1934], Inequalities, Cambridge Mathematical Library (2nd ed.), Cambridge: Cambridge University Press, ISBN 0-521-05206-8, MR 0046395, Zbl 0047.05302 , Chapter 4.8
↑ Henstock, Ralph (1988), Lectures on the Theory of Integration , Series in Real Analysis Volume I, Singapore, New Jersey: World Scientific, ISBN 9971-5-0450-2, MR 0963249, Zbl 0668.28001 , Theorem 2.9
↑ Mitroi, F. C., & Niculescu, C. P. (2011). An extension of Young's inequality. In Abstract and Applied Analysis (Vol. 2011). Hindawi.

Related Research Articles

In number theory, a Liouville number is a real number $with the property that, for every positive integer, there exists a pair of integers with such that$

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series $is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if$

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of $L p$ spaces.

In calculus, the product rule is a formula used to find the derivatives of products of two or more functions. For two functions, it may be stated in Lagrange's notation as

In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential. It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables.

In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate. It allows in particular for a far reaching generalization of Lagrangian duality.

In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If $has a beta distribution, then the odds has a beta prime distribution.$

In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, $> 0, such that$

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

Viscoplasticity is a theory in continuum mechanics that describes the rate-dependent inelastic behavior of solids. Rate-dependence in this context means that the deformation of the material depends on the rate at which loads are applied. The inelastic behavior that is the subject of viscoplasticity is plastic deformation which means that the material undergoes unrecoverable deformations when a load level is reached. Rate-dependent plasticity is important for transient plasticity calculations. The main difference between rate-independent plastic and viscoplastic material models is that the latter exhibit not only permanent deformations after the application of loads but continue to undergo a creep flow as a function of time under the influence of the applied load.

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In mechanics, strain is defined as relative deformation, compared to a reference position configuration. Different equivalent choices may be made for the expression of a strain field depending on whether it is defined with respect to the initial or the final configuration of the body and on whether the metric tensor or its dual is considered.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

In mathematics, Jacobi polynomials $are a class of classical orthogonal polynomials. They are orthogonal with respect to the weight on the interval . The Gegenbauer polynomials, and thus also the Legendre, Zernike and Chebyshev polynomials, are special cases of the Jacobi polynomials.$

In mathematics, singular integral operators of convolution type are the singular integral operators that arise on Rⁿ and Tⁿ through convolution by distributions; equivalently they are the singular integral operators that commute with translations. The classical examples in harmonic analysis are the harmonic conjugation operator on the circle, the Hilbert transform on the circle and the real line, the Beurling transform in the complex plane and the Riesz transforms in Euclidean space. The continuity of these operators on L² is evident because the Fourier transform converts them into multiplication operators. Continuity on L^p spaces was first established by Marcel Riesz. The classical techniques include the use of Poisson integrals, interpolation theory and the Hardy–Littlewood maximal function. For more general operators, fundamental new techniques, introduced by Alberto Calderón and Antoni Zygmund in 1952, were developed by a number of authors to give general criteria for continuity on L^p spaces. This article explains the theory for the classical operators and sketches the subsequent general theory.

References

Jarchow, Hans (1981). Locally convex spaces. Stuttgart: B.G. Teubner. ISBN 978-3-519-02224-4. OCLC 8210342.
Bahouri, Hajer; Chemin, Jean-Yves; Danchin, Raphaël (2011). Fourier Analysis and Nonlinear Partial Differential Equations. Grundlehren der mathematischen Wissenschaften. Vol. 343. Berlin, Heidelberg: Springer. ISBN 978-3-642-16830-7. OCLC 704397128.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Young, W. H. (1912), "On classes of summable functions and their Fourier series", Proceedings of the Royal Society A , 87 (594): 225–229, Bibcode:1912RSPSA..87..225Y, doi: 10.1098/rspa.1912.0076 , JFM 43.1114.12, JSTOR 93236

[2] Pearse, Erin. "Math 209D - Real Analysis Summer Preparatory Seminar Lecture Notes" (PDF). Retrieved 17 September 2022.

[FOOTNOTEBahouriCheminDanchin2011-3] Bahouri, Chemin & Danchin 2011.

[FOOTNOTEJarchow198147–55-4] 1 2 Jarchow 1981, pp. 47–55.

[5] Tisdell, Chris (2013), The Peter Paul Inequality, YouTube video on Dr Chris Tisdell's YouTube channel,

[6] T. Ando (1995). "Matrix Young Inequalities". In Huijsmans, C. B.; Kaashoek, M. A.; Luxemburg, W. A. J.; et al. (eds.). Operator Theory in Function Spaces and Banach Lattices. Springer. pp. 33–38. ISBN 978-3-0348-9076-2.

[7] Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) [1934], Inequalities, Cambridge Mathematical Library (2nd ed.), Cambridge: Cambridge University Press, ISBN 0-521-05206-8, MR 0046395, Zbl 0047.05302 , Chapter 4.8

[8] Henstock, Ralph (1988), Lectures on the Theory of Integration , Series in Real Analysis Volume I, Singapore, New Jersey: World Scientific, ISBN 9971-5-0450-2, MR 0963249, Zbl 0668.28001 , Theorem 2.9

[9] Mitroi, F. C., & Niculescu, C. P. (2011). An extension of Young's inequality. In Abstract and Applied Analysis (Vol. 2011). Hindawi.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]