Young's inequality for products

Last updated
The area of the rectangle a,b can't be larger than sum of the areas under the functions
f
{\displaystyle f}
(red) and
f
-
1
{\displaystyle f^{-1}}
(yellow) Young.png
The area of the rectangle a,b can't be larger than sum of the areas under the functions (red) and (yellow)

In mathematics, Young's inequality for products is a mathematical inequality about the product of two numbers. [1] The inequality is named after William Henry Young and should not be confused with Young's convolution inequality.

Contents

Young's inequality for products can be used to prove Hölder's inequality. It is also widely used to estimate the norm of nonlinear terms in PDE theory, since it allows one to estimate a product of two terms by a sum of the same terms raised to a power and scaled.

Standard version for conjugate Hölder exponents

The standard form of the inequality is the following, which can be used to prove Hölder's inequality.

Theorem  If and are nonnegative real numbers and if and are real numbers such that then

Equality holds if and only if

Proof [2]

Since A graph on the -plane is thus also a graph From sketching a visual representation of the integrals of the area between this curve and the axes, and the area in the rectangle bounded by the lines and the fact that is always increasing for increasing and vice versa, we can see that upper bounds the area of the rectangle below the curve (with equality when ) and upper bounds the area of the rectangle above the curve (with equality when ). Thus, with equality when (or equivalently, ). Young's inequality follows from evaluating the integrals. (See below for a generalization.)

A second proof is via Jensen's inequality.

Proof [3]

The claim is certainly true if or so henceforth assume that and Put and Because the logarithm function is concave, with the equality holding if and only if Young's inequality follows by exponentiating.

Yet another proof is to first prove it with an then apply the resulting inequality to . The proof below illustrates also why Hölder conjugate exponent is the only possible parameter that makes Young's inequality hold for all non-negative values. The details follow:

Proof

Let and . The inequality holds if and only if (and hence ). This can be shown by convexity arguments or by simply minimizing the single-variable function.

To prove full Young's inequality, clearly we assume that and . Now, we apply the inequality above to to obtain: It is easy to see that choosing and multiplying both sides by yields Young's inequality.

Young's inequality may equivalently be written as

Where this is just the concavity of the logarithm function. Equality holds if and only if or This also follows from the weighted AM-GM inequality.

Generalizations

Theorem [4]   Suppose and If and are such that then

Using and replacing with and with results in the inequality: which is useful for proving Hölder's inequality.

Proof [4]

Define a real-valued function on the positive real numbers by for every and then calculate its minimum.

Theorem  If with then Equality holds if and only if all the s with non-zero s are equal.

Elementary case

An elementary case of Young's inequality is the inequality with exponent which also gives rise to the so-called Young's inequality with (valid for every ), sometimes called the Peter–Paul inequality. [5] This name refers to the fact that tighter control of the second term is achieved at the cost of losing some control of the first term – one must "rob Peter to pay Paul"

Proof: Young's inequality with exponent is the special case However, it has a more elementary proof.

Start by observing that the square of every real number is zero or positive. Therefore, for every pair of real numbers and we can write: Work out the square of the right hand side: Add to both sides: Divide both sides by 2 and we have Young's inequality with exponent

Young's inequality with follows by substituting and as below into Young's inequality with exponent

Matricial generalization

T. Ando proved a generalization of Young's inequality for complex matrices ordered by Loewner ordering. [6] It states that for any pair of complex matrices of order there exists a unitary matrix such that where denotes the conjugate transpose of the matrix and

Standard version for increasing functions

For the standard version [7] [8] of the inequality, let denote a real-valued, continuous and strictly increasing function on with and Let denote the inverse function of Then, for all and with equality if and only if

With and this reduces to standard version for conjugate Hölder exponents.

For details and generalizations we refer to the paper of Mitroi & Niculescu. [9]

Generalization using Fenchel–Legendre transforms

By denoting the convex conjugate of a real function by we obtain This follows immediately from the definition of the convex conjugate. For a convex function this also follows from the Legendre transformation.

More generally, if is defined on a real vector space and its convex conjugate is denoted by (and is defined on the dual space ), then where is the dual pairing.

Examples

The convex conjugate of is with such that and thus Young's inequality for conjugate Hölder exponents mentioned above is a special case.

The Legendre transform of is , hence for all non-negative and This estimate is useful in large deviations theory under exponential moment conditions, because appears in the definition of relative entropy, which is the rate function in Sanov's theorem.

See also

Notes

  1. Young, W. H. (1912), "On classes of summable functions and their Fourier series", Proceedings of the Royal Society A , 87 (594): 225–229, Bibcode:1912RSPSA..87..225Y, doi: 10.1098/rspa.1912.0076 , JFM   43.1114.12, JSTOR   93236
  2. Pearse, Erin. "Math 209D - Real Analysis Summer Preparatory Seminar Lecture Notes" (PDF). Retrieved 17 September 2022.
  3. Bahouri, Chemin & Danchin 2011.
  4. 1 2 Jarchow 1981, pp. 47–55.
  5. Tisdell, Chris (2013), The Peter Paul Inequality, YouTube video on Dr Chris Tisdell's YouTube channel,
  6. T. Ando (1995). "Matrix Young Inequalities". In Huijsmans, C. B.; Kaashoek, M. A.; Luxemburg, W. A. J.; et al. (eds.). Operator Theory in Function Spaces and Banach Lattices. Springer. pp. 33–38. ISBN   978-3-0348-9076-2.
  7. Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) [1934], Inequalities, Cambridge Mathematical Library (2nd ed.), Cambridge: Cambridge University Press, ISBN   0-521-05206-8, MR   0046395, Zbl   0047.05302 , Chapter 4.8
  8. Henstock, Ralph (1988), Lectures on the Theory of Integration , Series in Real Analysis Volume I, Singapore, New Jersey: World Scientific, ISBN   9971-5-0450-2, MR   0963249, Zbl   0668.28001 , Theorem 2.9
  9. Mitroi, F. C., & Niculescu, C. P. (2011). An extension of Young's inequality. In Abstract and Applied Analysis (Vol. 2011). Hindawi.

Related Research Articles

In integral calculus, an elliptic integral is one of a number of related functions defined as the value of certain integrals, which were first studied by Giulio Fagnano and Leonhard Euler. Their name originates from their originally arising in connection with the problem of finding the arc length of an ellipse.

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if A convergent series that is not absolutely convergent is called conditionally convergent.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential. It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables.

In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate. The convex conjugate is widely used for constructing the dual problem in optimization theory, thus generalizing Lagrangian duality.

In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an n-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, although other shapes can occur.

A multipole expansion is a mathematical series representing a function that depends on angles—usually the two angles used in the spherical coordinate system for three-dimensional Euclidean space, . Multipole expansions are useful because, similar to Taylor series, oftentimes only the first few terms are needed to provide a good approximation of the original function. The function being expanded may be real- or complex-valued and is defined either on , or less often on for some other .

<span class="mw-page-title-main">Inverse-gamma distribution</span> Two-parameter family of continuous probability distributions

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

In mathematics, a Kloosterman sum is a particular kind of exponential sum. They are named for the Dutch mathematician Hendrik Kloosterman, who introduced them in 1926 when he adapted the Hardy–Littlewood circle method to tackle a problem involving positive definite diagonal quadratic forms in four variables, strengthening his 1924 dissertation research on five or more variables.

<span class="mw-page-title-main">Beta prime distribution</span> Probability distribution

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.

<span class="mw-page-title-main">Irrationality measure</span> Function that quantifies how near a number is to being rational

In mathematics, an irrationality measure of a real number is a measure of how "closely" it can be approximated by rationals.

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members often asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.

<span class="mw-page-title-main">Dvoretzky–Kiefer–Wolfowitz inequality</span> Statistical inequality

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

<span class="mw-page-title-main">Viscoplasticity</span> Theory in continuum mechanics

Viscoplasticity is a theory in continuum mechanics that describes the rate-dependent inelastic behavior of solids. Rate-dependence in this context means that the deformation of the material depends on the rate at which loads are applied. The inelastic behavior that is the subject of viscoplasticity is plastic deformation which means that the material undergoes unrecoverable deformations when a load level is reached. Rate-dependent plasticity is important for transient plasticity calculations. The main difference between rate-independent plastic and viscoplastic material models is that the latter exhibit not only permanent deformations after the application of loads but continue to undergo a creep flow as a function of time under the influence of the applied load.

In information theory, Pinsker's inequality, named after its inventor Mark Semenovich Pinsker, is an inequality that bounds the total variation distance in terms of the Kullback–Leibler divergence. The inequality is tight up to constant factors.

<span class="mw-page-title-main">Anatoly Karatsuba</span> Russian mathematician (1937–2008)

Anatoly Alexeyevich Karatsuba was a Russian mathematician working in the field of analytic number theory, p-adic numbers and Dirichlet series.

In mechanics, strain is defined as relative deformation, compared to a reference position configuration. Different equivalent choices may be made for the expression of a strain field depending on whether it is defined with respect to the initial or the final configuration of the body and on whether the metric tensor or its dual is considered.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

In mathematics, singular integral operators of convolution type are the singular integral operators that arise on Rn and Tn through convolution by distributions; equivalently they are the singular integral operators that commute with translations. The classical examples in harmonic analysis are the harmonic conjugation operator on the circle, the Hilbert transform on the circle and the real line, the Beurling transform in the complex plane and the Riesz transforms in Euclidean space. The continuity of these operators on L2 is evident because the Fourier transform converts them into multiplication operators. Continuity on Lp spaces was first established by Marcel Riesz. The classical techniques include the use of Poisson integrals, interpolation theory and the Hardy–Littlewood maximal function. For more general operators, fundamental new techniques, introduced by Alberto Calderón and Antoni Zygmund in 1952, were developed by a number of authors to give general criteria for continuity on Lp spaces. This article explains the theory for the classical operators and sketches the subsequent general theory.

In PAC learning, error tolerance refers to the ability of an algorithm to learn when the examples received have been corrupted in some way. In fact, this is a very common and important issue since in many applications it is not possible to access noise-free data. Noise can interfere with the learning process at different levels: the algorithm may receive data that have been occasionally mislabeled, or the inputs may have some false information, or the classification of the examples may have been maliciously adulterated.

References