# Lp space

Last updated

In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue ( Dunford & Schwartz 1958 , III.3), although according to the Bourbaki group ( Bourbaki 1987 ) they were first introduced by Frigyes Riesz ( Riesz 1910 ). Lp spaces form an important class of Banach spaces in functional analysis, and of topological vector spaces. Because of their key role in the mathematical analysis of measure and probability spaces, Lebesgue spaces are used also in the theoretical discussion of problems in physics, statistics, finance, engineering, and other disciplines.

## Applications

### Statistics

In statistics, measures of central tendency and statistical dispersion, such as the mean, median, and standard deviation, are defined in terms of Lp metrics, and measures of central tendency can be characterized as solutions to variational problems.

In penalized regression, "L1 penalty" and "L2 penalty" refer to penalizing either the L1 norm of a solution's vector of parameter values (i.e. the sum of its absolute values), or its L2 norm (its Euclidean length). Techniques which use an L1 penalty, like LASSO, encourage solutions where many parameters are zero. Techniques which use an L2 penalty, like ridge regression, encourage solutions where most parameter values are small. Elastic net regularization uses a penalty term that is a combination of the L1 norm and the L2 norm of the parameter vector.

### Hausdorff–Young inequality

The Fourier transform for the real line (or, for periodic functions, see Fourier series), maps Lp(R) to Lq(R) (or Lp(T) to q) respectively, where 1 ≤ p ≤ 2 and 1/p + 1/q = 1. This is a consequence of the Riesz–Thorin interpolation theorem, and is made precise with the Hausdorff–Young inequality.

By contrast, if p > 2, the Fourier transform does not map into Lq.

### Hilbert spaces

Hilbert spaces are central to many applications, from quantum mechanics to stochastic calculus. The spaces L2 and 2 are both Hilbert spaces. In fact, by choosing a Hilbert basis (i.e., a maximal orthonormal subset of L2 or any Hilbert space), one sees that all Hilbert spaces are isometric to 2(E), where E is a set with an appropriate cardinality.

## The p-norm in finite dimensions

The length of a vector x = (x1, x2, ..., xn) in the n-dimensional real vector space Rn is usually given by the Euclidean norm:

${\displaystyle \left\|x\right\|_{2}=\left({x_{1}}^{2}+{x_{2}}^{2}+\dotsb +{x_{n}}^{2}\right)^{1/2}.}$

The Euclidean distance between two points x and y is the length ||xy||2 of the straight line between the two points. In many situations, the Euclidean distance is insufficient for capturing the actual distances in a given space. An analogy to this is suggested by taxi drivers in a grid street plan who should measure distance not in terms of the length of the straight line to their destination, but in terms of the rectilinear distance, which takes into account that streets are either orthogonal or parallel to each other. The class of p-norms generalizes these two examples and has an abundance of applications in many parts of mathematics, physics, and computer science.

### Definition

For a real number p ≥ 1, the p-norm or Lp-norm of x is defined by

${\displaystyle \left\|x\right\|_{p}=\left(|x_{1}|^{p}+|x_{2}|^{p}+\dotsb +|x_{n}|^{p}\right)^{1/p}.}$

The absolute value bars are unnecessary when p is a rational number and, in reduced form, has an even numerator.

The Euclidean norm from above falls into this class and is the 2-norm, and the 1-norm is the norm that corresponds to the rectilinear distance.

The L-norm or maximum norm (or uniform norm) is the limit of the Lp-norms for p → ∞. It turns out that this limit is equivalent to the following definition:

${\displaystyle \left\|x\right\|_{\infty }=\max \left\{|x_{1}|,|x_{2}|,\dotsc ,|x_{n}|\right\}}$

See L-infinity.

For all p ≥ 1, the p-norms and maximum norm as defined above indeed satisfy the properties of a "length function" (or norm), which are that:

• only the zero vector has zero length,
• the length of the vector is positive homogeneous with respect to multiplication by a scalar (positive homogeneity), and
• the length of the sum of two vectors is no larger than the sum of lengths of the vectors (triangle inequality).

Abstractly speaking, this means that Rn together with the p-norm is a Banach space. This Banach space is the Lp-space over Rn.

#### Relations between p-norms

The grid distance or rectilinear distance (sometimes called the "Manhattan distance") between two points is never shorter than the length of the line segment between them (the Euclidean or "as the crow flies" distance). Formally, this means that the Euclidean norm of any vector is bounded by its 1-norm:

${\displaystyle \left\|x\right\|_{2}\leq \left\|x\right\|_{1}.}$

This fact generalizes to p-norms in that the p-norm ||x||p of any given vector x does not grow with p:

||x||p+a ≤ ||x||p for any vector x and real numbers p ≥ 1 and a ≥ 0. (In fact this remains true for 0 < p < 1 and a ≥ 0.)

For the opposite direction, the following relation between the 1-norm and the 2-norm is known:

${\displaystyle \left\|x\right\|_{1}\leq {\sqrt {n}}\left\|x\right\|_{2}.}$

This inequality depends on the dimension n of the underlying vector space and follows directly from the Cauchy–Schwarz inequality.

In general, for vectors in Cn where 0 < r < p:

${\displaystyle \left\|x\right\|_{p}\leq \left\|x\right\|_{r}\leq n^{{\frac {1}{r}}-{\frac {1}{p}}}\left\|x\right\|_{p}.}$

This is a consequence of Hölder's inequality.

### When 0 < p < 1

In Rn for n > 1, the formula

${\displaystyle \|x\|_{p}=\left(|x_{1}|^{p}+|x_{2}|^{p}+\cdots +|x_{n}|^{p}\right)^{1/p}}$

defines an absolutely homogeneous function for 0 < p < 1; however, the resulting function does not define a norm, because it is not subadditive. On the other hand, the formula

${\displaystyle |x_{1}|^{p}+|x_{2}|^{p}+\dotsb +|x_{n}|^{p}}$

defines a subadditive function at the cost of losing absolute homogeneity. It does define an F-norm, though, which is homogeneous of degree p.

Hence, the function

${\displaystyle d_{p}(x,y)=\sum _{i=1}^{n}|x_{i}-y_{i}|^{p}}$

defines a metric. The metric space (Rn, dp) is denoted by np.

Although the p-unit ball Bnp around the origin in this metric is "concave", the topology defined on Rn by the metric dp is the usual vector space topology of Rn, hence np is a locally convex topological vector space. Beyond this qualitative statement, a quantitative way to measure the lack of convexity of np is to denote by Cp(n) the smallest constant C such that the multiple CBnp of the p-unit ball contains the convex hull of Bnp, equal to Bn1. The fact that for fixed p < 1 we have

${\displaystyle C_{p}(n)=n^{{\frac {1}{p}}-1}\to \infty ,\quad {\text{as }}n\to \infty }$

shows that the infinite-dimensional sequence space p defined below, is no longer locally convex.[ citation needed ]

### When p = 0

There is one 0 norm and another function called the 0 "norm" (with quotation marks).

The mathematical definition of the 0 norm was established by Banach's Theory of Linear Operations . The space of sequences has a complete metric topology provided by the F-norm

${\displaystyle (x_{n})\mapsto \sum _{n}2^{-n}{\frac {|x_{n}|}{1+|x_{n}|}},}$

which is discussed by Stefan Rolewicz in Metric Linear Spaces. [1] The 0-normed space is studied in functional analysis, probability theory, and harmonic analysis.

Another function was called the 0 "norm" by David Donoho—whose quotation marks warn that this function is not a proper norm—is the number of non-zero entries of the vector x. Many authors abuse terminology by omitting the quotation marks. Defining 00 = 0, the zero "norm" of x is equal to

${\displaystyle |x_{1}|^{0}+|x_{2}|^{0}+\cdots +|x_{n}|^{0}.}$

This is not a norm because it is not homogeneous. For example, scaling the vector x by a positive constant does not change the "norm". Despite these defects as a mathematical norm, the non-zero counting "norm" has uses in scientific computing, information theory, and statistics–notably in compressed sensing in signal processing and computational harmonic analysis. Despite not being a norm, the associated metric, known as Hamming distance, is a valid distance, since homogeneity is not required for distances.

### p-Entropy on finite dimensional vector spaces having a p-norm ${\displaystyle {\mathcal {S}}_{p}(x)=-p^{2}{\frac {d}{dp}}\log(\|x\|_{p})~~p>0,\|x\|_{p}\neq 0}$

1-Entropy was introduced by Josiah Willard Gibbs as statistical thermodynamic Entropy and again by Claude Shannon's information. The Wikipedia page on P-norms and ${\displaystyle L^{p}}$ spaces discusses various properties of the p-norm including homogeneity, ${\displaystyle \|x\|_{\infty }}$ as the maximum component absolute value, and this page includes unit p-sphere animations. I. Hirschman conjectured that the 2-entropy sum of a vector and its Fourier transform is bounded below by ${\displaystyle 1-ln(2)}$. The conjecture was proved by William Beckner. Definition of p-entropy as a derivative is suggested by I. Hirschman's work, and clarifies position-momentum uncertainty's relationship to entropy. For p-norm spheres animated using p as time, entropy becomes ${\displaystyle {-p^{2}}}$ multiplied by the sphere's radial velocity divided by the sphere's fixed radius. On unit spheres, entropy is a ${\displaystyle (-p^{2})}$ multiple of radial velocity. The notation ${\displaystyle log(t)}$ is the natural logarithm with base ${\displaystyle e}$. Vector space dimension is ${\displaystyle D}$.

• Homogeneity of the norm ${\displaystyle \|c\cdot x\|_{p}=|c|\cdot \|x\|_{p}}$ causes ${\displaystyle {\mathcal {S}}_{p}(c\cdot x)={\mathcal {S}}_{p}(x)}$ for any scalar c which is independent of p. ${\displaystyle {\mathcal {S}}_{p}(x)}$ isn't continuous at the null vector.
• Vectors having a uniform absolute value of its non-zero components will have p independent entropy equal to ln(n) where n is the count of non-zero components. Any uniform vector of absolute values is a critical point of entropy, with p static entropy and zero animation velocity when ${\displaystyle n=1}$.
• Setting ${\displaystyle 0\leq \iota (t)=-t~log(t)\leq e^{-1},~~0 entropy becomes ${\displaystyle 0\leq {\mathcal {S}}_{p}(x)=p\sum _{k=0}^{D-1}\iota \left(\left({\frac {|x_{k}|}{\|x\|_{p}}}\right)^{p}\right)}$; Thus ${\displaystyle \|x\|_{p}}$ is non-increasing in p. Often entropy is only defined for points on the unit sphere. Note: ${\displaystyle \iota (t)}$ is continuous at zero.
• ${\displaystyle \iota (e^{-1})=e^{-1}}$ is maximum ${\displaystyle \iota (t),~0\leq t}$ .The uniform vector, all components equal to ${\displaystyle e^{-1}}$ is the location of ${\displaystyle \sum _{k}\iota (|x_{k}|)}$ unconstrained maximum.
• Since ${\displaystyle {\frac {d}{dp}}{\mathcal {S}}_{p}(x)\leq 0}$ is concave downward.Uniform vectors are local entropy extrema and the uniform vector without zero components is a global maximum; ${\displaystyle 0\leq {\mathcal {S}}_{p}(x)\leq log(D)}$, Also established by Gibbs.
• Since ${\displaystyle {\frac {d}{dp}}{\mathcal {S}}_{p}(x)\leq 0}$ the rate of inflation of animated spheres is non-increasing, whence ${\displaystyle {\frac {d}{dp}}{\mathcal {S}}_{\frac {1}{p}}(x)\geq 0}$ shows radial deflation on the ${\displaystyle \|x\|_{\frac {1}{p}}=1}$ unit sphere non-decreasing with p.
• The function ${\displaystyle log\left(\|x\|_{\frac {1}{p}}\right)}$ is convex; So that adjacent secants bound the tangent line at their shared endpoint, yielding
• ${\displaystyle {\frac {1}{{\frac {1}{p}}-{\frac {1}{r}}}}log\left({\frac {\|x\|_{p}}{\|x\|_{r}}}\right)\leq {\mathcal {S}}_{p}(x)\leq {\frac {1}{{\frac {1}{o}}-{\frac {1}{p}}}}log\left({\frac {\|x\|_{o}}{\|x\|_{p}}}\right),~0. Special case: ${\displaystyle p~log\left({\frac {\|x\|_{p}}{\|x\|_{\infty }}}\right)\leq {\mathcal {S}}_{p}(x)}$. So that ${\displaystyle {\mathcal {S}}_{p}(x)=0}$ if and only if ${\displaystyle \|x\|_{p}=\|x\|_{\infty }}$.
• The Fourier transform carries any zero entropy vector to a uniform vector of maximum entropy ${\displaystyle log(D)}$; In which case the sum of entropies of any zero entropy vector and its image is positive, regardless of p's value. In the following, ${\displaystyle {\mathcal {T}}(x)}$ is a vector function which maps zero entropy vectors to non-zero entropy vectors; The Fourier transform is the archetype. The collection of all vectors with components ${\displaystyle |x_{k}|\leq 1}$ is a sequentially compact set in the Euclidean metric as is the closed subset of vectors of unit 2-norm sphere; Homogeneity shows any possible entropy is attainable on a sphere. Set ${\displaystyle 0<\sigma _{p,p}(x)={\mathcal {S}}_{p}(x)+{\mathcal {S}}_{p}({\mathcal {T}}(x))=-p^{2}{\frac {d}{dp}}log\left(\|x\|_{p}\|{\mathcal {T}}(x)\|_{p}\right)}$. When ${\displaystyle {\mathcal {T}}}$is also a continuous function and its image of any single non-zero component vector has non-zero entropy, then ${\displaystyle \sigma _{p,p}(x)}$ has a positive minimum and maximum at some non-null vectors on the 2-norm unit sphere,by the Extreme value theorem.
• When ${\displaystyle {\frac {1}{p}}+{\frac {1}{q}}=1 then ${\displaystyle 0<\sigma _{q,p}(x)={\mathcal {S}}_{q}(x)+{\mathcal {S}}_{p}({\mathcal {T}}(x))=-p^{2}{\frac {d}{dp}}log\left({\frac {\|{\mathcal {T}}(x)\|_{p}}{\|x\|_{q}}}\right)}$ so that ${\displaystyle \sigma _{p,q}}$ reaches its extreme values at some vectors on 2-norm unit sphere. Integration yields a bound on the product of p-norms. This finite dimensional result is similar to I. Herschman's 2-norm conjecture.

## The p-norm in infinite dimensions and ℓp spaces

### The sequence space ℓp

The p-norm can be extended to vectors that have an infinite number of components (sequences), which yields the space p. This contains as special cases:

The space of sequences has a natural vector space structure by applying addition and scalar multiplication coordinate by coordinate. Explicitly, the vector sum and the scalar action for infinite sequences of real (or complex) numbers are given by:

{\displaystyle {\begin{aligned}&(x_{1},x_{2},\ldots ,x_{n},x_{n+1},\ldots )+(y_{1},y_{2},\ldots ,y_{n},y_{n+1},\ldots )\\={}&(x_{1}+y_{1},x_{2}+y_{2},\ldots ,x_{n}+y_{n},x_{n+1}+y_{n+1},\ldots ),\\[6pt]&\lambda \cdot \left(x_{1},x_{2},\ldots ,x_{n},x_{n+1},\ldots \right)\\={}&(\lambda x_{1},\lambda x_{2},\ldots ,\lambda x_{n},\lambda x_{n+1},\ldots ).\end{aligned}}}

Define the p-norm:

${\displaystyle \left\|x\right\|_{p}=\left(|x_{1}|^{p}+|x_{2}|^{p}+\cdots +|x_{n}|^{p}+|x_{n+1}|^{p}+\cdots \right)^{1/p}}$

Here, a complication arises, namely that the series on the right is not always convergent, so for example, the sequence made up of only ones, (1, 1, 1, ...), will have an infinite p-norm for 1 ≤ p < ∞. The space p is then defined as the set of all infinite sequences of real (or complex) numbers such that the p-norm is finite.

One can check that as p increases, the set p grows larger. For example, the sequence

${\displaystyle \left(1,{\frac {1}{2}},\ldots ,{\frac {1}{n}},{\frac {1}{n+1}},\ldots \right)}$

is not in 1, but it is in p for p > 1, as the series

${\displaystyle 1^{p}+{\frac {1}{2^{p}}}+\cdots +{\frac {1}{n^{p}}}+{\frac {1}{(n+1)^{p}}}+\cdots ,}$

diverges for p = 1 (the harmonic series), but is convergent for p > 1.

One also defines the -norm using the supremum:

${\displaystyle \left\|x\right\|_{\infty }=\sup(|x_{1}|,|x_{2}|,\dotsc ,|x_{n}|,|x_{n+1}|,\ldots )}$

and the corresponding space of all bounded sequences. It turns out that [2]

${\displaystyle \left\|x\right\|_{\infty }=\lim _{p\to \infty }\left\|x\right\|_{p}}$

if the right-hand side is finite, or the left-hand side is infinite. Thus, we will consider p spaces for 1 ≤ p ≤ ∞.

The p-norm thus defined on p is indeed a norm, and p together with this norm is a Banach space. The fully general Lp space is obtained—as seen below—by considering vectors, not only with finitely or countably-infinitely many components, but with "arbitrarily many components"; in other words, functions. An integral instead of a sum is used to define the p-norm.

### General ℓp-space

In complete analogy to the preceding definition one can define the space ${\displaystyle \ell ^{p}(I)}$ over a general index set ${\displaystyle I}$ (and ${\displaystyle 1\leq p<\infty }$) as

${\displaystyle \ell ^{p}(I)=\left\{(x_{i})_{i\in I}\in \mathbb {K} ^{I};\,\sum _{i\in I}|x_{i}|^{p}<\infty \right\}\,}$,

where convergence on the right means that only countably many summands are nonzero (see also Unconditional convergence). With the norm

${\displaystyle \left\|x\right\|_{p}=\left(\sum _{i\in I}|x_{i}|^{p}\right)^{1/p}}$

the space ${\displaystyle \ell ^{p}(I)}$ becomes a Banach space. In the case where ${\displaystyle I}$ is finite with ${\displaystyle n}$ elements, this construction yields Rn with the ${\displaystyle p}$-norm defined above. If ${\displaystyle I}$ is countably infinite, this is exactly the sequence space ${\displaystyle \ell ^{p}}$ defined above. For uncountable sets ${\displaystyle I}$ this is a non-separable Banach space which can be seen as the locally convex direct limit of ${\displaystyle \ell ^{p}}$-sequence spaces. [3]

The index set ${\displaystyle I}$ can be turned into a measure space by giving it the discrete σ-algebra and the counting measure. Then the space ${\displaystyle \ell ^{p}(I)}$ is just a special case of the more general ${\displaystyle L^{p}}$-space (see below).

## Lp spaces and Lebesgue integrals

An Lp space may be defined as a space of measurable functions for which the ${\displaystyle p}$-th power of the absolute value is Lebesgue integrable, where functions which agree almost everywhere are identified. More generally, let 1 ≤ p < ∞ and (S, Σ, μ) be a measure space. Consider the set of all measurable functions from S to C or R whose absolute value raised to the p-th power has a finite integral, or equivalently, that

${\displaystyle \|f\|_{p}\equiv \left(\int _{S}|f|^{p}\;\mathrm {d} \mu \right)^{1/p}<\infty }$

The set of such functions forms a vector space, with the following natural operations:

{\displaystyle {\begin{aligned}(f+g)(x)&=f(x)+g(x),\\(\lambda f)(x)&=\lambda f(x)\end{aligned}}}

for every scalar λ.

That the sum of two p-th power integrable functions is again p-th power integrable follows from the inequality

${\displaystyle \|f+g\|_{p}^{p}\leq 2^{p-1}\left(\|f\|_{p}^{p}+\|g\|_{p}^{p}\right).}$

(This comes from the convexity of ${\displaystyle t\mapsto t^{p}}$ for ${\displaystyle p\geq 1}$.)

In fact, more is true. Minkowski's inequality says the triangle inequality holds for || · ||p. Thus the set of p-th power integrable functions, together with the function || · ||p, is a seminormed vector space, which is denoted by ${\displaystyle {\mathcal {L}}^{p}(S,\,\mu )}$.

For p = ∞, the space ${\displaystyle {\mathcal {L}}^{\infty }(S,\mu )}$ is the space of measurable functions bounded almost everywhere, with the essential supremum of its absolute value as a norm:

${\displaystyle \|f\|_{\infty }\equiv \inf\{C\geq 0:|f(x)|\leq C{\text{ for almost every }}x\}.}$

As in the discrete case, if there exists q < ∞ such that fL(S, μ) ∩ Lq(S, μ), then

${\displaystyle \|f\|_{\infty }=\lim _{p\to \infty }\|f\|_{p}.}$

${\displaystyle {\mathcal {L}}^{p}(S,\,\mu )}$ can be made into a normed vector space in a standard way; one simply takes the quotient space with respect to the kernel of || · ||p. Since for any measurable function f, we have that ||f||p = 0 if and only if f = 0 almost everywhere, the kernel of || · ||p does not depend upon p,

${\displaystyle {\mathcal {N}}\equiv \{f:f=0\ \mu {\text{-almost everywhere}}\}=\ker(\|\cdot \|_{p})\qquad \forall \ 1\leq p<\infty }$

In the quotient space, two functions f and g are identified if f = g almost everywhere. The resulting normed vector space is, by definition,

${\displaystyle L^{p}(S,\mu )\equiv {\mathcal {L}}^{p}(S,\mu )/{\mathcal {N}}}$

In general, this process cannot be reversed: there is no consistent way to define a "canonical" representative of each coset of ${\displaystyle {\mathcal {N}}}$ in ${\displaystyle L^{p}}$. For ${\displaystyle L^{\infty }}$, however, there is a theory of lifts enabling such recovery.

When the underlying measure space S is understood, Lp(S, μ) is often abbreviated Lp(μ), or just Lp.

For 1 ≤ p ≤ ∞, Lp(S, μ) is a Banach space. The fact that Lp is complete is often referred to as the Riesz-Fischer theorem, and can be proven using the convergence theorems for Lebesgue integrals.

The above definitions generalize to Bochner spaces.

### Special cases

Similar to the p spaces, L2 is the only Hilbert space among Lp spaces. In the complex case, the inner product on L2 is defined by

${\displaystyle \langle f,g\rangle =\int _{S}f(x){\overline {g(x)}}\,\mathrm {d} \mu (x)}$

The additional inner product structure allows for a richer theory, with applications to, for instance, Fourier series and quantum mechanics. Functions in L2 are sometimes called quadratically integrable functions , square-integrable functions or square-summable functions, but sometimes these terms are reserved for functions that are square-integrable in some other sense, such as in the sense of a Riemann integral ( Titchmarsh 1976 ).

If we use complex-valued functions, the space L is a commutative C*-algebra with pointwise multiplication and conjugation. For many measure spaces, including all sigma-finite ones, it is in fact a commutative von Neumann algebra. An element of L defines a bounded operator on any Lp space by multiplication.

For 1 ≤ p ≤ ∞ the p spaces are a special case of Lp spaces, when S = N , and μ is the counting measure on N. More generally, if one considers any set S with the counting measure, the resulting Lp space is denoted p(S). For example, the space p(Z) is the space of all sequences indexed by the integers, and when defining the p-norm on such a space, one sums over all the integers. The space p(n), where n is the set with n elements, is Rn with its p-norm as defined above. As any Hilbert space, every space L2 is linearly isometric to a suitable 2(I), where the cardinality of the set I is the cardinality of an arbitrary Hilbertian basis for this particular L2.

## Properties of Lp spaces

### Dual spaces

The dual space (the Banach space of all continuous linear functionals) of Lp(μ) for 1 < p < ∞ has a natural isomorphism with Lq(μ), where q is such that 1/p + 1/q = 1 (i.e. q = p/p − 1). This isomorphism associates gLq(μ) with the functional κp(g) ∈ Lp(μ) defined by

${\displaystyle f\mapsto \kappa _{p}(g)(f)=\int fg\,\mathrm {d} \mu \ }$ for every ${\displaystyle f\in L^{p}(\mu )}$

The fact that κp(g) is well defined and continuous follows from Hölder's inequality. κp : Lq(μ) → Lp(μ) is a linear mapping which is an isometry by the extremal case of Hölder's inequality. It is also possible to show (for example with the Radon–Nikodym theorem, see [4] ) that any GLp(μ) can be expressed this way: i.e., that κp is onto. Since κp is onto and isometric, it is an isomorphism of Banach spaces. With this (isometric) isomorphism in mind, it is usual to say simply that Lq is the dual Banach space of Lp.

For 1 < p < ∞, the space Lp(μ) is reflexive. Let κp be as above and let κq : Lp(μ) → Lq(μ) be the corresponding linear isometry. Consider the map from Lp(μ) to Lp(μ)∗∗, obtained by composing κq with the transpose (or adjoint) of the inverse of κp:

${\displaystyle j_{p}:L^{p}(\mu )\mathrel {\overset {\kappa _{q}}{\longrightarrow }} L^{q}(\mu )^{*}\mathrel {\overset {\left(\kappa _{p}^{-1}\right)^{*}}{\longrightarrow }} L^{p}(\mu )^{**}}$

This map coincides with the canonical embedding J of Lp(μ) into its bidual. Moreover, the map jp is onto, as composition of two onto isometries, and this proves reflexivity.

If the measure μ on S is sigma-finite, then the dual of L1(μ) is isometrically isomorphic to L(μ) (more precisely, the map κ1 corresponding to p = 1 is an isometry from L(μ) onto L1(μ)).

The dual of L is subtler. Elements of L(μ) can be identified with bounded signed finitely additive measures on S that are absolutely continuous with respect to μ. See ba space for more details. If we assume the axiom of choice, this space is much bigger than L1(μ) except in some trivial cases. However, Saharon Shelah proved that there are relatively consistent extensions of Zermelo–Fraenkel set theory (ZF + DC + "Every subset of the real numbers has the Baire property") in which the dual of is 1. [5]

### Embeddings

Colloquially, if 1 ≤ p < q ≤ ∞, then Lp(S, μ) contains functions that are more locally singular, while elements of Lq(S, μ) can be more spread out. Consider the Lebesgue measure on the half line (0, ∞). A continuous function in L1 might blow up near 0 but must decay sufficiently fast toward infinity. On the other hand, continuous functions in L need not decay at all but no blow-up is allowed. The precise technical result is the following. [6] Suppose that 0 < p < q ≤ ∞. Then:

1. Lq(S, μ) ⊂ Lp(S, μ) iff S does not contain sets of finite but arbitrarily large measure, and
2. Lp(S, μ) ⊂ Lq(S, μ) iff S does not contain sets of non-zero but arbitrarily small measure.

Neither condition holds for the real line with the Lebesgue measure. In both cases the embedding is continuous, in that the identity operator is a bounded linear map from Lq to Lp in the first case, and Lp to Lq in the second. (This is a consequence of the closed graph theorem and properties of Lp spaces.) Indeed, if the domain S has finite measure, one can make the following explicit calculation using Hölder's inequality

${\displaystyle \ \|\mathbf {1} f^{p}\|_{1}\leq \|\mathbf {1} \|_{q/(q-p)}\|f^{p}\|_{q/p}}$

${\displaystyle \ \|f\|_{p}\leq \mu (S)^{1/p-1/q}\|f\|_{q}}$.

The constant appearing in the above inequality is optimal, in the sense that the operator norm of the identity I : Lq(S, μ) → Lp(S, μ) is precisely

${\displaystyle \|I\|_{q,p}=\mu (S)^{1/p-1/q}}$

the case of equality being achieved exactly when f = 1μ-almost-everywhere.

### Dense subspaces

Throughout this section we assume that: 1 ≤ p < ∞.

Let (S, Σ, μ) be a measure space. An integrable simple functionf on S is one of the form

${\displaystyle f=\sum _{j=1}^{n}a_{j}\mathbf {1} _{A_{j}}}$

where aj is scalar, Aj ∈ Σ has finite measure and ${\displaystyle {\mathbf {1} }_{A_{j}}}$ is the indicator function of the set ${\displaystyle A_{j}}$, for j = 1, ..., n. By construction of the integral, the vector space of integrable simple functions is dense in Lp(S, Σ, μ).

More can be said when S is a normal topological space and Σ its Borel σalgebra, i.e., the smallest σalgebra of subsets of S containing the open sets.

Suppose VS is an open set with μ(V) < ∞. It can be proved that for every Borel set A ∈ Σ contained in V, and for every ε > 0, there exist a closed set F and an open set U such that

${\displaystyle F\subset A\subset U\subset V\quad {\text{and}}\quad \mu (U)-\mu (F)=\mu (U\setminus F)<\varepsilon }$

It follows that there exists a continuous Urysohn function 0 ≤ φ ≤ 1 on S that is 1 on F and 0 on SU, with

${\displaystyle \int _{S}|\mathbf {1} _{A}-\varphi |\,\mathrm {d} \mu <\varepsilon \ .}$

If S can be covered by an increasing sequence (Vn) of open sets that have finite measure, then the space of pintegrable continuous functions is dense in Lp(S, Σ, μ). More precisely, one can use bounded continuous functions that vanish outside one of the open sets Vn.

This applies in particular when S = Rd and when μ is the Lebesgue measure. The space of continuous and compactly supported functions is dense in Lp(Rd). Similarly, the space of integrable step functions is dense in Lp(Rd); this space is the linear span of indicator functions of bounded intervals when d = 1, of bounded rectangles when d = 2 and more generally of products of bounded intervals.

Several properties of general functions in Lp(Rd) are first proved for continuous and compactly supported functions (sometimes for step functions), then extended by density to all functions. For example, it is proved this way that translations are continuous on Lp(Rd), in the following sense:

${\displaystyle \forall f\in L^{p}\left(\mathbf {R} ^{d}\right):\quad \left\|\tau _{t}f-f\right\|_{p}\to 0,\quad {\text{as }}\mathbf {R} ^{d}\ni t\to 0,}$

where

${\displaystyle (\tau _{t}f)(x)=f(x-t).}$

## Lp (0 < p < 1)

Let (S, Σ, μ) be a measure space. If 0 < p < 1, then Lp(μ) can be defined as above: it is the vector space of those measurable functions f such that

${\displaystyle N_{p}(f)=\int _{S}|f|^{p}\,d\mu <\infty .}$

As before, we may introduce the p-norm ||f||p = Np(f)1/p, but || · ||p does not satisfy the triangle inequality in this case, and defines only a quasi-norm. The inequality (a + b)pap + bp, valid for a, b ≥ 0 implies that ( Rudin 1991 , §1.47)

${\displaystyle N_{p}(f+g)\leq N_{p}(f)+N_{p}(g)}$

and so the function

${\displaystyle d_{p}(f,g)=N_{p}(f-g)=\|f-g\|_{p}^{p}}$

is a metric on Lp(μ). The resulting metric space is complete; the verification is similar to the familiar case when p ≥ 1.

In this setting Lp satisfies a reverse Minkowski inequality, that is for u, v in Lp

${\displaystyle {\Big \|}|u|+|v|{\Big \|}_{p}\geq \|u\|_{p}+\|v\|_{p}}$

This result may be used to prove Clarkson's inequalities, which are in turn used to establish the uniform convexity of the spaces Lp for 1 < p < ∞( Adams & Fournier 2003 ).

The space Lp for 0 < p < 1 is an F-space: it admits a complete translation-invariant metric with respect to which the vector space operations are continuous. It is also locally bounded, much like the case p ≥ 1. It is the prototypical example of an F-space that, for most reasonable measure spaces, is not locally convex: in p or Lp([0, 1]), every open convex set containing the 0 function is unbounded for the p-quasi-norm; therefore, the 0 vector does not possess a fundamental system of convex neighborhoods. Specifically, this is true if the measure space S contains an infinite family of disjoint measurable sets of finite positive measure.

The only nonempty convex open set in Lp([0, 1]) is the entire space ( Rudin 1991 , §1.47). As a particular consequence, there are no nonzero linear functionals on Lp([0, 1]): the dual space is the zero space. In the case of the counting measure on the natural numbers (producing the sequence space Lp(μ) = p), the bounded linear functionals on p are exactly those that are bounded on 1, namely those given by sequences in . Although p does contain non-trivial convex open sets, it fails to have enough of them to give a base for the topology.

The situation of having no linear functionals is highly undesirable for the purposes of doing analysis. In the case of the Lebesgue measure on Rn, rather than work with Lp for 0 < p < 1, it is common to work with the Hardy space Hp whenever possible, as this has quite a few linear functionals: enough to distinguish points from one another. However, the Hahn–Banach theorem still fails in Hp for p < 1( Duren 1970 , §7.5).

### L0, the space of measurable functions

The vector space of (equivalence classes of) measurable functions on (S, Σ, μ) is denoted L0(S, Σ, μ)( Kalton, Peck & Roberts 1984 ). By definition, it contains all the Lp, and is equipped with the topology of convergence in measure . When μ is a probability measure (i.e., μ(S) = 1), this mode of convergence is named convergence in probability .

The description is easier when μ is finite. If μ is a finite measure on (S, Σ), the 0 function admits for the convergence in measure the following fundamental system of neighborhoods

${\displaystyle V_{\varepsilon }={\Bigl \{}f:\mu {\bigl (}\{x:|f(x)|>\varepsilon \}{\bigr )}<\varepsilon {\Bigr \}},\qquad \varepsilon >0}$

The topology can be defined by any metric d of the form

${\displaystyle d(f,g)=\int _{S}\varphi {\bigl (}|f(x)-g(x)|{\bigr )}\,\mathrm {d} \mu (x)}$

where φ is bounded continuous concave and non-decreasing on [0, ∞), with φ(0) = 0 and φ(t) > 0 when t > 0 (for example, φ(t) = min(t, 1)). Such a metric is called Lévy-metric for L0. Under this metric the space L0 is complete (it is again an F-space). The space L0 is in general not locally bounded, and not locally convex.

For the infinite Lebesgue measure λ on Rn, the definition of the fundamental system of neighborhoods could be modified as follows

${\displaystyle W_{\varepsilon }=\left\{f:\lambda \left(\left\{x:|f(x)|>\varepsilon {\text{ and }}|x|<{\frac {1}{\varepsilon }}\right\}\right)<\varepsilon \right\}}$

The resulting space L0(Rn, λ) coincides as topological vector space with L0(Rn, g(x)dλ(x)), for any positive λintegrable density g.

## Generalizations and extensions

### Weak Lp

Let (S, Σ, μ) be a measure space, and f a measurable function with real or complex values on S. The distribution function of f is defined for t ≥ 0 by

${\displaystyle \lambda _{f}(t)=\mu \left\{x\in S:|f(x)|>t\right\}}$

If f is in Lp(S, μ) for some p with 1 ≤ p < ∞, then by Markov's inequality,

${\displaystyle \lambda _{f}(t)\leq {\frac {\|f\|_{p}^{p}}{t^{p}}}}$

A function f is said to be in the space weak Lp(S, μ), or Lp,w(S, μ), if there is a constant C > 0 such that, for all t > 0,

${\displaystyle \lambda _{f}(t)\leq {\frac {C^{p}}{t^{p}}}}$

The best constant C for this inequality is the Lp,w-norm of f, and is denoted by

${\displaystyle \|f\|_{p,w}=\sup _{t>0}~t\lambda _{f}^{1/p}(t).}$

The weak Lp coincide with the Lorentz spaces Lp,∞, so this notation is also used to denote them.

The Lp,w-norm is not a true norm, since the triangle inequality fails to hold. Nevertheless, for f in Lp(S, μ),

${\displaystyle \|f\|_{p,w}\leq \|f\|_{p}}$

and in particular Lp(S, μ) ⊂ Lp,w(S, μ).

In fact, one has

${\displaystyle \|f\|_{L^{p}}^{p}=\int |f(x)|^{p}d\mu (x)\geq \int _{\{|f(x)|>t\}}t^{p}+\int _{\{|f(x)|\leq t\}}|f|^{p}\geq t^{p}\mu (\{|f|>t\})}$,

and raising to power 1/p and taking the supremum in t one has

${\displaystyle \|f\|_{L^{p}}\geq \sup _{t>0}t\;\mu (\{|f|>t\})^{1/p}=\|f\|_{L^{p,w}}.}$

Under the convention that two functions are equal if they are equal μ almost everywhere, then the spaces Lp,w are complete ( Grafakos 2004 ).

For any 0 < r < p the expression

${\displaystyle |||f|||_{L^{p,\infty }}=\sup _{0<\mu (E)<\infty }\mu (E)^{-1/r+1/p}\left(\int _{E}|f|^{r}\,d\mu \right)^{1/r}}$

is comparable to the Lp,w-norm. Further in the case p > 1, this expression defines a norm if r = 1. Hence for p > 1 the weak Lp spaces are Banach spaces ( Grafakos 2004 ).

A major result that uses the Lp,w-spaces is the Marcinkiewicz interpolation theorem, which has broad applications to harmonic analysis and the study of singular integrals.

### Weighted Lp spaces

As before, consider a measure space (S, Σ, μ). Let w : S → [0, ∞) be a measurable function. The w-weighted Lp space is defined as Lp(S, wdμ), where wdμ means the measure ν defined by

${\displaystyle \nu (A)\equiv \int _{A}w(x)\,\mathrm {d} \mu (x),\qquad A\in \Sigma ,}$

or, in terms of the Radon–Nikodym derivative, w = dν/dμ the norm for Lp(S, wdμ) is explicitly

${\displaystyle \|u\|_{L^{p}(S,w\,\mathrm {d} \mu )}\equiv \left(\int _{S}w(x)|u(x)|^{p}\,\mathrm {d} \mu (x)\right)^{1/p}}$

As Lp-spaces, the weighted spaces have nothing special, since Lp(S, wdμ) is equal to Lp(S, dν). But they are the natural framework for several results in harmonic analysis ( Grafakos 2004 ); they appear for example in the Muckenhoupt theorem: for 1 < p < ∞, the classical Hilbert transform is defined on Lp(T, λ) where T denotes the unit circle and λ the Lebesgue measure; the (nonlinear) Hardy–Littlewood maximal operator is bounded on Lp(Rn, λ). Muckenhoupt's theorem describes weights w such that the Hilbert transform remains bounded on Lp(T, wdλ) and the maximal operator on Lp(Rn, wdλ).

### Lp spaces on manifolds

One may also define spaces Lp(M) on a manifold, called the intrinsic Lp spaces of the manifold, using densities.

### Vector-valued Lp spaces

Given a measure space (X, Σ, μ) and a locally-convex space E, one may also define a spaces of p-integrable E-valued functions in a number of ways. The most common of these being the spaces of Bochner integrable and Pettis-integrable functions. Using the tensor product of locally convex spaces, these may be respectively defined as ${\displaystyle L_{\mu }^{p}\left(X,\Sigma ,\mu \right)\otimes _{\pi }E}$ and ${\displaystyle L_{\mu }^{p}\left(X,\Sigma ,\mu \right)\otimes _{\epsilon }E}$; where ${\displaystyle \otimes _{\pi }}$ and ${\displaystyle \otimes _{\epsilon }}$respectively denote the projective and injective tensor products of locally convex spaces. When E is a nuclear space, Grothendieck showed that these two constructions are indistinguishable.

## Notes

1. Rolewicz, Stefan (1987), Functional analysis and control theory: Linear systems, Mathematics and its Applications (East European Series), 29 (Translated from the Polish by Ewa Bednarczuk ed.), Dordrecht; Warsaw: D. Reidel Publishing Co.; PWN—Polish Scientific Publishers, pp. xvi+524, doi:10.1007/978-94-015-7758-8, ISBN   90-277-2186-6, MR   0920371, OCLC   13064804 [ page needed ]
2. Maddox, I. J. (1988), Elements of Functional Analysis (2nd ed.), Cambridge: CUP, page 16
3. Rafael Dahmen, Gábor Lukács: Long colimits of topological groups I: Continuous maps and homeomorphisms. in: Topology and its Applications Nr. 270, 2020. Example 2.14
4. Rudin, Walter (1980), Real and Complex Analysis (2nd ed.), New Delhi: Tata McGraw-Hill, ISBN   9780070542341 , Theorem 6.16
5. Schechter, Eric (1997), Handbook of Analysis and its Foundations, London: Academic Press Inc. See Sections 14.77 and 27.44–47
6. Villani, Alfonso (1985), "Another note on the inclusion Lp(μ) ⊂ Lq(μ)", Amer. Math. Monthly, 92 (7): 485–487, doi:10.2307/2322503, JSTOR   2322503, MR   0801221

## Related Research Articles

In mathematics, a measure on a set is a systematic way to assign a number to subsets of a set, intuitively interpreted as the size of the subset. Those sets which can be associated with such a number, we call measurable sets. In this sense, a measure is a generalization of the concepts of length, area, and volume. A particularly important example is the Lebesgue measure on a Euclidean space. This assigns the usual length, area, or volume to certain subsets of the given Euclidean space. For instance, the Lebesgue measure of an interval of real numbers is its usual length, but also assigns numbers to other kinds of sets in a way that is consistent with the lengths of intervals.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.

In mathematics, the Radon–Nikodym theorem is a result in measure theory that expresses the relationship between two measures defined on the same measurable space. A measure is a set function that assigns a consistent magnitude to the measurable subsets of a measurable space. Examples of a measure include area and volume, where the subsets are sets of points; or the probability of an event, which is a subset of possible outcomes within a wider probability space.

In mathematics, the ba space of an algebra of sets is the Banach space consisting of all bounded and finitely additive signed measures on . The norm is defined as the variation, that is ,

In mathematics, the total variation identifies several slightly different concepts, related to the structure of the codomain of a function or a measure. For a real-valued continuous function f, defined on an interval [a, b] ⊂ R, its total variation on the interval of definition is a measure of the one-dimensional arclength of the curve with parametric equation xf(x), for x ∈ [a, b]. Functions whose total variation is finite are called functions of bounded variation.

In mathematics, the Riesz–Thorin theorem, often referred to as the Riesz–Thorin interpolation theorem or the Riesz–Thorin convexity theorem, is a result about interpolation of operators. It is named after Marcel Riesz and his student G. Olof Thorin.

The spectrum of a linear operator that operates on a Banach space consists of all scalars such that the operator does not have a bounded inverse on . The spectrum has a standard decomposition into three parts:

In mathematics, signed measure is a generalization of the concept of measure by allowing it to have negative values. In the theory of measures a signed measure is sometimes called a charge.

In mathematics, the Bochner integral, named for Salomon Bochner, extends the definition of Lebesgue integral to functions that take values in a Banach space, as the limit of integrals of simple functions.

In mathematics, a π-system on a set is a collection of certain subsets of such that

Convergence in measure is either of two distinct mathematical concepts both of which generalize the concept of convergence in probability.

In measure theory, a discipline within mathematics, a pushforward measure is obtained by transferring a measure from one measurable space to another using a measurable function.

In mathematics, a content is a set function that is like a measure, but a content must only be finitely additive, whereas a measure must be countably additive. A content is a real function defined on a collection of subsets such that

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

In mathematical analysis, and especially in real and harmonic analysis, an Orlicz space is a type of function space which generalizes the Lp spaces. Like the Lp spaces, they are Banach spaces. The spaces are named for Władysław Orlicz, who was the first to define them in 1932.

In mathematics, the Pettis integral or Gelfand–Pettis integral, named after Israel M. Gelfand and Billy James Pettis, extends the definition of the Lebesgue integral to vector-valued functions on a measure space, by exploiting duality. The integral was introduced by Gelfand for the case when the measure space is an interval with Lebesgue measure. The integral is also called the weak integral in contrast to the Bochner integral, which is the strong integral.

In mathematical analysis, Lorentz spaces, introduced by George G. Lorentz in the 1950s, are generalisations of the more familiar spaces.

In mathematics, lifting theory was first introduced by John von Neumann in a pioneering paper from 1931, in which he answered a question raised by Alfréd Haar. The theory was further developed by Dorothy Maharam (1958) and by Alexandra Ionescu Tulcea and Cassius Ionescu Tulcea (1961). Lifting theory was motivated to a large extent by its striking applications. Its development up to 1969 was described in a monograph of the Ionescu Tulceas. Lifting theory continued to develop since then, yielding new results and applications.

In mathematics, , the vector space of bounded sequences with the supremum norm, and , the vector space of essentially bounded measurable functions with the essential supremum norm, are two closely related Banach spaces. In fact the former is a special case of the latter. As a Banach space they are the continuous dual of the Banach spaces of absolutely summable sequences, and of absolutely integrable measurable functions. Pointwise multiplication gives them the structure of a Banach algebra, and in fact they are the standard examples of abelian Von Neumann algebras.