Entropic uncertainty

Last updated

In quantum mechanics, information theory, and Fourier analysis, the entropic uncertainty or Hirschman uncertainty is defined as the sum of the temporal and spectral Shannon entropies. It turns out that Heisenberg's uncertainty principle can be expressed as a lower bound on the sum of these entropies. This is stronger than the usual statement of the uncertainty principle in terms of the product of standard deviations.

Contents

In 1957, [1] Hirschman considered a function f and its Fourier transform g such that

where the "≈" indicates convergence in L2, and normalized so that (by Plancherel's theorem),

He showed that for any such functions the sum of the Shannon entropies is non-negative,

A tighter bound,

was conjectured by Hirschman [1] and Everett, [2] proven in 1975 by W. Beckner [3] and in the same year interpreted as a generalized quantum mechanical uncertainty principle by Białynicki-Birula and Mycielski. [4] The equality holds in the case of Gaussian distributions. [5] Note, however, that the above entropic uncertainty function is distinctly different from the quantum Von Neumann entropy represented in phase space.

Sketch of proof

The proof of this tight inequality depends on the so-called (q, p)-norm of the Fourier transformation. (Establishing this norm is the most difficult part of the proof.)

From this norm, one is able to establish a lower bound on the sum of the (differential) Rényi entropies, Hα(|f|²)+Hβ(|g|²), where 1/α + 1/β = 2, which generalize the Shannon entropies. For simplicity, we consider this inequality only in one dimension; the extension to multiple dimensions is straightforward and can be found in the literature cited.

Babenko–Beckner inequality

The (q, p)-norm of the Fourier transform is defined to be [6]

where   and

In 1961, Babenko [7] found this norm for even integer values of q. Finally, in 1975, using Hermite functions as eigenfunctions of the Fourier transform, Beckner [3] proved that the value of this norm (in one dimension) for all q ≥ 2 is

Thus we have the Babenko–Beckner inequality that

Rényi entropy bound

From this inequality, an expression of the uncertainty principle in terms of the Rényi entropy can be derived. [6] [8]

Letting , 2α=p, and 2β=q, so that 1/α + 1/β = 2 and 1/2<α<1<β, we have

Squaring both sides and taking the logarithm, we get

Multiplying both sides by

reverses the sense of the inequality,

Rearranging terms, finally yields an inequality in terms of the sum of the Rényi entropies,

Note that this inequality is symmetric with respect to α and β: One no longer need assume that α<β; only that they are positive and not both one, and that 1/α + 1/β = 2. To see this symmetry, simply exchange the rôles of i and −i in the Fourier transform.

Shannon entropy bound

Taking the limit of this last inequality as α, β → 1 yields the less general Shannon entropy inequality,

valid for any base of logarithm, as long as we choose an appropriate unit of information, bit, nat, etc.

The constant will be different, though, for a different normalization of the Fourier transform, (such as is usually used in physics, with normalizations chosen so that ħ=1 ), i.e.,

In this case, the dilation of the Fourier transform absolute squared by a factor of 2π simply adds log(2π) to its entropy.

Entropy versus variance bounds

The Gaussian or normal probability distribution plays an important role in the relationship between variance and entropy: it is a problem of the calculus of variations to show that this distribution maximizes entropy for a given variance, and at the same time minimizes the variance for a given entropy. In fact, for any probability density function on the real line, Shannon's entropy inequality specifies:

where H is the Shannon entropy and V is the variance, an inequality that is saturated only in the case of a normal distribution.

Moreover, the Fourier transform of a Gaussian probability amplitude function is also Gaussian—and the absolute squares of both of these are Gaussian, too. This can then be used to derive the usual Robertson variance uncertainty inequality from the above entropic inequality, enabling the latter to be tighter than the former. That is (for ħ=1), exponentiating the Hirschman inequality and using Shannon's expression above,

Hirschman [1] explained that entropy—his version of entropy was the negative of Shannon's—is a "measure of the concentration of [a probability distribution] in a set of small measure." Thus a low or large negative Shannon entropy means that a considerable mass of the probability distribution is confined to a set of small measure.

Note that this set of small measure need not be contiguous; a probability distribution can have several concentrations of mass in intervals of small measure, and the entropy may still be low no matter how widely scattered those intervals are. This is not the case with the variance: variance measures the concentration of mass about the mean of the distribution, and a low variance means that a considerable mass of the probability distribution is concentrated in a contiguous interval of small measure.

To formalize this distinction, we say that two probability density functions and are equimeasurable if

where μ is the Lebesgue measure. Any two equimeasurable probability density functions have the same Shannon entropy, and in fact the same Rényi entropy, of any order. The same is not true of variance, however. Any probability density function has a radially decreasing equimeasurable "rearrangement" whose variance is less (up to translation) than any other rearrangement of the function; and there exist rearrangements of arbitrarily high variance, (all having the same entropy.)

See also

Related Research Articles

<span class="mw-page-title-main">Uncertainty principle</span> Foundational principle in quantum physics

The uncertainty principle, also known as Heisenberg's indeterminacy principle, is a fundamental concept in quantum mechanics. It states that there is a limit to the precision with which certain pairs of physical properties, such as position and momentum, can be simultaneously known. In other words, the more accurately one property is measured, the less accurately the other property can be known.

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the moment-generating functions of distributions defined by the weighted sums of random variables. However, not all random variables have moment-generating functions.

<span class="mw-page-title-main">Beta distribution</span> Probability distribution

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

<span class="mw-page-title-main">Gamma distribution</span> Probability distribution

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

  1. With a shape parameter and a scale parameter .
  2. With a shape parameter and an inverse scale parameter , called a rate parameter.

In the theory of stochastic processes, the Karhunen–Loève theorem, also known as the Kosambi–Karhunen–Loève theorem states that a stochastic process can be represented as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function on a bounded interval. The transformation is also known as Hotelling transform and eigenvector transform, and is closely related to principal component analysis (PCA) technique widely used in image processing and in data analysis in many fields.

<span class="mw-page-title-main">Stable distribution</span> Distribution of variables which satisfies a stability property under linear combinations

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

<span class="mw-page-title-main">Inverse-gamma distribution</span> Two-parameter family of continuous probability distributions

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

In information theory, the Rényi entropy is a quantity that generalizes various notions of entropy, including Hartley entropy, Shannon entropy, collision entropy, and min-entropy. The Rényi entropy is named after Alfréd Rényi, who looked for the most general way to quantify information while preserving additivity for independent events. In the context of fractal dimension estimation, the Rényi entropy forms the basis of the concept of generalized dimensions.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

<span class="mw-page-title-main">Beta prime distribution</span> Probability distribution

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.

Differential entropy is a concept in information theory that began as an attempt by Claude Shannon to extend the idea of (Shannon) entropy, a measure of average (surprisal) of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP). Differential entropy is commonly encountered in the literature, but it is a limiting case of the LDDP, and one that loses its fundamental association with discrete entropy.

In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, α > 0, such that

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

Inequalities are very important in the study of information theory. There are a number of different contexts in which these inequalities appear.

In mathematics, the Babenko–Beckner inequality (after Konstantin I. Babenko and William E. Beckner) is a sharpened form of the Hausdorff–Young inequality having applications to uncertainty principles in the Fourier analysis of Lp spaces. The (qp)-norm of the n-dimensional Fourier transform is defined to be

Khabibullin's conjecture is a conjecture in mathematics related to Paley's problem for plurisubharmonic functions and to various extremal problems in the theory of entire functions of several variables. The conjecture was named after its proposer, B. N. Khabibullin.

A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product is a product distribution.

In mathematics, Young's inequality for products is a mathematical inequality about the product of two numbers. The inequality is named after William Henry Young and should not be confused with Young's convolution inequality.

References

  1. 1 2 3 Hirschman, I. I. Jr. (1957), "A note on entropy", American Journal of Mathematics , 79 (1): 152–156, doi:10.2307/2372390, JSTOR   2372390.
  2. Hugh Everett, III. The Many-Worlds Interpretation of Quantum Mechanics: the theory of the universal wave function. Everett's Dissertation
  3. 1 2 Beckner, W. (1975), "Inequalities in Fourier analysis", Annals of Mathematics , 102 (6): 159–182, doi:10.2307/1970980, JSTOR   1970980, PMC   432369 , PMID   16592223.
  4. Bialynicki-Birula, I.; Mycielski, J. (1975), "Uncertainty Relations for Information Entropy in Wave Mechanics", Communications in Mathematical Physics , 44 (2): 129, Bibcode:1975CMaPh..44..129B, doi:10.1007/BF01608825, S2CID   122277352
  5. Ozaydin, Murad; Przebinda, Tomasz (2004). "An Entropy-based Uncertainty Principle for a Locally Compact Abelian Group" (PDF). Journal of Functional Analysis. Elsevier Inc. 215 (1): 241–252. doi: 10.1016/j.jfa.2003.11.008 . Retrieved 2011-06-23.
  6. 1 2 Bialynicki-Birula, I. (2006). "Formulation of the uncertainty relations in terms of the Rényi entropies". Physical Review A. 74 (5): 052101. arXiv: quant-ph/0608116 . Bibcode:2006PhRvA..74e2101B. doi:10.1103/PhysRevA.74.052101. S2CID   19123961.
  7. K.I. Babenko. An inequality in the theory of Fourier integrals. Izv. Akad. Nauk SSSR, Ser. Mat. 25 (1961) pp. 531542 English transl., Amer. Math. Soc. Transl. (2) 44, pp. 115-128
  8. H.P. Heinig and M. Smith, Extensions of the Heisenberg–Weil inequality. Internat. J. Math. & Math. Sci., Vol. 9, No. 1 (1986) pp. 185192.

Further reading