Disintegration theorem

Last updated July 14, 2024

In mathematics, the disintegration theorem is a result in measure theory and probability theory. It rigorously defines the idea of a non-trivial "restriction" of a measure to a measure zero subset of the measure space in question. It is related to the existence of conditional probability measures. In a sense, "disintegration" is the opposite process to the construction of a product measure.

Motivation

Consider the unit square $S=[0,1]\times [0,1]$ in the Euclidean plane $\mathbb {R} ^{2}$ . Consider the probability measure $\mu$ defined on $S$ by the restriction of two-dimensional Lebesgue measure $\lambda ^{2}$ to $S$ . That is, the probability of an event $E\subseteq S$ is simply the area of $E$ . We assume $E$ is a measurable subset of $S$ .

Consider a one-dimensional subset of $S$ such as the line segment $L_{x}=\{x\}\times [0,1]$ . $L_{x}$ has $\mu$ -measure zero; every subset of $L_{x}$ is a $\mu$ -null set; since the Lebesgue measure space is a complete measure space, $E\subseteq L_{x}\implies \mu (E)=0.$

While true, this is somewhat unsatisfying. It would be nice to say that $\mu$ "restricted to" $L_{x}$ is the one-dimensional Lebesgue measure $\lambda ^{1}$ , rather than the zero measure. The probability of a "two-dimensional" event $E$ could then be obtained as an integral of the one-dimensional probabilities of the vertical "slices" $E\cap L_{x}$ : more formally, if $\mu _{x}$ denotes one-dimensional Lebesgue measure on $L_{x}$ , then $\mu (E)=\int _{[0,1]}\mu _{x}(E\cap L_{x})\,\mathrm {d} x$ for any "nice" $E\subseteq S$ . The disintegration theorem makes this argument rigorous in the context of measures on metric spaces.

Statement of the theorem

(Hereafter, ${\mathcal {P}}(X)$ will denote the collection of Borel probability measures on a topological space $(X,T)$ .) The assumptions of the theorem are as follows:

Let $Y$ and $X$ be two Radon spaces (i.e. a topological space such that every Borel probability measure on it is inner regular, e.g. separably metrizable spaces; in particular, every probability measure on it is outright a Radon measure).
Let $\mu \in {\mathcal {P}}(Y)$ .
Let $\pi :Y\to X$ be a Borel-measurable function. Here one should think of $\pi$ as a function to "disintegrate" $Y$ , in the sense of partitioning $Y$ into $\{\pi ^{-1}(x)\ |\ x\in X\}$ . For example, for the motivating example above, one can define $\pi ((a,b))=a$ , $(a,b)\in [0,1]\times [0,1]$ , which gives that $\pi ^{-1}(a)=a\times [0,1]$ , a slice we want to capture.
Let $\nu \in {\mathcal {P}}(X)$ be the pushforward measure $\nu =\pi _{*}(\mu )=\mu \circ \pi ^{-1}$ . This measure provides the distribution of $x$ (which corresponds to the events $\pi ^{-1}(x)$ ).

The conclusion of the theorem: There exists a $\nu$ -almost everywhere uniquely determined family of probability measures $\{\mu _{x}\}_{x\in X}\subseteq {\mathcal {P}}(Y)$ , which provides a "disintegration" of $\mu$ into $\{\mu _{x}\}_{x\in X}$ , such that:

the function $x\mapsto \mu _{x}$ is Borel measurable, in the sense that $x\mapsto \mu _{x}(B)$ is a Borel-measurable function for each Borel-measurable set $B\subseteq Y$ ;
$\mu _{x}$ "lives on" the fiber $\pi ^{-1}(x)$ : for $\nu$ -almost all $x\in X$ , $\mu _{x}\left(Y\setminus \pi ^{-1}(x)\right)=0,$ and so $\mu _{x}(E)=\mu _{x}(E\cap \pi ^{-1}(x))$ ;
for every Borel-measurable function $f:Y\to [0,\infty ]$ , $\int _{Y}f(y)\,\mathrm {d} \mu (y)=\int _{X}\int _{\pi ^{-1}(x)}f(y)\,\mathrm {d} \mu _{x}(y)\,\mathrm {d} \nu (x).$ In particular, for any event $E\subseteq Y$ , taking $f$ to be the indicator function of $E$ ,^[1] $\mu (E)=\int _{X}\mu _{x}(E)\,\mathrm {d} \nu (x).$

Applications

Product spaces

The original example was a special case of the problem of product spaces, to which the disintegration theorem applies.

When $Y$ is written as a Cartesian product $Y=X_{1}\times X_{2}$ and $\pi _{i}:Y\to X_{i}$ is the natural projection, then each fibre $\pi _{1}^{-1}(x_{1})$ can be canonically identified with $X_{2}$ and there exists a Borel family of probability measures $\{\mu _{x_{1}}\}_{x_{1}\in X_{1}}$ in ${\mathcal {P}}(X_{2})$ (which is $(\pi _{1})_{*}(\mu )$ -almost everywhere uniquely determined) such that $\mu =\int _{X_{1}}\mu _{x_{1}}\,\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)=\int _{X_{1}}\mu _{x_{1}}\,\mathrm {d} (\pi _{1})_{*}(\mu )(x_{1}),$ which is in particular^{[ clarification needed ]} $\int _{X_{1}\times X_{2}}f(x_{1},x_{2})\,\mu (\mathrm {d} x_{1},\mathrm {d} x_{2})=\int _{X_{1}}\left(\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1})\right)\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)$ and $\mu (A\times B)=\int _{A}\mu \left(B\mid x_{1}\right)\,\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right).$

The relation to conditional expectation is given by the identities $\operatorname {E} (f\mid \pi _{1})(x_{1})=\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1}),$ $\mu (A\times B\mid \pi _{1})(x_{1})=1_{A}(x_{1})\cdot \mu (B\mid x_{1}).$

Vector calculus

The disintegration theorem can also be seen as justifying the use of a "restricted" measure in vector calculus. For instance, in Stokes' theorem as applied to a vector field flowing through a compact surface $\Sigma \subset \mathbb {R} ^{3}$ , it is implicit that the "correct" measure on $\Sigma$ is the disintegration of three-dimensional Lebesgue measure $\lambda ^{3}$ on $\Sigma$ , and that the disintegration of this measure on ∂Σ is the same as the disintegration of $\lambda ^{3}$ on $\partial \Sigma$ .^[2]

Conditional distributions

The disintegration theorem can be applied to give a rigorous treatment of conditional probability distributions in statistics, while avoiding purely abstract formulations of conditional probability.^[3] The theorem is related to the Borel–Kolmogorov paradox, for example.

Related Research Articles

In mathematics, specifically in measure theory, a Borel measure on a topological space is a measure that is defined on all open sets. Some authors require additional restrictions on the measure, as described below.

In mathematics, the concept of a measure is a generalization and formalization of geometrical measures and other common notions, such as magnitude, mass, and probability of events. These seemingly distinct concepts have many similarities and can often be treated together in a single mathematical context. Measures are foundational in probability theory, integration theory, and can be generalized to assume negative values, as with electrical charge. Far-reaching generalizations of measure are widely used in quantum physics and physics in general.

In mathematical analysis, a null set is a Lebesgue measurable set of real numbers that has measure zero. This can be characterized as a set that can be covered by a countable union of intervals of arbitrarily small total length.

In mathematical analysis, the Haar measure assigns an "invariant volume" to subsets of locally compact topological groups, consequently defining an integral for functions on those groups.

In mathematics, the $L p$ spaces are function spaces defined using a natural generalization of the $p$ -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz.

In calculus and real analysis, absolute continuity is a smoothness property of functions that is stronger than continuity and uniform continuity. The notion of absolute continuity allows one to obtain generalizations of the relationship between the two central operations of calculus—differentiation and integration. This relationship is commonly characterized in the framework of Riemann integration, but with absolute continuity it may be formulated in terms of Lebesgue integration. For real-valued functions on the real line, two interrelated notions appear: absolute continuity of functions and absolute continuity of measures. These two notions are generalized in different directions. The usual derivative of a function is related to the Radon–Nikodym derivative, or density, of a measure. We have the following chains of inclusions for functions over a compact subset of the real line:

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

In mathematics, the Radon–Nikodym theorem is a result in measure theory that expresses the relationship between two measures defined on the same measurable space. A measure is a set function that assigns a consistent magnitude to the measurable subsets of a measurable space. Examples of a measure include area and volume, where the subsets are sets of points; or the probability of an event, which is a subset of possible outcomes within a wider probability space.

In probability theory, the Borel–Kolmogorov paradox is a paradox relating to conditional probability with respect to an event of probability zero. It is named after Émile Borel and Andrey Kolmogorov.

In mathematics, the total variation identifies several slightly different concepts, related to the (local or global) structure of the codomain of a function or a measure. For a real-valued continuous function f, defined on an interval [a, b] ⊂ R, its total variation on the interval of definition is a measure of the one-dimensional arclength of the curve with parametric equation x ↦ f(x), for x ∈ [a, b]. Functions whose total variation is finite are called functions of bounded variation.

In mathematics and functional analysis, a direct integral or Hilbert integral is a generalization of the concept of direct sum. The theory is most developed for direct integrals of Hilbert spaces and direct integrals of von Neumann algebras. The concept was introduced in 1949 by John von Neumann in one of the papers in the series On Rings of Operators. One of von Neumann's goals in this paper was to reduce the classification of von Neumann algebras on separable Hilbert spaces to the classification of so-called factors. Factors are analogous to full matrix algebras over a field, and von Neumann wanted to prove a continuous analogue of the Artin–Wedderburn theorem classifying semi-simple rings.

In functional analysis, an abelian von Neumann algebra is a von Neumann algebra of operators on a Hilbert space in which all elements commute.

In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all of its entries are sampled randomly from a probability distribution. Random matrix theory (RMT) is the study of properties of random matrices, often as they become large. RMT provides techniques like mean-field theory, diagrammatic methods, the cavity method, or the replica method to compute quantities like traces, spectral densities, or scalar products between eigenvectors. Many physical phenomena, such as the spectrum of nuclei of heavy atoms, the thermal conductivity of a lattice, or the emergence of quantum chaos, can be modeled mathematically as problems concerning large, random matrices.

In mathematics, a $π$ -system on a set $is a collection of certain subsets of such that$

In mathematics, Gaussian measure is a Borel measure on finite-dimensional Euclidean space $, closely related to the normal distribution in statistics. There is also a generalization to infinite-dimensional spaces. Gaussian measures are named after the German mathematician Carl Friedrich Gauss. One reason why Gaussian measures are so ubiquitous in probability theory is the central limit theorem. Loosely speaking, it states that if a random variable is obtained by summing a large number of independent random variables with variance 1, then has variance and its law is approximately Gaussian.$

In mathematics, the Wasserstein distance or Kantorovich–Rubinstein metric is a distance function defined between probability distributions on a given metric space $. It is named after Leonid Vaseršteĭn.$

In mathematics, the Lévy–Prokhorov metric is a metric on the collection of probability measures on a given metric space. It is named after the French mathematician Paul Lévy and the Soviet mathematician Yuri Vasilyevich Prokhorov; Prokhorov introduced it in 1956 as a generalization of the earlier Lévy metric.

In mathematics, the Pettis integral or Gelfand–Pettis integral, named after Israel M. Gelfand and Billy James Pettis, extends the definition of the Lebesgue integral to vector-valued functions on a measure space, by exploiting duality. The integral was introduced by Gelfand for the case when the measure space is an interval with Lebesgue measure. The integral is also called the weak integral in contrast to the Bochner integral, which is the strong integral.

In mathematics, lifting theory was first introduced by John von Neumann in a pioneering paper from 1931, in which he answered a question raised by Alfréd Haar. The theory was further developed by Dorothy Maharam (1958) and by Alexandra Ionescu Tulcea and Cassius Ionescu Tulcea (1961). Lifting theory was motivated to a large extent by its striking applications. Its development up to 1969 was described in a monograph of the Ionescu Tulceas. Lifting theory continued to develop since then, yielding new results and applications.

In mathematics, Wiener's lemma is a well-known identity which relates the asymptotic behaviour of the Fourier coefficients of a Borel measure on the circle to its atomic part. This result admits an analogous statement for measures on the real line. It was first discovered by Norbert Wiener.

References

↑ Dellacherie, C.; Meyer, P.-A. (1978). Probabilities and Potential. North-Holland Mathematics Studies. Amsterdam: North-Holland. ISBN 0-7204-0701-X.
↑ Ambrosio, L.; Gigli, N.; Savaré, G. (2005). Gradient Flows in Metric Spaces and in the Space of Probability Measures. ETH Zürich, Birkhäuser Verlag, Basel. ISBN 978-3-7643-2428-5.
↑ Chang, J.T.; Pollard, D. (1997). "Conditioning as disintegration" (PDF). Statistica Neerlandica. 51 (3): 287. CiteSeerX 10.1.1.55.7544 . doi:10.1111/1467-9574.00056. S2CID 16749932.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Dellacherie_Meyer-1] Dellacherie, C.; Meyer, P.-A. (1978). Probabilities and Potential. North-Holland Mathematics Studies. Amsterdam: North-Holland. ISBN 0-7204-0701-X.

[Ambrosio_Gigli_Savare-2] Ambrosio, L.; Gigli, N.; Savaré, G. (2005). Gradient Flows in Metric Spaces and in the Space of Probability Measures. ETH Zürich, Birkhäuser Verlag, Basel. ISBN 978-3-7643-2428-5.

[Chang_Pollard-3] Chang, J.T.; Pollard, D. (1997). "Conditioning as disintegration" (PDF). Statistica Neerlandica. 51 (3): 287. CiteSeerX 10.1.1.55.7544 . doi:10.1111/1467-9574.00056. S2CID 16749932.

[1]

[2]

[3]