Category of Markov kernels

Last updated

In mathematics, the category of Markov kernels, often denoted Stoch, is the category whose objects are measurable spaces and whose morphisms are Markov kernels. [1] [2] [3] [4] It is analogous to the category of sets and functions, but where the arrows can be interpreted as being stochastic.

Contents

Several variants of this category are used in the literature. For example, one can use subprobability kernels [5] instead of probability kernels, or more general s-finite kernels. [6] Also, one can take as morphisms equivalence classes of Markov kernels under almost sure equality; [7] see below.

Definition

Recall that a Markov kernel between measurable spaces and is an assignment which is measurable as a function on and which is a probability measure on . [4] We denote its values by for and , which suggests an interpretation as conditional probability.

The category Stoch has: [4]

for all and ;
for all and .

This composition formula is sometimes called the Chapman-Kolmogorov equation. [4]

This composition is unital, and associative by the monotone convergence theorem, so that one indeed has a category.

Basic properties

Probability measures

The terminal object of Stoch is the one-point space . [4] Morphisms in the form can be equivalently seen as probability measures on , since they correspond to functions , i.e. elements of .

Given kernels and , the composite kernel gives the probability measure on with values

for every measurable subset of . [7]

Given probability spaces and , a measure-preserving Markov kernel is a Markov kernel such that for every measurable subset , [7]

Probability spaces and measure-preserving Markov kernels form a category, which can be seen as the slice category .

Measurable functions

Every measurable function defines canonically a Markov kernel as follows,

for every and every . This construction preserves identities and compositions, and is therefore a functor from Meas to Stoch.

Isomorphisms

By functoriality, every isomorphism of measurable spaces (in the category Meas) induces an isomorphism in Stoch. However, in Stoch there are more isomorphisms, and in particular, measurable spaces can be isomorphic in Stoch even when the underlying sets are not in bijection.

Relationship with other categories

between Stoch and the category of measurable spaces.

Particular limits and colimits

Since the functor is left adjoint, it preserves colimits. [8] Because of this, all colimits in the category of measurable spaces are also colimits in Stoch. For example,

In general, the functor does not preserve limits. This in particular implies that the product of measurable spaces is not a product in Stoch in general. Since the Giry monad is monoidal, however, the product of measurable spaces still makes Stoch a monoidal category. [4]

A limit of particular significance for probability theory is de Finetti's theorem, which can be interpreted as the fact that the space of probability measures (Giry monad) is the limit in Stoch of the diagram formed by finite permutations of sequences.

Almost sure version

Sometimes it is useful to consider Markov kernels only up to almost sure equality, for example when talking about disintegrations or about regular conditional probability.

Given probability spaces and , we say that two measure-preserving kernels are almost surely equal if and only if for every measurable subset ,

for -almost all . [7] This defines an equivalence relation on the set of measure-preserving Markov kernels .

Probability spaces and equivalence classes of Markov kernels under the relation defined above form a category. When restricted to standard Borel probability spaces, the category is often denoted by Krn. [7]

See also

Citations

Related Research Articles

In mathematics, specifically category theory, a functor is a mapping between categories. Functors were first considered in algebraic topology, where algebraic objects are associated to topological spaces, and maps between these algebraic objects are associated to continuous maps between spaces. Nowadays, functors are used throughout modern mathematics to relate various categories. Thus, functors are important in all areas within mathematics to which category theory is applied.

In mathematics, especially in category theory and homotopy theory, a groupoid generalises the notion of group in several equivalent ways. A groupoid can be seen as a:

In mathematics, specifically category theory, adjunction is a relationship that two functors may exhibit, intuitively corresponding to a weak form of equivalence between two related categories. Two functors that stand in this relationship are known as adjoint functors, one being the left adjoint and the other the right adjoint. Pairs of adjoint functors are ubiquitous in mathematics and often arise from constructions of "optimal solutions" to certain problems, such as the construction of a free group on a set in algebra, or the construction of the Stone–Čech compactification of a topological space in topology.

In mathematics, specifically in category theory, a pre-abelian category is an additive category that has all kernels and cokernels.

In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in honor of Bruno de Finetti.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value evaluated with respect to the conditional probability distribution. If the random variable can take on only a finite number of values, the "conditions" are that the variable can only take on a subset of those values. More formally, in the case when the random variable is defined over a discrete probability space, the "conditions" are a partition of this probability space.

<span class="mw-page-title-main">Total variation distance of probability measures</span> Concept in probability theory

In probability theory, the total variation distance is a distance measure for probability distributions. It is an example of a statistical distance metric, and is sometimes called the statistical distance, statistical difference or variational distance.

In category theory, a branch of mathematics, a presheaf on a category is a functor . If is the poset of open sets in a topological space, interpreted as a category, then one recovers the usual notion of presheaf on a topological space.

In probability theory, a random measure is a measure-valued random element. Random measures are for example used in the theory of random processes, where they form many important point processes such as Poisson point processes and Cox processes.

An -superprocess, , within mathematics probability theory is a stochastic process on that is usually constructed as a special limit of near-critical branching diffusions.

In probability theory, regular conditional probability is a concept that formalizes the notion of conditioning on the outcome of a random variable. The resulting conditional probability distribution is a parametrized family of probability measures called a Markov kernel.

In probability theory, a Markov kernel is a map that in the general theory of Markov processes plays the role that the transition matrix does in the theory of Markov processes with a finite state space.

In mathematics, specifically in the theory of Markovian stochastic processes in probability theory, the Chapman–Kolmogorov equation (CKE) is an identity relating the joint probability distributions of different sets of coordinates on a stochastic process. The equation was derived independently by both the British mathematician Sydney Chapman and the Russian mathematician Andrey Kolmogorov. The CKE is prominently used in recent Variational Bayesian methods.

In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in , discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song , Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.

In mathematics, the quotient of an abelian category by a Serre subcategory is the abelian category which, intuitively, is obtained from by ignoring all objects from . There is a canonical exact functor whose kernel is , and is in a certain sense the most general abelian category with this property.

In mathematics, the Poisson boundary is a measure space associated to a random walk. It is an object designed to encode the asymptotic behaviour of the random walk, i.e. how trajectories diverge when the number of steps goes to infinity. Despite being called a boundary it is in general a purely measure-theoretical object and not a boundary in the topological sense. However, in the case where the random walk is on a topological space the Poisson boundary can be related to the Martin boundary, which is an analytic construction yielding a genuine topological boundary. Both boundaries are related to harmonic functions on the space via generalisations of the Poisson formula.

In probability theory and ergodic theory, a Markov operator is an operator on a certain function space that conserves the mass. If the underlying measurable space is topologically sufficiently rich enough, then the Markov operator admits a kernel representation. Markov operators can be linear or non-linear. Closely related to Markov operators is the Markov semigroup.

In mathematics, the category of measurable spaces, often denoted Meas, is the category whose objects are measurable spaces and whose morphisms are measurable maps. This is a category because the composition of two measurable maps is again measurable, and the identity function is measurable.

In mathematics, the Giry monad is a construction that assigns to a measurable space a space of probability measures over it, equipped with a canonical sigma-algebra. It is one of the main examples of a probability monad.

References

Further reading