In mathematics, the Giry monad is a construction that assigns to a measurable space a space of probability measures over it, equipped with a canonical sigma-algebra. [1] [2] [3] [4] [5] It is one of the main examples of a probability monad.
It is implicitly used in probability theory whenever one considers probability measures which depend measurably on a parameter (giving rise to Markov kernels), or when one has probability measures over probability measures (such as in de Finetti's theorem).
Like many iterable constructions, it has the category-theoretic structure of a monad, on the category of measurable spaces.
The Giry monad, like every monad, consists of three structures: [6] [7] [8]
Let be a measurable space. Denote by the set of probability measures over . We equip the set with a sigma-algebra as follows. First of all, for every measurable set , define the map by . We then define the sigma algebra on to be the smallest sigma-algebra which makes the maps measurable, for all (where is assumed equipped with the Borel sigma-algebra). [6]
Equivalently, can be defined as the smallest sigma-algebra on which makes the maps
measurable for all bounded measurable . [9]
The assignment is part of an endofunctor on the category of measurable spaces, usually denoted again by . Its action on morphisms, i.e. on measurable maps, is via the pushforward of measures. Namely, given a measurable map , one assigns to the map defined by
for all and all measurable sets . [6]
Given a measurable space , the map maps an element to the Dirac measure , defined on measurable subsets by [6]
Let , i.e. a probability measure over the probability measures over . We define the probability measure by
for all measurable . This gives a measurable, natural map . [6]
A mixture distribution, or more generally a compound distribution, can be seen as an application of the map . Let's see this for the case of a finite mixture. Let be probability measures on , and consider the probability measure given by the mixture
for all measurable , for some weights satisfying . We can view the mixture as the average , where the measure on measures , which in this case is discrete, is given by
More generally, the map can be seen as the most general, non-parametric way to form arbitrary mixture or compound distributions.
One of the properties of the sigma-algebra is that given measurable spaces and , we have a bijective correspondence between measurable functions and Markov kernels . This allows to view a Markov kernel, equivalently, as a measurably parametrized probability measure. [10]
In more detail, given a measurable function , one can obtain the Markov kernel as follows,
for every and every measurable (note that is a probability measure). Conversely, given a Markov kernel , one can form the measurable function mapping to the probability measure defined by
for every measurable . The two assignments are mutually inverse.
From the point of view of category theory, we can interpret this correspondence as an adjunction
between the category of measurable spaces and the category of Markov kernels. In particular, the category of Markov kernels can be seen as the Kleisli category of the Giry monad. [3] [4] [5]
Given measurable spaces and , one can form the measurable space with the product sigma-algebra, which is the product in the category of measurable spaces. Given probability measures and , one can form the product measure on . This gives a natural, measurable map
usually denoted by or by . [4]
The map is in general not an isomorphism, since there are probability measures on which are not product distributions, for example in case of correlation. However, the maps and the isomorphism make the Giry monad a monoidal monad, and so in particular a commutative strong monad. [4]
In mathematics, convolution is a mathematical operation on two functions that produces a third function. The term convolution refers to both the result function and to the process of computing it. It is defined as the integral of the product of the two functions after one is reflected about the y-axis and shifted. The integral is evaluated for all values of shift, producing the convolution function. The choice of which function is reflected and shifted before the integral does not change the integral result. Graphically, it expresses how the 'shape' of one function is modified by the other.
In mathematical analysis and in probability theory, a σ-algebra on a set X is a nonempty collection Σ of subsets of X closed under complement, countable unions, and countable intersections. The ordered pair is called a measurable space.
In probability theory, a probability space or a probability triple is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models the throwing of a die.
In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz.
In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value evaluated with respect to the conditional probability distribution. If the random variable can take on only a finite number of values, the "conditions" are that the variable can only take on a subset of those values. More formally, in the case when the random variable is defined over a discrete probability space, the "conditions" are a partition of this probability space.
In mathematics, mixing is an abstract concept originating from physics: the attempt to describe the irreversible thermodynamic process of mixing in the everyday world: e.g. mixing paint, mixing drinks, industrial mixing.
In measure theory, Carathéodory's extension theorem states that any pre-measure defined on a given ring of subsets R of a given set Ω can be extended to a measure on the σ-ring generated by R, and this extension is unique if the pre-measure is σ-finite. Consequently, any pre-measure on a ring containing all intervals of real numbers can be extended to the Borel algebra of the set of real numbers. This is an extremely powerful result of measure theory, and leads, for example, to the Lebesgue measure.
In probability theory, a Hunt process is a type of Markov process, named for mathematician Gilbert A. Hunt who first defined them in 1957. Hunt processes were important in the study of probabilistic potential theory until they were superseded by right processes in the 1970s.
In mathematics, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space that the system moves in, in a uniform and random sense. This implies that the average behavior of the system can be deduced from the trajectory of a "typical" point. Equivalently, a sufficiently large collection of random samples from a process can represent the average statistical properties of the entire process. Ergodicity is a property of the system; it is a statement that the system cannot be reduced or factored into smaller components. Ergodic theory is the study of systems possessing ergodicity.
In measure theory, a pushforward measure is obtained by transferring a measure from one measurable space to another using a measurable function.
In probability theory, a random measure is a measure-valued random element. Random measures are for example used in the theory of random processes, where they form many important point processes such as Poisson point processes and Cox processes.
In probability theory, a standard probability space, also called Lebesgue–Rokhlin probability space or just Lebesgue space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940. Informally, it is a probability space consisting of an interval and/or a finite or countable number of atoms.
An -superprocess, , within mathematics probability theory is a stochastic process on that is usually constructed as a special limit of near-critical branching diffusions.
In probability theory, a Markov kernel is a map that in the general theory of Markov processes plays the role that the transition matrix does in the theory of Markov processes with a finite state space.
In mathematics, specifically in the theory of Markovian stochastic processes in probability theory, the Chapman–Kolmogorov equation (CKE) is an identity relating the joint probability distributions of different sets of coordinates on a stochastic process. The equation was derived independently by both the British mathematician Sydney Chapman and the Russian mathematician Andrey Kolmogorov. The CKE is prominently used in recent Variational Bayesian methods.
A Markov chain on a measurable state space is a discrete-time-homogeneous Markov chain with a measurable space as state space.
In probability theory and ergodic theory, a Markov operator is an operator on a certain function space that conserves the mass. If the underlying measurable space is topologically sufficiently rich enough, then the Markov operator admits a kernel representation. Markov operators can be linear or non-linear. Closely related to Markov operators is the Markov semigroup.
In mathematics, the category of measurable spaces, often denoted Meas, is the category whose objects are measurable spaces and whose morphisms are measurable maps. This is a category because the composition of two measurable maps is again measurable, and the identity function is measurable.
In mathematics, the category of Markov kernels, often denoted Stoch, is the category whose objects are measurable spaces and whose morphisms are Markov kernels. It is analogous to the category of sets and functions, but where the arrows can be interpreted as being stochastic.
In mathematics, especially in probability theory and ergodic theory, the invariant sigma-algebra is a sigma-algebra formed by sets which are invariant under a group action or dynamical system. It can be interpreted as of being "indifferent" to the dynamics.