Ratio of uniforms

Last updated

The ratio of uniforms is a method initially proposed by Kinderman and Monahan in 1977 [1] for pseudo-random number sampling, that is, for drawing random samples from a statistical distribution. Like rejection sampling and inverse transform sampling, it is an exact simulation method. The basic idea of the method is to use a change of variables to create a bounded set, which can then be sampled uniformly to generate random variables following the original distribution. One feature of this method is that the distribution to sample is only required to be known up to an unknown multiplicative factor, a common situation in computational statistics and statistical physics.

Contents

Motivation

Rejection sampling of a bounded statistical distribution with finite support. Rejection sampling of a bounded distribution with finite support.svg
Rejection sampling of a bounded statistical distribution with finite support.

A convenient technique to sample a statistical distribution is rejection sampling. When the probability density function of the distribution is bounded and has finite support, one can define a bounding box around it (a uniform proposal distribution), draw uniform samples in the box and return only the x coordinates of the points that fall below the function (see graph). As a direct consequence of the fundamental theorem of simulation, [2] the returned samples are distributed according to the original distribution.

When the support of the distribution is infinite, it is impossible to draw a rectangular bounding box containing the graph of the function. One can still use rejection sampling, but with a non-uniform proposal distribution. It can be delicate to choose an appropriate proposal distribution, [3] and one also has to know how to efficiently sample this proposal distribution.

The method of the ratio of uniforms offers a solution to this problem, by essentially using as proposal distribution the distribution created by the ratio of two uniform random variables.

Statement

The statement and the proof are adapted from the presentation by Gobet [4]

Theorem  Let be a multidimensional random variable with probability density function on . The function is only required to be known up to a constant, so we can assume that we only know where , with a constant unknown or difficult to compute. Let , a parameter that can be adjusted as we choose to improve the properties of the method. We can define the set :
The Lebesgue measure of the set is finite and equal to .

Furthermore, let be a random variable uniformly distributed on the set . Then, is a random variable on distributed like .

Proof

We will first assume that the first statement is correct, i.e. .

Let be a measurable function on . Let's consider the expectation of on the set :

With the change of variables , we have

where we can see that has indeed the density .

Coming back to the first statement, a similar argument shows that .

Complements

Rejection sampling in

The above statement does not specify how one should perform the uniform sampling in . However, the interest of this method is that under mild conditions on (namely that and for all are bounded), is bounded. One can define the rectangular bounding box such that

This allows to sample uniformly the set by rejection sampling inside . The parameter can be adjusted to change the shape of and maximize the acceptance ratio of this sampling.

Parametric description of the boundary of

The definition of is already convenient for the rejection sampling step. For illustration purposes, it can be interesting to draw the set, in which case it can be useful to know the parametric description of its boundary:

or for the common case where is a 1-dimensional variable, .

Generalized ratio of uniforms

Here parameterized only with , the ratio of uniforms can be described with a more general class of transformations. [5]

In the 1-dimensional case, if is a strictly increasing and differentiable function such that , then we can define such that

If is a random variable uniformly distributed in , then is distributed with the density .

Examples

Exponential distribution before and after change of variables by the ratio of uniforms method. Top: graph of the exponential distribution on
R
+
{\displaystyle \mathbb {R} ^{+}}
. Bottom: the set
A
f
,
1
{\displaystyle A_{f,1}}
is represented in the space
(
u
,
v
)
{\displaystyle (u,v)}
, inscribed in the bounding box
A
~
f
,
1
{\displaystyle {\tilde {A}}_{f,1}}
. The colored domains, of equal probability, were added to help the visual association of the corresponding domains of the transformed sets. Ratio of uniforms for the exponential distribution.svg
Exponential distribution before and after change of variables by the ratio of uniforms method. Top: graph of the exponential distribution on . Bottom: the set is represented in the space , inscribed in the bounding box . The colored domains, of equal probability, were added to help the visual association of the corresponding domains of the transformed sets.

The exponential distribution

Assume that we want to sample the exponential distribution, with the ratio of uniforms method. We will take here .

We can start constructing the set :

The condition is equivalent, after computation, to , which allows us to plot the shape of the set (see graph).

This inequality also allows us to determine the rectangular bounding box where is included. Indeed, with , we have and , from where .

From here, we can draw pairs of uniform random variables and until , and when that happens, we return , which is exponentially distributed.

Normal mixture distribution before and after change of variables by the ratio of uniforms method. Top: graph of the mixture distribution on
R
{\displaystyle \mathbb {R} }
. Bottom: the set
A
f
,
r
{\displaystyle A_{f,r}}
is represented for two different values of
r
{\displaystyle r}
. The solid lines on the top represent the de-transformation of the bounding boxes on the bottom. The solid lines on the bottom represent the locations of different values of
x
{\displaystyle x}
in the set. Ratio of uniforms for normal mixture distribution.svg
Normal mixture distribution before and after change of variables by the ratio of uniforms method. Top: graph of the mixture distribution on . Bottom: the set is represented for two different values of . The solid lines on the top represent the de-transformation of the bounding boxes on the bottom. The solid lines on the bottom represent the locations of different values of in the set.

A mixture of normal distributions

Consider the mixture of two normal distributions . To apply the method of the ratio of uniforms, with a given , one should first determine the boundaries of the rectangular bounding box enclosing the set . This can be done numerically, by computing the minimum and maximum of and on a grid of values of . Then, one can draw uniform samples , only keep those that fall inside the set and return them as .

It is possible to optimize the acceptance ratio by adjusting the value of , as seen on the graphs.

Software

See also

Related Research Articles

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed.

In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz. Lp spaces form an important class of Banach spaces in functional analysis, and of topological vector spaces. Because of their key role in the mathematical analysis of measure and probability spaces, Lebesgue spaces are used also in the theoretical discussion of problems in physics, statistics, economics, finance, engineering, and other disciplines.

<span class="mw-page-title-main">Rayleigh distribution</span> Probability distribution

In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution for nonnegative-valued random variables. Up to rescaling, it coincides with the chi distribution with two degrees of freedom. The distribution is named after Lord Rayleigh.

<span class="mw-page-title-main">Jensen's inequality</span> Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.

In probability theory, the Borel–Kolmogorov paradox is a paradox relating to conditional probability with respect to an event of probability zero. It is named after Émile Borel and Andrey Kolmogorov.

<span class="mw-page-title-main">Stellar dynamics</span>

Stellar dynamics is the branch of astrophysics which describes in a statistical way the collective motions of stars subject to their mutual gravity. The essential difference from celestial mechanics is that the number of body

The Gram–Charlier A series, and the Edgeworth series are series that approximate a probability distribution in terms of its cumulants. The series are the same; but, the arrangement of terms differ. The key idea of these expansions is to write the characteristic function of the distribution whose probability density function f is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover f through the inverse Fourier transform.

In the theory of stochastic processes, the Karhunen–Loève theorem, also known as the Kosambi–Karhunen–Loève theorem is a representation of a stochastic process as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function on a bounded interval. The transformation is also known as Hotelling transform and eigenvector transform, and is closely related to principal component analysis (PCA) technique widely used in image processing and in data analysis in many fields.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

In Bayesian probability, the Jeffreys prior, named after Sir Harold Jeffreys, is a non-informative (objective) prior distribution for a parameter space; its density function is proportional to the square root of the determinant of the Fisher information matrix:

Differential entropy is a concept in information theory that began as an attempt by Claude Shannon to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP). Differential entropy is commonly encountered in the literature, but it is a limiting case of the LDDP, and one that loses its fundamental association with discrete entropy.

In the mathematical discipline of graph theory, the expander walk sampling theorem intuitively states that sampling vertices in an expander graph by doing relatively short random walk can simulate sampling the vertices independently from a uniform distribution. The earliest version of this theorem is due to Ajtai, Komlós & Szemerédi (1987), and the more general version is typically attributed to Gillman (1998).

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

<span class="mw-page-title-main">Conway–Maxwell–Poisson distribution</span> Probability distribution

In probability theory and statistics, the Conway–Maxwell–Poisson distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Poisson distribution by adding a parameter to model overdispersion and underdispersion. It is a member of the exponential family, has the Poisson distribution and geometric distribution as special cases and the Bernoulli distribution as a limiting case.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. It is named after French mathematician Siméon Denis Poisson. The Poisson distribution can also be used for the number of events in other specified interval types such as distance, area, or volume.

<span class="mw-page-title-main">Marchenko–Pastur distribution</span> Distribution of singular values of large rectangular random matrices

In the mathematical theory of random matrices, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the asymptotic behavior of singular values of large rectangular random matrices. The theorem is named after Ukrainian mathematicians Vladimir Marchenko and Leonid Pastur who proved this result in 1967.

In mathematics, Watson's lemma, proved by G. N. Watson, has significant application within the theory on the asymptotic behavior of integrals.

In mathematics, a determinantal point process is a stochastic point process, the probability distribution of which is characterized as a determinant of some function. Such processes arise as important tools in random matrix theory, combinatorics, physics, and wireless network modeling.

<span class="mw-page-title-main">Wrapped exponential distribution</span> Probability distribution

In probability theory and directional statistics, a wrapped exponential distribution is a wrapped probability distribution that results from the "wrapping" of the exponential distribution around the unit circle.

References

  1. Kinderman, A. J.; Monahan, J. F. (September 1977). "Computer Generation of Random Variables Using the Ratio of Uniform Deviates". ACM Transactions on Mathematical Software. 3 (3): 257–260. doi: 10.1145/355744.355750 . S2CID   12884505.
  2. Robert, Christian; Casella, George (2004). Monte Carlo Statistical Methods (2 ed.). Springer-Verlag. p. 47. ISBN   978-0-387-21239-5.
  3. Martino, Luca; Luengo, David; Míguez, Joaquín (16 July 2013). "On the Generalized Ratio of Uniforms as a Combination of Transformed Rejection and Extended Inverse of Density Sampling". p. 13. arXiv: 1205.0482 [stat.CO].
  4. GOBET, EMMANUEL (2020). MONTE-CARLO METHODS AND STOCHASTIC PROCESSES : from linear to non-linear. [S.l.]: CRC PRESS. ISBN   978-0-367-65846-5. OCLC   1178639517.
  5. Wakefield, J. C.; Gelfand, A. E.; Smith, A. F. M. (1 December 1991). "Efficient generation of random variates via the ratio-of-uniforms method". Statistics and Computing. 1 (2): 129–133. doi:10.1007/BF01889987. ISSN   1573-1375. S2CID   119824513.
  6. Northrop, P. J. (2021), rust: Ratio-of-Uniforms Simulation with Transformation
  7. Leydold, J.; Hörmann, W. (2021), Runuran: R Interface to the 'UNU.RAN' Random Variate Generators