Donsker's theorem

Last updated
Donsker's invariance principle for simple random walk on
Z
{\displaystyle \mathbb {Z} }
. Donskers invariance principle.gif
Donsker's invariance principle for simple random walk on .

In probability theory, Donsker's theorem (also known as Donsker's invariance principle, or the functional central limit theorem), named after Monroe D. Donsker, is a functional extension of the central limit theorem for empirical distribution functions. Specifically, the theorem states that an appropriately centered and scaled version of the empirical distribution function converges to a Gaussian process.

Contents

Let be a sequence of independent and identically distributed (i.i.d.) random variables with mean 0 and variance 1. Let . The stochastic process is known as a random walk. Define the diffusively rescaled random walk (partial-sum process) by

The central limit theorem asserts that converges in distribution to a standard Gaussian random variable as . Donsker's invariance principle [1] [2] extends this convergence to the whole function . More precisely, in its modern form, Donsker's invariance principle states that: As random variables taking values in the Skorokhod space , the random function converges in distribution to a standard Brownian motion as

Donsker-Skorokhod-Kolmogorov theorem for uniform distributions. Donsker theorem for uniform distributions.gif
Donsker-Skorokhod-Kolmogorov theorem for uniform distributions.
Donsker-Skorokhod-Kolmogorov theorem for normal distributions Donsker theorem for normal distributions.gif
Donsker-Skorokhod-Kolmogorov theorem for normal distributions

Formal statement

Let Fn be the empirical distribution function of the sequence of i.i.d. random variables with distribution function F. Define the centered and scaled version of Fn by

indexed by x  R. By the classical central limit theorem, for fixed x, the random variable Gn(x) converges in distribution to a Gaussian (normal) random variable G(x) with zero mean and variance F(x)(1  F(x)) as the sample size n grows.

Theorem (Donsker, Skorokhod, Kolmogorov) The sequence of Gn(x), as random elements of the Skorokhod space , converges in distribution to a Gaussian process G with zero mean and covariance given by

The process G(x) can be written as B(F(x)) where B is a standard Brownian bridge on the unit interval.

Kolmogorov (1933) showed that when F is continuous, the supremum and supremum of absolute value, converges in distribution to the laws of the same functionals of the Brownian bridge B(t), see the Kolmogorov–Smirnov test. In 1949 Doob asked whether the convergence in distribution held for more general functionals, thus formulating a problem of weak convergence of random functions in a suitable function space. [3]

In 1952 Donsker stated and proved (not quite correctly) [4] a general extension for the Doob–Kolmogorov heuristic approach. In the original paper, Donsker proved that the convergence in law of Gn to the Brownian bridge holds for Uniform[0,1] distributions with respect to uniform convergence in t over the interval [0,1]. [2]

However Donsker's formulation was not quite correct because of the problem of measurability of the functionals of discontinuous processes. In 1956 Skorokhod and Kolmogorov defined a separable metric d, called the Skorokhod metric, on the space of càdlàg functions on [0,1], such that convergence for d to a continuous function is equivalent to convergence for the sup norm, and showed that Gn converges in law in to the Brownian bridge.

Later Dudley reformulated Donsker's result to avoid the problem of measurability and the need of the Skorokhod metric. One can prove [4] that there exist Xi, iid uniform in [0,1] and a sequence of sample-continuous Brownian bridges Bn, such that

is measurable and converges in probability to 0. An improved version of this result, providing more detail on the rate of convergence, is the Komlós–Major–Tusnády approximation.

See also

Related Research Articles

<span class="mw-page-title-main">Kolmogorov–Smirnov test</span> Non-parametric statistical test between two distributions

In statistics, the Kolmogorov–Smirnov test is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution, or to compare two samples. In essence, the test answers the question "How likely is it that we would see a collection of samples like this if they were drawn from that probability distribution?" or, in the second case, "How likely is it that we would see two sets of samples like this if they were drawn from the same probability distribution?". It is named after Andrey Kolmogorov and Nikolai Smirnov.

In probability theory, the central limit theorem (CLT) establishes that, in many situations, for independent and identically distributed random variables, the sampling distribution of the standardized sample mean tends towards the standard normal distribution even if the original variables themselves are not normally distributed.

<span class="mw-page-title-main">Stochastic process</span> Collection of random variables

In probability theory and related fields, a stochastic or random process is a mathematical object usually defined as a sequence of random variables, where the index of the sequence has the interpretation of time. Stochastic processes are widely used as mathematical models of systems and phenomena that appear to vary in a random manner. Examples include the growth of a bacterial population, an electrical current fluctuating due to thermal noise, or the movement of a gas molecule. Stochastic processes have applications in many disciplines such as biology, chemistry, ecology, neuroscience, physics, image processing, signal processing, control theory, information theory, computer science, and telecommunications. Furthermore, seemingly random changes in financial markets have motivated the extensive use of stochastic processes in finance.

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. The same concepts are known in more general mathematics as stochastic convergence and they formalize the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle down into a behavior that is essentially unchanging when items far enough into the sequence are studied. The different possible notions of convergence relate to how such a behavior can be characterized: two readily understood behaviors are that the sequence eventually takes a constant value, and that values in the sequence continue to change but can be described by an unchanging probability distribution.

<span class="mw-page-title-main">Wiener process</span> Stochastic process generalizing Brownian motion

In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It is often also called Brownian motion due to its historical connection with the physical process of the same name originally observed by Scottish botanist Robert Brown. It is one of the best known Lévy processes and occurs frequently in pure and applied mathematics, economics, quantitative finance, evolutionary biology, and physics.

<span class="mw-page-title-main">Law of the iterated logarithm</span>

In probability theory, the law of the iterated logarithm describes the magnitude of the fluctuations of a random walk. The original statement of the law of the iterated logarithm is due to A. Ya. Khinchin (1924). Another statement was given by A. N. Kolmogorov in 1929.

In probability theory and related fields, Malliavin calculus is a set of mathematical techniques and ideas that extend the mathematical field of calculus of variations from deterministic functions to stochastic processes. In particular, it allows the computation of derivatives of random variables. Malliavin calculus is also called the stochastic calculus of variations. P. Malliavin first initiated the calculus on infinite dimensional space. Then, the significant contributors such as S. Kusuoka, D. Stroock, J-M. Bismut, S. Watanabe, I. Shigekawa, and so on finally completed the foundations.

<span class="mw-page-title-main">Empirical distribution function</span> Distribution function associated with the empirical measure of a sample

In statistics, an empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

In probability theory, an empirical process is a stochastic process that describes the proportion of objects in a system in a given state. For a process in a discrete state space a population continuous time Markov chain or Markov population model is a process which counts the number of objects in a given state . In mean field theory, limit theorems are considered and generalise the central limit theorem for empirical measures. Applications of the theory of empirical processes arise in non-parametric statistics.

In the theory of probability, the Glivenko–Cantelli theorem, named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, describes the asymptotic behaviour of the empirical distribution function as the number of independent and identically distributed observations grows. Specifically, the empirical distribution function converges uniformly to the true distribution function almost surely.

In probability theory, an empirical measure is a random measure arising from a particular realization of a sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical statistics.

<span class="mw-page-title-main">Classical Wiener space</span>

In mathematics, classical Wiener space is the collection of all continuous functions on a given domain, taking values in a metric space. Classical Wiener space is useful in the study of stochastic processes whose sample paths are continuous functions. It is named after the American mathematician Norbert Wiener.

<span class="mw-page-title-main">Dvoretzky–Kiefer–Wolfowitz inequality</span> Statistical inequality

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality bounds how close an empirically determined distribution function will be to the distribution function from which the empirical samples are drawn. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

In probability theory and statistical mechanics, the Gaussian free field (GFF) is a Gaussian random field, a central model of random surfaces. Sheffield (2007) gives a mathematical survey of the Gaussian free field.

In mathematics – specifically, in the theory of stochastic processes – Doob's martingale convergence theorems are a collection of results on the limits of supermartingales, named after the American mathematician Joseph L. Doob. Informally, the martingale convergence theorem typically refers to the result that any supermartingale satisfying a certain boundedness condition must converge. One may think of supermartingales as the random variable analogues of non-increasing sequences; from this perspective, the martingale convergence theorem is a random variable analogue of the monotone convergence theorem, which states that any bounded monotone sequence converges. There are symmetric results for submartingales, which are analogous to non-decreasing sequences.

In statistics, the Khmaladze transformation is a mathematical tool used in constructing convenient goodness of fit tests for hypothetical distribution functions. More precisely, suppose are i.i.d., possibly multi-dimensional, random observations generated from an unknown probability distribution. A classical problem in statistics is to decide how well a given hypothetical distribution function , or a given hypothetical parametric family of distribution functions , fits the set of observations. The Khmaladze transformation allows us to construct goodness of fit tests with desirable properties. It is named after Estate V. Khmaladze.

This page lists articles related to probability theory. In particular, it lists many articles corresponding to specific probability distributions. Such articles are marked here by a code of the form (X:Y), which refers to number of random variables involved and the type of the distribution. For example (2:DC) indicates a distribution with two random variables, discrete or continuous. Other codes are just abbreviations for topics. The list of codes can be found in the table of contents.

In probability theory, the Komlós–Major–Tusnády approximation refers to one of the two strong embedding theorems: 1) approximation of random walk by a standard Brownian motion constructed on the same probability space, and 2) an approximation of the empirical process by a Brownian bridge constructed on the same probability space. It is named after Hungarian mathematicians János Komlós, Gábor Tusnády, and Péter Major, who proved it in 1975.

In mathematics, a càdlàg, RCLL, or corlol function is a function defined on the real numbers that is everywhere right-continuous and has left limits everywhere. Càdlàg functions are important in the study of stochastic processes that admit jumps, unlike Brownian motion, which has continuous sample paths. The collection of càdlàg functions on a given domain is known as Skorokhod space.

Convergence proof techniques are canonical components of mathematical proofs that sequences or functions converge to a finite limit when the argument tends to infinity.

References

  1. Donsker, M.D. (1951). "An invariance principle for certain probability limit theorems". Memoirs of the American Mathematical Society (6). MR   0040613.
  2. 1 2 Donsker, M. D. (1952). "Justification and extension of Doob's heuristic approach to the Kolmogorov–Smirnov theorems". Annals of Mathematical Statistics . 23 (2): 277–281. doi: 10.1214/aoms/1177729445 . MR   0047288. Zbl   0046.35103.
  3. Doob, Joseph L. (1949). "Heuristic approach to the Kolmogorov–Smirnov theorems". Annals of Mathematical Statistics . 20 (3): 393–403. doi: 10.1214/aoms/1177729991 . MR   0030732. Zbl   0035.08901.
  4. 1 2 Dudley, R.M. (1999). Uniform Central Limit Theorems. Cambridge University Press. ISBN   978-0-521-46102-3.