The motivation for studying empirical measures is that it is often impossible to know the true underlying probability measure. We collect observations and compute relative frequencies. We can estimate , or a related distribution function by means of the empirical measure or empirical distribution function, respectively. These are uniformly good estimates under certain conditions. Theorems in the area of empirical processes provide rates of this convergence.
Definition
Let be a sequence of independent identically distributed random variables with values in the state space S with probability distribution P.
Definition
The empirical measurePn is defined for measurable subsets of S and given by
In particular, the empirical measure of A is simply the empirical mean of the indicator function, Pn(A) = PnIA.
For a fixed measurable function , is a random variable with mean and variance .
By the strong law of large numbers, Pn(A) converges to P(A) almost surely for fixed A. Similarly converges to almost surely for a fixed measurable function . The problem of uniform convergence of Pn to P was open until Vapnik and Chervonenkis solved it in 1968.[1]
If the class (or ) is Glivenko–Cantelli with respect to P then Pn converges to P uniformly over (or ). In other words, with probability 1 we have
Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics. Vol.63. Cambridge, UK: Cambridge University Press. ISBN0-521-46102-2.
This page is based on this Wikipedia article Text is available under the CC BY-SA 4.0 license; additional terms may apply. Images, videos and audio are available under their respective licenses.