Kolmogorov's three-series theorem

Last updated

In probability theory, Kolmogorov's Three-Series Theorem, named after Andrey Kolmogorov, gives a criterion for the almost sure convergence of an infinite series of random variables in terms of the convergence of three different series involving properties of their probability distributions. Kolmogorov's three-series theorem, combined with Kronecker's lemma, can be used to give a relatively easy proof of the Strong Law of Large Numbers. [1]

Contents

Statement of the theorem

Let be independent random variables. The random series converges almost surely in if the following conditions hold for some , and only if the following conditions hold for any :

  1. converges.
  2. Let . Then , the series of expected values of , converges.
  3. converges, where is defined as in the second condition.

Proof

Sufficiency of conditions ("if")

Condition (i) and Borel–Cantelli give that for large, almost surely. Hence converges if and only if converges. Conditions (ii)-(iii) and Kolmogorov's Two-Series Theorem give the almost sure convergence of .

Necessity of conditions ("only if")

Suppose that converges almost surely.

Without condition (i), by Borel–Cantelli there would exist some such that for infinitely many , almost surely. But then the series would diverge. Therefore, we must have condition (i).

We see that condition (iii) implies condition (ii): Kolmogorov's two-series theorem along with condition (i) applied to the case gives the convergence of . So given the convergence of , we have converges, so condition (ii) is implied.

Thus, it only remains to demonstrate the necessity of condition (iii), and we will have obtained the full result. It is equivalent to check condition (iii) for the series where for each , and are IID—that is, to employ the assumption that , since is a sequence of random variables bounded by 2, converging almost surely, and with . So we wish to check that if converges, then converges as well. This is a special case of a more general result from martingale theory with summands equal to the increments of a martingale sequence and the same conditions (; the series of the variances is converging; and the summands are bounded). [2] [3] [4]

Example

As an illustration of the theorem, consider the example of the harmonic series with random signs :

Here, "" means that each term is taken with a random sign that is either or with respective probabilities , and all random signs are chosen independently. Let in the theorem denote a random variable that takes the values and with equal probabilities. With the summands of the first two series are identically zero and var(Yn)=. The conditions of the theorem are then satisfied, so it follows that the harmonic series with random signs converges almost surely. On the other hand, the analogous series of (for example) square root reciprocals with random signs, namely

diverges almost surely, since condition (3) in the theorem is not satisfied for any A. Note that this is different from the behavior of the analogous series with alternating signs, , which does converge.

Notes

  1. Durrett, Rick. "Probability: Theory and Examples." Duxbury advanced series, Third Edition, Thomson Brooks/Cole, 2005, Section 1.8, pp. 60–69.
  2. Sun, Rongfeng. Lecture notes. http://www.math.nus.edu.sg/~matsr/ProbI/Lecture4.pdf Archived 2018-04-17 at the Wayback Machine
  3. M. Loève, "Probability theory", Princeton Univ. Press (1963) pp. Sect. 16.3
  4. W. Feller, "An introduction to probability theory and its applications", 2, Wiley (1971) pp. Sect. IX.9

Related Research Articles

<span class="mw-page-title-main">Expected value</span> Average value of a random variable

In probability theory, the expected value is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a large number of independently selected outcomes of a random variable.

<span class="mw-page-title-main">Probability theory</span> Branch of mathematics concerning probability

Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of the sample space is called an event. Central subjects in probability theory include discrete and continuous random variables, probability distributions, and stochastic processes . Although it is not possible to perfectly predict random events, much can be said about their behavior. Two major results in probability theory describing such behaviour are the law of large numbers and the central limit theorem.

In probability theory, the central limit theorem (CLT) establishes that, in many situations, for identically distributed independent samples, the standardized sample mean tends towards the standard normal distribution even if the original variables themselves are not normally distributed.

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. The same concepts are known in more general mathematics as stochastic convergence and they formalize the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle down into a behavior that is essentially unchanging when items far enough into the sequence are studied. The different possible notions of convergence relate to how such a behavior can be characterized: two readily understood behaviors are that the sequence eventually takes a constant value, and that values in the sequence continue to change but can be described by an unchanging probability distribution.

<span class="mw-page-title-main">Law of large numbers</span> Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.

In probability theory, Kolmogorov's zero–one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, namely a tail event of independent σ-algebras, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.

In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – given that a certain set of "conditions" is known to occur. If the random variable can take on only a finite number of values, the “conditions” are that the variable can only take on a subset of those values. More formally, in the case when the random variable is defined over a discrete probability space, the "conditions" are a partition of this probability space.

The algebra of random variables in statistics, provides rules for the symbolic manipulation of random variables, while avoiding delving too deeply into the mathematically sophisticated ideas of probability theory. Its symbolism allows the treatment of sums, products, ratios and general functions of random variables, as well as dealing with operations such as finding the probability distributions and the expectations, variances and covariances of such combinations.

In probability theory, Wald's equation, Wald's identity or Wald's lemma is an important identity that simplifies the calculation of the expected value of the sum of a random number of random quantities. In its simplest form, it relates the expectation of a sum of randomly many finite-mean, independent and identically distributed random variables to the expected number of terms in the sum and the random variables' common expectation under the condition that the number of terms in the sum is independent of the summands.

<span class="mw-page-title-main">Empirical distribution function</span> Distribution function associated with the empirical measure of a sample

In statistics, an empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

In the theory of probability, the Glivenko–Cantelli theorem, named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, determines the asymptotic behaviour of the empirical distribution function as the number of independent and identically distributed observations grows.

In probability theory, an empirical measure is a random measure arising from a particular realization of a sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical statistics.

The Hewitt–Savage zero–one law is a theorem in probability theory, similar to Kolmogorov's zero–one law and the Borel–Cantelli lemma, that specifies that a certain type of event will either almost surely happen or almost surely not happen. It is sometimes known as the Savage-Hewitt law for symmetric events. It is named after Edwin Hewitt and Leonard Jimmie Savage.

In mathematics, a local martingale is a type of stochastic process, satisfying the localized version of the martingale property. Every martingale is a local martingale; every bounded local martingale is a martingale; in particular, every local martingale that is bounded from below is a supermartingale, and every local martingale that is bounded from above is a submartingale; however, in general a local martingale is not a martingale, because its expectation can be distorted by large values of small probability. In particular, a driftless diffusion process is a local martingale, but not necessarily a martingale.

In probability theory, a standard probability space, also called Lebesgue–Rokhlin probability space or just Lebesgue space is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940. Informally, it is a probability space consisting of an interval and/or a finite or countable number of atoms.

In mathematics – specifically, in the theory of stochastic processes – Doob's martingale convergence theorems are a collection of results on the limits of supermartingales, named after the American mathematician Joseph L. Doob. Informally, the martingale convergence theorem typically refers to the result that any supermartingale satisfying a certain boundedness condition must converge. One may think of supermartingales as the random variable analogues of non-increasing sequences; from this perspective, the martingale convergence theorem is a random variable analogue of the monotone convergence theorem, which states that any bounded monotone sequence converges. There are symmetric results for submartingales, which are analogous to non-decreasing sequences.

In probability theory, the optional stopping theorem says that, under certain conditions, the expected value of a martingale at a stopping time is equal to its initial expected value. Since martingales can be used to model the wealth of a gambler participating in a fair game, the optional stopping theorem says that, on average, nothing can be gained by stopping play based on the information obtainable so far. Certain conditions are necessary for this result to hold true. In particular, the theorem applies to doubling strategies.

Beliefs depend on the available information. This idea is formalized in probability theory by conditioning. Conditional probabilities, conditional expectations, and conditional probability distributions are treated on three levels: discrete probabilities, probability density functions, and measure theory. Conditioning leads to a non-random result if the condition is completely specified; otherwise, if the condition is left random, the result of conditioning is also random.

In probability theory, Lindeberg's condition is a sufficient condition for the central limit theorem (CLT) to hold for a sequence of independent random variables. Unlike the classical CLT, which requires that the random variables in question have finite variance and be both independent and identically distributed, Lindeberg's CLT only requires that they have finite variance, satisfy Lindeberg's condition, and be independent. It is named after the Finnish mathematician Jarl Waldemar Lindeberg.

In probability theory, Kolmogorov's two-series theorem is a result about the convergence of random series. It follows from Kolmogorov's inequality and is used in one proof of the strong law of large numbers.