Law of the iterated logarithm

Last updated
Plot of
S
n
/
n
{\displaystyle S_{n}/n}
(red), its standard deviation
1
/
n
{\displaystyle 1/{\sqrt {n}}}
(blue) and its bound
2
log
[?]
log
[?]
n
/
n
{\displaystyle {\sqrt {2\log \log n/n}}}
given by LIL (green). Notice the way it randomly switches from the upper bound to the lower bound. Both axes are non-linearly transformed (as explained in figure summary) to make this effect more visible. Law of large numbers.gif
Plot of (red), its standard deviation (blue) and its bound given by LIL (green). Notice the way it randomly switches from the upper bound to the lower bound. Both axes are non-linearly transformed (as explained in figure summary) to make this effect more visible.

In probability theory, the law of the iterated logarithm describes the magnitude of the fluctuations of a random walk. The original statement of the law of the iterated logarithm is due to A. Ya. Khinchin (1924). [1] Another statement was given by A. N. Kolmogorov in 1929. [2]

Contents

Statement

Let {Yn} be independent, identically distributed random variables with zero means and unit variances. Let Sn = Y1 + ... + Yn. Then

where "log" is the natural logarithm, "lim sup" denotes the limit superior, and "a.s." stands for "almost surely". [3] [4]

Another statement given by A. N. Kolmogorov in 1929 [5] is as follows.

Let be independent random variables with zero means and finite variances. Let and . If and there exists a sequence of positive constants such that a.s. and

then we have

Note that, the first statement covers the case of the standard normal distribution, but the second does not.

Discussion

The law of iterated logarithms operates "in between" the law of large numbers and the central limit theorem. There are two versions of the law of large numbers — the weak and the strong — and they both state that the sums Sn, scaled by n−1, converge to zero, respectively in probability and almost surely:

On the other hand, the central limit theorem states that the sums Sn scaled by the factor n−1/2 converge in distribution to a standard normal distribution. By Kolmogorov's zero–one law, for any fixed M, the probability that the event occurs is 0 or 1. Then

so

An identical argument shows that

This implies that these quantities cannot converge almost surely. In fact, they cannot even converge in probability, which follows from the equality

and the fact that the random variables

are independent and both converge in distribution to

The law of the iterated logarithm provides the scaling factor where the two limits become different:

Thus, although the absolute value of the quantity is less than any predefined ε > 0 with probability approaching one, it will nevertheless almost surely be greater than ε infinitely often; in fact, the quantity will be visiting the neighborhoods of any point in the interval (-1,1) almost surely.

Exhibition of Limit Theorems and their interrelationship LimitTheoremsExhibition.png
Exhibition of Limit Theorems and their interrelationship

Generalizations and variants

The law of the iterated logarithm (LIL) for a sum of independent and identically distributed (i.i.d.) random variables with zero mean and bounded increment dates back to Khinchin and Kolmogorov in the 1920s.

Since then, there has been a tremendous amount of work on the LIL for various kinds of dependent structures and for stochastic processes. The following is a small sample of notable developments.

HartmanWintner (1940) generalized LIL to random walks with increments with zero mean and finite variance. De Acosta (1983) gave a simple proof of the Hartman–Wintner version of the LIL. [6]

Chung (1948) proved another version of the law of the iterated logarithm for the absolute value of a brownian motion. [7]

Strassen (1964) studied the LIL from the point of view of invariance principles. [8]

Stout (1970) generalized the LIL to stationary ergodic martingales. [9]

Wittmann (1985) generalized Hartman–Wintner version of LIL to random walks satisfying milder conditions. [10]

Vovk (1987) derived a version of LIL valid for a single chaotic sequence (Kolmogorov random sequence). [11] This is notable, as it is outside the realm of classical probability theory.

Yongge Wang (1996) showed that the law of the iterated logarithm holds for polynomial time pseudorandom sequences also. [12] [13] The Java-based software testing tool tests whether a pseudorandom generator outputs sequences that satisfy the LIL.

Balsubramani (2014) proved a non-asymptotic LIL that holds over finite-time martingale sample paths. [14] This subsumes the martingale LIL as it provides matching finite-sample concentration and anti-concentration bounds, and enables sequential testing [15] and other applications. [16]

See also

Notes

  1. A. Khinchine. "Über einen Satz der Wahrscheinlichkeitsrechnung", Fundamenta Mathematicae 6 (1924): pp. 9–20 (The author's name is shown here in an alternate transliteration.)
  2. A. Kolmogoroff. "Über das Gesetz des iterierten Logarithmus". Mathematische Annalen, 101: 126–135, 1929. (At the Göttinger DigitalisierungsZentrum web site)
  3. Leo Breiman. Probability. Original edition published by Addison-Wesley, 1968; reprinted by Society for Industrial and Applied Mathematics, 1992. (See Sections 3.9, 12.9, and 12.10; Theorem 3.52 specifically.)
  4. R. Durrett. Probability: Theory and Examples. Fourth edition published by Cambridge University Press in 2010. (See Theorem 8.8.3.)
  5. A. Kolmogoroff. "Über das Gesetz des iterierten Logarithmus". Mathematische Annalen, 101: 126–135, 1929. (At the Göttinger DigitalisierungsZentrum web site)
  6. A. de Acosta: "A New Proof of the Hartman-Wintner Law of the Iterated Logarithm". Ann. Probab., 1983.
  7. Chung, Kai-lai (1948). "On the maximum partial sums of sequences of independent random variables". Trans. Am. Math. Soc. 61: 205–233.
  8. V. Strassen: "An invariance principle for the law of the iterated logarithm". Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 1964.
  9. W. F. Stout: "The Hartman-Wintner Law of the Iterated Logarithm for Martingales". Ann. Math. Statist., 1970.
  10. R. Wittmann: "A general law of iterated logarithm". Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 1985.
  11. V. Vovk: "The Law of the Iterated Logarithm for Random Kolmogorov, or Chaotic, Sequences". Theory Probab. Appl., 1987.
  12. Y. Wang: "The law of the iterated logarithm for p-random sequences". In: Proc. 11th IEEE Conference on Computational Complexity (CCC), pages 180–189. IEEE Computer Society Press, 1996.
  13. Y. Wang: Randomness and Complexity. PhD thesis, 1996.
  14. A. Balsubramani: "Sharp finite-time iterated-logarithm martingale concentration". arXiv:1405.2639.
  15. A. Balsubramani and A. Ramdas: "Sequential nonparametric testing with the law of the iterated logarithm". 32nd Conference on Uncertainty in Artificial Intelligence (UAI).
  16. C. Daskalakis and Y. Kawase: "Optimal Stopping Rules for Sequential Hypothesis Testing". In 25th Annual European Symposium on Algorithms (ESA 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

Related Research Articles

In complex analysis, an entire function, also called an integral function, is a complex-valued function that is holomorphic on the whole complex plane. Typical examples of entire functions are polynomials and the exponential function, and any finite sums, products and compositions of these, such as the trigonometric functions sine and cosine and their hyperbolic counterparts sinh and cosh, as well as derivatives and integrals of entire functions such as the error function. If an entire function has a root at , then , taking the limit value at , is an entire function. On the other hand, the natural logarithm, the reciprocal function, and the square root are all not entire functions, nor can they be continued analytically to an entire function.

<span class="mw-page-title-main">Natural logarithm</span> Logarithm to the base of the mathematical constant e

The natural logarithm of a number is its logarithm to the base of the mathematical constant e, which is an irrational and transcendental number approximately equal to 2.718281828459. The natural logarithm of x is generally written as ln x, logex, or sometimes, if the base e is implicit, simply log x. Parentheses are sometimes added for clarity, giving ln(x), loge(x), or log(x). This is done particularly when the argument to the logarithm is not a single symbol, so as to prevent ambiguity.

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

In probability theory, the Borel–Cantelli lemma is a theorem about sequences of events. In general, it is a result in measure theory. It is named after Émile Borel and Francesco Paolo Cantelli, who gave statement to the lemma in the first decades of the 20th century. A related result, sometimes called the second Borel–Cantelli lemma, is a partial converse of the first Borel–Cantelli lemma. The lemma states that, under certain conditions, an event will have probability of either zero or one. Accordingly, it is the best-known of a class of similar theorems, known as zero-one laws. Other examples include Kolmogorov's zero–one law and the Hewitt–Savage zero–one law.

<span class="mw-page-title-main">Wiener process</span> Stochastic process generalizing Brownian motion

In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It is often also called Brownian motion due to its historical connection with the physical process of the same name originally observed by Scottish botanist Robert Brown. It is one of the best known Lévy processes and occurs frequently in pure and applied mathematics, economics, quantitative finance, evolutionary biology, and physics.

<span class="mw-page-title-main">Law of large numbers</span> Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a mathematical law that states that the average of the results obtained from a large number of independent random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.

In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.

In mathematics, the limit of a sequence of sets is a set whose elements are determined by the sequence in either of two equivalent ways: (1) by upper and lower bounds on the sequence that converge monotonically to the same set and (2) by convergence of a sequence of indicator functions which are themselves real-valued. As is the case with sequences of other objects, convergence is not necessary or even usual.

In mathematics, the root test is a criterion for the convergence of an infinite series. It depends on the quantity

In number theory, natural density, also referred to as asymptotic density or arithmetic density, is one method to measure how "large" a subset of the set of natural numbers is. It relies chiefly on the probability of encountering members of the desired subset when combing through the interval [1, n] as n grows large.

<span class="mw-page-title-main">Empirical distribution function</span> Distribution function associated with the empirical measure of a sample

In statistics, an empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

In mathematics, the Stolz–Cesàro theorem is a criterion for proving the convergence of a sequence. It is named after mathematicians Otto Stolz and Ernesto Cesàro, who stated and proved it for the first time.

<span class="mw-page-title-main">Donsker's theorem</span> Statement in probability theory

In probability theory, Donsker's theorem, named after Monroe D. Donsker, is a functional extension of the central limit theorem for empirical distribution functions. Specifically, the theorem states that an appropriately centered and scaled version of the empirical distribution function converges to a Gaussian process.

Intuitively, an algorithmically random sequence is a sequence of binary digits that appears random to any algorithm running on a universal Turing machine. The notion can be applied analogously to sequences on any finite alphabet. Random sequences are key objects of study in algorithmic information theory.

In mathematics, a local martingale is a type of stochastic process, satisfying the localized version of the martingale property. Every martingale is a local martingale; every bounded local martingale is a martingale; in particular, every local martingale that is bounded from below is a supermartingale, and every local martingale that is bounded from above is a submartingale; however, a local martingale is not in general a martingale, because its expectation can be distorted by large values of small probability. In particular, a driftless diffusion process is a local martingale, but not necessarily a martingale.

<span class="mw-page-title-main">Dvoretzky–Kiefer–Wolfowitz inequality</span> Statistical inequality

In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality

In mathematics – specifically, in the theory of stochastic processes – Doob's martingale convergence theorems are a collection of results on the limits of supermartingales, named after the American mathematician Joseph L. Doob. Informally, the martingale convergence theorem typically refers to the result that any supermartingale satisfying a certain boundedness condition must converge. One may think of supermartingales as the random variable analogues of non-increasing sequences; from this perspective, the martingale convergence theorem is a random variable analogue of the monotone convergence theorem, which states that any bounded monotone sequence converges. There are symmetric results for submartingales, which are analogous to non-decreasing sequences.

In mathematics, Schilder's theorem is a generalization of the Laplace method from integrals on to functional Wiener integration. The theorem is used in the large deviations theory of stochastic processes. Roughly speaking, out of Schilder's theorem one gets an estimate for the probability that a (scaled-down) sample path of Brownian motion will stray far from the mean path. This statement is made precise using rate functions. Schilder's theorem is generalized by the Freidlin–Wentzell theorem for Itō diffusions.

In Mathematics, the Mashreghi–Ransford inequality is a bound on the growth rate of certain sequences. It is named after J. Mashreghi and T. Ransford.

In probability theory, Kolmogorov's two-series theorem is a result about the convergence of random series. It follows from Kolmogorov's inequality and is used in one proof of the strong law of large numbers.