Random sequence

Last updated
Fortuna, Goddess of chance depicted by Tadeusz Kuntze, 1754 (National Museum in Warsaw). Kuntze-Konicz Fortune.jpg
Fortuna, Goddess of chance depicted by Tadeusz Kuntze, 1754 (National Museum in Warsaw).

The concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let X1,...,Xn be independent random variables...". Yet as D. H. Lehmer stated in 1951: "A random sequence is a vague notion... in which each term is unpredictable to the uninitiated and whose digits pass a certain number of tests traditional with statisticians". [1]

Contents

Axiomatic probability theory deliberately avoids a definition of a random sequence. [2] Traditional probability theory does not state if a specific sequence is random, but generally proceeds to discuss the properties of random variables and stochastic sequences assuming some definition of randomness. The Bourbaki school considered the statement "let us consider a random sequence" an abuse of language. [3]

Early history

Émile Borel was one of the first mathematicians to formally address randomness in 1909. [4] In 1919 Richard von Mises gave the first definition of algorithmic randomness, which was inspired by the law of large numbers, although he used the term collective rather than random sequence. Using the concept of the impossibility of a gambling system, von Mises defined an infinite sequence of zeros and ones as random if it is not biased by having the frequency stability property i.e. the frequency of zeros goes to 1/2 and every sub-sequence we can select from it by a "proper" method of selection is also not biased. [5]

The sub-sequence selection criterion imposed by von Mises is important, because although 0101010101... is not biased, by selecting the odd positions, we get 000000... which is not random. Von Mises never totally formalized his definition of a proper selection rule for sub-sequences, but in 1940 Alonzo Church defined it as any recursive function which having read the first N elements of the sequence decides if it wants to select element number N + 1. Church was a pioneer in the field of computable functions, and the definition he made relied on the Church Turing Thesis for computability. [6] This definition is often called Mises–Church randomness.

Modern approaches

During the 20th century various technical approaches to defining random sequences were developed and now three distinct paradigms can be identified. In the mid 1960s, A. N. Kolmogorov and D. W. Loveland independently proposed a more permissive selection rule. [7] [8] In their view Church's recursive function definition was too restrictive in that it read the elements in order. Instead they proposed a rule based on a partially computable process which having read anyN elements of the sequence, decides if it wants to select another element which has not been read yet. This definition is often called Kolmogorov–Loveland stochasticity. But this method was considered too weak by Alexander Shen who showed that there is a Kolmogorov–Loveland stochastic sequence which does not conform to the general notion of randomness.

In 1966 Per Martin-Löf introduced a new notion which is now generally considered the most satisfactory notion of algorithmic randomness. His original definition involved measure theory, but it was later shown that it can be expressed in terms of Kolmogorov complexity. Kolmogorov's definition of a random string was that it is random if has no description shorter than itself via a universal Turing machine. [9]

Three basic paradigms for dealing with random sequences have now emerged: [10]

  • The frequency / measure-theoretic approach. This approach started with the work of Richard von Mises and Alonzo Church. In the 1960s Per Martin-Löf noticed that the sets coding such frequency-based stochastic properties are a special kind of measure zero sets, and that a more general and smooth definition can be obtained by considering all effectively measure zero sets.
  • The complexity / compressibility approach. This paradigm was championed by A. N. Kolmogorov along with contributions Levin and Gregory Chaitin. For finite random sequences, Kolmogorov defined the "randomness" as the entropy, i.e. Kolmogorov complexity, of a string of length K of zeros and ones as the closeness of its entropy to K, i.e. if the complexity of the string is close to K it is very random and if the complexity is far below K, it is not so random.
  • The predictability approach. This paradigm was due to Claus P. Schnorr and uses a slightly different definition of constructive martingales than martingales used in traditional probability theory. [11] Schnorr showed how the existence of a selective betting strategy implied the existence of a selection rule for a biased sub-sequence. If one only requires a recursive martingale to succeed on a sequence instead of constructively succeeds on a sequence, then one gets the recursively randomness concepts. Yongge Wang showed [12] [13] that recursively randomness concept is different from Schnorr's randomness concepts.

In most cases, theorems relating the three paradigms (often equivalence) have been proven. [14]

It is important to realize that for each of the definitions given above for infinite sequences, if one adds a billion zeros to the front of the random sequence the new sequence will still be considered random. Hence any application of these concepts to practical problems needs to be performed with care. [15]

See also

Related Research Articles

Kolmogorov complexity Measure of algorithmic complexity

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963.

Complexity characterises the behaviour of a system or model whose components interact in multiple ways and follow local rules, meaning there is no reasonable higher instruction to define the various possible interactions.

Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of these outcomes is called an event. Central subjects in probability theory include discrete and continuous random variables, probability distributions, and stochastic processes, which provide mathematical abstractions of non-deterministic or uncertain processes or measured quantities that may either be single occurrences or evolve over time in a random fashion. Although it is not possible to perfectly predict random events, much can be said about their behavior. Two major results in probability theory describing such behaviour are the law of large numbers and the central limit theorem.

Stochastic process

In probability theory and related fields, a stochastic or random process is a mathematical object usually defined as a family of random variables. Historically, the random variables were associated with or indexed by a set of numbers, usually viewed as points in time, giving the interpretation of a stochastic process representing numerical values of some system randomly changing over time, such as the growth of a bacterial population, an electrical current fluctuating due to thermal noise, or the movement of a gas molecule. Stochastic processes are widely used as mathematical models of systems and phenomena that appear to vary in a random manner. They have applications in many disciplines such as biology, chemistry, ecology, neuroscience, physics, image processing, signal processing, control theory, information theory, computer science, cryptography and telecommunications. Furthermore, seemingly random changes in financial markets have motivated the extensive use of stochastic processes in finance.

Ray Solomonoff was the inventor of algorithmic probability, his General Theory of Inductive Inference, and was a founder of algorithmic information theory. He was an originator of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.

Solomonoff's theory of inductive inference is a mathematical theory of induction introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.

Law of the iterated logarithm theorem

In probability theory, the law of the iterated logarithm describes the magnitude of the fluctuations of a random walk. The original statement of the law of the iterated logarithm is due to A. Ya. Khinchin (1924). Another statement was given by A. N. Kolmogorov in 1929.

Per Martin-Löf

Per Erik Rutger Martin-Löf is a Swedish logician, philosopher, and mathematical statistician. He is internationally renowned for his work on the foundations of probability, statistics, mathematical logic, and computer science. Since the late 1970s, Martin-Löf's publications have been mainly in logic. In philosophical logic, Martin-Löf has wrestled with the philosophy of logical consequence and judgment, partly inspired by the work of Brentano, Frege, and Husserl. In mathematical logic, Martin-Löf has been active in developing intuitionistic type theory as a constructive foundation of mathematics; Martin-Löf's work on type theory has influenced computer science.

Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects, such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."

Intuitively, an algorithmically random sequence is a sequence of binary digits that appears random to any algorithm running on a universal Turing machine. The notion can be applied analogously to sequences on any finite alphabet. Random sequences are key objects of study in algorithmic information theory.

In 1973 Kolmogorov proposed a non-probabilistic approach to statistics and model selection. Let each datum be a finite binary string and a model be a finite set of binary strings. Consider model classes consisting of models of given maximal Kolmogorov complexity. The Kolmogorov structure function of an individual data string expresses the relation between the complexity level constraint on a model class and the least log-cardinality of a model in the class containing the data. The structure function determines all stochastic properties of the individual data string: for every constrained model class it determines the individual best-fitting model in the class irrespective of whether the true model is in the model class considered or not. In the classical case we talk about a set of data with a probability distribution, and the properties are those of the expectations. In contrast, here we deal with individual data strings and the properties of the individual string focused on. In this setting, a property holds with certainty rather than with high probability as in the classical case. The Kolmogorov structure function precisely quantifies the goodness-of-fit of an individual model with respect to individual data.

Randomness lack of pattern or predictability in events, or, a measure of uncertainty of an outcome

In the common parlance, randomness is the apparent lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual random events are by definition unpredictable, but since they often follow a probability distribution, the frequency of different outcomes over numerous events is predictable. For example, when throwing two dice, the outcome of any particular roll is unpredictable, but a sum of 7 will occur twice as often as 4. In this view, randomness is a measure of uncertainty of an outcome, rather than its haphazardness, and applies to concepts of chance, probability, and information entropy.

This page lists articles related to probability theory. In particular, it lists many articles corresponding to specific probability distributions. Such articles are marked here by a code of the form (X:Y), which refers to number of random variables involved and the type of the distribution. For example (2:DC) indicates a distribution with two random variables, discrete or continuous. Other codes are just abbreviations for topics. The list of codes can be found in the table of contents.

Simplicity theory is a cognitive theory that seeks to explain the attractiveness of situations or events to human minds. It is based on work done by scientists like behavioural scientist Nick Chater, computer scientist Paul Vitanyi, psychologist Jacob Feldman, artificial intelligence researchers Jean-Louis Dessalles and Jürgen Schmidhuber. It claims that interesting situations appear simpler than expected to the observer.

Impossibility of a gambling system

The principle of the impossibility of a gambling system is a concept in probability. It states that in a random sequence, the methodical selection of subsequences does not change the probability of specific elements. The first mathematical demonstration is attributed to Richard von Mises.

History of randomness aspect of history tracing the concepts of chance and randomness which were intertwined with that of fate

In ancient history, the concepts of chance and randomness were intertwined with that of fate. Many ancient peoples threw dice to determine fate, and this later evolved into games of chance. At the same time, most ancient cultures used various methods of divination to attempt to circumvent randomness and fate.

Yongge Wang is a computer science professor at the University of North Carolina at Charlotte specialized in algorithmic complexity and cryptography. He is the inventor of IEEE P1363 cryptographic standards SRP5 and WANG-KE and has contributed to the mathematical theory of algorithmic randomness. He co-authored a paper demonstrating that a recursively enumerable real number is an algorithmically random sequence if and only if it is a Chaitin's constant for some encoding of programs. He also showed the separation of Schnorr randomness from recursive randomness. He also invented a distance based statistical testing technique to improve NIST SP800-22 testing in randomness tests. In cryptographic research, he is known for the invention of the quantum resistant random linear code based encryption scheme RLCE.

The phenomenon of statistical stability, one of the most surprising physical phenomena, is the weakness of the dependence of statistics on the sample size, if this size is large. This effect is typical, for example, for relative frequencies of mass events and averages. This phenomenon is widespread and so can be regarded as a fundamental natural phenomenon.

References

Notes

  1. "What is meant by the word Random" in Mathematics and common sense by Philip J. Davis 2006 ISBN   1-56881-270-1 pages 180-182
  2. Inevitable Randomness in Discrete Mathematics by József Beck 2009 ISBN   0-8218-4756-2 page 44
  3. Algorithms: main ideas and applications by Vladimir Andreevich Uspenskiĭ, Alekseĭ, Lʹvovich Semenov 1993 Springer ISBN   0-7923-2210-X page 166
  4. E. Borel, Les probabilites denombrables et leurs applications arithmetique Rend. Circ. Mat. Palermo 27 (1909) 247–271
  5. Laurant Bienvenu "Kolmogorov Loveland Stochasticity" in STACS 2007: 24th Annual Symposium on Theoretical Aspects of Computer Science by Wolfgang Thomas ISBN   3-540-70917-7 page 260
  6. Church, Alonzo (1940). "On the Concept of Random Sequence". Bull. Amer. Math. Soc. 46 (2): 130–136. doi:10.1090/S0002-9904-1940-07154-X.
  7. A. N. Kolmogorov, Three approaches to the quantitative definition of information Problems of Information and Transmission, 1(1):1–7, 1965.
  8. D.W. Loveland, A new interpretation of von Mises' concept of random sequence Z. Math. Logik Grundlagen Math 12 (1966) 279–294
  9. An introduction to Kolmogorov complexity and its applications by Ming Li, P. M. B. Vitányi 1997 0387948686 pages 149–151
  10. R. Downey, Some Recent Progress in Algorithmic Randomness in Mathematical foundations of computer science 2004: by Jiří Fiala, Václav Koubek 2004 ISBN   3-540-22823-3 page 44
  11. Schnorr, C. P. (1971). "A unified approach to the definition of a random sequence". Mathematical Systems Theory. 5 (3): 246–258. doi:10.1007/bf01694181.
  12. Yongge Wang: Randomness and Complexity. PhD Thesis, 1996. http://webpages.uncc.edu/yonwang/papers/IPL97.pdf
  13. Wang, Yongge (1999). "A separation of two randomness concepts". Information Processing Letters. 69 (3): 115–118. CiteSeerX   10.1.1.46.199 . doi:10.1016/S0020-0190(98)00202-6.
  14. Wolfgang Merkle, Kolmogorov Loveland Stochasticity in Automata, languages and programming: 29th international colloquium, ICALP 2002, by Peter Widmayer et al. ISBN   3-540-43864-5 page 391
  15. Algorithms: main ideas and applications by Vladimir Andreevich Uspenskiĭ, Alekseĭ, Lʹvovich Semenov 1993 Springer ISBN   0-7923-2210-X page 176