Universality probability

Last updated

Universality probability is an abstruse probability measure in computational complexity theory that concerns universal Turing machines.

Contents

Background

A Turing machine is a basic model of computation. Some Turing machines might be specific to doing particular calculations. For example, a Turing machine might take input which comprises two numbers and then produce output which is the product of their multiplication. Another Turing machine might take input which is a list of numbers and then give output which is those numbers sorted in order.

A Turing machine which has the ability to simulate any other Turing machine is called universal - in other words, a Turing machine (TM) is said to be a universal Turing machine (or UTM) if, given any other TM, there is a some input (or "header") such that the first TM given that input "header" will forever after behave like the second TM.

An interesting mathematical and philosophical question then arises. If a universal Turing machine is given random input (for suitable definition of random), how probable is it that it remains universal forever?

Definition

Given a prefix-free Turing machine, the universality probability of it is the probability that it remains universal even when every input of it (as a binary string) is prefixed by a random binary string. More formally, it is the probability measure of reals (infinite binary sequences) which have the property that every initial segment of them preserves the universality of the given Turing machine. This notion was introduced by the computer scientist Chris Wallace and was first explicitly discussed in print in an article by Dowe [1] (and a subsequent article [2] ). However, relevant discussions also appear in an earlier article by Wallace and Dowe. [3]

Universality probabilities of prefix-free UTMs are non-zero

Although the universality probability of a UTM (UTM) was originally suspected to be zero, relatively simple proofs exist that the supremum of the set of universality probabilities is equal to 1, such as a proof based on random walks [4] and a proof in Barmpalias and Dowe (2012). Once one has one prefix-free UTM with a non-zero universality probability, it immediately follows that all prefix-free UTMs have non-zero universality probability. Further, because the supremum of the set of universality probabilities is 1 and because the set { m/ 2n | 0 <n& 0 <m< 2n} is dense in the interval [0, 1], suitable constructions of UTMs (e.g., if U is a UTM, define a UTM U2 by U2(0s) halts for all strings s, U2(1s) = U(s) for all s) gives that the set of universality probabilities is dense in the open interval (0, 1).

Characterization and randomness of universality probability

Universality probability was thoroughly studied and characterized by Barmpalias and Dowe in 2012. [5] Seen as real numbers, these probabilities were completely characterized in terms of notions in computability theory and algorithmic information theory. It was shown that when the underlying machine is universal, these numbers are highly algorithmically random. More specifically, it is Martin-Löf random relative to the third iteration of the halting problem. In other words, they are random relative to null sets that can be defined with four quantifiers in Peano arithmetic. Vice versa, given such a highly random number[ clarification needed ] (with appropriate approximation properties) there is a Turing machine with universality probability that number.

Relation with Chaitin's constant

Universality probabilities are very related to the Chaitin constant, which is the halting probability of a universal prefix-free machine. In a sense, they are complementary to the halting probabilities of universal machines relative to the third iteration of the halting problem. In particular, the universality probability can be seen as the non-halting probability of a machine with oracle the third iteration of the halting problem. Vice versa, the non-halting probability of any prefix-free machine with this highly non-computable oracle is the universality probability of some prefix-free machine.

Probabilities of machines as examples of highly random numbers

Universality probability provides a concrete and somewhat natural example of a highly random number (in the sense of algorithmic information theory). In the same sense, Chaitin's constant provides a concrete example of a random number (but for a much weaker notion of algorithmic randomness).

See also

Related Research Articles

<span class="mw-page-title-main">Kolmogorov complexity</span> Measure of algorithmic complexity

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory.

In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.

<span class="mw-page-title-main">Gregory Chaitin</span> Argentine-American mathematician

Gregory John Chaitin is an Argentine-American mathematician and computer scientist. Beginning in the late 1960s, Chaitin made contributions to algorithmic information theory and metamathematics, in particular a computer-theoretic result equivalent to Gödel's incompleteness theorem. He is considered to be one of the founders of what is today known as algorithmic complexity together with Andrei Kolmogorov and Ray Solomonoff. Along with the works of e.g. Solomonoff, Kolmogorov, Martin-Löf, and Leonid Levin, algorithmic information theory became a foundational part of theoretical computer science, information theory, and mathematical logic. It is a common subject in several computer science curricula. Besides computer scientists, Chaitin's work draws attention of many philosophers and mathematicians to fundamental problems in mathematical creativity and digital philosophy.

In complexity theory and computability theory, an oracle machine is an abstract machine used to study decision problems. It can be visualized as a Turing machine with a black box, called an oracle, which is able to solve certain problems in a single operation. The problem can be of any complexity class. Even undecidable problems, such as the halting problem, can be used.

<span class="mw-page-title-main">Turing machine</span> Computation model defining an abstract machine

A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algorithm.

The concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let X1,...,Xn be independent random variables...". Yet as D. H. Lehmer stated in 1951: "A random sequence is a vague notion... in which each term is unpredictable to the uninitiated and whose digits pass a certain number of tests traditional with statisticians".

Computability theory, also known as recursion theory, is a branch of mathematical logic, computer science, and the theory of computation that originated in the 1930s with the study of computable functions and Turing degrees. The field has since expanded to include the study of generalized computability and definability. In these areas, computability theory overlaps with proof theory and effective descriptive set theory.

Hypercomputation or super-Turing computation is a set of hypothetical models of computation that can provide outputs that are not Turing-computable. For example, a machine that could solve the halting problem would be a hypercomputer; so too would one that could correctly evaluate every statement in Peano arithmetic.

Minimum message length (MML) is a Bayesian information-theoretic method for statistical model comparison and selection. It provides a formal information theory restatement of Occam's Razor: even when models are equal in their measure of fit-accuracy to the observed data, the one generating the most concise explanation of data is more likely to be correct. MML was invented by Chris Wallace, first appearing in the seminal paper "An information measure for classification". MML is intended not just as a theoretical construct, but as a technique that may be deployed in practice. It differs from the related concept of Kolmogorov complexity in that it does not require use of a Turing-complete language to model data.

Ray Solomonoff was an American mathematician who invented algorithmic probability, his General Theory of Inductive Inference, and was a founder of algorithmic information theory. He was an originator of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.

<span class="mw-page-title-main">Algorithmic probability</span>

In algorithmic information theory, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s. It is used in inductive inference theory and analyses of algorithms. In his general theory of inductive inference, Solomonoff uses the method together with Bayes' rule to obtain probabilities of prediction for an algorithm's future outputs.

Solomonoff's theory of inductive inference is a mathematical theory of induction introduced by Ray Solomonoff, based on probability theory and theoretical computer science. In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.

<span class="mw-page-title-main">No free lunch in search and optimization</span> Average solution cost is the same with any method

In computational complexity and optimization the no free lunch theorem is a result that states that for certain types of mathematical problems, the computational cost of finding a solution, averaged over all problems in the class, is the same for any solution method. The name alludes to the saying "no such thing as a free lunch", that is, no method offers a "short cut". This is under the assumption that the search space is a probability density function. It does not apply to the case where the search space has underlying structure that can be exploited more efficiently than random search or even has closed-form solutions that can be determined without search at all. For such probabilistic assumptions, the outputs of all procedures solving a particular type of problem are statistically identical. A colourful way of describing such a circumstance, introduced by David Wolpert and William G. Macready in connection with the problems of search and optimization, is to say that there is no free lunch. Wolpert had previously derived no free lunch theorems for machine learning. Before Wolpert's article was published, Cullen Schaffer independently proved a restricted version of one of Wolpert's theorems and used it to critique the current state of machine learning research on the problem of induction.

Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects (as opposed to stochastically generated), such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" (except for a constant that only depends on the chosen universal programming language) the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."

Intuitively, an algorithmically random sequence is a sequence of binary digits that appears random to any algorithm running on a universal Turing machine. The notion can be applied analogously to sequences on any finite alphabet. Random sequences are key objects of study in algorithmic information theory.

In theoretical computer science the random-access stored-program (RASP) machine model is an abstract machine used for the purposes of algorithm development and algorithm complexity theory.

Description numbers are numbers that arise in the theory of Turing machines. They are very similar to Gödel numbers, and are also occasionally called "Gödel numbers" in the literature. Given some universal Turing machine, every Turing machine can, given its encoding on that machine, be assigned a number. This is the machine's description number. These numbers play a key role in Alan Turing's proof of the undecidability of the halting problem, and are very useful in reasoning about Turing machines as well.

In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. The halting problem is undecidable, meaning that no general algorithm exists that solves the halting problem for all possible program–input pairs.

In mathematics, a set of natural numbers is called a K-trivial set if its initial segments viewed as binary strings are easy to describe: the prefix-free Kolmogorov complexity is as low as possible, close to that of a computable set. Solovay proved in 1975 that a set can be K-trivial without being computable.

Péter Gács, professionally also known as Peter Gacs, is a Hungarian-American mathematician and computer scientist, professor, and an external member of the Hungarian Academy of Sciences. He is well known for his work in reliable computation, randomness in computing, algorithmic complexity, algorithmic probability, and information theory.

References

    • Dowe, D.L. (5 September 2008). "Foreword re C. S. Wallace". Computer Journal. 51 (5): 523–560. doi:10.1093/comjnl/bxm117. (and here)
  1. Wallace, C. S. & Dowe, D. L. 1999 Minimum message length and Kolmogorov complexity Computer J. 42, 270–283
    • Hernandez-Orallo, J. & Dowe, D. L. (2013), "On Potential Cognitive Abilities in the Machine Kingdom", Minds and Machines, Vol. 23, Issue 2, pp179-210
  2. Barmpalias, G. and Dowe D.L. (2012). "Universality probability of a prefix-free machine". Philosophical Transactions of the Royal Society A. 370 (1): 3488–3511. Bibcode:2012RSPTA.370.3488B. CiteSeerX   10.1.1.221.6000 . doi:10.1098/rsta.2011.0319. PMID   22711870. S2CID   2092954.

Further reading