Chaitin's constant

Last updated

In the computer science subfield of algorithmic information theory, a Chaitin constant (Chaitin omega number) [1] or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.

Contents

Although there are infinitely many halting probabilities, one for each (universal, see below) method of encoding programs, it is common to use the letter Ω to refer to them as if there were only one. Because Ω depends on the program encoding used, it is sometimes called Chaitin's construction when not referring to any specific encoding.

Each halting probability is a normal and transcendental real number that is not computable, which means that there is no algorithm to compute its digits. Each halting probability is Martin-Löf random, meaning there is not even any algorithm which can reliably guess its digits.

Background

The definition of a halting probability relies on the existence of a prefix-free universal computable function. Such a function, intuitively, represents a programming language with the property that no valid program can be obtained as a proper extension of another valid program.

Suppose that F is a partial function that takes one argument, a finite binary string, and possibly returns a single binary string as output. The function F is called computable if there is a Turing machine that computes it, in the sense that for any finite binary strings x and y,F(x) = y if and only if the Turing machine halts with y on its tape when given the input x.

The function F is called universal if the following property holds: for every computable function f of a single variable there is a string w such that for all x, F(w x) = f(x); here w x represents the concatenation of the two strings w and x. This means that F can be used to simulate any computable function of one variable. Informally, w represents a "script" for the computable function f, and F represents an "interpreter" that parses the script as a prefix of its input and then executes it on the remainder of input.

The domain of F is the set of all inputs p on which it is defined. For F that are universal, such a p can generally be seen both as the concatenation of a program part and a data part, and as a single program for the function F.

The function F is called prefix-free if there are no two elements p, p in its domain such that p is a proper extension of p. This can be rephrased as: the domain of F is a prefix-free code (instantaneous code) on the set of finite binary strings. A simple way to enforce prefix-free-ness is to use machines whose means of input is a binary stream from which bits can be read one at a time. There is no end-of-stream marker; the end of input is determined by when the universal machine decides to stop reading more bits, and the remaining bits are not considered part of the accepted string. Here, the difference between the two notions of program mentioned in the last paragraph becomes clear; one is easily recognized by some grammar, while the other requires arbitrary computation to recognize.

The domain of any universal computable function is a computably enumerable set but never a computable set. The domain is always Turing equivalent to the halting problem.

Definition

Let PF be the domain of a prefix-free universal computable function F. The constant ΩF is then defined as

,

where denotes the length of a string p. This is an infinite sum which has one summand for every p in the domain of F. The requirement that the domain be prefix-free, together with Kraft's inequality, ensures that this sum converges to a real number between 0 and 1. If F is clear from context then ΩF may be denoted simply Ω, although different prefix-free universal computable functions lead to different values of Ω.

Relationship to the halting problem

Knowing the first N bits of Ω, one could calculate the halting problem for all programs of a size up to N. Let the program p for which the halting problem is to be solved be N bits long. In dovetailing fashion, all programs of all lengths are run, until enough have halted to jointly contribute enough probability to match these first N bits. If the program p hasn't halted yet, then it never will, since its contribution to the halting probability would affect the first N bits. Thus, the halting problem would be solved for p.

Because many outstanding problems in number theory, such as Goldbach's conjecture, are equivalent to solving the halting problem for special programs (which would basically search for counter-examples and halt if one is found), knowing enough bits of Chaitin's constant would also imply knowing the answer to these problems. But as the halting problem is not generally solvable, and therefore calculating any but the first few bits of Chaitin's constant is not possible in a very concise language, this just reduces hard problems to impossible ones, much like trying to build an oracle machine for the halting problem would be.

Interpretation as a probability

The Cantor space is the collection of all infinite sequences of 0s and 1s. A halting probability can be interpreted as the measure of a certain subset of Cantor space under the usual probability measure on Cantor space. It is from this interpretation that halting probabilities take their name.

The probability measure on Cantor space, sometimes called the fair-coin measure, is defined so that for any binary string x the set of sequences that begin with x has measure 2−|x|. This implies that for each natural number n, the set of sequences f in Cantor space such that f(n) = 1 has measure 1/2, and the set of sequences whose nth element is 0 also has measure 1/2.

Let F be a prefix-free universal computable function. The domain P of F consists of an infinite set of binary strings

.

Each of these strings pi determines a subset Si of Cantor space; the set Si contains all sequences in cantor space that begin with pi. These sets are disjoint because P is a prefix-free set. The sum

represents the measure of the set

.

In this way, ΩF represents the probability that a randomly selected infinite sequence of 0s and 1s begins with a bit string (of some finite length) that is in the domain of F. It is for this reason that ΩF is called a halting probability.

Properties

Each Chaitin constant Ω has the following properties:

Not every set that is Turing equivalent to the halting problem is a halting probability. A finer equivalence relation, Solovay equivalence, can be used to characterize the halting probabilities among the left-c.e. reals. [4] One can show that a real number in [0,1] is a Chaitin constant (i.e. the halting probability of some prefix-free universal computable function) if and only if it is left-c.e. and algorithmically random. [4] Ω is among the few definable algorithmically random numbers and is the best-known algorithmically random number, but it is not at all typical of all algorithmically random numbers. [5]

Uncomputability

A real number is called computable if there is an algorithm which, given n, returns the first n digits of the number. This is equivalent to the existence of a program that enumerates the digits of the real number.

No halting probability is computable. The proof of this fact relies on an algorithm which, given the first n digits of Ω, solves Turing's halting problem for programs of length up to n. Since the halting problem is undecidable, Ω cannot be computed.

The algorithm proceeds as follows. Given the first n digits of Ω and a kn, the algorithm enumerates the domain of F until enough elements of the domain have been found so that the probability they represent is within 2−(k+1) of Ω. After this point, no additional program of length k can be in the domain, because each of these would add 2k to the measure, which is impossible. Thus the set of strings of length k in the domain is exactly the set of such strings already enumerated.

Algorithmic randomness

A real number is random if the binary sequence representing the real number is an algorithmically random sequence. Calude, Hertling, Khoussainov, and Wang showed [6] that a recursively enumerable real number is an algorithmically random sequence if and only if it is a Chaitin's Ω number.

Incompleteness theorem for halting probabilities

For each specific consistent effectively represented axiomatic system for the natural numbers, such as Peano arithmetic, there exists a constant N such that no bit of Ω after the Nth can be proven to be 1 or 0 within that system. The constant N depends on how the formal system is effectively represented, and thus does not directly reflect the complexity of the axiomatic system. This incompleteness result is similar to Gödel's incompleteness theorem in that it shows that no consistent formal theory for arithmetic can be complete.

Super Omega

As mentioned above, the first n bits of Gregory Chaitin's constant Ω are random or incompressible in the sense that we cannot compute them by a halting algorithm with fewer than n-O(1) bits. However, consider the short but never halting algorithm which systematically lists and runs all possible programs; whenever one of them halts its probability gets added to the output (initialized by zero). After finite time the first n bits of the output will never change any more (it does not matter that this time itself is not computable by a halting program). So there is a short non-halting algorithm whose output converges (after finite time) onto the first n bits of Ω. In other words, the enumerable first n bits of Ω are highly compressible in the sense that they are limit-computable by a very short algorithm; they are not random with respect to the set of enumerating algorithms. Jürgen Schmidhuber (2000) constructed a limit-computable "Super Ω" which in a sense is much more random than the original limit-computable Ω, as one cannot significantly compress the Super Ω by any enumerating non-halting algorithm.

For an alternative "Super Ω", the universality probability of a prefix-free Universal Turing Machine (UTM)  namely, the probability that it remains universal even when every input of it (as a binary string) is prefixed by a random binary string  can be seen as the non-halting probability of a machine with oracle the third iteration of the halting problem (i.e., using Turing Jump notation). [7]

See also

Related Research Articles

<span class="mw-page-title-main">Kolmogorov complexity</span> Measure of algorithmic complexity

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory.

<span class="mw-page-title-main">Computable number</span> Real number that can be computed within arbitrary precision

In mathematics, computable numbers are the real numbers that can be computed to within any desired precision by a finite, terminating algorithm. They are also known as the recursive numbers, effective numbers or the computable reals or recursive reals. The concept of a computable real number was introduced by Emile Borel in 1912, using the intuitive notion of computability available at the time.

<span class="mw-page-title-main">Gregory Chaitin</span> Argentine-American mathematician

Gregory John Chaitin is an Argentine-American mathematician and computer scientist. Beginning in the late 1960s, Chaitin made contributions to algorithmic information theory and metamathematics, in particular a computer-theoretic result equivalent to Gödel's incompleteness theorem. He is considered to be one of the founders of what is today known as algorithmic complexity together with Andrei Kolmogorov and Ray Solomonoff. Along with the works of e.g. Solomonoff, Kolmogorov, Martin-Löf, and Leonid Levin, algorithmic information theory became a foundational part of theoretical computer science, information theory, and mathematical logic. It is a common subject in several computer science curricula. Besides computer scientists, Chaitin's work draws attention of many philosophers and mathematicians to fundamental problems in mathematical creativity and digital philosophy.

In theoretical computer science, the busy beaver game aims at finding a terminating program of a given size that produces the most output possible. Since an endlessly looping program producing infinite output or running for infinite time is easily conceived, such programs are excluded from the game. This is sometimes also formulated as finding the machine that runs for the longest time, but both games are similarly difficult.

Computability theory, also known as recursion theory, is a branch of mathematical logic, computer science, and the theory of computation that originated in the 1930s with the study of computable functions and Turing degrees. The field has since expanded to include the study of generalized computability and definability. In these areas, computability theory overlaps with proof theory and effective descriptive set theory.

Hypercomputation or super-Turing computation is a set of hypothetical models of computation that can provide outputs that are not Turing-computable. For example, a machine that could solve the halting problem would be a hypercomputer; so too would one that could correctly evaluate every statement in Peano arithmetic.

<span class="mw-page-title-main">Arithmetical hierarchy</span> Hierarchy of complexity classes for formulas defining sets

In mathematical logic, the arithmetical hierarchy, arithmetic hierarchy or Kleene–Mostowski hierarchy classifies certain sets based on the complexity of formulas that define them. Any set that receives a classification is called arithmetical. The arithmetical hierarchy was invented independently by Kleene (1943) and Mostowski (1946).

An enumeration is a complete, ordered listing of all the items in a collection. The term is commonly used in mathematics and computer science to refer to a listing of all of the elements of a set. The precise requirements for an enumeration depend on the discipline of study and the context of a given problem.

In computability theory, a set S of natural numbers is called computably enumerable (c.e.), recursively enumerable (r.e.), semidecidable, partially decidable, listable, provable or Turing-recognizable if:

<span class="mw-page-title-main">Algorithmic probability</span>

In algorithmic information theory, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s. It is used in inductive inference theory and analyses of algorithms. In his general theory of inductive inference, Solomonoff uses the method together with Bayes' rule to obtain probabilities of prediction for an algorithm's future outputs.

Computable functions are the basic objects of study in computability theory. Computable functions are the formalized analogue of the intuitive notion of algorithms, in the sense that a function is computable if there exists an algorithm that can do the job of the function, i.e. given an input of the function domain it can return the corresponding output. Computable functions are used to discuss computability without referring to any concrete model of computation such as Turing machines or register machines. Any definition, however, must make reference to some specific model of computation but all valid definitions yield the same class of functions. Particular models of computability that give rise to the set of computable functions are the Turing-computable functions and the general recursive functions.

In computability theory, a Turing reduction from a decision problem to a decision problem is an oracle machine that decides problem given an oracle for . It can be understood as an algorithm that could be used to solve if it had available to it a subroutine for solving . The concept can be analogously applied to function problems.

Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects (as opposed to stochastically generated), such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" (except for a constant that only depends on the chosen universal programming language) the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."

In computability theory, a function is called limit computable if it is the limit of a uniformly computable sequence of functions. The terms computable in the limit, limit recursive and recursively approximable are also used. One can think of limit computable functions as those admitting an eventually correct computable guessing procedure at their true value. A set is limit computable just when its characteristic function is limit computable.

Intuitively, an algorithmically random sequence is a sequence of binary digits that appears random to any algorithm running on a universal Turing machine. The notion can be applied analogously to sequences on any finite alphabet. Random sequences are key objects of study in algorithmic information theory.

In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. The halting problem is undecidable, meaning that no general algorithm exists that solves the halting problem for all possible program–input pairs.

In mathematics, a set of natural numbers is called a K-trivial set if its initial segments viewed as binary strings are easy to describe: the prefix-free Kolmogorov complexity is as low as possible, close to that of a computable set. Solovay proved in 1975 that a set can be K-trivial without being computable.

Universality probability is an abstruse probability measure in computational complexity theory that concerns universal Turing machines.

In mathematics, the incompressibility method is a proof method like the probabilistic method, the counting method or the pigeonhole principle. To prove that an object in a certain class satisfies a certain property, select an object of that class that is incompressible. If it does not satisfy the property, it can be compressed by computable coding. Since it can be generally proven that almost all objects in a given class are incompressible, the argument demonstrates that almost all objects in the class have the property involved. To select an incompressible object is ineffective, and cannot be done by a computer program. However, a simple counting argument usually shows that almost all objects of a given class can be compressed by only a few bits.

References

  1. mathworld.wolfram.com, Chaitin's Constant. Retrieved 28 May 2012
  2. Downey & Hirschfeldt 2010, Theorem 6.1.3.
  3. Downey & Hirschfeldt 2010, Theorem 5.1.11.
  4. 1 2 Downey & Hirschfeldt 2010, p. 405.
  5. Downey & Hirschfeldt 2010, pp. 228–229.
  6. Calude, Cristian S.; Hertling, Peter H.; Khoussainov, Bakhadyr; Wang, Yongge (1998), "Recursively Enumerable Reals and Chaitin Ω numbers" (PDF), STACS 98 , vol. 1373, Springer Berlin Heidelberg, pp. 596–606, Bibcode:1998LNCS.1373..596C, doi:10.1007/bfb0028594, ISBN   978-3-540-64230-5, S2CID   5493426, archived (PDF) from the original on 19 January 2004, retrieved 20 March 2022
  7. Barmpalias, G. and Dowe D.L. (2012). "Universality probability of a prefix-free machine". Philosophical Transactions of the Royal Society A. 370 (1): 3488–3511 (Theme Issue 'The foundations of computation, physics and mentality: the Turing legacy' compiled and edited by Barry Cooper and Samson Abramsky). Bibcode:2012RSPTA.370.3488B. doi: 10.1098/rsta.2011.0319 . PMID   22711870.

Works cited