Ackermann function

Last updated

In computability theory, the Ackermann function, named after Wilhelm Ackermann, is one of the simplest [1] and earliest-discovered examples of a total computable function that is not primitive recursive. All primitive recursive functions are total and computable, but the Ackermann function illustrates that not all total computable functions are primitive recursive.

Contents

After Ackermann's publication [2] of his function (which had three non-negative integer arguments), many authors modified it to suit various purposes, so that today "the Ackermann function" may refer to any of numerous variants of the original function. One common version is the two-argument Ackermann–Péter function developed by Rózsa Péter and Raphael Robinson. Its value grows very rapidly; for example, results in , an integer of 19,729 decimal digits. [3]

History

In the late 1920s, the mathematicians Gabriel Sudan and Wilhelm Ackermann, students of David Hilbert, were studying the foundations of computation. Both Sudan and Ackermann are credited [4] with discovering total computable functions (termed simply "recursive" in some references) that are not primitive recursive. Sudan published the lesser-known Sudan function, then shortly afterwards and independently, in 1928, Ackermann published his function (from Greek, the letter phi ). Ackermann's three-argument function, , is defined such that for , it reproduces the basic operations of addition, multiplication, and exponentiation as

and for p > 2 it extends these basic operations in a way that can be compared to the hyperoperations:

(Aside from its historic role as a total-computable-but-not-primitive-recursive function, Ackermann's original function is seen to extend the basic arithmetic operations beyond exponentiation, although not as seamlessly as do variants of Ackermann's function that are specifically designed for that purpose—such as Goodstein's hyperoperation sequence.)

In On the Infinite, [5] David Hilbert hypothesized that the Ackermann function was not primitive recursive, but it was Ackermann, Hilbert's personal secretary and former student, who actually proved the hypothesis in his paper On Hilbert's Construction of the Real Numbers. [2] [6]

Rózsa Péter [7] and Raphael Robinson [8] later developed a two-variable version of the Ackermann function that became preferred by almost all authors.

The generalized hyperoperation sequence, e.g. , is a version of the Ackermann function as well. [9]

In 1963 R.C. Buck based an intuitive two-variable [n 1] variant on the hyperoperation sequence: [10] [11]

Compared to most other versions, Buck's function has no unessential offsets:

Many other versions of Ackermann function have been investigated. [12] [13]

Definition

Definition: as m-ary function

Ackermann's original three-argument function is defined recursively as follows for nonnegative integers and :

Of the various two-argument versions, the one developed by Péter and Robinson (called "the" Ackermann function by most authors) is defined for nonnegative integers and as follows:

The Ackermann function has also been expressed in relation to the hyperoperation sequence: [14] [15]

or, written in Knuth's up-arrow notation (extended to integer indices ):
or, equivalently, in terms of Buck's function F: [10]

Definition: as iterated 1-ary function

Define as the n-th iterate of :

Iteration is the process of composing a function with itself a certain number of times. Function composition is an associative operation, so .

Conceiving the Ackermann function as a sequence of unary functions, one can set .

The function then becomes a sequence of unary [n 2] functions, defined from iteration:

Computation

The recursive definition of the Ackermann function can naturally be transposed to a term rewriting system (TRS).

TRS, based on 2-ary function

The definition of the 2-ary Ackermann function leads to the obvious reduction rules [16] [17]

Example

Compute

The reduction sequence is [n 3]

Leftmost-outermost (one-step) strategy:             Leftmost-innermost (one-step) strategy:
        
        
        
        
        
        

To compute one can use a stack, which initially contains the elements .

Then repeatedly the two top elements are replaced according to the rules [n 4]

Schematically, starting from :

WHILE stackLength <> 1 {    POP 2 elements;    PUSH 1 or 2 or 3 elements, applying the rules r1, r2, r3 }

The pseudocode is published in Grossman & Zeitman (1988).

For example, on input ,

the stack configurations    reflect the reduction [n 5]
        
        
        
        
        
        
        
        
        
        
        
        
        
        

Remarks

Their own algorithm, inherently iterative, computes within time and within space.

TRS, based on iterated 1-ary function

The definition of the iterated 1-ary Ackermann functions leads to different reduction rules

As function composition is associative, instead of rule r6 one can define

Like in the previous section the computation of can be implemented with a stack.

Initially the stack contains the three elements .

Then repeatedly the three top elements are replaced according to the rules [n 4]

Schematically, starting from :

WHILE stackLength <> 1 {    POP 3 elements;    PUSH 1 or 3 or 5 elements, applying the rules r4, r5, r6; }

Example

On input the successive stack configurations are

The corresponding equalities are

When reduction rule r7 is used instead of rule r6, the replacements in the stack will follow

The successive stack configurations will then be

The corresponding equalities are

Remarks

TRS, based on hyperoperators

As Sundblad (1971) or Porto & Matos (1980) showed explicitly, the Ackermann function can be expressed in terms of the hyperoperation sequence:

or, after removal of the constant 2 from the parameter list, in terms of Buck's function

Buck's function , [10] a variant of Ackermann function by itself, can be computed with the following reduction rules:

Instead of rule b6 one can define the rule

To compute the Ackermann function it suffices to add three reduction rules

These rules take care of the base case A(0,n), the alignment (n+3) and the fudge (-3).

Example

Compute

using reduction rule : [n 5]     using reduction rule : [n 5]
        
        
        
        
        
        
                  
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        

The matching equalities are

Remarks

Huge numbers

To demonstrate how the computation of results in many steps and in a large number: [n 5]

Table of values

Computing the Ackermann function can be restated in terms of an infinite table. First, place the natural numbers along the top row. To determine a number in the table, take the number immediately to the left. Then use that number to look up the required number in the column given by that number and one row up. If there is no number to its left, simply look at the column headed "1" in the previous row. Here is a small upper-left portion of the table:

Values of A(m, n)
n
m
01234n
012345
123456
2357911
35132961125
41365533265536  3





565533

6
m

The numbers here which are only expressed with recursive exponentiation or Knuth arrows are very large and would take up too much space to notate in plain decimal digits.

Despite the large values occurring in this early section of the table, some even larger numbers have been defined, such as Graham's number, which cannot be written with any small number of Knuth arrows. This number is constructed with a technique similar to applying the Ackermann function to itself recursively.

This is a repeat of the above table, but with the values replaced by the relevant expression from the function definition to show the pattern clearly:

Values of A(m, n)
n
m
01234n
00+11+12+13+14+1n + 1
1A(0, 1)A(0, A(1, 0))
= A(0, 2)
A(0, A(1, 1))
= A(0, 3)
A(0, A(1, 2))
= A(0, 4)
A(0, A(1, 3))
= A(0, 5)
A(0, A(1, n−1))
2A(1, 1)A(1, A(2, 0))
= A(1, 3)
A(1, A(2, 1))
= A(1, 5)
A(1, A(2, 2))
= A(1, 7)
A(1, A(2, 3))
= A(1, 9)
A(1, A(2, n−1))
3A(2, 1)A(2, A(3, 0))
= A(2, 5)
A(2, A(3, 1))
= A(2, 13)
A(2, A(3, 2))
= A(2, 29)
A(2, A(3, 3))
= A(2, 61)
A(2, A(3, n−1))
4A(3, 1)A(3, A(4, 0))
= A(3, 13)
A(3, A(4, 1))
= A(3, 65533)
A(3, A(4, 2))A(3, A(4, 3))A(3, A(4, n−1))
5A(4, 1)A(4, A(5, 0))A(4, A(5, 1))A(4, A(5, 2))A(4, A(5, 3))A(4, A(5, n−1))
6A(5, 1)A(5, A(6, 0))A(5, A(6, 1))A(5, A(6, 2))A(5, A(6, 3))A(5, A(6, n−1))

Properties

General remarks

Not primitive recursive

The Ackermann function grows faster than any primitive recursive function and therefore is not itself primitive recursive. The sketch of the proof is this: a primitive recursive function defined using up to k recursions must grow slower than , the (k+1)-th function in the fast-growing hierarchy, but the Ackermann function grows at least as fast as .

Specifically, one shows that for every primitive recursive function there exists a non-negative integer such that for all non-negative integers ,

Once this is established, it follows that itself is not primitive recursive, since otherwise putting would lead to the contradiction

The proof proceeds as follows: define the class of all functions that grow slower than the Ackermann function

and show that contains all primitive recursive functions. The latter is achieved by showing that contains the constant functions, the successor function, the projection functions and that it is closed under the operations of function composition and primitive recursion.

Use in computational complexity

The Ackermann function appears in the time complexity of some algorithms, [19] such as vector addition systems [20] and Petri net reachability, [21] thus showing they are computationally infeasible for large instances. The inverse of the Ackerman function appears in some time complexity results.

Inverse

Since the function  f(n) = A(n, n) considered above grows very rapidly, its inverse function, f−1, grows very slowly. This inverse Ackermann functionf−1 is usually denoted by α. In fact, α(n) is less than 5 for any practical input size n, since A(4, 4) is on the order of .

This inverse appears in the time complexity of some algorithms, such as the disjoint-set data structure and Chazelle's algorithm for minimum spanning trees. Sometimes Ackermann's original function or other variations are used in these settings, but they all grow at similarly high rates. In particular, some modified functions simplify the expression by eliminating the −3 and similar terms.

A two-parameter variation of the inverse Ackermann function can be defined as follows, where is the floor function:

This function arises in more precise analyses of the algorithms mentioned above, and gives a more refined time bound. In the disjoint-set data structure, m represents the number of operations while n represents the number of elements; in the minimum spanning tree algorithm, m represents the number of edges while n represents the number of vertices. Several slightly different definitions of α(m, n) exist; for example, log2n is sometimes replaced by n, and the floor function is sometimes replaced by a ceiling.

Other studies might define an inverse function of one where m is set to a constant, such that the inverse applies to a particular row. [22]

The inverse of the Ackermann function is primitive recursive. [23]

Use as benchmark

The Ackermann function, due to its definition in terms of extremely deep recursion, can be used as a benchmark of a compiler's ability to optimize recursion. The first published use of Ackermann's function in this way was in 1970 by Dragoș Vaida [24] and, almost simultaneously, in 1971, by Yngve Sundblad. [14]

Sundblad's seminal paper was taken up by Brian Wichmann (co-author of the Whetstone benchmark) in a trilogy of papers written between 1975 and 1982. [25] [26] [27]

See also

Notes

  1. with parameter order reversed
  2. 'curried'
  3. In each step the underlined redex is rewritten.
  4. 1 2 here: leftmost-innermost strategy!
  5. 1 2 3 4 For better readability
    S(0) is notated as 1,
    S(S(0)) is notated as 2,
    S(S(S(0))) is notated as 3,
    etc...
  6. The maximum depth of recursion refers to the number of levels of activation of a procedure which exist during the deepest call of the procedure. Cornelius & Kirby (1975)
  7. LOOP n+1 TIMES DO F

Related Research Articles

In computability theory, a primitive recursive function is, roughly speaking, a function that can be computed by a computer program whose loops are all "for" loops. Primitive recursive functions form a strict subset of those general recursive functions that are also total functions.

In mathematical logic and computer science, a general recursive function, partial recursive function, or μ-recursive function is a partial function from natural numbers to natural numbers that is "computable" in an intuitive sense – as well as in a formal one. If the function is total, it is also called a total recursive function. In computability theory, it is shown that the μ-recursive functions are precisely the functions that can be computed by Turing machines. The μ-recursive functions are closely related to primitive recursive functions, and their inductive definition (below) builds upon that of the primitive recursive functions. However, not every total recursive function is a primitive recursive function—the most famous example is the Ackermann function.

In mathematics, a tuple is a finite sequence or ordered list of numbers or, more generally, mathematical objects, which are called the elements of the tuple. An n-tuple is a tuple of n elements, where n is a non-negative integer. There is only one 0-tuple, called the empty tuple. A 1-tuple and a 2-tuple are commonly called a singleton and an ordered pair, respectively.

In combinatory logic for computer science, a fixed-point combinator, is a higher-order function that returns some fixed point of its argument function, if one exists.

In computability theory, Kleene's recursion theorems are a pair of fundamental results about the application of computable functions to their own descriptions. The theorems were first proved by Stephen Kleene in 1938 and appear in his 1952 book Introduction to Metamathematics. A related theorem, which constructs fixed points of a computable function, is known as Rogers's theorem and is due to Hartley Rogers, Jr.

<span class="mw-page-title-main">Tetration</span> Arithmetic operation

In mathematics, tetration is an operation based on iterated, or repeated, exponentiation. There is no standard notation for tetration, though and the left-exponent xb are common.

In computational complexity theory, the complexity class ELEMENTARY of elementary recursive functions is the union of the classes

Computable functions are the basic objects of study in computability theory. Computable functions are the formalized analogue of the intuitive notion of algorithms, in the sense that a function is computable if there exists an algorithm that can do the job of the function, i.e. given an input of the function domain it can return the corresponding output. Computable functions are used to discuss computability without referring to any concrete model of computation such as Turing machines or register machines. Any definition, however, must make reference to some specific model of computation but all valid definitions yield the same class of functions. Particular models of computability that give rise to the set of computable functions are the Turing-computable functions and the general recursive functions.

In computability theory the S m
n
 
theorem
, written also as "smn-theorem" or "s-m-n theorem" is a basic result about programming languages. It was first proved by Stephen Cole Kleene (1943). The name S m
n
 
comes from the occurrence of an S with subscript n and superscript m in the original formulation of the theorem.

In mathematics, Church encoding is a means of representing data and operators in the lambda calculus. The Church numerals are a representation of the natural numbers using lambda notation. The method is named for Alonzo Church, who first encoded data in the lambda calculus this way.

In the theory of computation, the Sudan function is an example of a function that is recursive, but not primitive recursive. This is also true of the better-known Ackermann function.

In computability theory, course-of-values recursion is a technique for defining number-theoretic functions by recursion. In a definition of a function f by course-of-values recursion, the value of f(n) is computed from the sequence .

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

Primitive recursive arithmetic (PRA) is a quantifier-free formalization of the natural numbers. It was first proposed by Norwegian mathematician Skolem (1923), as a formalization of his finitistic conception of the foundations of arithmetic, and it is widely agreed that all reasoning of PRA is finitistic. Many also believe that all of finitism is captured by PRA, but others believe finitism can be extended to forms of recursion beyond primitive recursion, up to ε0, which is the proof-theoretic ordinal of Peano arithmetic. PRA's proof theoretic ordinal is ωω, where ω is the smallest transfinite ordinal. PRA is sometimes called Skolem arithmetic, although that has another meaning, see Skolem arithmetic.

In mathematics, the Fibonacci numbers form a sequence defined recursively by:

<span class="mw-page-title-main">Wallenius' noncentral hypergeometric distribution</span>

In probability theory and statistics, Wallenius' noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where items are sampled with bias.

The Grzegorczyk hierarchy, named after the Polish logician Andrzej Grzegorczyk, is a hierarchy of functions used in computability theory. Every function in the Grzegorczyk hierarchy is a primitive recursive function, and every primitive recursive function appears in the hierarchy at some level. The hierarchy deals with the rate at which the values of the functions grow; intuitively, functions in lower levels of the hierarchy grow slower than functions in the higher levels.

In mathematics, the hyperoperation sequence is an infinite sequence of arithmetic operations (called hyperoperations in this context) that starts with a unary operation (the successor function with n = 0). The sequence continues with the binary operations of addition (n = 1), multiplication (n = 2), and exponentiation (n = 3).

LOOP is a simple register language that precisely captures the primitive recursive functions. The language is derived from the counter-machine model. Like the counter machines the LOOP language comprises a set of one or more unbounded registers, each of which can hold a single non-negative integer. A few arithmetic instructions operate on the registers. The only control flow instruction is 'LOOP x DO...END'. It causes the instructions within its scope to be repeated x times.

In mathematics, Rathjen's  psi function is an ordinal collapsing function developed by Michael Rathjen. It collapses weakly Mahlo cardinals to generate large countable ordinals. A weakly Mahlo cardinal is a cardinal such that the set of regular cardinals below is closed under . Rathjen uses this to diagonalise over the weakly inaccessible hierarchy.

References

  1. Monin & Hinchey 2003, p. 61.
  2. 1 2 Ackermann 1928.
  3. "Decimal expansion of A(4,2)". kosara.net. August 27, 2000. Archived from the original on January 20, 2010.
  4. Calude, Marcus & Tevy 1979.
  5. Hilbert 1926, p. 185.
  6. van Heijenoort 1977.
  7. Péter 1935.
  8. Robinson 1948.
  9. Ritchie 1965, p. 1028.
  10. 1 2 3 Buck 1963.
  11. Meeussen & Zantema 1992, p. 6.
  12. Munafo 1999a.
  13. Ritchie 1965.
  14. 1 2 Sundblad 1971.
  15. Porto & Matos 1980.
  16. Grossman & Zeitman 1988.
  17. Paulson 2021.
  18. Cohen 1987, p. 56, Proposition 3.16 (see in proof).
  19. Brubaker 2023.
  20. Czerwiński & Orlikowski 2022.
  21. Leroux 2022.
  22. Pettie 2002.
  23. Matos 2014.
  24. Vaida 1970.
  25. Wichmann 1976.
  26. Wichmann 1977.
  27. Wichmann 1982.

Bibliography