Lambda calculus

Last updated

Lambda calculus (also written as λ-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation that can be used to simulate any Turing machine. It was introduced by the mathematician Alonzo Church in the 1930s as part of his research into the foundations of mathematics.


Lambda calculus consists of constructing lambda terms and performing reduction operations on them. In the simplest form of lambda calculus, terms are built using only the following rules:

xVariableA character or string representing a parameter or mathematical/logical value.
x.M)AbstractionFunction definition (M is a lambda term). The variable x becomes bound in the expression.
(MN)ApplicationApplying a function to an argument. M and N are lambda terms.

producing expressions such as: (λxy.(λz.(λx.z x) (λy.z y)) (x y)). Parentheses can be dropped if the expression is unambiguous. For some applications, terms for logical and mathematical constants and operations may be included.

The reduction operations include:

x.M[x]) → (λy.M[y])α-conversionRenaming the bound variables in the expression. Used to avoid name collisions.
((λx.M) E) → (M[x := E])β-reductionReplacing the bound variables with the argument expression in the body of the abstraction.

If De Bruijn indexing is used, then α-conversion is no longer required as there will be no name collisions. If repeated application of the reduction steps eventually terminates, then by the Church–Rosser theorem it will produce a β-normal form.

Explanation and applications

Lambda calculus is Turing complete, that is, it is a universal model of computation that can be used to simulate any Turing machine. [1] Its namesake, the Greek letter lambda (λ), is used in lambda expressions and lambda terms to denote binding a variable in a function.

Lambda calculus may be untyped or typed. In typed lambda calculus, functions can be applied only if they are capable of accepting the given input's "type" of data. Typed lambda calculi are weaker than the untyped lambda calculus, which is the primary subject of this article, in the sense that typed lambda calculi can express less than the untyped calculus can, but on the other hand typed lambda calculi allow more things to be proved; in the simply typed lambda calculus it is, for example, a theorem that every evaluation strategy terminates for every simply typed lambda-term, whereas evaluation of untyped lambda-terms need not terminate. One reason there are many different typed lambda calculi has been the desire to do more (of what the untyped calculus can do) without giving up on being able to prove strong theorems about the calculus.

Lambda calculus has applications in many different areas in mathematics, philosophy, [2] linguistics, [3] [4] and computer science. [5] Lambda calculus has played an important role in the development of the theory of programming languages. Functional programming languages implement the lambda calculus. Lambda calculus is also a current research topic in Category theory. [6]


The lambda calculus was introduced by mathematician Alonzo Church in the 1930s as part of an investigation into the foundations of mathematics. [7] [8] The original system was shown to be logically inconsistent in 1935 when Stephen Kleene and J. B. Rosser developed the Kleene–Rosser paradox. [9] [10]

Subsequently, in 1936 Church isolated and published just the portion relevant to computation, what is now called the untyped lambda calculus. [11] In 1940, he also introduced a computationally weaker, but logically consistent system, known as the simply typed lambda calculus. [12]

Until the 1960s when its relation to programming languages was clarified, the lambda calculus was only a formalism. Thanks to Richard Montague and other linguists' applications in the semantics of natural language, the lambda calculus has begun to enjoy a respectable place in both linguistics [13] and computer science. [14]

Origin of the lambda symbol

There is a bit of controversy over the reason for Church's use of the Greek letter lambda (λ) as the notation for function-abstraction in the lambda calculus, perhaps in part due to conflicting explanations by Church himself. According to Cardone and Hindley (2006):

By the way, why did Church choose the notation “λ”? In [an unpublished 1964 letter to Harald Dickson] he stated clearly that it came from the notation “” used for class-abstraction by Whitehead and Russell, by first modifying “” to “∧” to distinguish function-abstraction from class-abstraction, and then changing “∧” to “λ” for ease of printing.

This origin was also reported in [Rosser, 1984, p.338]. On the other hand, in his later years Church told two enquirers that the choice was more accidental: a symbol was needed and λ just happened to be chosen.

Dana Scott has also addressed this controversy in various public lectures. [15] Scott recounts that he once posed a question about the origin of the lambda symbol to Church's son-in-law John Addison, who then wrote his father-in-law a postcard:

Dear Professor Church,

Russell had the iota operator, Hilbert had the epsilon operator. Why did you choose lambda for your operator?

According to Scott, Church's entire response consisted of returning the postcard with the following annotation: "eeny, meeny, miny, moe".

Informal description


Computable functions are a fundamental concept within computer science and mathematics. The lambda calculus provides a simple semantics for computation, enabling properties of computation to be studied formally. The lambda calculus incorporates two simplifications that make this semantics simple. The first simplification is that the lambda calculus treats functions "anonymously", without giving them explicit names. For example, the function

can be rewritten in anonymous form as

(read as "a tuple of x and y is mapped to "). Similarly,

can be rewritten in anonymous form as

where the input is simply mapped to itself.

The second simplification is that the lambda calculus only uses functions of a single input. An ordinary function that requires two inputs, for instance the function, can be reworked into an equivalent function that accepts a single input, and as output returns another function, that in turn accepts a single input. For example,

can be reworked into

This method, known as currying, transforms a function that takes multiple arguments into a chain of functions each with a single argument.

Function application of the function to the arguments (5, 2), yields at once


whereas evaluation of the curried version requires one more step

// the definition of has been used with in the inner expression. This is like β-reduction.
// the definition of has been used with . Again, similar to β-reduction.

to arrive at the same result.

The lambda calculus

The lambda calculus consists of a language of lambda terms, which is defined by a certain formal syntax, and a set of transformation rules, which allow manipulation of the lambda terms. These transformation rules can be viewed as an equational theory or as an operational definition.

As described above, all functions in the lambda calculus are anonymous functions, having no names. They only accept one input variable, with currying used to implement functions with several variables.

Lambda terms

The syntax of the lambda calculus defines some expressions as valid lambda calculus expressions and some as invalid, just as some strings of characters are valid C programs and some are not. A valid lambda calculus expression is called a "lambda term".

The following three rules give an inductive definition that can be applied to build all syntactically valid lambda terms:

  • a variable, , is itself a valid lambda term
  • if is a lambda term, and is a variable, then is a lambda term (called an abstraction);
  • if and are lambda terms, then is a lambda term (called an application).

Nothing else is a lambda term. Thus a lambda term is valid if and only if it can be obtained by repeated application of these three rules. However, some parentheses can be omitted according to certain rules. For example, the outermost parentheses are usually not written. See Notation , below.

An abstraction is a definition of an anonymous function that is capable of taking a single input and substituting it into the expression . It thus defines an anonymous function that takes and returns . For example, is an abstraction for the function using the term for . The definition of a function with an abstraction merely "sets up" the function but does not invoke it. The abstraction binds the variable in the term .

An application represents the application of a function to an input , that is, it represents the act of calling function on input to produce .

There is no concept in lambda calculus of variable declaration. In a definition such as (i.e. ), the lambda calculus treats as a variable that is not yet defined. The abstraction is syntactically valid, and represents a function that adds its input to the yet-unknown .

Bracketing may be used and may be needed to disambiguate terms. For example, and denote different terms (although they coincidentally reduce to the same value). Here, the first example defines a function whose lambda term is the result of applying x to the child function, while the second example is the application of the outermost function to the input x, which returns the child function. Therefore, both examples evaluate to the identity function .

Functions that operate on functions

In lambda calculus, functions are taken to be 'first class values', so functions may be used as the inputs, or be returned as outputs from other functions.

For example, represents the identity function, , and represents the identity function applied to . Further, represents the constant function, the function that always returns , no matter the input. In lambda calculus, function application is regarded as left-associative, so that means .

There are several notions of "equivalence" and "reduction" that allow lambda terms to be "reduced" to "equivalent" lambda terms.

Alpha equivalence

A basic form of equivalence, definable on lambda terms, is alpha equivalence. It captures the intuition that the particular choice of a bound variable, in an abstraction, does not (usually) matter. For instance, and are alpha-equivalent lambda terms, and they both represent the same function (the identity function). The terms and are not alpha-equivalent, because they are not bound in an abstraction. In many presentations, it is usual to identify alpha-equivalent lambda terms.

The following definitions are necessary in order to be able to define β-reduction:

Free variables

The free variables of a term are those variables not bound by an abstraction. The set of free variables of an expression is defined inductively:

  • The free variables of are just
  • The set of free variables of is the set of free variables of , but with removed
  • The set of free variables of is the union of the set of free variables of and the set of free variables of .

For example, the lambda term representing the identity has no free variables, but the function has a single free variable, .

Capture-avoiding substitutions

Suppose , and are lambda terms and and are variables. The notation indicates substitution of for in in a capture-avoiding manner. This is defined so that:

  • ;
  • if ;
  • ;
  • ;
  • if and is not in the free variables of . The variable is said to be "fresh" for .

For example, , and .

The freshness condition (requiring that is not in the free variables of ) is crucial in order to ensure that substitution does not change the meaning of functions. For example, a substitution is made that ignores the freshness condition: . This substitution turns the constant function into the identity by substitution.

In general, failure to meet the freshness condition can be remedied by alpha-renaming with a suitable fresh variable. For example, switching back to our correct notion of substitution, in the abstraction can be renamed with a fresh variable , to obtain , and the meaning of the function is preserved by substitution.


The β-reduction rule states that an application of the form reduces to the term . The notation is used to indicate that β-reduces to . For example, for every , . This demonstrates that really is the identity. Similarly, , which demonstrates that is a constant function.

The lambda calculus may be seen as an idealised version of a functional programming language, like Haskell or Standard ML. Under this view, β-reduction corresponds to a computational step. This step can be repeated by additional β-reductions until there are no more applications left to reduce. In the untyped lambda calculus, as presented here, this reduction process may not terminate. For instance, consider the term . Here . That is, the term reduces to itself in a single β-reduction, and therefore the reduction process will never terminate.

Another aspect of the untyped lambda calculus is that it does not distinguish between different kinds of data. For instance, it may be desirable to write a function that only operates on numbers. However, in the untyped lambda calculus, there is no way to prevent a function from being applied to truth values, strings, or other non-number objects.

Formal definition


Lambda expressions are composed of:

The set of lambda expressions, Λ, can be defined inductively:

  1. If x is a variable, then x ∈ Λ.
  2. If x is a variable and M ∈ Λ, then (λx.M) ∈ Λ.
  3. If M, N ∈ Λ, then (M N) ∈ Λ.

Instances of rule 2 are known as abstractions and instances of rule 3 are known as applications. [16] [17]


To keep the notation of lambda expressions uncluttered, the following conventions are usually applied:

Free and bound variables

The abstraction operator, λ, is said to bind its variable wherever it occurs in the body of the abstraction. Variables that fall within the scope of an abstraction are said to be bound. In an expression λx.M, the part λx is often called binder, as a hint that the variable x is getting bound by appending λx to M. All other variables are called free. For example, in the expression λy.x x y, y is a bound variable and x is a free variable. Also a variable is bound by its nearest abstraction. In the following example the single occurrence of x in the expression is bound by the second lambda: λx.yx.z x).

The set of free variables of a lambda expression, M, is denoted as FV(M) and is defined by recursion on the structure of the terms, as follows:

  1. FV(x) = {x}, where x is a variable.
  2. FV(λx.M) = FV(M) \ {x}.
  3. FV(M N) = FV(M) ∪ FV(N). [20]

An expression that contains no free variables is said to be closed. Closed lambda expressions are also known as combinators and are equivalent to terms in combinatory logic.


The meaning of lambda expressions is defined by how expressions can be reduced. [21]

There are three kinds of reduction:

We also speak of the resulting equivalences: two expressions are α-equivalent, if they can be α-converted into the same expression. β-equivalence and η-equivalence are defined similarly.

The term redex, short for reducible expression, refers to subterms that can be reduced by one of the reduction rules. For example, (λx.M) N is a β-redex in expressing the substitution of N for x in M. The expression to which a redex reduces is called its reduct; the reduct of (λx.M) N is M[x := N].

If x is not free in M, λx.M x is also an η-redex, with a reduct of M.


α-conversion, sometimes known as α-renaming, [22] allows bound variable names to be changed. For example, α-conversion of λx.x might yield λy.y. Terms that differ only by α-conversion are called α-equivalent. Frequently, in uses of lambda calculus, α-equivalent terms are considered to be equivalent.

The precise rules for α-conversion are not completely trivial. First, when α-converting an abstraction, the only variable occurrences that are renamed are those that are bound to the same abstraction. For example, an α-conversion of λxx.x could result in λyx.x, but it could not result in λyx.y. The latter has a different meaning from the original. This is analogous to the programming notion of variable shadowing.

Second, α-conversion is not possible if it would result in a variable getting captured by a different abstraction. For example, if we replace x with y in λxy.x, we get λyy.y, which is not at all the same.

In programming languages with static scope, α-conversion can be used to make name resolution simpler by ensuring that no variable name masks a name in a containing scope (see α-renaming to make name resolution trivial).

In the De Bruijn index notation, any two α-equivalent terms are syntactically identical.


Substitution, written M[V := N], is the process of replacing all free occurrences of the variable V in the expression M with expression N. Substitution on terms of the lambda calculus is defined by recursion on the structure of terms, as follows (note: x and y are only variables while M and N are any lambda expression):

x[x := N] = N
y[x := N] = y, if xy
(M1M2)[x := N] = (M1[x := N]) (M2[x := N])
x.M)[x := N] = λx.M
y.M)[x := N] = λy.(M[x := N]), if xy and y ∉ FV(N)

To substitute into an abstraction, it is sometimes necessary to α-convert the expression. For example, it is not correct for (λx.y)[y := x] to result in λx.x, because the substituted x was supposed to be free but ended up being bound. The correct substitution in this case is λz.x, up to α-equivalence. Substitution is defined uniquely up to α-equivalence.


β-reduction captures the idea of function application. β-reduction is defined in terms of substitution: the β-reduction of (λV.M) N is M[V := N].

For example, assuming some encoding of 2, 7, ×, we have the following β-reduction: (λn.n × 2) 7 → 7 × 2.

β-reduction can be seen to be the same as the concept of local reducibility in natural deduction, via the Curry–Howard isomorphism.


η-reduction expresses the idea of extensionality, which in this context is that two functions are the same if and only if they give the same result for all arguments. η-reduction converts between λx.fx and f whenever x does not appear free in f.

η-reduction can be seen to be the same as the concept of local completeness in natural deduction, via the Curry–Howard isomorphism.

Normal forms and confluence

For the untyped lambda calculus, β-reduction as a rewriting rule is neither strongly normalising nor weakly normalising.

However, it can be shown that β-reduction is confluent when working up to α-conversion (i.e. we consider two normal forms to be equal if it is possible to α-convert one into the other).

Therefore, both strongly normalising terms and weakly normalising terms have a unique normal form. For strongly normalising terms, any reduction strategy is guaranteed to yield the normal form, whereas for weakly normalising terms, some reduction strategies may fail to find it.

Encoding datatypes

The basic lambda calculus may be used to model booleans, arithmetic, data structures and recursion, as illustrated in the following sub-sections.

Arithmetic in lambda calculus

There are several possible ways to define the natural numbers in lambda calculus, but by far the most common are the Church numerals, which can be defined as follows:

0 := λfx.x
1 := λfx.fx
2 := λfx.f (fx)
3 := λfx.f (f (fx))

and so on. Or using the alternative syntax presented above in Notation :

0 := λfx.x
1 := λfx.fx
2 := λfx.f (fx)
3 := λfx.f (f (fx))

A Church numeral is a higher-order function—it takes a single-argument function f, and returns another single-argument function. The Church numeral n is a function that takes a function f as argument and returns the n-th composition of f, i.e. the function f composed with itself n times. This is denoted f(n) and is in fact the n-th power of f (considered as an operator); f(0) is defined to be the identity function. Such repeated compositions (of a single function f) obey the laws of exponents, which is why these numerals can be used for arithmetic. (In Church's original lambda calculus, the formal parameter of a lambda expression was required to occur at least once in the function body, which made the above definition of 0 impossible.)

One way of thinking about the Church numeral n, which is often useful when analysing programs, is as an instruction 'repeat n times'. For example, using the PAIR and NIL functions defined below, one can define a function that constructs a (linked) list of n elements all equal to x by repeating 'prepend another x element' n times, starting from an empty list. The lambda term is

λnx.n (PAIR x) NIL

By varying what is being repeated, and varying what argument that function being repeated is applied to, a great many different effects can be achieved.

We can define a successor function, which takes a Church numeral n and returns n + 1 by adding another application of f, where '(mf)x' means the function 'f' is applied 'm' times on 'x':

SUCC := λnfx.f (nfx)

Because the m-th composition of f composed with the n-th composition of f gives the m+n-th composition of f, addition can be defined as follows:

PLUS := λ (nfx)

PLUS can be thought of as a function taking two natural numbers as arguments and returning a natural number; it can be verified that

PLUS 2 3



are β-equivalent lambda expressions. Since adding m to a number n can be accomplished by adding 1 m times, an alternative definition is:

PLUS := λmn.m SUCC n [23]

Similarly, multiplication can be defined as

MULT := λmnf.m (nf) [19]


MULT := λmn.m (PLUS n) 0

since multiplying m and n is the same as repeating the add n function m times and then applying it to zero. Exponentiation has a rather simple rendering in Church numerals, namely

POW := λbe.eb [20]

The predecessor function defined by PRED n = n − 1 for a positive integer n and PRED 0 = 0 is considerably more difficult. The formula

PRED := λnfx.ngh.h (gf)) (λu.x) (λu.u)

can be validated by showing inductively that if T denotes gh.h (gf)), then T(n)u.x) = (λh.h(f(n−1)(x))) for n > 0. Two other definitions of PRED are given below, one using conditionals and the other using pairs. With the predecessor function, subtraction is straightforward. Defining

SUB := λmn.n PRED m,

SUB mn yields mn when m > n and 0 otherwise.

Logic and predicates

By convention, the following two definitions (known as Church booleans) are used for the boolean values TRUE and FALSE:

TRUE := λxy.x
FALSE := λxy.y
(Note that FALSE is equivalent to the Church numeral zero defined above)

Then, with these two lambda terms, we can define some logic operators (these are just possible formulations; other expressions are equally correct):

AND := λpq.pqp
OR := λpq.ppq
IFTHENELSE := λpab.pab

We are now able to compute some logic functions, for example:

≡ (λxy.x) FALSE TRUE →β FALSE

and we see that AND TRUE FALSE is equivalent to FALSE.

A predicate is a function that returns a boolean value. The most fundamental predicate is ISZERO, which returns TRUE if its argument is the Church numeral 0, and FALSE if its argument is any other Church numeral:


The following predicate tests whether the first argument is less-than-or-equal-to the second:

LEQ := λmn.ISZERO (SUB mn),

and since m = n, if LEQ mn and LEQ nm, it is straightforward to build a predicate for numerical equality.

The availability of predicates and the above definition of TRUE and FALSE make it convenient to write "if-then-else" expressions in lambda calculus. For example, the predecessor function can be defined as:

PRED := λn.ngk.ISZERO (g 1) k (PLUS (gk) 1)) (λv.0) 0

which can be verified by showing inductively that ngk.ISZERO (g 1) k (PLUS (gk) 1)) (λv.0) is the add n − 1 function for n > 0.


A pair (2-tuple) can be defined in terms of TRUE and FALSE, by using the Church encoding for pairs. For example, PAIR encapsulates the pair (x,y), FIRST returns the first element of the pair, and SECOND returns the second.

PAIR := λxyf.fxy
FIRST := λp.p TRUE
NIL := λx.TRUE
NULL := λp.pxy.FALSE)

A linked list can be defined as either NIL for the empty list, or the PAIR of an element and a smaller list. The predicate NULL tests for the value NIL. (Alternatively, with NIL := FALSE, the construct lhtz.deal_with_head_h_and_tail_t) (deal_with_nil) obviates the need for an explicit NULL test).

As an example of the use of pairs, the shift-and-increment function that maps (m, n) to (n, n + 1) can be defined as


which allows us to give perhaps the most transparent version of the predecessor function:

PRED := λn.FIRST (n Φ (PAIR 0 0)).

Additional programming techniques

There is a considerable body of programming idioms for lambda calculus. Many of these were originally developed in the context of using lambda calculus as a foundation for programming language semantics, effectively using lambda calculus as a low-level programming language. Because several programming languages include the lambda calculus (or something very similar) as a fragment, these techniques also see use in practical programming, but may then be perceived as obscure or foreign.

Named constants

In lambda calculus, a library would take the form of a collection of previously defined functions, which as lambda-terms are merely particular constants. The pure lambda calculus does not have a concept of named constants since all atomic lambda-terms are variables, but one can emulate having named constants by setting aside a variable as the name of the constant, using abstraction to bind that variable in the main body, and apply that abstraction to the intended definition. Thus to use f to mean M (some explicit lambda-term) in N (another lambda-term, the "main program"), one can say


Authors often introduce syntactic sugar, such as let, to permit writing the above in the more intuitive order

let f = M in N

By chaining such definitions, one can write a lambda calculus "program" as zero or more function definitions, followed by one lambda-term using those functions that constitutes the main body of the program.

A notable restriction of this let is that the name f is not defined in M, since M is outside the scope of the abstraction binding f; this means a recursive function definition cannot be used as the M with let. The more advanced letrec syntactic sugar construction that allows writing recursive function definitions in that naive style instead additionally employs fixed-point combinators.

Recursion and fixed points

Recursion is the definition of a function using the function itself. Lambda calculus cannot express this as directly as some other notations: all functions are anonymous in lambda calculus, so we can't refer to a value which is yet to be defined, inside the lambda term defining that same value. However, recursion can still be achieved by arranging for a lambda expression to receive itself as its argument value, for example inx.xx) E.

Consider the factorial function F(n) recursively defined by

F(n) = 1, if n = 0; else n× F(n − 1).

In the lambda expression which is to represent this function, a parameter (typically the first one) will be assumed to receive the lambda expression itself as its value, so that calling it applying it to an argument will amount to recursion. Thus to achieve recursion, the intended-as-self-referencing argument (called r here) must always be passed to itself within the function body, at a call point:

G := λr. λn.(1, if n = 0; else n× (rr (n−1)))
withrrx = F x = G rxto hold, sor = Gand
F := G G = (λx.xx) G

The self-application achieves replication here, passing the function's lambda expression on to the next invocation as an argument value, making it available to be referenced and called there.

This solves it but requires re-writing each recursive call as self-application. We would like to have a generic solution, without a need for any re-writes:

G := λr. λn.(1, if n = 0; else n× (r (n−1)))
withrx = F x = G rxto hold, sor = G r =: FIX Gand
F := FIX GwhereFIX g := (r where r = gr) = g (FIX g)
so that FIX G = G (FIX G) = (λn.(1, if n = 0; else n× ((FIX G) (n−1))))

Given a lambda term with first argument representing recursive call (e.g. G here), the fixed-point combinator FIX will return a self-replicating lambda expression representing the recursive function (here, F). The function does not need to be explicitly passed to itself at any point, for the self-replication is arranged in advance, when it is created, to be done each time it is called. Thus the original lambda expression (FIX G) is re-created inside itself, at call-point, achieving self-reference.

In fact, there are many possible definitions for this FIX operator, the simplest of them being:

Y := λg.(λx.g (xx)) (λx.g (xx))

In the lambda calculus, Yg is a fixed-point of g, as it expands to:

h.(λx.h (xx)) (λx.h (xx))) g
x.g (xx)) (λx.g (xx))
g ((λx.g (xx)) (λx.g (xx)))
g (Yg)

Now, to perform our recursive call to the factorial function, we would simply call (Y G) n,  where n is the number we are calculating the factorial of. Given n = 4, for example, this gives:

(Y G) 4
G (Y G) 4
rn.(1, if n = 0; else n × (r (n−1)))) (Y G) 4
n.(1, if n = 0; else n × ((Y G) (n−1)))) 4
1, if 4 = 0; else 4 × ((Y G) (4−1))
4 × (G (Y G) (4−1))
4 × ((λn.(1, if n = 0; else n × ((Y G) (n−1)))) (4−1))
4 × (1, if 3 = 0; else 3 × ((Y G) (3−1)))
4 × (3 × (G (Y G) (3−1)))
4 × (3 × ((λn.(1, if n = 0; else n × ((Y G) (n−1)))) (3−1)))
4 × (3 × (1, if 2 = 0; else 2 × ((Y G) (2−1))))
4 × (3 × (2 × (G (Y G) (2−1))))
4 × (3 × (2 × ((λn.(1, if n = 0; else n × ((Y G) (n−1)))) (2−1))))
4 × (3 × (2 × (1, if 1 = 0; else 1 × ((Y G) (1−1)))))
4 × (3 × (2 × (1 × (G (Y G) (1−1)))))
4 × (3 × (2 × (1 × ((λn.(1, if n = 0; else n × ((Y G) (n−1)))) (1−1)))))
4 × (3 × (2 × (1 × (1, if 0 = 0; else 0 × ((Y G) (0−1))))))
4 × (3 × (2 × (1 × (1))))

Every recursively defined function can be seen as a fixed point of some suitably defined function closing over the recursive call with an extra argument, and therefore, using Y, every recursively defined function can be expressed as a lambda expression. In particular, we can now cleanly define the subtraction, multiplication and comparison predicate of natural numbers recursively.

Standard terms

Certain terms have commonly accepted names:

I := λx.x
K := λxy.x
S := λxyz.xz (yz)
B := λxyz.x (yz)
C := λxyz.xzy
W := λxy.xyy
U := λx.xx
ω := λx.xx
Ω := ω ω
Y := λg.(λx.g (xx)) (λx.g (xx))

Several of these have direct applications in the elimination of abstraction that turns lambda terms into combinator calculus terms.

Abstraction elimination

If N is a lambda-term without abstraction, but possibly containing named constants (combinators), then there exists a lambda-term T(x,N) which is equivalent to λx.N but lacks abstraction (except as part of the named constants, if these are considered non-atomic). This can also be viewed as anonymising variables, as T(x,N) removes all occurrences of x from N, while still allowing argument values to be substituted into the positions where N contains an x. The conversion function T can be defined by:

T(x, x) := I
T(x, N) := KN if x is not free in N.
T(x, MN) := ST(x, M) T(x, N)

In either case, a term of the form T(x,N) P can reduce by having the initial combinator I, K, or S grab the argument P, just like β-reduction of x.N)P would do. I returns that argument. K throws the argument away, just like x.N) would do if x has no free occurrence in N. S passes the argument on to both subterms of the application, and then applies the result of the first to the result of the second.

The combinators B and C are similar to S, but pass the argument on to only one subterm of an application (B to the "argument" subterm and C to the "function" subterm), thus saving a subsequent K if there is no occurrence of x in one subterm. In comparison to B and C, the S combinator actually conflates two functionalities: rearranging arguments, and duplicating an argument so that it may be used in two places. The W combinator does only the latter, yielding the B, C, K, W system as an alternative to SKI combinator calculus.

Typed lambda calculus

A typed lambda calculus is a typed formalism that uses the lambda-symbol () to denote anonymous function abstraction. In this context, types are usually objects of a syntactic nature that are assigned to lambda terms; the exact nature of a type depends on the calculus considered (see kinds below). From a certain point of view, typed lambda calculi can be seen as refinements of the untyped lambda calculus but from another point of view, they can also be considered the more fundamental theory and untyped lambda calculus a special case with only one type. [24]

Typed lambda calculi are foundational programming languages and are the base of typed functional programming languages such as ML and Haskell and, more indirectly, typed imperative programming languages. Typed lambda calculi play an important role in the design of type systems for programming languages; here typability usually captures desirable properties of the program, e.g. the program will not cause a memory access violation.

Typed lambda calculi are closely related to mathematical logic and proof theory via the Curry–Howard isomorphism and they can be considered as the internal language of classes of categories, e.g. the simply typed lambda calculus is the language of Cartesian closed categories (CCCs).

Computable functions and lambda calculus

A function F: NN of natural numbers is a computable function if and only if there exists a lambda expression f such that for every pair of x, y in N, F(x)=y if and only if fx =β y,  where x and y are the Church numerals corresponding to x and y, respectively and =β meaning equivalence with β-reduction. This is one of the many ways to define computability; see the Church–Turing thesis for a discussion of other approaches and their equivalence.

Undecidability of equivalence

There is no algorithm that takes as input any two lambda expressions and outputs TRUE or FALSE depending on whether or not the two expressions are equivalent. [11] More precisely, no computable function can decide the equivalence. This was historically the first problem for which undecidability could be proven. As usual for such a proof, computable means computable by any model of computation that is Turing complete.

Church's proof first reduces the problem to determining whether a given lambda expression has a normal form. A normal form is an equivalent expression that cannot be reduced any further under the rules imposed by the form. Then he assumes that this predicate is computable, and can hence be expressed in lambda calculus. Building on earlier work by Kleene and constructing a Gödel numbering for lambda expressions, he constructs a lambda expression e that closely follows the proof of Gödel's first incompleteness theorem. If e is applied to its own Gödel number, a contradiction results.

Lambda calculus and programming languages

As pointed out by Peter Landin's 1965 paper "A Correspondence between ALGOL 60 and Church's Lambda-notation" [25] , sequential procedural programming languages can be understood in terms of the lambda calculus, which provides the basic mechanisms for procedural abstraction and procedure (subprogram) application.

Anonymous functions

For example, in Lisp the "square" function can be expressed as a lambda expression as follows:


The above example is an expression that evaluates to a first-class function. The symbol lambda creates an anonymous function, given a list of parameter names, (x) – just a single argument in this case, and an expression that is evaluated as the body of the function, (* x x). Anonymous functions are sometimes called lambda expressions.

For example, Pascal and many other imperative languages have long supported passing subprograms as arguments to other subprograms through the mechanism of function pointers. However, function pointers are not a sufficient condition for functions to be first class datatypes, because a function is a first class datatype if and only if new instances of the function can be created at run-time. And this run-time creation of functions is supported in Smalltalk, JavaScript, and more recently in Scala, Eiffel ("agents"), C# ("delegates") and C++11, among others.

Reduction strategies

Whether a term is normalising or not, and how much work needs to be done in normalising it if it is, depends to a large extent on the reduction strategy used. The distinction between reduction strategies relates to the distinction in functional programming languages between eager evaluation and lazy evaluation.

Full β-reductions
Any redex can be reduced at any time. This means essentially the lack of any particular reduction strategy—with regard to reducibility, "all bets are off".
Applicative order
The leftmost, innermost redex is always reduced first. Intuitively this means a function's arguments are always reduced before the function itself. Applicative order always attempts to apply functions to normal forms, even when this is not possible.
Most programming languages (including Lisp, ML and imperative languages like C and Java) are described as "strict", meaning that functions applied to non-normalising arguments are non-normalising. This is done essentially using applicative order, call by value reduction (see below), but usually called "eager evaluation".
Normal order
The leftmost, outermost redex is always reduced first. That is, whenever possible the arguments are substituted into the body of an abstraction before the arguments are reduced.
Call by value
Only the outermost redexes are reduced: a redex is reduced only when its right hand side has reduced to a value (variable or abstraction).
Call by name
As normal order, but no reductions are performed inside abstractions. For example, λx.(λx.x)x is in normal form according to this strategy, although it contains the redex x.x)x.
Call by need
As normal order, but function applications that would duplicate terms instead name the argument, which is then reduced only "when it is needed". Called in practical contexts "lazy evaluation". In implementations this "name" takes the form of a pointer, with the redex represented by a thunk.

Applicative order is not a normalising strategy. The usual counterexample is as follows: define Ω = ωω where ω = λx.xx. This entire expression contains only one redex, namely the whole expression; its reduct is again Ω. Since this is the only available reduction, Ω has no normal form (under any evaluation strategy). Using applicative order, the expression KIΩ = (λxy.x) (λx.x)Ω is reduced by first reducing Ω to normal form (since it is the rightmost redex), but since Ω has no normal form, applicative order fails to find a normal form for KIΩ.

In contrast, normal order is so called because it always finds a normalising reduction, if one exists. In the above example, KIΩ reduces under normal order to I, a normal form. A drawback is that redexes in the arguments may be copied, resulting in duplicated computation (for example, x.xx) ((λx.x)y) reduces to ((λx.x)y) ((λx.x)y) using this strategy; now there are two redexes, so full evaluation needs two more steps, but if the argument had been reduced first, there would now be none).

The positive tradeoff of using applicative order is that it does not cause unnecessary computation, if all arguments are used, because it never substitutes arguments containing redexes and hence never needs to copy them (which would duplicate work). In the above example, in applicative order x.xx) ((λx.x)y) reduces first to x.xx)y and then to the normal order yy, taking two steps instead of three.

Most purely functional programming languages (notably Miranda and its descendants, including Haskell), and the proof languages of theorem provers, use lazy evaluation, which is essentially the same as call by need. This is like normal order reduction, but call by need manages to avoid the duplication of work inherent in normal order reduction using sharing. In the example given above, x.xx) ((λx.x)y) reduces to ((λx.x)y) ((λx.x)y), which has two redexes, but in call by need they are represented using the same object rather than copied, so when one is reduced the other is too.

A note about complexity

While the idea of β-reduction seems simple enough, it is not an atomic step, in that it must have a non-trivial cost when estimating computational complexity. [26] To be precise, one must somehow find the location of all of the occurrences of the bound variable V in the expression E, implying a time cost, or one must keep track of these locations in some way, implying a space cost. A naïve search for the locations of V in E is O(n) in the length n of E. This has led to the study of systems that use explicit substitution. Sinot's director strings [27] offer a way of tracking the locations of free variables in expressions.

Parallelism and concurrency

The Church–Rosser property of the lambda calculus means that evaluation (β-reduction) can be carried out in any order, even in parallel. This means that various nondeterministic evaluation strategies are relevant. However, the lambda calculus does not offer any explicit constructs for parallelism. One can add constructs such as Futures to the lambda calculus. Other process calculi have been developed for describing communication and concurrency.

Optimal reduction

In Lévy's 1988 paper "Sharing in the Evaluation of lambda Expressions", he defines a notion of optimal sharing, such that no work is duplicated. For example, performing a β-reduction in normal order on x.xx) (II) reduces it to II (II). The argument II is duplicated by the application to the first lambda term. If the reduction was done in an applicative order first, we save work because work is not duplicated: x.xx) (II) reduces to x.xx) I. On the other hand, using applicative order can result in redundant reductions or even possibly never reduce to normal form. For example, performing a β-reduction in normal order on f.f I) (λy.(λx.xx) (y I)) yields (λy.(λx.xx) (y I)) I, x.xx) (II) which we know we can do without duplicating work. Doing the same but in applicative order yields f.f I) (λy.y I (y I)), (λy.y I (y I)) I, I I (I I), and now work is duplicated.

Lévy shows the existence of lambda terms where there does not exist a sequence of reductions which reduces them without duplicating work. The below lambda term is such an example.

((λg.(g(g(λx.x)))) (λh.((λf.(f(f(λz.z)))) (λw.(h(w(λy.y)))))))

It is composed of three similar terms, x=((λg. ... ) (λh.y)) and y=((λf. ...) (λw.z) ), and finally z=λw.(h(w(λy.y))). There are only two possible β-reductions to be done here, on x and on y. Reducing the outer x term first results in the inner y term being duplicated, and each copy will have to be reduced, but reducing the inner y term first will duplicate its argument z, which will cause work to be duplicated when the values of h and w are made known. Incidentally, the above term reduces to the identity function (λy.y), and is constructed by making wrappers which make the identity function available to the binders g=λh..., f=λw..., h=λx.x (at first), and w=λz.z (at first), all of which are applied to the innermost term λy.y.

The precise notion of duplicated work relies on noticing that after the first reduction of I I is done, the value of the other I I can be determined, because they have the same structure (and in fact they have exactly the same values), and result from a common ancestor. Such similar structures can each be assigned a label that can be tracked across reductions. If a name is assigned to the redex that produces all the resulting II terms, and then all duplicated occurrences of II can be tracked and reduced in one go. However, it is not obvious that a redex will produce the II term. Identifying the structures that are similar in different parts of a lambda term can involve a complex algorithm and can possibly have a complexity equal to the history of the reduction itself.

While Lévy defines the notion of optimal sharing, he does not provide an algorithm to do it. In Vincent van Oostrom, Kees-Jan van de Looij, and Marijn Zwitserlood's paper Lambdascope: Another optimal implementation of the lambda-calculus , they provide such an algorithm by transforming lambda terms into interaction nets, which are then reduced. Roughly speaking, the resulting reduction is optimal because every term that would have the same labels as per Lévy's paper would also be the same graph in the interaction net. In the paper, they mention that their prototype implementation of Lambdascope performs as well as the optimised version of the reference optimal higher order machine BOHM.

More details can be found in the short article About the efficient reduction of lambda terms.


The fact that lambda calculus terms act as functions on other lambda calculus terms, and even on themselves, led to questions about the semantics of the lambda calculus. Could a sensible meaning be assigned to lambda calculus terms? The natural semantics was to find a set D isomorphic to the function space DD, of functions on itself. However, no nontrivial such D can exist, by cardinality constraints because the set of all functions from D to D has greater cardinality than D, unless D is a singleton set.

In the 1970s, Dana Scott showed that, if only continuous functions were considered, a set or domain D with the required property could be found, thus providing a model for the lambda calculus.

This work also formed the basis for the denotational semantics of programming languages.

Variations and extensions

These extensions are in the lambda cube:

These formal systems are extensions of lambda calculus that are not in the lambda cube:

These formal systems are variations of lambda calculus:

These formal systems are related to lambda calculus:

See also

Related Research Articles

In mathematics, and in other disciplines involving formal languages, including mathematical logic and computer science, a free variable is a notation (symbol) that specifies places in an expression where substitution may take place and is not a parameter of this or any container expression. Some older books use the terms real variable and apparent variable for free variable and bound variable, respectively. The idea is related to a placeholder, or a wildcard character that stands for an unspecified symbol.

Combinatory logic is a notation to eliminate the need for quantified variables in mathematical logic. It was introduced by Moses Schönfinkel and Haskell Curry, and has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming languages. It is based on combinators which were introduced by Schönfinkel in 1920 with the idea of providing an analogous way to build up functions - and to remove any mention of variables - particularly in predicate logic. A combinator is a higher-order function that uses only function application and earlier defined combinators to define a result from its arguments.

In mathematics and computer science in general, a fixed point of a function is a value that is mapped to itself by the function. In combinatory logic for computer science, a fixed-point combinator is a higher-order function that returns some fixed point of its argument function, if one exists.

In programming language theory and proof theory, the Curry–Howard correspondence is the direct relationship between computer programs and mathematical proofs.

System F, also known as the (Girard–Reynolds) polymorphic lambda calculus or the second-order lambda calculus, is a typed lambda calculus that differs from the simply typed lambda calculus by the introduction of a mechanism of universal quantification over types. System F thus formalizes the notion of parametric polymorphism in programming languages, and forms a theoretical basis for languages such as Haskell and ML. System F was discovered independently by logician Jean-Yves Girard (1972) and computer scientist John C. Reynolds (1974).

Lambda cube a set of 3 independent extensions (type-dependent terms, term-dependent types, type-dependent types) of simply typed λ-calculus, generating 8 different typed systems (λ→, System F, λω̱, Fω, λP, λP2, λPω̱, calculus of constructions) arranged i

In mathematical logic and type theory, the λ-cube is a framework introduced by Henk Barendregt to investigate the different dimensions in which the calculus of constructions is a generalization of the simply typed λ-calculus. Each dimension of the cube corresponds to a new kind of dependency between terms and types. Here, "dependency" refers to the capacity of a term or type to bind a term or type. The respective dimensions of the λ-cube correspond to:

The SKI combinator calculus is a combinatory logic, a computational system that may be perceived as a reduced version of the untyped lambda calculus. It can be thought of as a computer programming language, though it is not convenient for writing software. Instead, it is important in the mathematical theory of algorithms because it is an extremely simple Turing complete language. It was introduced by Moses Schönfinkel and Haskell Curry.

The simply typed lambda calculus, a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor: that builds function types. It is the canonical and simplest example of a typed lambda calculus. The simply typed lambda calculus was originally introduced by Alonzo Church in 1940 as an attempt to avoid paradoxical uses of the untyped lambda calculus, and it exhibits many desirable and interesting properties.

Lambda lifting is a meta-process that restructures a computer program so that functions are defined independently of each other in a global scope. An individual "lift" transforms a local function into a global function. It is a two step process, consisting of;

In mathematics, Church encoding is a means of representing data and operators in the lambda calculus. The Church numerals are a representation of the natural numbers using lambda notation. The method is named for Alonzo Church, who first encoded data in the lambda calculus this way.

In mathematical logic, the de Bruijn index is a tool invented by the Dutch mathematician Nicolaas Govert de Bruijn for representing terms of lambda calculus without naming the bind variables. Terms written using these indices are invariant with respect to α-conversion, so the check for α-equivalence is the same as that for syntactic equality. Each De Bruijn index is a natural number that represents an occurrence of a variable in a λ-term, and denotes the number of binders that are in scope between that occurrence and its corresponding binder. The following are some examples:

In mathematical logic and computer science, the lambda-mu calculus is an extension of the lambda calculus introduced by M. Parigot. It introduces two new operators: the μ operator and the bracket operator. Proof-theoretically, it provides a well-behaved formulation of classical natural deduction.

In mathematics, in the area of lambda calculus and computation, directors or director strings are a mechanism for keeping track of the free variables in a term. Loosely speaking, they can be understood as a kind of memoization for free variables; that is, as an optimization technique for rapidly locating the free variables in a term algebra or in a lambda expression. Director strings were introduced by Kennaway and Sleep in 1982 and further developed by Sinot, Fernández and Mackie as a mechanism for understanding and controlling the computational complexity cost of beta reduction.

In computer science, partial application refers to the process of fixing a number of arguments to a function, producing another function of smaller arity. Given a function , we might fix the first argument, producing a function of type . Evaluation of this function might be represented as . Note that the result of partial function application in this case is a function that takes two arguments. Partial application is sometimes incorrectly called currying, which is a related, but distinct concept.

In algebra, a λ-ring or lambda ring is a commutative ring together with some operations λn on it that behave like the exterior powers of vector spaces. Many rings considered in K-theory carry a natural λ-ring structure. λ-rings also provide a powerful formalism for studying an action of the symmetric functions on the ring of polynomials, recovering and extending many classical results.

Computable topology is a discipline in mathematics that studies the topological and algebraic structure of computation. Computable topology is not to be confused with algorithmic or computational topology, which studies the application of computation to topology.

In computer science, a "let" expression associates a function definition with a restricted scope.

Deductive lambda calculus considers what happens when lambda terms are regarded as mathematical expressions. One interpretation of the untyped lambda calculus is as a programming language where evaluation proceeds by performing reductions on an expression until it is in normal form. In this interpretation, if the expression never reduces to normal form then the program never terminates, and the value is undefined. Considered as a mathematical deductive system, each reduction would not alter the value of the expression. The expression would equal the reduction of the expression.

Formal definitions of the Lambda calculus. Lambda calculus is a programming language based on lambda abstraction and function application. Two definitions of the language are given here.

Krivine machine

In theoretical computer science, the Krivine machine is an abstract machine. As an abstract machine, it shares features with Turing machines and the SECD machine. The Krivine machine explains how to compute a recursive function. More specifically it aims to define rigorously head normal form reduction of a lambda term using call-by-name reduction. Thanks to its formalism, it tells in details how a kind of reduction works and sets the theoretical foundation of the operational semantics of functional programming languages. On the other hand, Krivine machine implements call-by-name because it evaluates the body of a β-redex before it evaluates its parameter. In other words, in an expression u it evaluates first λx. t before evaluating u. In functional programming, this would mean that in order to evaluate a function applied to a parameter, it evaluates first the function before evaluating the parameter.


  1. Turing, A. M. (December 1937). "Computability and λ-Definability". The Journal of Symbolic Logic. 2 (4): 153–163. doi:10.2307/2268280. JSTOR   2268280.
  2. Coquand, Thierry, "Type Theory", The Stanford Encyclopedia of Philosophy (Summer 2013 Edition), Edward N. Zalta (ed.).
  3. Moortgat, Michael (1988). Categorial Investigations: Logical and Linguistic Aspects of the Lambek Calculus. Foris Publications. ISBN   9789067653879.
  4. Bunt, Harry; Muskens, Reinhard, eds. (2008), Computing Meaning, Springer, ISBN   9781402059575
  5. Mitchell, John C. (2003), Concepts in Programming Languages, Cambridge University Press, p. 57, ISBN   9780521780988 .
  6. Pierce, Benjamin C. Basic Category Theory for Computer Scientists. p. 53.
  7. Church, A. (1932). "A set of postulates for the foundation of logic". Annals of Mathematics. Series 2. 33 (2): 346–366. doi:10.2307/1968337. JSTOR   1968337.
  8. For a full history, see Cardone and Hindley's "History of Lambda-calculus and Combinatory Logic" (2006).
  9. Kleene, S. C.; Rosser, J. B. (July 1935). "The Inconsistency of Certain Formal Logics". The Annals of Mathematics. 36 (3): 630. doi:10.2307/1968646.
  10. Church, Alonzo (December 1942). "Review of Haskell B. Curry, The Inconsistency of Certain Formal Logics". The Journal of Symbolic Logic. 7 (4): 170–171. doi:10.2307/2268117. JSTOR   2268117.
  11. 1 2 Church, A. (1936). "An unsolvable problem of elementary number theory". American Journal of Mathematics. 58 (2): 345–363. doi:10.2307/2371045. JSTOR   2371045.
  12. Church, A. (1940). "A Formulation of the Simple Theory of Types". Journal of Symbolic Logic. 5 (2): 56–68. doi:10.2307/2266170. JSTOR   2266170.
  13. Partee, B. B. H.; ter Meulen, A.; Wall, R. E. (1990). Mathematical Methods in Linguistics. Springer. Retrieved 29 Dec 2016.
  14. Alama, Jesse "The Lambda Calculus", The Stanford Encyclopedia of Philosophy (Summer 2013 Edition), Edward N. Zalta (ed.).
  15. Dana Scott, "Looking Backward; Looking Forward", Invited Talk at the Workshop in honour of Dana Scott’s 85th birthday and 50 years of domain theory, 7-8 July, FLoC 2018 (talk 7 July 2018). The relevant passage begins at 32:50. (See also this extract of a May 2016 talk at the University of Birmingham, UK.)
  16. Barendregt, Hendrik Pieter (1984). The Lambda Calculus: Its Syntax and Semantics. Studies in Logic and the Foundations of Mathematics. 103 (Revised ed.). North Holland. ISBN   0-444-87508-5.
  17. Corrections.
  18. 1 2 "Example for Rules of Associativity". Retrieved 2012-06-18.
  19. 1 2 Selinger, Peter (2008), Lecture Notes on the Lambda Calculus (PDF), 0804, Department of Mathematics and Statistics, University of Ottawa, p. 9, arXiv: 0804.3434 , Bibcode:2008arXiv0804.3434S
  20. 1 2 Barendregt, Henk; Barendsen, Erik (March 2000), Introduction to Lambda Calculus (PDF)
  21. de Queiroz, Ruy J. G. B. (1988). "A Proof-Theoretic Account of Programming and the Role of Reduction Rules". Dialectica. 42 (4): 265–282. doi:10.1111/j.1746-8361.1988.tb00919.x.
  22. Turbak, Franklyn; Gifford, David (2008), Design concepts in programming languages, MIT press, p. 251, ISBN   978-0-262-20175-9
  23. Felleisen, Matthias; Flatt, Matthew (2006), Programming Languages and Lambda Calculi (PDF), p. 26, archived from the original (PDF) on 2009-02-05; a note (accessed 2017) at the original location suggests that the authors consider the work originally referenced to have been superseded by a book.
  24. Types and Programming Languages, p. 273, Benjamin C. Pierce
  25. Landin, P. J. (1965). "A Correspondence between ALGOL 60 and Church's Lambda-notation". Communications of the ACM. 8 (2): 89–101. doi:10.1145/363744.363749.
  26. Statman, R. (1979). "The typed λ-calculus is not elementary recursive". Theoretical Computer Science. 9 (1): 73–81. doi:10.1016/0304-3975(79)90007-0.
  27. Sinot, F.-R. (2005). "Director Strings Revisited: A Generic Approach to the Efficient Representation of Free Variables in Higher-order Rewriting". Journal of Logic and Computation. 15 (2): 201–218. doi:10.1093/logcom/exi010.[ permanent dead link ]

Further reading

Monographs/textbooks for graduate students:

Some parts of this article are based on material from FOLDOC, used with permission.

  1. Frink Jr., Orrin (1944). "Review: The Calculi of Lambda-Conversion by Alonzo Church" (PDF). Bull. Amer. Math. Soc. 50 (3): 169–172. doi:10.1090/s0002-9904-1944-08090-7.