A-normal form

Last updated

In computer science, A-normal form (abbreviated ANF, sometimes expanded as administrative normal form or as atomic normal form) is an intermediate representation of programs in functional programming language compilers. In ANF, all arguments to a function must be trivial (constants or variables). That is, evaluation of each argument must halt immediately.

Contents

ANF was introduced by Sabry and Felleisen in 1992 [1] as a simpler alternative to continuation-passing style (CPS). Some of the advantages of using CPS as an intermediate representation are that optimizations are easier to perform on programs in CPS than in the source language, and that it is also easier for compilers to generate machine code for programs in CPS. Flanagan et al. [2] showed how compilers could use ANF to achieve those same benefits with one source-level transformation; in contrast, for realistic compilers the CPS transformation typically involves additional phases, for example, to simplify CPS terms.

Grammar

Consider the pure λ-calculus with constants and let-expressions. The ANF restriction is enforced by

  1. allowing only values (variables, constants, and λ-terms), to serve as operands of function applications, and
  2. requiring that the result of a non-trivial expression (such as a function application) be immediately captured in a let-bound variable.

The following BNF grammars show how one would modify the syntax of λ-expressions to implement the constraints of ANF:

OriginalANF
EXP ::= λ VAR . EXP       | EXP EXP       | VAR       | CONST       | let VAR = EXP in EXP  CONST ::= f | g | h 
EXP ::= VAL        | let VAR = VAL in EXP       | let VAR = VAL VAL in EXP  VAL ::= VAR       | CONST       | λ VAR . EXP  CONST ::= f | g | h 

Variants of ANF used in compilers or in research often allow records, tuples, multiargument functions, primitive operations and conditional expressions as well.

Examples

The expression:

f(g(x),h(y))

is written in ANF as:

let v0 = g(x) in     let v1 = h(y) in         f(v0,v1)

By imagining the sort of assembly this function call would produce:

;; let v0 = g(x) move x into args[0] call g move result into temp[0] ;; let v1 = h(y) move y into args[0] call h move result into temp[1] ;; f(v0, v1) move temp[0] into args[0] move temp[1] into args[1] call f

One can see the immediate similarities between ANF and the compiled form of a function; this property is a part of what makes ANF a good intermediate representation for optimisations in compilers.

See also

Related Research Articles

Lambda calculus is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. Untyped lambda calculus, the topic of this article, is a universal machine, a model of computation that can be used to simulate any Turing machine. It was introduced by the mathematician Alonzo Church in the 1930s as part of his research into the foundations of mathematics. In 1936, Church found a formulation which was logically consistent, and documented it in 1940.

<span class="mw-page-title-main">Scheme (programming language)</span> Dialect of Lisp

Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT Computer Science and Artificial Intelligence Laboratory and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.

In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in an alternative style that some may prefer. Syntactic sugar is usually a shorthand for a common operation that could also be expressed in an alternate, more verbose, form: The programmer has a choice of whether to use the shorter form or the longer form, but will usually use the shorter form since it is shorter and easier to type and read.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

Combinatory logic is a notation to eliminate the need for quantified variables in mathematical logic. It was introduced by Moses Schönfinkel and Haskell Curry, and has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming languages. It is based on combinators, which were introduced by Schönfinkel in 1920 with the idea of providing an analogous way to build up functions—and to remove any mention of variables—particularly in predicate logic. A combinator is a higher-order function that uses only function application and earlier defined combinators to define a result from its arguments.

In combinatory logic for computer science, a fixed-point combinator, is a higher-order function that returns some fixed point of its argument function, if one exists.

In compiler design, static single assignment form is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for imperative languages, including LLVM, the GNU Compiler Collection, and many commercial compilers.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, conditional expression, ternary if, or inline if. An expression if a then b else c or a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is the most common, but alternative syntax do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In functional programming, continuation-passing style (CPS) is a style of programming in which control is passed explicitly in the form of a continuation. This is contrasted with direct style, which is the usual style of programming. Gerald Jay Sussman and Guy L. Steele, Jr. coined the phrase in AI Memo 349 (1975), which sets out the first version of the Scheme programming language. John C. Reynolds gives a detailed account of the numerous discoveries of continuations.

Lambda lifting is a meta-process that restructures a computer program so that functions are defined independently of each other in a global scope. An individual "lift" transforms a local function into a global function. It is a two step process, consisting of;

In computer science, higher-order abstract syntax is a technique for the representation of abstract syntax trees for languages with variable binders.

In computing, a meta-circular evaluator (MCE) or meta-circular interpreter (MCI) is an interpreter which defines each feature of the interpreted language using a similar facility of the interpreter's host language. For example, interpreting a lambda application may be implemented using function application. Meta-circular evaluation is most prominent in the context of Lisp. A self-interpreter is a meta-circular interpreter where the interpreted language is nearly identical to the host language; the two terms are often used synonymously.

In programming languages, a delimited continuation, composable continuation or partial continuation, is a "slice" of a continuation frame that has been reified into a function. Unlike regular continuations, delimited continuations return a value, and thus may be reused and composed. Control delimiters, the basis of delimited continuations, were introduced by Matthias Felleisen in 1988 though early allusions to composable and delimited continuations can be found in Carolyn Talcott's Stanford 1984 dissertation, Felleisen et al., Felleisen's 1987 dissertation, and algorithms for functional backtracking, e.g., for pattern matching, for parsing, in the Algebraic Logic Functional programming language, and in the functional implementations of Prolog where the failure continuation is often kept implicit and the reason of being for the success continuation is that it is composable.

In programming language semantics, normalisation by evaluation (NBE) is a method of obtaining the normal form of terms in the λ-calculus by appealing to their denotational semantics. A term is first interpreted into a denotational model of the λ-term structure, and then a canonical (β-normal and η-long) representative is extracted by reifying the denotation. Such an essentially semantic, reduction-free, approach differs from the more traditional syntactic, reduction-based, description of normalisation as reductions in a term rewrite system where β-reductions are allowed deep inside λ-terms.

In mathematics and computer science, apply is a function that applies a function to arguments. It is central to programming languages derived from lambda calculus, such as LISP and Scheme, and also in functional languages. It has a role in the study of the denotational semantics of computer programs, because it is a continuous function on complete partial orders. Apply is also a continuous function in homotopy theory, and, indeed underpins the entire theory: it allows a homotopy deformation to be viewed as a continuous path in the space of functions. Likewise, valid mutations (refactorings) of computer programs can be seen as those that are "continuous" in the Scott topology.

In mathematics, in the area of lambda calculus and computation, directors or director strings are a mechanism for keeping track of the free variables in a term. Loosely speaking, they can be understood as a kind of memoization for free variables; that is, as an optimization technique for rapidly locating the free variables in a term algebra or in a lambda expression. Director strings were introduced by Kennaway and Sleep in 1982 and further developed by Sinot, Fernández and Mackie as a mechanism for understanding and controlling the computational complexity cost of beta reduction.

In mathematics, a zonal spherical function or often just spherical function is a function on a locally compact group G with compact subgroup K (often a maximal compact subgroup) that arises as the matrix coefficient of a K-invariant vector in an irreducible representation of G. The key examples are the matrix coefficients of the spherical principal series, the irreducible representations appearing in the decomposition of the unitary representation of G on L2(G/K). In this case the commutant of G is generated by the algebra of biinvariant functions on G with respect to K acting by right convolution. It is commutative if in addition G/K is a symmetric space, for example when G is a connected semisimple Lie group with finite centre and K is a maximal compact subgroup. The matrix coefficients of the spherical principal series describe precisely the spectrum of the corresponding C* algebra generated by the biinvariant functions of compact support, often called a Hecke algebra. The spectrum of the commutative Banach *-algebra of biinvariant L1 functions is larger; when G is a semisimple Lie group with maximal compact subgroup K, additional characters come from matrix coefficients of the complementary series, obtained by analytic continuation of the spherical principal series.

In computer science, a "let" expression associates a function definition with a restricted scope.

A CEK Machine is an abstract machine invented by Matthias Felleisen and Daniel P. Friedman that implements left-to-right call by value. It is generally implemented as an interpreter for functional programming languages, but can also be used to implement simple imperative programming languages. A state in a CEK machine includes a control statement, environment and continuation. The control statement is the term being evaluated at that moment, the environment is (usually) a map from variable names to values, and the continuation stores another state, or a special halt case. It is a simplified form of another abstract machine called the SECD machine.

References

  1. Sabry, Amr; Felleisen, Matthias. "Reasoning about Programs in Continuation-Passing Style". Proceedings of the 1992 ACM Conference on LISP and Functional Programming, LFP'92. San Francisco, CA, USA. CiteSeerX   10.1.1.22.7509 . Sabry92.
  2. Flanagan, Cormac; Sabry, Amr; Duba, Bruce F.; Felleisen, Matthias. "The Essence of Compiling with Continuations" (PDF). Proceedings ACM SIGPLAN 1993 Conf. on Programming Language Design and Implementation, PLDI'93. Albuquerque, NM, USA. Flanagan93. Retrieved 2012-11-16.