Reduction strategy

Last updated

In rewriting, a reduction strategy or rewriting strategy is a relation specifying a rewrite for each object or term, compatible with a given reduction relation. [1] Some authors use the term to refer to an evaluation strategy. [2] [3]

Contents

Definitions

Formally, for an abstract rewriting system , a reduction strategy is a binary relation on with , where is the transitive closure of (but not the reflexive closure). [1] In addition the normal forms of the strategy must be the same as the normal forms of the original rewriting system, i.e. for all , there exists a with iff . [4]

A one step reduction strategy is one where . Otherwise it is a many step strategy. [5]

A deterministic strategy is one where is a partial function, i.e. for each there is at most one such that . Otherwise it is a nondeterministic strategy. [5]

Term rewriting

In a term rewriting system a rewriting strategy specifies, out of all the reducible subterms (redexes), which one should be reduced (contracted) within a term.

One-step strategies for term rewriting include: [5]

Many-step strategies include: [5]

Parallel outermost and Gross-Knuth reduction are hypernormalizing for all almost-orthogonal term rewriting systems, meaning that these strategies will eventually reach a normal form if it exists, even when performing (finitely many) arbitrary reductions between successive applications of the strategy. [8]

Stratego is a domain-specific language designed specifically for programming term rewriting strategies. [9]

Lambda calculus

In the context of the lambda calculus, normal-order reduction refers to leftmost-outermost reduction in the sense given above. [10] Normal-order reduction is normalizing, in the sense that if a term has a normal form, then normal‐order reduction will eventually reach it, hence the name normal. This is known as the standardization theorem. [11] [12]

Leftmost reduction is sometimes used to refer to normal order reduction, as with a pre-order traversal the notions coincide, and similarly the leftmost-outermost redex is the redex with leftmost starting character when the lambda term is considered as a string of characters. [13] [14] When "leftmost" is defined using an in-order traversal the notions are distinct. For example, in the term with defined here, the leftmost redex of the in-order traversal is while the leftmost-outermost redex is the entire expression. [15]

Applicative order reduction refers to leftmost-innermost reduction. [10] In contrast to normal order, applicative order reduction may not terminate, even when the term has a normal form. [10] For example, using applicative order reduction, the following sequence of reductions is possible:

But using normal-order reduction, the same starting point reduces quickly to normal form:

Full β-reduction refers to the nondeterministic one-step strategy that allows reducing any redex at each step. [3] Takahashi's parallel β-reduction is the strategy that reduces all redexes in the term simultaneously. [16]

Weak reduction

Normal and applicative order reduction are strong in that they allow reduction under lambda abstractions. In contrast, weak reduction does not reduce under a lambda abstraction. [17] Call-by-name reduction is the weak reduction strategy that reduces the leftmost outermost redex not inside a lambda abstraction, while call-by-value reduction is the weak reduction strategy that reduces the leftmost innermost redex not inside a lambda abstraction. These strategies were devised to reflect the call-by-name and call-by-value evaluation strategies. [18] In fact, applicative order reduction was also originally introduced to model the call-by-value parameter passing technique found in Algol 60 and modern programming languages. When combined with the idea of weak reduction, the resulting call-by-value reduction is indeed a faithful approximation. [19]

Unfortunately, weak reduction is not confluent, [17] and the traditional reduction equations of the lambda calculus are useless, because they suggest relationships that violate the weak evaluation regime. [19] However, it is possible to extend the system to be confluent by allowing a restricted form of reduction under an abstraction, in particular when the redex does not involve the variable bound by the abstraction. [17] For example, λx.(λy.x)z is in normal form for a weak reduction strategy because the redex y.x)z is contained in a lambda abstraction. But the term λx.(λy.y)z can still be reduced under the extended weak reduction strategy, because the redex y.y)z does not refer to x. [20]

Optimal reduction

Optimal reduction is motivated by the existence of lambda terms where there does not exist a sequence of reductions which reduces them without duplicating work. For example, consider

((λg.(g(g(λx.x))))  (λh.((λf.(f(f(λz.z))))       (λw.(h(w(λy.y))))))) 

It is composed of three nested terms, x=((λg. ... ) (λh.y)), y=((λf. ...) (λw.z) ), and z=λw.(h(w(λy.y))). There are only two possible β-reductions to be done here, on x and on y. Reducing the outer x term first results in the inner y term being duplicated, and each copy will have to be reduced, but reducing the inner y term first will duplicate its argument z, which will cause work to be duplicated when the values of h and w are made known. [lower-alpha 1]

Optimal reduction is not a reduction strategy for the lambda calculus in a narrow sense because performing β-reduction loses the information about the substituted redexes being shared. Instead it is defined for the labelled lambda calculus, an annotated lambda calculus which captures a precise notion of the work that should be shared. [21] :113–114

Labels consist of a countably infinite set of atomic labels, and concatenations , overlinings and underlinings of labels. A labelled term is a lambda calculus term where each subterm has a label. The standard initial labeling of a lambda term gives each subterm a unique atomic label. [21] :132 Labelled β-reduction is given by: [22]

where concatenates labels, , and substitution is defined as follows (using the Barendregt convention): [22]

The system can be proven to be confluent. Optimal reduction is then defined to be normal order or leftmost-outermost reduction using reduction by families, i.e. the parallel reduction of all redexes with the same function part label. [23] The strategy is optimal in the sense that it performs the optimal (minimal) number of family reduction steps. [24]

A practical algorithm for optimal reduction was first described in 1989, [25] more than a decade after optimal reduction was first defined in 1974. [26] The Bologna Optimal Higher-order Machine (BOHM) is a prototype implementation of an extension of the technique to interaction nets. [21] :362 [27] Lambdascope is a more recent implementation of optimal reduction, also using interaction nets. [28] [lower-alpha 2]

Call by need reduction

Call by need reduction can be defined similarly to optimal reduction as weak leftmost-outermost reduction using parallel reduction of redexes with the same label, for a slightly different labelled lambda calculus. [17] An alternate definition changes the beta rule to an operation that finds the next "needed" computation, evaluates it, and substitutes the result into all locations. This requires extending the beta rule to allow reducing terms that are not syntactically adjacent. [29] As with call-by-name and call-by-value, call-by-need reduction was devised to mimic the behavior of the evaluation strategy known as "call-by-need" or lazy evaluation.

See also

Notes

  1. Incidentally, the above term reduces to the identity function (λy.y), and is constructed by making wrappers which make the identity function available to the binders g=λh..., f=λw..., h=λx.x (at first), and w=λz.z (at first), all of which are applied to the innermost term λy.y.
  2. A summary of recent research on optimal reduction can be found in the short article About the efficient reduction of lambda terms.

Related Research Articles

Lambda calculus is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation that can be used to simulate any Turing machine. It was introduced by the mathematician Alonzo Church in the 1930s as part of his research into the foundations of mathematics.

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

Combinatory logic is a notation to eliminate the need for quantified variables in mathematical logic. It was introduced by Moses Schönfinkel and Haskell Curry, and has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming languages. It is based on combinators, which were introduced by Schönfinkel in 1920 with the idea of providing an analogous way to build up functions—and to remove any mention of variables—particularly in predicate logic. A combinator is a higher-order function that uses only function application and earlier defined combinators to define a result from its arguments.

<span class="mw-page-title-main">Church–Rosser theorem</span>

In lambda calculus, the Church–Rosser theorem states that, when applying reduction rules to terms, the ordering in which the reductions are chosen does not make a difference to the eventual result.

In mathematics and computer science in general, a fixed point of a function is a value that is mapped to itself by the function.

<span class="mw-page-title-main">Lambda cube</span>

In mathematical logic and type theory, the λ-cube is a framework introduced by Henk Barendregt to investigate the different dimensions in which the calculus of constructions is a generalization of the simply typed λ-calculus. Each dimension of the cube corresponds to a new kind of dependency between terms and types. Here, "dependency" refers to the capacity of a term or type to bind a term or type. The respective dimensions of the λ-cube correspond to:

In the lambda calculus, a term is in beta normal form if no beta reduction is possible. A term is in beta-eta normal form if neither a beta reduction nor an eta reduction is possible. A term is in head normal form if there is no beta-redex in head position. The normal form of a term, if one exists, is unique. However, a term may have more than one head normal form.

The simply typed lambda calculus, a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor that builds function types. It is the canonical and simplest example of a typed lambda calculus. The simply typed lambda calculus was originally introduced by Alonzo Church in 1940 as an attempt to avoid paradoxical use of the untyped lambda calculus.

Orthogonality as a property of term rewriting systems (TRSs) describes where the reduction rules of the system are all left-linear, that is each variable occurs only once on the left hand side of each reduction rule, and there is no overlap between them, i.e. the TRS has no critical pairs. For example is not left-linear.

Interaction nets are a graphical model of computation devised by Yves Lafont in 1990 as a generalisation of the proof structures of linear logic. An interaction net system is specified by a set of agent types and a set of interaction rules. Interaction nets are an inherently distributed model of computation in the sense that computations can take place simultaneously in many parts of an interaction net, and no synchronisation is needed. The latter is guaranteed by the strong confluence property of reduction in this model of computation. Thus interaction nets provide a natural language for massive parallelism. Interaction nets are at the heart of many implementations of the lambda calculus, such as efficient closed reduction and optimal, in Lévy's sense, Lambdascope.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In mathematical logic, the De Bruijn index is a tool invented by the Dutch mathematician Nicolaas Govert de Bruijn for representing terms of lambda calculus without naming the bound variables. Terms written using these indices are invariant with respect to α-conversion, so the check for α-equivalence is the same as that for syntactic equality. Each De Bruijn index is a natural number that represents an occurrence of a variable in a λ-term, and denotes the number of binders that are in scope between that occurrence and its corresponding binder. The following are some examples:

In programming language semantics, normalisation by evaluation (NBE) is a style of obtaining the normal form of terms in the λ-calculus by appealing to their denotational semantics. A term is first interpreted into a denotational model of the λ-term structure, and then a canonical (β-normal and η-long) representative is extracted by reifying the denotation. Such an essentially semantic, reduction-free, approach differs from the more traditional syntactic, reduction-based, description of normalisation as reductions in a term rewrite system where β-reductions are allowed deep inside λ-terms.

In mathematical logic and computer science, the lambda-mu calculus is an extension of the lambda calculus introduced by M. Parigot. It introduces two new operators: the μ operator and the bracket operator. Proof-theoretically, it provides a well-behaved formulation of classical natural deduction.

In the study of denotational semantics of the lambda calculus, Böhm trees, Lévy-Longo trees, and Berarducci trees are tree-like mathematical objects that capture the "meaning" of a term up to some set of "meaningless" terms.

In mathematics, the Fox–Wright function (also known as Fox–Wright Psi function, not to be confused with Wright Omega function) is a generalisation of the generalised hypergeometric function pFq(z) based on ideas of Charles Fox (1928) and E. Maitland Wright (1935):

In mathematics, Ricci calculus constitutes the rules of index notation and manipulation for tensors and tensor fields on a differentiable manifold, with or without a metric tensor or connection. It is also the modern name for what used to be called the absolute differential calculus, developed by Gregorio Ricci-Curbastro in 1887–1896, and subsequently popularized in a paper written with his pupil Tullio Levi-Civita in 1900. Jan Arnoldus Schouten developed the modern notation and formalism for this mathematical framework, and made contributions to the theory, during its applications to general relativity and differential geometry in the early twentieth century.

Computable topology is a discipline in mathematics that studies the topological and algebraic structure of computation. Computable topology is not to be confused with algorithmic or computational topology, which studies the application of computation to topology.

Lambda calculus is a formal mathematical system based on lambda abstraction and function application. Two definitions of the language are given here: a standard definition, and a definition using mathematical formulas.

<span class="mw-page-title-main">Krivine machine</span> Abstract machine

In theoretical computer science, the Krivine machine is an abstract machine. As an abstract machine, it shares features with Turing machines and the SECD machine. The Krivine machine explains how to compute a recursive function. More specifically it aims to define rigorously head normal form reduction of a lambda term using call-by-name reduction. Thanks to its formalism, it tells in details how a kind of reduction works and sets the theoretical foundation of the operational semantics of functional programming languages. On the other hand, Krivine machine implements call-by-name because it evaluates the body of a β-redex before it applies the body to its parameter. In other words, in an expression u it evaluates first λx. t before applying it to u. In functional programming, this would mean that in order to evaluate a function applied to a parameter, it evaluates first the function before applying it to the parameter.

References

  1. 1 2 Kirchner, Hélène (26 August 2015). "Rewriting Strategies and Strategic Rewrite Programs". In Martí-Oliet, Narciso; Ölveczky, Peter Csaba; Talcott, Carolyn (eds.). Logic, Rewriting, and Concurrency: Essays Dedicated to José Meseguer on the Occasion of His 65th Birthday. Springer. ISBN   978-3-319-23165-5 . Retrieved 14 August 2021.
  2. Selinger, Peter; Valiron, Benoît (2009). "Quantum Lambda Calculus" (PDF). Semantic Techniques in Quantum Computation: 23. doi:10.1017/CBO9781139193313.005. ISBN   9780521513746 . Retrieved 21 August 2021.
  3. 1 2 Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. p. 56. ISBN   0-262-16209-1.
  4. Klop, Jan Willem; van Oostrom, Vincent; van Raamsdonk, Femke (2007). "Reduction Strategies and Acyclicity" (PDF). Rewriting, Computation and Proof. Lecture Notes in Computer Science. Vol. 4600. pp. 89–112. CiteSeerX   10.1.1.104.9139 . doi:10.1007/978-3-540-73147-4_5. ISBN   978-3-540-73146-7.
  5. 1 2 3 4 5 Klop, J. W. "Term Rewriting Systems" (PDF). Papers by Nachum Dershowitz and students. Tel Aviv University. p. 77. Retrieved 14 August 2021.
  6. 1 2 Horwitz, Susan B. "Lambda Calculus". CS704 Notes. University of Wisconsin Madison. Retrieved 19 August 2021.
  7. Barendregt, H. P.; Eekelen, M. C. J. D.; Glauert, J. R. W.; Kennaway, J. R.; Plasmeijer, M. J.; Sleep, M. R. (1987). Term graph rewriting. Parallel Architectures and Languages Europe. Vol. 259. pp. 141–158. doi:10.1007/3-540-17945-3_8.
  8. Antoy, Sergio; Middeldorp, Aart (September 1996). "A sequential reduction strategy" (PDF). Theoretical Computer Science. 165 (1): 75–95. doi:10.1016/0304-3975(96)00041-2 . Retrieved 8 September 2021.
  9. Kieburtz, Richard B. (November 2001). "A Logic for Rewriting Strategies". Electronic Notes in Theoretical Computer Science. 58 (2): 138–154. doi: 10.1016/S1571-0661(04)00283-X .
  10. 1 2 3 Mazzola, Guerino; Milmeister, Gérard; Weissmann, Jody (21 October 2004). Comprehensive Mathematics for Computer Scientists 2. Springer Science & Business Media. p. 323. ISBN   978-3-540-20861-7.
  11. Curry, Haskell B.; Feys, Robert (1958). Combinatory Logic. Vol. I. Amsterdam: North Holland. pp. 139–142. ISBN   0-7204-2208-6.
  12. Kashima, Ryo. "A Proof of the Standardization Theorem in λ-Calculus" (PDF). Tokyo Institute of Technology. Retrieved 19 August 2021.
  13. Vial, Pierre (7 December 2017). Non-Idempotent Typing Operators, beyond the λ-Calculus (PDF) (PhD). Sorbonne Paris Cité. p. 62.
  14. Partain, William D. (December 1989). Graph Reduction Without Pointers (PDF) (PhD). University of North Carolina at Chapel Hill. Retrieved 10 January 2022.
  15. Van Oostrom, Vincent; Toyama, Yoshihito (2016). Normalisation by Random Descent (PDF). 1st International Conference on Formal Structures for Computation and Deduction. p. 32:3. doi:10.4230/LIPIcs.FSCD.2016.32.
  16. Takahashi, M. (April 1995). "Parallel Reductions in λ-Calculus". Information and Computation. 118 (1): 120–127. doi: 10.1006/inco.1995.1057 .
  17. 1 2 3 4 Blanc, Tomasz; Lévy, Jean-Jacques; Maranget, Luc (2005). "Sharing in the Weak Lambda-Calculus". Processes, Terms and Cycles: Steps on the Road to Infinity: Essays Dedicated to Jan Willem Klop on the Occasion of His 60th Birthday. Springer. pp. 70–87. CiteSeerX   10.1.1.129.147 . doi:10.1007/11601548_7. ISBN   978-3-540-32425-6.
  18. Sestoft, Peter (2002). Mogensen, T; Schmidt, D; Sudborough, I. H. (eds.). Demonstrating Lambda Calculus Reduction (PDF). pp. 420–435. ISBN   3-540-00326-6.{{cite book}}: |work= ignored (help)
  19. 1 2 Felleisen, Matthias (2009). Semantics engineering with PLT Redex. Cambridge, Mass.: MIT Press. p. 42. ISBN   978-0262062756.
  20. Sestini, Filippo (2019). Normalization by Evaluation for Typed Weak lambda-Reduction (PDF). 24th International Conference on Types for Proofs and Programs (TYPES 2018). doi:10.4230/LIPIcs.TYPES.2018.6.
  21. 1 2 3 Asperti, Andrea; Guerrini, Stefano (1998). The optimal implementation of functional programming languages. Cambridge, UK: Cambridge University Press. ISBN   0521621127.
  22. 1 2 Fernández, Maribel; Siafakas, Nikolaos (30 March 2010). "Labelled Lambda-calculi with Explicit Copy and Erase". Electronic Proceedings in Theoretical Computer Science. 22: 49–64. arXiv: 1003.5515v1 . doi:10.4204/EPTCS.22.5. S2CID   15500633.
  23. Lévy, Jean-Jacques (9–11 November 1987). Sharing in the Evaluation of lambda Expressions (PDF). Second Franco-Japanese Symposium on Programming of Future Generation Computers. Cannes, France. p. 187. ISBN   0444705260.
  24. Terese (2003). Term rewriting systems. Cambridge, UK: Cambridge University Press. p. 518. ISBN   978-0-521-39115-3.
  25. Lamping, John (1990). An algorithm for optimal lambda calculus reduction (PDF). 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '90. pp. 16–30. doi:10.1145/96709.96711.
  26. Lévy, Jean-Jacques (June 1974). Réductions sures dans le lambda-calcul (PDF) (PhD) (in French). Université Paris VII. pp. 81–109. OCLC   476040273 . Retrieved 17 August 2021.
  27. Asperti, Andrea. "Bologna Optimal Higher-Order Machine, Version 1.1". GitHub.
  28. van Oostrom, Vincent; van de Looij, Kees-Jan; Zwitserlood, Marijn (2004). ] (Lambdascope): Another optimal implementation of the lambda-calculus (PDF). Workshop on Algebra and Logic on Programming Systems (ALPS).
  29. Chang, Stephen; Felleisen, Matthias (2012). "The Call-by-Need Lambda Calculus, Revisited" (PDF). Programming Languages and Systems. Lecture Notes in Computer Science. Vol. 7211. pp. 128–147. doi:10.1007/978-3-642-28869-2_7. ISBN   978-3-642-28868-5. S2CID   6350826.