Director string

Last updated February 19, 2020

In mathematics, in the area of lambda calculus and computation, directors or director strings are a mechanism for keeping track of the free variables in a term. Loosely speaking, they can be understood as a kind of memoization for free variables; that is, as an optimization technique for rapidly locating the free variables in a term algebra or in a lambda expression. Director strings were introduced by Kennaway and Sleep in 1982 and further developed by Sinot, Fernández and Mackie^[1] as a mechanism for understanding and controlling the computational complexity cost of beta reduction.

Motivation

In beta reduction, one defines the value of the expression on the left to be that on the right:

(\lambda x.E)y\equiv E[x:=y]\,

or

(\lambda x.E)y\equiv E[y/x]

(Replace all x in E(body) by y)

While this is a conceptually simple operation, the computational complexity of the step can be non-trivial: a naive algorithm would scan the expression E for all occurrences of the free variable x. Such an algorithm is clearly O(n) in the length of the expression E. Thus, one is motivated to somehow track the occurrences of the free variables in the expression. One may attempt to track the position of every free variable, wherever it may occur in the expression, but this can clearly become very costly in terms of storage; furthermore, it provides a level of detail that is not really needed. Director strings suggest that the correct model is to track free variables in a hierarchical fashion, by tracking their use in component terms.

Definition

Consider, for simplicity, a term algebra, that is, a collection of free variables, constants, and operators which may be freely combined. Assume that a term t takes the form

t::=f(t_{1},t_{2},\dots ,t_{n})

where f is a function, of arity n, with no free variables, and the $t_{i}$ are terms that may or may not contain free variables. Let V denote the set of all free variables that may occur in the set of all terms. The director is then the map

\sigma _{t}:V\to P(\lbrace 1,2,\dots ,n\rbrace )

from the free variables to the power set $P(X)$ of the set $X=\lbrace 1,2,\dots ,n\rbrace$ . The values taken by $\sigma _{t}$ are simply a list of the indices of the $t_{i}$ in which a given free variable occurs. Thus, for example, if a free variable $x\in V$ occurs in $t_{3}$ and $t_{5}$ but in no other terms, then one has $\sigma _{t}(x)=\lbrace 3,5\rbrace$ .

Thus, for every term $t\in T$ in the set of all terms T, one maintains a function $\sigma _{t}$ , and instead of working only with terms t, one works with pairs $(t,\sigma _{t})$ . Thus, the time complexity of finding the free variables in t is traded for the space complexity of maintaining a list of the terms in which a variable occurs.

General case

Although the above definition is formulated in terms of a term algebra, the general concept applies more generally, and can be defined both for combinatory algebras and for lambda calculus proper, specifically, within the framework of explicit substitution.

Related Research Articles

In formal language theory, a context-free grammar (CFG) is a certain type of formal grammar: a set of production rules that describe all possible strings in a given formal language. Production rules are simple replacements. For example, the rule

Lambda calculus is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation that can be used to simulate any Turing machine. It was introduced by the mathematician Alonzo Church in the 1930s as part of his research into the foundations of mathematics.

In probability theory, a normaldistribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In mathematics, a coefficient is a multiplicative factor in some term of a polynomial, a series, or any expression; it is usually a number, but may be any expression. In the latter case, the variables appearing in the coefficients are often called parameters, and must be clearly distinguished from the other variables.

In logic and computer science, unification is an algorithmic process of solving equations between symbolic expressions.

Combinatory logic is a notation to eliminate the need for quantified variables in mathematical logic. It was introduced by Moses Schönfinkel and Haskell Curry, and has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming languages. It is based on combinators which were introduced by Schönfinkel in 1920 with the idea of providing an analogous way to build up functions - and to remove any mention of variables - particularly in predicate logic. A combinator is a higher-order function that uses only function application and earlier defined combinators to define a result from its arguments.

In mathematics, computer science, and logic, rewriting covers a wide range of methods of replacing subterms of a formula with other terms. The objects of focus for this article include rewriting systems. In their most basic form, they consist of a set of objects, plus relations on how to transform those objects.

In mathematics, a differential-algebraic system of equations (DAEs) is a system of equations that either contains differential equations and algebraic equations, or is equivalent to such a system. Such systems occur as the general form of differential equations for vector–valued functions x in one independent variable t,

The simply typed lambda calculus, a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor: $that builds function types. It is the canonical and simplest example of a typed lambda calculus. The simply typed lambda calculus was originally introduced by Alonzo Church in 1940 as an attempt to avoid paradoxical uses of the untyped lambda calculus, and it exhibits many desirable and interesting properties.$

In theoretical computer science and mathematical logic a string rewriting system (SRS), historically called a semi-Thue system, is a rewriting system over strings from a alphabet. Given a binary relation $between fixed strings over the alphabet, called rewrite rules, denoted by, an SRS extends the rewriting relation to all strings in which the left- and right-hand side of the rules appear as substrings, that is, where,,, and are strings.$

In mathematics and computer science, a word problem for a set S with respect to a system of finite encodings of its elements is the algorithmic problem of deciding whether two given representatives represent the same element of the set. The problem is commonly encountered in abstract algebra, where given a presentation of an algebraic structure by generators and relators, the problem is to determine if two expressions represent the same element; a prototypical example is the word problem for groups. Less formally, the word problem in an algebra is: given a set of identities E, and two expressions x and y, is it possible to transform x into y using the identities in E as rewriting rules in both directions? While answering this question may not seem hard, the remarkable result that emerges, in many important cases, is that the problem is undecidable.

In formal language theory, a string is defined as a finite sequence of members of an underlying base set; this set is called the alphabet of a string or collection of strings. The members of the set are called symbols, and are typically thought of as representing letters, characters, or digits. For example, a common alphabet is {0,1}, the binary alphabet, and a binary string is a string drawn from the alphabet {0,1}. An infinite sequence of letters may be constructed from elements of an alphabet as well.

In statistical theory, a pseudolikelihood is an approximation to the joint probability distribution of a collection of random variables. The practical use of this is that it can provide an approximation to the likelihood function of a set of observed data which may either provide a computationally simpler problem for estimation, or may provide a way of obtaining explicit estimates of model parameters.

Substitution is a fundamental concept in logic. A substitution is a syntactic transformation on formal expressions. To apply a substitution to an expression means to consistently replace its variable, or placeholder, symbols by other expressions. The resulting expression is called a substitution instance, or short instance, of the original expression.

In statistics, Bayesian multivariate linear regression is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random variable. A more general treatment of this approach can be found in the article MMSE estimator.

The Swendsen–Wang algorithm is the first non-local or cluster algorithm for Monte Carlo simulation for large systems near criticality. It has been introduced by Robert Swendsen and Jian-Sheng Wang in 1987 at Carnegie Mellon.

In computational mathematics, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions and other mathematical objects. Although computer algebra could be considered a subfield of scientific computing, they are generally considered as distinct fields because scientific computing is usually based on numerical computation with approximate floating point numbers, while symbolic computation emphasizes exact computation with expressions containing variables that have no given value and are manipulated as symbols.

In analogy to natural language, where a noun phrase refers to an object and a whole sentence refers to a fact, in mathematical logic, a term denotes a mathematical object and a formula denotes a mathematical fact. In particular, terms appear as components of a formula.

Computable topology is a discipline in mathematics that studies the topological and algebraic structure of computation. Computable topology is not to be confused with algorithmic or computational topology, which studies the application of computation to topology.

Anti-unification is the process of constructing a generalization common to two given symbolic expressions. As in unification, several frameworks are distinguished depending on which expressions are allowed, and which expressions are considered equal. If variables representing functions are allowed in an expression, the process is called "higher-order anti-unification", otherwise "first-order anti-unification". If the generalization is required to have an instance literally equal to each input expression, the process is called "syntactical anti-unification", otherwise "E-anti-unification", or "anti-unification modulo theory".

References

↑ F.-R. Sinot, M. Fernández and I. Mackie. Efficient Reductions with Director Strings. In Proc. Rewriting Techniques and Applications. Springer LNCS vol 2706, 2003

F.-R. Sinot. "Director Strings Revisited: A Generic Approach to the Efficient Representation of Free Variables in Higher-order Rewriting." Journal of Logic and Computation15(2), pages 201-218, 2005.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] F.-R. Sinot, M. Fernández and I. Mackie. Efficient Reductions with Director Strings. In Proc. Rewriting Techniques and Applications. Springer LNCS vol 2706, 2003