Difference list

Last updated April 16, 2024

In computer science, the term difference list refers to a data structure representing a list with an efficient O(1) concatenation operation and conversion to a linked list in time proportional to its length. Difference lists can be implemented using first-class functions or using unification. Whether a difference list is more efficient than other list representations depends on usage patterns. If an algorithm builds a list by concatenating smaller lists, which are themselves built by concatenating still smaller lists, then the usage of difference lists can improve performance by effectively "flattening" the list building computations.

Implementation using functions

A difference list f is a single-argument function append L, which when given a linked list X as argument, returns a linked list containing L prepended to X. Concatenation of difference lists is implemented as function composition. The contents may be retrieved using f [].^[1]

This implementation is typically used in functional programming languages such as Haskell, although it could be used in imperative languages as well.

As functions, difference lists are a Cayley representation of lists as monoids, or more specifically their transformation monoid induced by left multiplication.

Examples of use are in the ShowS type in the Prelude of Haskell, and in Donald Bruce Stewart's difference list library for Haskell.^[2]

Implementation using unification

Another implementation in the logic programming language Prolog uses unification variables.^[3] A difference list is a pair OpenList-Hole, where the first element OpenList is a list containing an unbound unification variable (hole) and the second element Hole is a reference to the hole.

Related Research Articles

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

Prolog is a logic programming language that has its origins in artificial intelligence, automated theorem proving and computational linguistics.

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed. A string is generally considered as a data type and is often implemented as an array data structure of bytes that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence data types and structures.

In logic and computer science, specifically automated reasoning, unification is an algorithmic process of solving equations between symbolic expressions, each of the form Left-hand side = Right-hand side. For example, using x,y,z as variables, and taking f to be an uninterpreted function, the singleton equation set { f(1,y) = f(x,2) } is a syntactic first-order unification problem that has the substitution { x ↦ 1, y ↦ 2 } as its only solution.

In computer programming, cons is a fundamental function in most dialects of the Lisp programming language. consconstructs memory objects which hold two values or pointers to two values. These objects are referred to as (cons) cells, conses, non-atomic s-expressions ("NATSes"), or (cons) pairs. In Lisp jargon, the expression "to cons x onto y" means to construct a new object with (cons xy). The resulting pair has a left half, referred to as the car, and a right half, referred to as the cdr.

In computer science, a list or sequence is an abstract data type that represents a finite number of ordered values, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical concept of a tuple or finite sequence; the (potentially) infinite analog of a list is a stream. Lists are a basic example of containers, as they contain other values. If the same value occurs multiple times, each occurrence is considered a distinct item.

In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow.

In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be a match." The patterns generally have the form of either sequences or tree structures. Uses of pattern matching include outputting the locations of a pattern within a token sequence, to output some component of the matched pattern, and to substitute the matching pattern with some other token sequence.

In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types.

Curry is a declarative programming language, an implementation of the functional logic programming paradigm, and based on the Haskell language. It merges elements of functional and logic programming, including constraint programming integration.

In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. The term was introduced in Driscoll, Sarnak, Sleator, and Tarjan's 1986 article.

The Joy programming language in computer science is a purely functional programming language that was produced by Manfred von Thun of La Trobe University in Melbourne, Australia. Joy is based on composition of functions rather than lambda calculus. It has turned out to have many similarities to Forth, due not to design but to an independent evolution and convergence. It was also inspired by the function-level programming style of John Backus's FP.

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

A definite clause grammar (DCG) is a way of expressing grammar, either for natural or formal languages, in a logic programming language such as Prolog. It is closely related to the concept of attribute grammars / affix grammars. DCGs are usually associated with Prolog, but similar languages such as Mercury also include DCGs. They are called definite clause grammars because they represent a grammar as a set of definite clauses in first-order logic.

In computer programming, append is the operation for concatenating linked lists or arrays in some high-level programming languages.

In computer programming, homoiconicity is a property of some programming languages. A language is homoiconic if a program written in it can be manipulated as data using the language. The program's internal representation can thus be inferred just by reading the program itself. This property is often summarized by saying that the language treats code as data.

In many programming languages, map is a higher-order function that applies a given function to each element of a collection, e.g. a list or set, returning the results in a collection of the same type. It is often called apply-to-all when considered in functional form.

In functional programming, fold refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value. Typically, a fold is presented with a combining function, a top node of a data structure, and possibly some default values to be used under certain conditions. The fold then proceeds to combine elements of the data structure's hierarchy, using the function in a systematic way.

In computer science, a finger tree is a purely functional data structure that can be used to efficiently implement other functional data structures. A finger tree gives amortized constant time access to the "fingers" (leaves) of the tree, which is where data is stored, and concatenation and splitting logarithmic time in the size of the smaller piece. It also stores in each internal node the result of applying some associative operation to its descendants. This "summary" data stored in the internal nodes can be used to provide the functionality of data structures other than trees.

The syntax and semantics of Prolog, a programming language, are the sets of rules that define how a Prolog program is written and how it is interpreted, respectively. The rules are laid out in ISO standard ISO/IEC 13211 although there are differences in the Prolog implementations.

References

↑ A Novel Representation of Lists and Its Application to the Function "Reverse" by John Hughes (1986)
↑ Difference Lists in Haskell (programming language)
↑ Open Lists and Difference Lists in Prolog Retrieved 2019-02-17

This algorithms or data structures-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] A Novel Representation of Lists and Its Application to the Function "Reverse" by John Hughes (1986)

[2] Difference Lists in Haskell (programming language)

[3] Open Lists and Difference Lists in Prolog Retrieved 2019-02-17

[1]

[2]

[3]

Difference list

Contents

Implementation using functions

Implementation using unification

Related Research Articles

References