Anamorphism

Last updated

In computer programming, an anamorphism is a function that generates a sequence by repeated application of the function to its previous result. You begin with some value A and apply a function f to it to get B. Then you apply f to B to get C, and so on until some terminating condition is reached. The anamorphism is the function that generates the list of A, B, C, etc. You can think of the anamorphism as unfolding the initial value into a sequence.

Contents

The above layman's description can be stated more formally in category theory: the anamorphism of a coinductive type denotes the assignment of a coalgebra to its unique morphism to the final coalgebra of an endofunctor. These objects are used in functional programming as unfolds .

The categorical dual (aka opposite) of the anamorphism is the catamorphism.

Anamorphisms in functional programming

In functional programming, an anamorphism is a generalization of the concept of unfolds on coinductive lists. Formally, anamorphisms are generic functions that can corecursively construct a result of a certain type and which is parameterized by functions that determine the next single step of the construction.

The data type in question is defined as the greatest fixed point ν X . F X of a functor F. By the universal property of final coalgebras, there is a unique coalgebra morphism A → ν X . F X for any other F-coalgebra a : A → F A. Thus, one can define functions from a type A _into_ a coinductive datatype by specifying a coalgebra structure a on A.

Example: Potentially infinite lists

As an example, the type of potentially infinite lists (with elements of a fixed type value) is given as the fixed point [value] = ν X . value × X + 1, i.e. a list consists either of a value and a further list, or it is empty. A (pseudo-)Haskell-Definition might look like this:

data[value]=(value:[value])|[]

It is the fixed point of the functor F value, where:

dataMaybea=Justa|NothingdataFvaluex=Maybe(value,x)

One can easily check that indeed the type [value] is isomorphic to F value [value], and thus [value] is the fixed point. (Also note that in Haskell, least and greatest fixed points of functors coincide, therefore inductive lists are the same as coinductive, potentially infinite lists.)

The anamorphism for lists (then usually known as unfold) would build a (potentially infinite) list from a state value. Typically, the unfold takes a state value x and a function f that yields either a pair of a value and a new state, or a singleton to mark the end of the list. The anamorphism would then begin with a first seed, compute whether the list continues or ends, and in case of a nonempty list, prepend the computed value to the recursive call to the anamorphism.

A Haskell definition of an unfold, or anamorphism for lists, called ana, is as follows:

ana::(state->Maybe(value,state))->state->[value]anafstateOld=casefstateOldofNothing->[]Just(value,stateNew)->value:anafstateNew

We can now implement quite general functions using ana, for example a countdown:

f::Int->Maybe(Int,Int)fcurrent=letoneSmaller=current-1inifoneSmaller<0thenNothingelseJust(oneSmaller,oneSmaller)

This function will decrement an integer and output it at the same time, until it is negative, at which point it will mark the end of the list. Correspondingly, ana f 3 will compute the list [2,1,0].

Anamorphisms on other data structures

An anamorphism can be defined for any recursive type, according to a generic pattern, generalizing the second version of ana for lists.

For example, the unfold for the tree data structure

dataTreea=Leafa|Branch(Treea)a(Treea)

is as follows

ana::(b->Eithera(b,a,b))->b->Treeaanaunspoolx=caseunspoolxofLefta->LeafaRight(l,x,r)->Branch(anaunspooll)x(anaunspoolr)

To better see the relationship between the recursive type and its anamorphism, note that Tree and List can be defined thus:

newtypeLista=List{unCons::Maybe(a,Lista)}newtypeTreea=Tree{unNode::Eithera(Treea,a,Treea))}

The analogy with ana appears by renaming b in its type:

newtypeLista=List{unCons::Maybe(a,Lista)}anaList::(list_a->Maybe(a,list_a))->(list_a->Lista)newtypeTreea=Tree{unNode::Eithera(Treea,a,Treea))}anaTree::(tree_a->Eithera(tree_a,a,tree_a))->(tree_a->Treea)

With these definitions, the argument to the constructor of the type has the same type as the return type of the first argument of ana, with the recursive mentions of the type replaced with b.

History

One of the first publications to introduce the notion of an anamorphism in the context of programming was the paper Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire, [1] by Erik Meijer et al., which was in the context of the Squiggol programming language.

Applications

Functions like zip and iterate are examples of anamorphisms. zip takes a pair of lists, say ['a','b','c'] and [1,2,3] and returns a list of pairs [('a',1),('b',2),('c',3)]. Iterate takes a thing, x, and a function, f, from such things to such things, and returns the infinite list that comes from repeated application of f, i.e. the list [x, (f x), (f (f x)), (f (f (f x))), ...].

zip(a:as)(b:bs)=if(as==[])||(bs==[])-- || means 'or'then[(a,b)]else(a,b):(zipasbs)iteratefx=x:(iteratef(fx))

To prove this, we can implement both using our generic unfold, ana, using a simple recursive routine:

zip2=anaunspfinwherefin(as,bs)=(as==[])||(bs==[])unsp((a:as),(b:bs))=((a,b),(as,bs))iterate2f=ana(\a->(a,fa))(\x->False)

In a language like Haskell, even the abstract functions fold, unfold and ana are merely defined terms, as we have seen from the definitions given above.

Anamorphisms in category theory

In category theory, anamorphisms are the categorical dual of catamorphisms (and catamorphisms are the categorical dual of anamorphisms).

That means the following. Suppose (A, fin) is a final F-coalgebra for some endofunctor F of some category into itself. Thus, fin is a morphism from A to FA, and since it is assumed to be final we know that whenever (X, f) is another F-coalgebra (a morphism f from X to FX), there will be a unique homomorphism h from (X, f) to (A, fin), that is a morphism h from X to A such that fin . h = Fh . f. Then for each such f we denote by anaf that uniquely specified morphism h.

In other words, we have the following defining relationship, given some fixed F, A, and fin as above:

Notation

A notation for ana f found in the literature is . The brackets used are known as lens brackets, after which anamorphisms are sometimes referred to as lenses.

See also

Related Research Articles

In mathematics and computer science, currying is the technique of converting a function that takes multiple arguments into a sequence of functions that each takes a single argument. For example, currying a function that takes three arguments creates three functions:

In mathematics, specifically category theory, a functor is a mapping between categories. Functors were first considered in algebraic topology, where algebraic objects are associated to topological spaces, and maps between these algebraic objects are associated to continuous maps between spaces. Nowadays, functors are used throughout modern mathematics to relate various categories. Thus, functors are important in all areas within mathematics to which category theory is applied.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.

In mathematics and computer science in general, a fixed point of a function is a value that is mapped to itself by the function.

In functional programming, a monad is a software design pattern with a structure that combines program fragments (functions) and wraps their return values in a type with additional computation. In addition to defining a wrapping monadic type, monads define two operators: one to wrap a value in the monad type, and another to compose together functions that output values of the monad type. General-purpose languages use monads to reduce boilerplate code needed for common operations. Functional languages use monads to turn complicated sequences of functions into succinct pipelines that abstract away control flow, and side-effects.

In computer science, corecursion is a type of operation that is dual to recursion. Whereas recursion works analytically, starting on data further from a base case and breaking it down into smaller data and repeating until one reaches a base case, corecursion works synthetically, starting from a base case and building it up, iteratively producing data further removed from a base case. Put simply, corecursive algorithms use the data that they themselves produce, bit by bit, as they become available, and needed, to produce further bits of data. A similar but distinct concept is generative recursion which may lack a definite "direction" inherent in corecursion and recursion.

In mathematics, specifically in category theory, an -coalgebra is a structure defined according to a functor , with specific properties as defined below. For both algebras and coalgebras, a functor is a convenient and general way of organizing a signature. This has applications in computer science: examples of coalgebras include lazy, infinite data structures, such as streams, and also transition systems.

<i>F</i>-algebra

In mathematics, specifically in category theory, F-algebras generalize the notion of algebraic structure. Rewriting the algebraic laws in terms of morphisms eliminates all references to quantified elements from the axioms, and these algebraic laws may then be glued together in terms of a single functor F, the signature.

Recursion (computer science) Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

In mathematics, an initial algebra is an initial object in the category of F-algebras for a given endofunctor F. This initiality provides a general framework for induction and recursion.

In category theory, the concept of catamorphism denotes the unique homomorphism from an initial algebra into some other algebra.

In many programming languages, map is the name of a higher-order function that applies a given function to each element of a collection, e.g. a list or set, returning the results in a collection of the same type. It is often called apply-to-all when considered in functional form.

In functional programming, fold refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value. Typically, a fold is presented with a combining function, a top node of a data structure, and possibly some default values to be used under certain conditions. The fold then proceeds to combine elements of the data structure's hierarchy, using the function in a systematic way.

A zipper is a technique of representing an aggregate data structure so that it is convenient for writing programs that traverse the structure arbitrarily and update its contents, especially in purely functional programming languages. The zipper was described by Gérard Huet in 1997. It includes and generalizes the gap buffer technique sometimes used with arrays.

In computing, a rose tree is a term for the value of a tree data structure with a variable and unbounded number of branches per node. The term is mostly used in the functional programming community, e.g., in the context of the Bird–Meertens formalism. Apart from the multi-branching property, the most essential characteristic of rose trees is the coincidence of bisimilarity with identity: two distinct rose trees are never bisimilar.

In computer science, and in particular functional programming, a hylomorphism is a recursive function, corresponding to the composition of an anamorphism followed by a catamorphism. Fusion of these two recursive computations into a single recursive pattern then avoids building the intermediate data structure. This is an example of deforestation, a program optimization strategy. A related type of function is a metamorphism, which is a catamorphism followed by an anamorphism.

In formal methods of computer science, a paramorphism (from Greek παρά, meaning "close together") is an extension of the concept of catamorphism first introduced by Lambert Meertens to deal with a form which “eats its argument and keeps it too”, as exemplified by the factorial function. Its categorical dual is the apomorphism.

In type theory, a system has inductive types if it has facilities for creating a new type from constants and functions that create terms of that type. The feature serves a role similar to data structures in a programming language and allows a type theory to add concepts like numbers, relations, and trees. As the name suggests, inductive types can be self-referential, but usually only in a way that permits structural recursion.

This article describes the features in Haskell.

In type theory, a polynomial functor is a kind of endofunctor of a category of types that is intimately related to the concept of inductive and coinductive types. Specifically, all W-types are initial algebras of such functors.

References

  1. Meijer, Erik; Fokkinga, Maarten; Paterson, Ross (1991). "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire": 124–144. CiteSeerX   10.1.1.41.125 .{{cite journal}}: Cite journal requires |journal= (help)