Arrow (computer science)

Last updated

In computer science, arrows or bolts are a type class used in programming to describe computations in a pure and declarative fashion. First proposed by computer scientist John Hughes as a generalization of monads, arrows provide a referentially transparent way of expressing relationships between logical steps in a computation. [1] Unlike monads, arrows don't limit steps to having one and only one input. As a result, they have found use in functional reactive programming, point-free programming, and parsers among other applications. [1] [2]

Contents

Motivation and history

While arrows were in use before being recognized as a distinct class, it wasn't until 2000 that John Hughes first published research focusing on them. Until then, monads had proven sufficient for most problems requiring the combination of program logic in pure code. However, some useful libraries, such as the Fudgets library for graphical user interfaces and certain efficient parsers, defied rewriting in a monadic form. [1]

The formal concept of arrows was developed to explain these exceptions to monadic code, and in the process, monads themselves turned out to be a subset of arrows. [1] Since then, arrows have been an active area of research. Their underlying laws and operations have been refined several times, with recent formulations such as arrow calculus requiring only five laws. [3]

Relation to category theory

In category theory, the Kleisli categories of all monads form a proper subset of Hughes arrows. [1] While Freyd categories were believed to be equivalent to arrows for a time, [4] it has since been proven that arrows are even more general. In fact, arrows are not merely equivalent, but directly equal to enriched Freyd categories. [5]

Definition

Like all type classes, arrows can be thought of as a set of qualities that can be applied to any data type. In the Haskell programming language, arrows allow functions (represented in Haskell by -> symbol) to combine in a reified form. However, the actual term "arrow" may also come from the fact that some (but not all) arrows correspond to the morphisms (also known as "arrows" in category theory) of different Kleisli categories. As a relatively new concept, there is not a single, standard definition, but all formulations are logically equivalent, feature some required methods, and strictly obey certain mathematical laws. [6]

Functions

The description currently used by the Haskell standard libraries requires only three basic operations:

arr:(s->t)->Ast
first:Ast->A(s,u)(t,u)
(>>>):Ast->Atu->Asu

Although only these three procedures are strictly necessary to define an arrow, other methods can be derived to make arrows easier to work with in practice and theory.

One more helpful method can be derived from arr and first (and from which first can be derived):

(***):Ast->Auv->A(s,u)(t,v)

Arrow laws

In addition to having some well-defined procedures, arrows must obey certain rules for any types they may be applied to:

arrid==id
arr(f>>>g)==arrf>>>arrgfirst(f>>>g)==firstf>>>firstg
arr(firstf)==first(arrf)

The remaining laws restrict how the piping method behaves when the order of a composition is reversed, also allowing for simplifying expressions:

arr(id***g)>>>firstf==firstf>>>arr(id***g)
firstf>>>arr((s,t)->s)==arr((s,t)->s)>>>f
first(firstf)>>>arr(((s,t),u)->(s,(t,u)))==arr(((s,t),u)->(s,(t,u)))>>>firstf

Applications

Arrows may be extended to fit specific situations by defining additional operations and restrictions. Commonly used versions include arrows with choice, which allow a computation to make conditional decisions, and arrows with feedback, which allow a step to take its own outputs as inputs. Another set of arrows, known as arrows with application, are rarely used in practice because they are actually equivalent to monads. [6]

Utility

Arrows have several benefits, mostly stemming from their ability to make program logic explicit yet concise. Besides avoiding side effects, purely functional programming creates more opportunities for static code analysis. This in turn can theoretically lead to better compiler optimizations, easier debugging, and features like syntax sugar. [6]

Although no program strictly requires arrows, they generalize away much of the dense function passing that pure, declarative code would otherwise require. They can also encourage code reuse by giving common linkages between program steps their own class definitions. The ability to apply to types generically also contributes to reusability and keeps interfaces simple. [6]

Arrows do have some disadvantages, including the initial effort of defining an arrow that satisfies the arrow laws. Because monads are usually easier to implement, and the extra features of arrows may be unnecessary, it is often preferable to use a monad. [6] Another issue, which applies to many functional programming constructs, is efficiently compiling code with arrows into the imperative style used by computer instruction sets.[ citation needed ]

Limitations

Due to the requirement of having to define an arr function to lift pure functions, the applicability of arrows is limited. For example, bidirectional transformations cannot be arrows, because one would need to provide not only a pure function, but also its inverse, when using arr. [8] This also limits the use of arrows to describe push-based reactive frameworks that stop unnecessary propagation. Similarly, the use of pairs to tuple values together results in a difficult coding style that requires additional combinators to re-group values, and raises fundamental questions about the equivalence of arrows grouped in different ways. These limitations remain an open problem, and extensions such as Generalized Arrows [8] and N-ary FRP [9] explore these problems.

Much of the utility of arrows is subsumed by more general classes like Profunctor (which requires only pre- and postcomposition with functions), which have application in optics. An arrow is essentially a strong profunctor that is also a category, although the laws are slightly different.

Related Research Articles

In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

<span class="mw-page-title-main">F Sharp (programming language)</span> Microsoft programming language

F# is a functional-first, general-purpose, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.

In programming language theory and proof theory, the Curry–Howard correspondence is the direct relationship between computer programs and mathematical proofs.

In computer programming, an assignment statement sets and/or re-sets the value stored in the storage location(s) denoted by a variable name; in other words, it copies a value into the variable. In most imperative programming languages, the assignment statement is a fundamental construct.

In category theory, a branch of mathematics, a monad is a monoid in the category of endofunctors of some fixed category. An endofunctor is a functor mapping a category to itself, and a monad is an endofunctor together with two natural transformations required to fulfill certain coherence conditions. Monads are used in the theory of pairs of adjoint functors, and they generalize closure operators on partially ordered sets to arbitrary categories. Monads are also useful in the theory of datatypes, the denotational semantics of imperative programming languages, and in functional programming languages, allowing languages with non-mutable states to do things such as simulate for-loops; see Monad.

In functional programming, a monad is a structure that combines program fragments (functions) and wraps their return values in a type with additional computation. In addition to defining a wrapping monadic type, monads define two operators: one to wrap a value in the monad type, and another to compose together functions that output values of the monad type. General-purpose languages use monads to reduce boilerplate code needed for common operations. Functional languages use monads to turn complicated sequences of functions into succinct pipelines that abstract away control flow, and side-effects.

In computer science, a deterministic algorithm is an algorithm that, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far the most studied and familiar kind of algorithm, as well as one of the most practical, since they can be run on real machines efficiently.

In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures. Some programming language theorists require support for anonymous functions as well. In languages with first-class functions, the names of functions do not have any special status; they are treated like ordinary variables with a function type. The term was coined by Christopher Strachey in the context of "functions as first-class citizens" in the mid-1960s.

In computer science, corecursion is a type of operation that is dual to recursion. Whereas recursion works analytically, starting on data further from a base case and breaking it down into smaller data and repeating until one reaches a base case, corecursion works synthetically, starting from a base case and building it up, iteratively producing data further removed from a base case. Put simply, corecursive algorithms use the data that they themselves produce, bit by bit, as they become available, and needed, to produce further bits of data. A similar but distinct concept is generative recursion which may lack a definite "direction" inherent in corecursion and recursion.

In computer science, function composition is an act or mechanism to combine simple functions to build more complicated ones. Like the usual composition of functions in mathematics, the result of each function is passed as the argument of the next, and the result of the last one is the result of the whole.

In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T and a type variable a, and means that a can only be instantiated to a type whose members support the overloaded operations associated with T.

In computer programming, a semipredicate problem occurs when a subroutine intended to return a useful value can fail, but the signalling of failure uses an otherwise valid return value. The problem is that the caller of the subroutine cannot tell what the result means in this case.

In computer programming, a pure function is a function that has the following properties:

  1. the function return values are identical for identical arguments, and
  2. the function has no side effects.

Functional reactive programming (FRP) is a programming paradigm for reactive programming using the building blocks of functional programming. FRP has been used for programming graphical user interfaces (GUIs), robotics, games, and music, aiming to simplify these problems by explicitly modeling time.

In computer programming, a parser combinator is a higher-order function that accepts several parsers as input and returns a new parser as its output. In this context, a parser is a function accepting strings as input and returning some structure as output, typically a parse tree or a set of indices representing locations in the string where parsing stopped successfully. Parser combinators enable a recursive descent parsing strategy that facilitates modular piecewise construction and testing. This parsing technique is called combinatory parsing.

This article describes the features in the programming language Haskell.

Haskell is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell has pioneered a number of programming language features such as type classes, which enable type-safe operator overloading, and monadic input/output (IO). It is named after logician Haskell Curry. Haskell's main implementation is the Glasgow Haskell Compiler (GHC).

In functional programming, an applicative functor, or an applicative for short, is an intermediate structure between functors and monads. Applicative functors allow for functorial computations to be sequenced, but don't allow using results from prior computations in the definition of subsequent ones. Applicative functors are the programming equivalent of lax monoidal functors with tensorial strength in category theory.

<span class="mw-page-title-main">Functor (functional programming)</span> Design pattern in pure functional programming

In functional programming, a functor is a design pattern inspired by the definition from category theory that allows one to apply a function to values inside a generic type without changing the structure of the generic type. In Haskell this idea can be captured in a type class:

References

  1. 1 2 3 4 5 Hughes, John (May 2000). "Generalising Monads to Arrows". Science of Computer Programming. 37 (1–3): 67–111. doi:10.1016/S0167-6423(99)00023-4. ISSN   0167-6423.
  2. Paterson, Ross (27 March 2003). "Chapter 10: Arrows and Computation" (PS.GZ). In Gibbons, Jeremy; de Moor, Oege (eds.). The Fun of Programming. Palgrave Macmillan. pp. 201–222. ISBN   978-1403907721 . Retrieved 10 June 2012.
  3. Lindley, Sam; Wadler, Philip; Yallop, Jeremy (January 2010). "The Arrow Calculus" (PDF). Journal of Functional Programming. 20 (1): 51–69. doi:10.1017/S095679680999027X. hdl: 1842/3716 . ISSN   0956-7968. S2CID   7387691 . Retrieved 10 June 2012.
  4. Jacobs, Bart; Heunen, Chris; Hasuo, Ichiro (2009). "Categorical Semantics for Arrows". Journal of Functional Programming. 19 (3–4): 403–438. doi:10.1017/S0956796809007308. hdl: 2066/75278 .
  5. Atkey, Robert (8 March 2011). "What is a Categorical Model of Arrows?". Electronic Notes in Theoretical Computer Science. 229 (5): 19–37. doi: 10.1016/j.entcs.2011.02.014 . ISSN   1571-0661.
  6. 1 2 3 4 5 Hughes, John (2005). "Programming with Arrows" (PDF). Advanced Functional Programming. 5th International Summer School on Advanced Functional Programming. 14–21 August 2004. Tartu, Estonia. Springer. pp. 73–129. doi:10.1007/11546382_2. ISBN   978-3-540-28540-3 . Retrieved 10 June 2012.
  7. 1 2 3 4 5 6 7 8 9 10 Paterson, Ross (2002). "Control.Arrow". base-4.5.0.0: Basic libraries. haskell.org. Archived from the original on 13 February 2006. Retrieved 10 June 2012.
  8. 1 2 Joseph, Adam Megacz (2014). "Generalized Arrows" (PDF). Technical Report No. UCB/EECS-2014-130. EECS Department, University of California, Berkeley. Retrieved 20 October 2018.
  9. Sculthorpe, Neil (2011). Towards safe and efficient functional reactive programming (PDF) (PhD). University of Nottingham.