Type family

Last updated

In computer science, a type family associates data types with other data types, using a type-level function defined by an open-ended collection of valid instances of input types and the corresponding output types. [1]

Contents

Type families are a feature of some type systems that allow partial functions between types to be defined by pattern matching. This is in contrast to data type constructors, which define injective functions from all types of a particular kind to a new set of types, and type synonyms (a.k.a. typedef), which define functions from all types of a particular kind to another existing set of types using a single case.

Type families and type classes are closely related: normal type classes define partial functions from types to a collection of named values by pattern matching on the input types, while type families define partial functions from types to types by pattern matching on the input types. In fact, in many uses of type families there is a single type class which logically contains both values and types associated with each instance. A type family declared inside a type class is called an associated type. [2]

Programming languages with support for type families or similar features include Haskell (with a common language extension), [3] Standard ML (through its module system), [4] Rust, [5] Scala (under the name "abstract types"), [6] and C++ (through use of typedefs in templates). [7]

Variations

The TypeFamilies extension in the Glasgow Haskell Compiler supports both type synonym families and data families. Type synonym families are the more flexible (but harder to type-check) form, permitting the types in the codomain of the type function to be any type whatsoever with the appropriate kind. [7] Data families, on the other hand, restrict the codomain by requiring each instance to define a new type constructor for the function's result. This ensures that the function is injective, allowing clients' contexts to deconstruct the type family and obtain the original argument type. [1]

Motivation and examples

Type families are useful in abstracting patterns where a common "organization" or "structure" of types is repeated, but with different specific types in each case. Typical use cases include describing abstract data types like generic collections, or design patterns like model–view–controller.

Self-optimizing abstract data types

One of the original motivations for the introduction of associated types was to allow abstract data types to be parameterized by their content type such that the data structure implementing the abstract type varies in a "self-optimizing" way. [2] Normal algebraic data type parameters can only describe data structures that behave uniformly with respect to all argument types. Associated types, however, can describe a family of data structures that have a uniform interface but vary in implementation according to one or more type parameters. For example, [2] using Haskell's associated types notation, we can declare a type class of valid array element types, with an associated data family representing an array of that element type:

classArrayElemewheredataArrayeindex::Arraye->Int->e

Instances can then be defined for this class, which define both the data structure used and the operations on the data structure in a single location. For efficiency, we might use a packed bit vector representation for arrays of Boolean values, while using a normal array data structure for integer values. The data structure for arrays of ordered pairs is defined recursively as a pair of arrays of each of the element types.

instanceArrayElemBoolwheredataArrayBool=BoolArrayBitVectorindex(BoolArrayar)i=indexBitVectorariinstanceArrayElemIntwheredataArrayInt=IntArrayUIntArrindex(IntArrayar)i=indexUIntArrariinstance(ArrayElema,ArrayElemb)=>ArrayElem(a,b)wheredataArray(a,b)=PairArray(Arraya)(Arrayb)index(PairArrayarbr)=(indexari,indexbri)

With these definitions, when a client refers to an Array (Int, Bool), an implementation is automatically selected using the defined instances.

A class for collections

Inverting the previous example, we can also use type families to define a class for collection types, where the type function maps each collection type to its corresponding element type: [7]

classCollectscwheretypeElemcempty::cinsert::Elemc->c->ctoList::c->[Elemc]instanceCollects[e]wheretypeElem[e]=eempty=[]insert=(:)toList=idinstanceOrde=>Collects(Set.Sete)wheretypeElem(Set.Sete)=eempty=Set.emptyinsert=Set.inserttoList=Set.toList

In this example, the use of a type synonym family instead of a data family is essential, since multiple collection types may have the same element type.

Comparison with functional dependencies

Functional dependencies are another type system feature that have similar uses to associated types. While an associated type adds a named type function mapping the enclosing type class's parameters to another type, a functional dependency lists the result type as another parameter of the type class and adds a constraint between the type parameters (e.g. "parameter a uniquely determines parameter b", written a -> b). The most common uses of functional dependencies can be directly converted to associated types and vice versa. [7]

Type families are regarded as being generally easier to type-check than functional dependencies. Another advantage of associated types over functional dependencies is that the latter requires clients using the type class to state all of the dependent types in their contexts, including ones they do not use; since associated types do not require this, adding another associated type to the class requires updating only the class's instances, while clients can remain unchanged. The main advantages of functional dependencies over type families are in their added flexibility in handling a few unusual cases. [8]

Related Research Articles

In computer science, an abstract data type (ADT) is a mathematical model for data types, defined by its behavior (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations. This mathematical model contrasts with data structures, which are concrete representations of data, and are the point of view of an implementer, not a user. For example, a stack has push/pop operations that follow a Last-In-First-Out rule, and can be concretely implemented using either a list or an array. Another example is a set which stores values, without any particular order, and no repeated values. Values themselves are not retrieved from sets, rather one tests a value for membership to obtain a Boolean "in" or "not in".

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

<span class="mw-page-title-main">Miranda (programming language)</span>

Miranda is a lazy, purely functional programming language designed by David Turner as a successor to his earlier programming languages SASL and KRC, using some concepts from ML and Hope. It was produced by Research Software Ltd. of England and was the first purely functional language to be commercially supported.

<span class="mw-page-title-main">Data type</span> Attribute of data

In computer science and computer programming, a data type is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these values as machine types. A data type specification in a program constrains the possible values that an expression, such as a variable or a function call, might take. On literal data, it tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

A method in object-oriented programming (OOP) is a procedure associated with an object, and generally also a message. An object consists of state data and behavior; these compose an interface, which specifies how the object may be used. A method is a behavior of an object parametrized by a user.

In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer programming, a function object is a construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. In some languages, particularly C++, function objects are often called functors.

The Glasgow Haskell Compiler (GHC) is a native or machine code compiler for the functional programming language Haskell. It provides a cross-platform software environment for writing and testing Haskell code and supports many extensions, libraries, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. It is free and open-source software released under a BSD license. The lead developers are Simon Peyton Jones and Simon Marlow.

In functional programming, a monad is a structure that combines program fragments (functions) and wraps their return values in a type with additional computation. In addition to defining a wrapping monadic type, monads define two operators: one to wrap a value in the monad type, and another to compose together functions that output values of the monad type. General-purpose languages use monads to reduce boilerplate code needed for common operations. Functional languages use monads to turn complicated sequences of functions into succinct pipelines that abstract away control flow, and side-effects.

In computer programming languages, a recursive data type is a data type for values that may contain other values of the same type. Data of recursive types are usually viewed as directed graphs.

In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T and a type variable a, and means that a can only be instantiated to a type whose members support the overloaded operations associated with T.

Platform Invocation Services, commonly referred to as P/Invoke, is a feature of Common Language Infrastructure implementations, like Microsoft's Common Language Runtime, that enables managed code to call native code.

In functional programming, a generalized algebraic data type is a generalization of parametric algebraic data types.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

In computer science, a finger tree is a purely functional data structure that can be used to efficiently implement other functional data structures. A finger tree gives amortized constant time access to the "fingers" (leaves) of the tree, which is where data is stored, and concatenation and splitting logarithmic time in the size of the smaller piece. It also stores in each internal node the result of applying some associative operation to its descendants. This "summary" data stored in the internal nodes can be used to provide the functionality of data structures other than trees.

The syntax and semantics of PHP, a programming language, form a set of rules that define how a PHP program can be written and interpreted.

This article describes the features in the programming language Haskell.

Swift is a high-level general-purpose, multi-paradigm, compiled programming language developed by Apple Inc. and the open-source community. Swift compiles to machine code, as it is an LLVM-based compiler. Swift was first released in June 2014, and the Swift toolchain has shipped in Xcode since version 6, released in 2014.

References

  1. 1 2 Kiselyov, Oleg; Peyton Jones, Simon; Shan, Chung-chieh (2010). "Fun with Type Functions" (PDF).
  2. 1 2 3 Chakravarty, Manuel M. T.; Keller, Gabriele; Peyton Jones, Simon; Marlow, Simon (2005). "Associated Types with Class". Proceedings of the 32nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press: 1–13.
  3. "Type Functions, Type Families, and Associated Types in GHC - The Master Plan". April 2019. Retrieved 4 April 2019.
  4. Wehr, Stefan; Chakravarty, Manuel M. T. (2008). "ML Modules and Haskell Type Classes: A Constructive Comparison". Proceedings of the Sixth ASIAN Symposium on Programming Languages and Systems. Lecture Notes in Computer Science. 5356. Springer-Verlag: 188–204. doi:10.1007/978-3-540-89330-1_14. ISBN   978-3-540-89329-5.
  5. "Associated Items - The Rust Reference". January 2024.
  6. "A Tour of Scala: Abstract Types" . Retrieved 23 February 2013.
  7. 1 2 3 4 Chakravarty, Manuel M. T.; Keller, Gabriele; Peyton Jones, Simon (2005). "Associated type synonyms". Proceedings of the tenth ACM SIGPLAN international conference on Functional programming. ACM Press. pp. 241–253. doi:10.1145/1086365.1086397. ISBN   1595930647. S2CID   16406612.{{cite book}}: CS1 maint: date and year (link)
  8. "Type Families (TF) vs Functional Dependencies (FD)" . Retrieved 4 April 2019.