Type class

Last updated

In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T and a type variable a, and means that a can only be instantiated to a type whose members support the overloaded operations associated with T.

Contents

Type classes were first implemented in the Haskell programming language after first being proposed by Philip Wadler and Stephen Blott as an extension to "eqtypes" in Standard ML, [1] [2] and were originally conceived as a way of implementing overloaded arithmetic and equality operators in a principled fashion. [3] [2] In contrast with the "eqtypes" of Standard ML, overloading the equality operator through the use of type classes in Haskell does not need extensive modification of the compiler frontend or the underlying type system. [4]

Overview

Type classes are defined by specifying a set of function or constant names, together with their respective types, that must exist for every type that belongs to the class. In Haskell, types can be parameterized; a type class Eq intended to contain types that admit equality would be declared in the following way:

classEqawhere(==)::a->a->Bool(/=)::a->a->Bool

where a is one instance of the type class Eq, and a defines the function signatures for 2 functions (the equality and inequality functions), which each take 2 arguments of type a and return a Boolean.

The type variable a has kind ( is also known as Type in the latest Glasgow Haskell Compiler (GHC) release), [5] meaning that the kind of Eq is

Eq::Type->Constraint

The declaration may be read as stating a "type a belongs to type class Eq if there are functions named (==), and (/=), of the appropriate types, defined on it". A programmer could then define a function elem (which determines if an element is in a list) in the following way:

elem::Eqa=>a->[a]->Boolelemy[]=Falseelemy(x:xs)=(x==y)||elemyxs

The function elem has the type a -> [a] -> Bool with the context Eq a, which constrains the types which a can range over to those a which belong to the Eq type class. (Haskell => can be called a 'class constraint'.)

Any type t can be made a member of a given type class C by using an instance declaration that defines implementations of all of C's methods for the particular type t. For example, if a new data type t is defined, this new type can be made an instance of Eq by providing an equality function over values of type t in any way that is useful. Once this is done, the function elem can be used on [t], that is, lists of elements of type t.

Type classes are different from classes in object-oriented programming languages. In particular, Eq is not a type: there is no such thing as a value of type Eq.

Type classes are closely related to parametric polymorphism. For example, the type of elem as specified above would be the parametrically polymorphic type a -> [a] -> Bool were it not for the type class constraint "Eq a =>".

Higher-kinded polymorphism

A type class need not take a type variable of kind Type but can take one of any kind. These type classes with higher kinds are sometimes called constructor classes (the constructors referred to are type constructors such as Maybe, rather than data constructors such as Just). An example is the Monad class:

classMonadmwherereturn::a->ma(>>=)::ma->(a->mb)->mb

That m is applied to a type variable indicates that it has kind Type -> Type, i.e., it takes a type and returns a type, the kind of Monad is thus:

Monad::(Type->Type)->Constraint

Multi-parameter type classes

Type classes permit multiple type parameters, and so type classes can be seen as relations on types. [6] For example, in the GHC standard library, the class IArray expresses a general immutable array interface. In this class, the type class constraint IArray a e means that a is an array type that contains elements of type e. (This restriction on polymorphism is used to implement unboxed array types, for example.)

Like multimethods,[ citation needed ] multi-parameter type classes support calling different implementations of a method depending on the types of multiple arguments, and indeed return types. Multi-parameter type classes do not require searching for the method to call on every call at runtime; [7] :minute 25:12 rather the method to call is first compiled and stored in the dictionary of the type class instance, just as with single-parameter type classes.

Haskell code that uses multi-parameter type classes is not portable, as this feature is not part of the Haskell 98 standard. The popular Haskell implementations, GHC and Hugs, support multi-parameter type classes.

Functional dependencies

In Haskell, type classes have been refined to allow the programmer to declare functional dependencies between type parametersa concept inspired from relational database theory. [8] [9] That is, the programmer can assert that a given assignment of some subset of the type parameters uniquely determines the remaining type parameters. For example, a general monad m which carries a state parameter of type s satisfies the type class constraint Monad.State s m. In this constraint, there is a functional dependency m -> s. This means that for a given monad m of type class Monad.State, the state type accessible from m is uniquely determined. This aids the compiler in type inference, as well as aiding the programmer in type-directed programming.

Simon Peyton Jones has objected to the introduction of functional dependencies in Haskell on grounds of complexity. [10]

Type classes and implicit parameters

Type classes and implicit parameters are very similar in nature, although not quite the same. A polymorphic function with a type class constraint such as:

sum::Numa=>[a]->a

can be intuitively treated as a function that implicitly accepts an instance of Num:

sum_::Num_a->[a]->a

The instance Num_ a is essentially a record that contains the instance definition of Num a. (This is in fact how type classes are implemented under the hood by the Glasgow Haskell Compiler.)

However, there is a crucial difference: implicit parameters are more flexible; different instances of Num Int can be passed. In contrast, type classes enforce the so-called coherence property, which requires that there should only be one unique choice of instance for any given type. The coherence property makes type classes somewhat antimodular, which is why orphan instances (instances that are defined in a module that neither contains the class nor the type of interest) are strongly discouraged. However, coherence adds another level of safety to a language, providing a guarantee that two disjoint parts of the same code will share the same instance. [11]

As an example, an ordered set (of type Set a) requires a total ordering on the elements (of type a) to function. This can be evidenced by a constraint Ord a, which defines a comparison operator on the elements. However, there can be numerous ways to impose a total order. Since set algorithms are generally intolerant of changes in the ordering once a set has been constructed, passing an incompatible instance of Ord a to functions that operate on the set may lead to incorrect results (or crashes). Thus, enforcing coherence of Ord a in this particular scenario is crucial.

Instances (or "dictionaries") in Scala type classes are just ordinary values in the language, rather than a completely separate kind of entity. [12] [13] While these instances are by default supplied by finding appropriate instances in scope to be used as the implicit parameters for explicitly-declared implicit formal parameters, that they are ordinary values means that they can be supplied explicitly, to resolve ambiguity. As a result, Scala type classes do not satisfy the coherence property and are effectively a syntactic sugar for implicit parameters.

This is an example taken from the Cats documentation: [14]

// A type class to provide textual representationtraitShow[A]{defshow(f:A):String}// A polymorphic function that works only when there is an implicit // instance of Show[A] availabledeflog[A](a:A)(implicits:Show[A])=println(s.show(a))// An instance for StringimplicitvalstringShow=newShow[String]{defshow(s:String)=s}// The parameter stringShow was inserted by the compiler.scala>log("a string")astring

Coq (version 8.2 onward) also supports type classes by inferring the appropriate instances. [15] Recent versions of Agda 2 also provide a similar feature, called "instance arguments". [16]

Other approaches to operator overloading

In Standard ML, the mechanism of "equality types" corresponds roughly to Haskell's built-in type class Eq, but all equality operators are derived automatically by the compiler. The programmer's control of the process is limited to designating which type components in a structure are equality types and which type variables in a polymorphic type range over equality types.

SML's and OCaml's modules and functors can play a role similar to that of Haskell's type classes, the principal difference being the role of type inference, which makes type classes suitable for ad hoc polymorphism. [17] The object oriented subset of OCaml is yet another approach which is somewhat comparable to the one of type classes.

An analogous notion for overloaded data (implemented in GHC) is that of type family. [18]

In C++, notably C++20, has support for type classes using the Concepts (C++). As an illustration, the above mentioned Haskell example of typeclass Eq would be written as

template<typenameT>conceptEqual=requires(Ta,Tb){{a==b}->std::convertible_to<bool>;{a!=b}->std::convertible_to<bool>;};

In Clean typeclasses are similar to Haskell, but have a slightly different syntax.

Rust supports traits, which are a limited form of type classes with coherence. [19]

Mercury has typeclasses, although they are not exactly the same as in Haskell.[ further explanation needed ]

In Scala, type classes are a programming idiom that can be implemented with existing language features such as implicit parameters, not a separate language feature per se. Because of the way they are implemented in Scala, it is possible to explicitly specify which type class instance to use for a type at a particular place in the code, in case of ambiguity. However, this is not necessarily a benefit as ambiguous type class instances can be error-prone.

The proof assistant Coq has also supported type classes in recent versions. Unlike in ordinary programming languages, in Coq, any laws of a type class (such as the monad laws) that are stated within the type class definition, must be mathematically proved of each type class instance before using them.

Related Research Articles

<span class="mw-page-title-main">Miranda (programming language)</span>

Miranda is a lazy, purely functional programming language designed by David Turner as a successor to his earlier programming languages SASL and KRC, using some concepts from ML and Hope. It was produced by Research Software Ltd. of England and was the first purely functional language to be commercially supported.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

A method in object-oriented programming (OOP) is a procedure associated with an object, and generally also a message. An object consists of state data and behavior; these compose an interface, which specifies how the object may be used. A method is a behavior of an object parametrized by a user.

In programming language theory and type theory, polymorphism is the use of a single symbol to represent multiple different types.

Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistics.

In computer programming, a function object is a construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. In some languages, particularly C++, function objects are often called functors.

The Glasgow Haskell Compiler (GHC) is a native or machine code compiler for the functional programming language Haskell. It provides a cross-platform software environment for writing and testing Haskell code and supports many extensions, libraries, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. It is free and open-source software released under a BSD license.

In functional programming, a monad is a structure that combines program fragments (functions) and wraps their return values in a type with additional computation. In addition to defining a wrapping monadic type, monads define two operators: one to wrap a value in the monad type, and another to compose together functions that output values of the monad type. General-purpose languages use monads to reduce boilerplate code needed for common operations. Functional languages use monads to turn complicated sequences of functions into succinct pipelines that abstract away control flow, and side-effects.

<span class="mw-page-title-main">C Sharp (programming language)</span> Programming language

C# is a general-purpose high-level programming language supporting multiple paradigms. C# encompasses static typing, strong typing, lexically scoped, imperative, declarative, functional, generic, object-oriented (class-based), and component-oriented programming disciplines.

In programming languages and type theory, parametric polymorphism allows a single piece of code to be given a "generic" type, using variables in place of actual types, and then instantiated with particular types as needed. Parametrically polymorphic functions and data types are sometimes called generic functions and generic datatypes, respectively, and they form the basis of generic programming.

In functional programming, a generalized algebraic data type is a generalization of a parametric algebraic data type (ADT).

In the area of mathematical logic and computer science known as type theory, a kind is the type of a type constructor or, less commonly, the type of a higher-order type operator. A kind system is essentially a simply typed lambda calculus "one level up", endowed with a primitive type, denoted and called "type", which is the kind of any data type which does not need any type parameters.

In programming languages and type theory, an option type or maybe type is a polymorphic type that represents encapsulation of an optional value; e.g., it is used as the return type of functions which may or may not return a meaningful value when they are applied. It consists of a constructor which either is empty, or which encapsulates the original data type A.

Concepts are an extension to the templates feature provided by the C++ programming language. Concepts are named Boolean predicates on template parameters, evaluated at compile time. A concept may be associated with a template, in which case it serves as a constraint: it limits the set of arguments that are accepted as template parameters.

This article describes the features in the programming language Haskell.

Haskell is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell has pioneered several programming language features such as type classes, which enable type-safe operator overloading, and monadic input/output (IO). It is named after logician Haskell Curry. Haskell's main implementation is the Glasgow Haskell Compiler (GHC).

A Hindley–Milner (HM) type system is a classical type system for the lambda calculus with parametric polymorphism. It is also known as Damas–Milner or Damas–Hindley–Milner. It was first described by J. Roger Hindley and later rediscovered by Robin Milner. Luis Damas contributed a close formal analysis and proof of the method in his PhD thesis.

<span class="mw-page-title-main">Yesod (web framework)</span>

Yesod is a web framework based on the programming language Haskell for productive development of type-safe, representational state transfer (REST) model based, high performance web applications, developed by Michael Snoyman, et al. It is free and open-source software released under an MIT License.

In computer science, a type family associates data types with other data types, using a type-level function defined by an open-ended collection of valid instances of input types and the corresponding output types.

<span class="mw-page-title-main">Strongly typed identifier</span>

A strongly typed identifier is user-defined data type which serves as an identifier or key that is strongly typed. This is a solution to the "primitive obsession" code smell as mentioned by Martin Fowler. The data type should preferably be immutable if possible. It is common for implementations to handle equality testing, serialization and model binding.

References

  1. Morris, John G. (2013). Type Classes and Instance Chains: A Relational Approach (PDF) (PhD). Department of Computer Science, Portland State University. doi:10.15760/etd.1010.
  2. 1 2 Wadler, P.; Blott, S. (1989). "How to make ad-hoc polymorphism less ad hoc". Proceedings of the 16th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '89). Association for Computing Machinery. pp. 60–76. doi:10.1145/75277.75283. ISBN   0897912942. S2CID   15327197.
  3. Kaes, Stefan (March 1988). "Parametric overloading in polymorphic programming languages". Proceedings 2nd European Symposium on Programming Languages. doi: 10.1007/3-540-19027-9_9 .
  4. Appel, A.W.; MacQueen, D.B. (1991). "Standard ML of New Jersey". In Maluszyński, J.; Wirsing, M. (eds.). Programming Language Implementation and Logic Programming. PLILP 1991. Lecture Notes in Computer Science. Vol. 528. Springer. pp. 1–13. CiteSeerX   10.1.1.55.9444 . doi:10.1007/3-540-54444-5_83. ISBN   3-540-54444-5.
  5. Type from Data.Kind appeared in version 8 of the Glasgow Haskell Compiler
  6. Haskell' page MultiParamTypeClasses .
  7. In GHC, the C Core uses Girard & Reynold's System F type signatures to identify a typed case for processing in the optimization phases. -- Simon Peyton-Jones "Into the Core - Squeezing Haskell into Nine Constructors" Erlang User Conference, Sep 14, 2016
  8. Jones, Mark P. (2000). "Type Classes with Functional Dependencies". In Smolka, G. (ed.). Programming Languages and Systems. ESOP 2000. Lecture Notes in Computer Science. Vol. 1782. Springer. pp. 230–244. CiteSeerX   10.1.1.26.7153 . doi:10.1007/3-540-46425-5_15. ISBN   3-540-46425-5.
  9. Haskell' page FunctionalDependencies .
  10. Peyton Jones, Simon (2006). "MPTCs and functional dependencies". Haskell-prime mailing list.
  11. Kmett, Edward (January 21, 2015). Type Classes vs. the World (video). Boston Haskell Meetup. Archived from the original on 2021-12-21.
  12. Oliveira, Bruno C.d.S.; Moors, Adriaan; Odersky, Martin (2010). "Type Classes as Objects and Implicits" (PDF). Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA '10). Association for Computing Machinery. pp. 341–360. CiteSeerX   10.1.1.205.2737 . doi:10.1145/1869459.1869489. ISBN   9781450302036. S2CID   207183083.
  13. "The Neophyte's Guide to Scala Part 12: Type classes - Daniel Westheide".
  14. typelevel.org, Scala Cats
  15. Castéran, P.; Sozeau, M. (2014). "A Gentle Introduction to Type Classes and Relations in Coq" (PDF). CiteSeerX   10.1.1.422.8091 .
  16. "Modelling Type Classes With Instance Arguments".
  17. Dreyer, Derek; Harper, Robert; Chakravarty, Manuel M.T. (2007). "Modular Type Classes". Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '07). pp. 63–70. See p. 63. doi:10.1145/1190216.1190229. ISBN   978-1595935755. S2CID   1828213. TR-2006-03.
  18. "GHC/Type families - HaskellWiki".
  19. Turon, Aaron (2017). Specialization, coherence, and API evolution (Report).