In functional programming, a **generalized algebraic data type** (**GADT**, also **first-class phantom type**,^{ [1] }**guarded recursive datatype**,^{ [2] } or **equality-qualified type**^{ [3] }) is a generalization of parametric algebraic data types.

In a GADT, the product constructors (called data constructors in Haskell) can provide an explicit instantiation of the ADT as the type instantiation of their return value. This allows defining functions with a more advanced type behaviour. For a data constructor of Haskell 2010, the return value has the type instantiation implied by the instantiation of the ADT parameters at the constructor's application.

`-- A parametric ADT that is not a GADTdataLista=Nil|Consa(Lista)integers=Cons12(Cons107Nil)-- the type of integers is List Intstrings=Cons"boat"(Cons"dock"Nil)-- the type of strings is List String-- A GADTdataExprawhereEBool::Bool->ExprBoolEInt::Int->ExprIntEEqual::ExprInt->ExprInt->ExprBooleval::Expra->aevale=caseeofEBoola->aEInta->aEEqualab->(evala)==(evalb)expr1=EEqual(EInt2)(EInt3)-- the type of expr1 is Expr Boolret=evalexpr1-- ret is False`

They are currently implemented in the GHC compiler as a non-standard extension, used by, among others, Pugs and Darcs. OCaml supports GADT natively since version 4.00.^{ [4] }

The GHC implementation provides support for existentially quantified type parameters and for local constraints.

An early version of generalized algebraic data types were described by Augustsson & Petersson (1994) and based on pattern matching in ALF.

Generalized algebraic data types were introduced independently by Cheney & Hinze (2003) and prior by Xi, Chen & Chen (2003) as extensions to ML's and Haskell's algebraic data types.^{ [5] } Both are essentially equivalent to each other. They are similar to the * inductive families of data types * (or *inductive datatypes*) found in Coq's Calculus of Inductive Constructions and other dependently typed languages, modulo the dependent types and except that the latter have an additional positivity restriction which is not enforced in GADTs.^{ [6] }

Sulzmann, Wazny & Stuckey (2006) introduced *extended algebraic data types* which combine GADTs together with the existential data types and type class constraints introduced by Perry (1991) , Läufer & Odersky (1994) and Läufer (1996) .

Type inference in the absence of any programmer supplied type annotations is undecidable ^{ [7] } and functions defined over GADTs do not admit principal types in general.^{ [8] } Type reconstruction requires several design trade-offs and is an area of active research (Peyton Jones, Washburn & Weirich 2004; Peyton Jones et al. 2006; Pottier & Régis-Gianas 2006 ; Sulzmann, Schrijvers & Stuckey 2006; Simonet & Pottier 2007 ; Schrijvers et al. 2009; Lin & Sheard 2010a ; Lin & Sheard 2010b ; Vytiniotis, Peyton Jones & Schrijvers 2010 ; Vytiniotis et al. 2011 ).

In spring 2021, Scala 3.0 is released.^{ [9] } This major update of Scala introduce the possibility to write GADTs^{ [10] } with the same syntax as ADTs, which is not the case in other programming languages according to Martin Odersky.^{ [11] }

Applications of GADTs include generic programming, modelling programming languages (higher-order abstract syntax), maintaining invariants in data structures, expressing constraints in embedded domain-specific languages, and modelling objects.^{ [12] }

An important application of GADTs is to embed higher-order abstract syntax in a type safe fashion. Here is an embedding of the simply typed lambda calculus with an arbitrary collection of base types, tuples and a fixed point combinator:

`dataLam::*->*whereLift::a->Lama-- ^ lifted valuePair::Lama->Lamb->Lam(a,b)-- ^ productLam::(Lama->Lamb)->Lam(a->b)-- ^ lambda abstractionApp::Lam(a->b)->Lama->Lamb-- ^ function applicationFix::Lam(a->a)->Lama-- ^ fixed point`

And a type safe evaluation function:

`eval::Lamt->teval(Liftv)=veval(Pairlr)=(evall,evalr)eval(Lamf)=\x->eval(f(Liftx))eval(Appfx)=(evalf)(evalx)eval(Fixf)=(evalf)(eval(Fixf))`

The factorial function can now be written as:

`fact=Fix(Lam(\f->Lam(\y->Lift(ifevaly==0then1elseevaly*(evalf)(evaly-1)))))eval(fact)(10)`

We would have run into problems using regular algebraic data types. Dropping the type parameter would have made the lifted base types existentially quantified, making it impossible to write the evaluator. With a type parameter we would still be restricted to a single base type. Furthermore, ill-formed expressions such as `App (Lam (\x -> Lam (\y -> App x y))) (Lift True)`

would have been possible to construct, while they are type incorrect using the GADT. A well-formed analogue is `App (Lam (\x -> Lam (\y -> App x y))) (Lift (\z -> True))`

. This is because the type of `x `

is `Lam (a -> b)`

, inferred from the type of the `Lam`

data constructor.

- ↑ Cheney & Hinze 2003.
- ↑ Xi, Chen & Chen 2003.
- ↑ Sheard & Pasalic 2004.
- ↑ "OCaml 4.00.1".
*ocaml.org*. - ↑ Cheney & Hinze 2003, p. 25.
- ↑ Cheney & Hinze 2003, pp. 25–26.
- ↑ Peyton Jones, Washburn & Weirich 2004, p. 7.
- ↑ Schrijvers et al. 2009, p. 1.
- ↑ Kmetiuk, Anatolii. "SCALA 3 IS HERE!🎉🎉🎉".
*scala-lang.org*. École Polytechnique Fédérale Lausanne (EPFL) Lausanne, Switzerland. Retrieved 19 May 2021. - ↑ "SCALA 3 — BOOK ALGEBRAIC DATA TYPES".
*scala-lang.org*. École Polytechnique Fédérale Lausanne (EPFL) Lausanne, Switzerland. Retrieved 19 May 2021. - ↑ Odersky, Martin. "A Tour of Scala 3 - Martin Odersky".
*youtube.com*. Scala Days Conferences. Retrieved 19 May 2021. - ↑ Peyton Jones, Washburn & Weirich 2004, p. 3.

- Applications

- Augustsson, Lennart; Petersson, Kent (September 1994). "Silly type families" (PDF).Cite journal requires
`|journal=`

(help) - Cheney, James; Hinze, Ralf (2003). "First-Class Phantom Types".
*Technical Report CUCIS TR2003-1901*. Cornell University. hdl:1813/5614. - Xi, Hongwei; Chen, Chiyan; Chen, Gang (2003).
*Guarded Recursive Datatype Constructors*.*Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'03)*. ACM Press. pp. 224–235. CiteSeerX 10.1.1.59.4622 . doi:10.1145/604131.604150. ISBN 978-1581136289. - Sheard, Tim; Pasalic, Emir (2004). "Meta-programming with built-in type equality".
*Proceedings of the Fourth International Workshop on Logical Frameworks and Meta-languages (LFM'04), Cork*.**199**: 49–65. doi: 10.1016/j.entcs.2007.11.012 .

- Semantics

- Patricia Johann and Neil Ghani (2008). "Foundations for Structured Programming with GADTs".
- Arie Middelkoop, Atze Dijkstra and S. Doaitse Swierstra (2011). "A lean specification for GADTs: system F with first-class equality proofs".
*Higher-Order and Symbolic Computation*.

- Type reconstruction

- Peyton Jones, Simon; Washburn, Geoffrey; Weirich, Stephanie (2004). "Wobbly types: type inference for generalised algebraic data types" (PDF).
*Technical Report MS-CIS-05-25*. University of Pennsylvania. - Peyton Jones, Simon; Vytiniotis, Dimitrios; Weirich, Stephanie; Washburn, Geoffrey (2006). "Simple Unification-based Type Inference for GADTs" (PDF).
*Proceedings of the ACM International Conference on Functional Programming (ICFP'06), Portland*. - Sulzmann, Martin; Wazny, Jeremy; Stuckey, Peter J. (2006). "A Framework for Extended Algebraic Data Types". In Hagiya, M.; Wadler, P. (eds.).
*8th International Symposium on Functional and Logic Programming (FLOPS 2006)*. Lecture Notes in Computer Science.**3945**. pp. 46–64. - Sulzmann, Martin; Schrijvers, Tom; Stuckey, Peter J. (2006). "Principal Type Inference for GHC-Style Multi-Parameter Type Classes". In Kobayashi, Naoki (ed.).
*Programming Languages and Systems: 4th Asian Symposium (APLAS 2006)*. Lecture Notes in Computer Science.**4279**. pp. 26–43. - Schrijvers, Tom; Peyton Jones, Simon; Sulzmann, Martin; Vytiniotis, Dimitrios (2009). "Complete and Decidable Type Inference for GADTs" (PDF).
*Proceedings of the ACM International Conference on Functional Programming (ICFP'09), Edinburgh*. - Lin, Chuan-kai (2010).
*Practical Type Inference for the GADT Type System*(PDF) (Doctoral Dissertation thesis). Portland State University.

- Other

- Andrew Kennedy and Claudio V. Russo. "Generalized algebraic data types and object-oriented programming".
*In Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming, systems, languages, and applications*. ACM Press, 2005.

Wikibooks has a book on the topic of: Haskell/GADT |

- Generalised Algebraic Datatype Page on the Haskell wiki
- Generalised Algebraic Data Types in the GHC Users' Guide
- Generalized Algebraic Data Types and Object-Oriented Programming
- GADTs – Haskell Prime – Trac
- Papers about type inference for GADTs, bibliography by Simon Peyton Jones
- Type inference with constraints, bibliography by Simon Peyton Jones
- Emulating GADTs in Java via the Yoneda lemma

In computer science, **functional programming** is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.

**Standard ML** (**SML**) is a general-purpose modular functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.

**Generic programming** is a style of computer programming in which algorithms are written in terms of types *to-be-specified-later* that are then *instantiated* when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication. Such software entities are known as *generics* in Ada, C#, Delphi, Eiffel, F#, Java, Nim, Python, Rust, Swift, TypeScript and Visual Basic .NET. They are known as *parametric polymorphism* in ML, Scala, Julia, and Haskell ; *templates* in C++ and D; and *parameterized types* in the influential 1994 book *Design Patterns*.

In computer programming, especially functional programming and type theory, an **algebraic data type** is a kind of composite type, i.e., a type formed by combining other types.

In computer science, a **tagged union**, also called a **variant**, **variant record**, **choice type**, **discriminated union**, **disjoint union**, **sum type** or **coproduct**, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a **tag** field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as the value itself, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

The **Glasgow Haskell Compiler** (**GHC**) is an open-source native code compiler for the functional programming language Haskell. It provides a cross-platform environment for the writing and testing of Haskell code and it supports numerous extensions, libraries, and optimisations that streamline the process of generating and executing code. GHC is the most commonly used Haskell compiler. The lead developers are Simon Peyton Jones and Simon Marlow.

**Hope** is a small functional programming language developed in the 1970s at the University of Edinburgh. It predates Miranda and Haskell and is contemporaneous with ML, also developed at the University. Hope was derived from NPL, a simple functional language developed by Rod Burstall and John Darlington in their work on program transformation. NPL and Hope are notable for being the first languages with call-by-pattern evaluation and algebraic data types.

Many programming language type systems support subtyping. For instance, if the type `Cat`

is a subtype of `Animal`

, then an expression of type `Cat`

should be substitutable wherever an expression of type `Animal`

is used.

**SIGPLAN** is the Association for Computing Machinery's Special Interest Group on programming languages.

In computer programming languages, a **recursive data type** is a data type for values that may contain other values of the same type. Data of recursive types are usually viewed as directed graphs.

**Scala** is a strong statically typed general-purpose programming language which supports both object-oriented programming and functional programming. Designed to be concise, many of Scala's design decisions are aimed to address criticisms of Java.

In programming languages and type theory, **parametric polymorphism** is a way to make a language more expressive, while still maintaining full static type-safety. Using parametric polymorphism, a function or a data type can be written generically so that it can handle values *identically* without depending on their type. Such functions and data types are called **generic functions** and **generic datatypes** respectively and form the basis of generic programming.

In computer science, a **type class** is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class `T`

and a type variable `a`

, and means that `a`

can only be instantiated to a type whose members support the overloaded operations associated with `T`

.

**Haxe** is an open source high-level cross-platform programming language and compiler that can produce applications and source code, for many different computing platforms from one code-base. It is free and open-source software, released under the MIT License. The compiler, written in OCaml, is released under the GNU General Public License (GPL) version 2.

The **expression problem** is a term used in discussing strengths and weaknesses of various programming paradigms and programming languages.

In computer science, **polymorphic recursion** refers to a recursive parametrically polymorphic function where the type parameter changes with each recursive invocation made, instead of staying constant. Type inference for polymorphic recursion is equivalent to semi-unification and therefore undecidable and requires the use of a semi-algorithm or programmer supplied type annotations.

**Haskell** is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial application, Haskell has pioneered a number of advanced programming language features such as type classes, which enable type-safe operator overloading. Haskell's main implementation is the Glasgow Haskell Compiler (GHC). It is named after logician Haskell Curry.

In computer science, a **type family** associates data types with other data types, using a type-level function defined by an open-ended collection of valid instances of input types and the corresponding output types.

**Flix** is a functional, imperative, and logic programming language developed at Aarhus University, with funding from the Independent Research Fund Denmark, and by a community of open source contributors. The Flix language supports algebraic data types, pattern matching, parametric polymorphism, currying, higher-order functions, extensible records, channel and process-based concurrency, and tail call elimination. Two notable features of Flix are its type and effect system and its support for first-class Datalog constraints.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.