# Enumerated type

Last updated

In computer programming, an enumerated type (also called enumeration, enum, or factor in the R programming language, and a categorical variable in statistics) is a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language. An enumerated type can be seen as a degenerate tagged union of unit type. A variable that has been declared as having an enumerated type can be assigned any of the enumerators as a value. In other words, an enumerated type has values that are different from each other, and that can be compared and assigned, but are not specified by the programmer as having any particular concrete representation in the computer's memory; compilers and interpreters can represent them arbitrarily.

## Contents

For example, the four suits in a deck of playing cards may be four enumerators named Club, Diamond, Heart, and Spade, belonging to an enumerated type named suit. If a variable V is declared having suit as its data type, one can assign any of those four values to it.

Although the enumerators are usually distinct, some languages may allow the same enumerator to be listed twice in the type's declaration. The names of enumerators need not be semantically complete or compatible in any sense. For example, an enumerated type called color may be defined to consist of the enumerators Red, Green, Zebra, Missing, and Bacon. In some languages, the declaration of an enumerated type also intentionally defines an ordering of its members; in others, the enumerators are unordered; in others still, an implicit ordering arises from the compiler concretely representing enumerators as integers.

Some enumerator types may be built into the language. The Boolean type, for example is often a pre-defined enumeration of the values False and True. Many languages allow users to define new enumerated types.

Values and variables of an enumerated type are usually implemented as fixed-length bit strings, often in a format and size compatible with some integer type. Some languages, especially system programming languages, allow the user to specify the bit combination to be used for each enumerator. In type theory, enumerated types are often regarded as tagged unions of unit types. Since such types are of the form ${\displaystyle 1+1+\cdots +1}$, they may also be written as natural numbers.

## Rationale

Some early programming languages did not originally have enumerated types. If a programmer wanted a variable, for example myColor, to have a value of red, the variable red would be declared and assigned some arbitrary value, usually an integer constant. The variable red would then be assigned to myColor. Other techniques assigned arbitrary values to strings containing the names of the enumerators.

These arbitrary values were sometimes referred to as magic numbers since there often was no explanation as to how the numbers were obtained or whether their actual values were significant. These magic numbers could make the source code harder for others to understand and maintain.

Enumerated types, on the other hand, make the code more self-documenting. Depending on the language, the compiler could automatically assign default values to the enumerators thereby hiding unnecessary detail from the programmer. These values may not even be visible to the programmer (see information hiding). Enumerated types can also prevent a programmer from writing illogical code such as performing mathematical operations on the values of the enumerators. If the value of a variable that was assigned an enumerator were to be printed, some programming languages could also print the name of the enumerator rather than its underlying numerical value. A further advantage is that enumerated types can allow compilers to enforce semantic correctness. For instance: `myColor = TRIANGLE` can be forbidden, whilst `myColor = RED` is accepted, even if TRIANGLE and RED are both internally represented as 1.

Conceptually, an enumerated type is similar to a list of nominals (numeric codes), since each possible value of the type is assigned a distinctive natural number. A given enumerated type is thus a concrete implementation of this notion. When order is meaningful and/or used for comparison, then an enumerated type becomes an ordinal type.

## Conventions

Programming languages tend to have their own, oftentimes multiple, programming styles and naming conventions. The variable assigned to an enumeration is usually a noun in singular form, and frequently follows either a PascalCase or uppercase convention, while lowercase and others are seen less frequently.

## Syntax in several programming languages

### Pascal and syntactically similar languages

#### Pascal

In Pascal, an enumerated type can be implicitly declared by listing the values in a parenthesised list:

`varsuit:(clubs,diamonds,hearts,spades);`

The declaration will often appear in a type synonym declaration, such that it can be used for multiple variables:

`typecardsuit=(clubs,diamonds,hearts,spades);card=recordsuit:cardsuit;value:1..13;end;varhand:array[1..13]ofcard;trump:cardsuit;`

The order in which the enumeration values are given matters. An enumerated type is an ordinal type, and the `pred` and `succ` functions will give the prior or next value of the enumeration, and `ord` can convert enumeration values to their integer representation. Standard Pascal does not offer a conversion from arithmetic types to enumerations, however. Extended Pascal offers this functionality via an extended `succ` function. Some other Pascal dialects allow it via type-casts. Some modern descendants of Pascal, such as Modula-3, provide a special conversion syntax using a method called `VAL`; Modula-3 also treats `BOOLEAN` and `CHAR` as special pre-defined enumerated types and uses `ORD` and `VAL` for standard ASCII decoding and encoding.

Pascal style languages also allow enumeration to be used as array index:

`varsuitcount:array[cardsuit]ofinteger;`

In Ada, the use of "=" was replaced with "is" leaving the definition quite similar:

`typeCardsuitis(clubs,diamonds,hearts,spades);`

In addition to `Pred`, `Succ`, `Val` and `Pos` Ada also supports simple string conversions via `Image` and `Value`.

Similar to C-style languages Ada allows the internal representation of the enumeration to be specified:

`forCardsuituse(clubs=>1,diamonds=>2,hearts=>4,spades=>8);`

Unlike C-style languages Ada also allows the number of bits of the enumeration to be specified:

`forCardsuit'Sizeuse4;-- 4 bits`

Additionally, one can use enumerations as indexes for arrays, like in Pascal, but there are attributes defined for enumerations

`Shuffle:constantarray(Cardsuit)ofCardsuit:=(Clubs=>Cardsuit'Succ(Clubs),-- see attributes of enumerations 'First, 'Last, 'Succ, 'PredDiamonds=>Hearts,--an explicit valueHearts=>Cardsuit'Last,--first enumeration value of type Cardsuit e.g., clubsSpades=>Cardsuit'First--last enumeration value of type Cardsuit e.g., spades);`

Like Modula-3 Ada treats `Boolean` and `Character` as special pre-defined (in package "`Standard`") enumerated types. Unlike Modula-3 one can also define own character types:

`typeCardsis('7','8','9','J','Q','K','A');`

### C and syntactically similar languages

#### C

The original K&R dialect of the programming language C had no enumerated types. [1] In C, enumerations are created by explicit definitions (the `enum` keyword by itself does not cause allocation of storage) which use the `enum` keyword and are reminiscent of struct and union definitions:

`enumcardsuit{Clubs,Diamonds,Hearts,Spades};structcard{enumcardsuitsuit;shortintvalue;}hand[13];enumcardsuittrump;`

C exposes the integer representation of enumeration values directly to the programmer. Integers and enum values can be mixed freely, and all arithmetic operations on enum values are permitted. It is even possible for an enum variable to hold an integer that does not represent any of the enumeration values. In fact, according to the language definition, the above code will define `Clubs`, `Diamonds`, `Hearts`, and `Spades` as constants of type `int`, which will only be converted (silently) to `enum cardsuit` if they are stored in a variable of that type.

C also allows the programmer to choose the values of the enumeration constants explicitly, even without type. For example,

`enumcardsuit{Clubs=1,Diamonds=2,Hearts=4,Spades=8};`

could be used to define a type that allows mathematical sets of suits to be represented as an `enum cardsuit` by bitwise logic operations.

#### C#

Enumerated types in the C# programming language preserve most of the "small integer" semantics of C's enums. Some arithmetic operations are not defined for enums, but an enum value can be explicitly converted to an integer and back again, and an enum variable can have values that were not declared by the enum definition. For example, given

`enumCardsuit{Clubs,Diamonds,Spades,Hearts};`

the expressions `CardSuit.Diamonds + 1` and `CardSuit.Hearts - CardSuit.Clubs` are allowed directly (because it may make sense to step through the sequence of values or ask how many steps there are between two values), but `CardSuit.Hearts*CardSuit.Spades` is deemed to make less sense and is only allowed if the values are first converted to integers.

C# also provides the C-like feature of being able to define specific integer values for enumerations. By doing this it is possible to perform binary operations on enumerations, thus treating enumeration values as sets of flags. These flags can be tested using binary operations or with the Enum type's builtin 'HasFlag' method.

The enumeration definition defines names for the selected integer values and is syntactic sugar, as it is possible to assign to an enum variable other integer values that are not in the scope of the enum definition. [2] [3] [4]

#### C++

C++ has enumeration types that are directly inherited from C's and work mostly like these, except that an enumeration is a real type in C++, giving added compile-time checking. Also (as with structs), the C++ `enum` keyword is automatically combined with a , so that instead of naming the type `enum name`, simply name it `name`. This can be simulated in C using a typedef: `typedefenum{Value1,Value2}name;`

C++11 provides a second, type-safe enumeration type that is not implicitly converted to an integer type. It allows io streaming to be defined for that type. Additionally the enumerations do not leak, so they have to be used with Enumeration `Type::enumeration`. This is specified by the phrase "enum class". For example:

`enumclassColor{Red,Green,Blue};`

The underlying type is an implementation-defined integral type that is large enough to hold all enumerated values (it doesn't have to be the smallest possible type!). In C++ you can specify the underlying type directly. That allows "forward declarations" of enumerations:

`enumclassColor:long{Red,Green,Blue};// must fit in size and memory layout the type 'long'enumclassShapes:char;// forward declaration. If later there are values defined that don't fit in 'char' it is an error.`

#### Go

Go uses the `iota` keyword to create enumerated constants. [5]

`typeByteSizefloat64const(_=iota// ignore first value by assigning to blank identifierKBByteSize=1<<(10*iota)MBGB)`

#### Java

The J2SE version 5.0 of the Java programming language added enumerated types whose declaration syntax is similar to that of C:

`enumCardsuit{CLUBS,DIAMONDS,SPADES,HEARTS};...Cardsuittrump;`

The Java type system, however, treats enumerations as a type separate from integers, and intermixing of enum and integer values is not allowed. In fact, an enum type in Java is actually a special compiler-generated class rather than an arithmetic type, and enum values behave as global pre-generated instances of that class. Enum types can have instance methods and a constructor (the arguments of which can be specified separately for each enum value). All enum types implicitly extend the ` Enum ` abstract class. An enum type cannot be instantiated directly. [6]

Internally, each enum value contains an integer, corresponding to the order in which they are declared in the source code, starting from 0. The programmer cannot set a custom integer for an enum value directly, but one can define overloaded constructors that can then assign arbitrary values to self-defined members of the enum class. Defining getters allows then access to those self-defined members. The internal integer can be obtained from an enum value using the ` ordinal() ` method, and the list of enum values of an enumeration type can be obtained in order using the `values()` method. It is generally discouraged for programmers to convert enums to integers and vice versa. [7] Enumerated types are `Comparable`, using the internal integer; as a result, they can be sorted.

The Java standard library provides utility classes to use with enumerations. The ` EnumSet ` class implements a `Set` of enum values; it is implemented as a bit array, which makes it very compact and as efficient as explicit bit manipulation, but safer. The ` EnumMap ` class implements a `Map` of enum values to object. It is implemented as an array, with the integer value of the enum value serving as the index.

#### Perl

Dynamically typed languages in the syntactic tradition of C (e.g., Perl or JavaScript) do not, in general, provide enumerations. But in Perl programming the same result can be obtained with the shorthand strings list and hashes (possibly slices):

`my@enum=qw(Clubs Diamonds Hearts Spades);my(%set1,%set2);@set1{@enum}=();# all cleared@set2{@enum}=(1)x@enum;# all set to 1\$set1{Clubs}...# false\$set2{Diamonds}...# true`

#### Raku

Raku (formerly known as Perl 6) supports enumerations. There are multiple ways to declare enumerations in Raku, all creating a back-end Map.

`enumCat<sphynx siamese bengal shorthair other>; # Using "quote-words"`
`enumCat ('sphynx', 'siamese', 'bengal', 'shorthair', 'other'); # Using a list`
`enumCat (sphynx => 0, siamese => 1, bengal => 2, shorthair => 3, other => 4); # Using Pair constructors`
`enumCat (:sphynx(0), :siamese(1), :bengal(2), shorthair(3), :other(4)); # Another way of using Pairs, you can also use `:0sphynx``

#### Rust

Though Rust uses the `enum` keyword like C, it uses it to describe tagged unions, which enums can be considered a degenerate form of. Rust’s enums are therefore much more flexible and can contain struct and tuple variants.

`enumMessage{Quit,Move{x: i32,y: i32},// structWrite(String),// single-element tupleChangeColor(i32,i32,i32),// three-element tuple}`

#### Swift

In C, enumerations assign related names to a set of integer values. In Swift, enumerations are much more flexible and need not provide a value for each case of the enumeration. If a value (termed a raw value) is provided for each enumeration case, the value can be a string, a character, or a value of any integer or floating-point type.

Alternatively, enumeration cases can specify associated values of any type to be stored along with each different case value, much as unions or variants do in other languages. One can define a common set of related cases as part of one enumeration, each of which has a different set of values of appropriate types associated with it.

In Swift, enumerations are a first-class type. They adopt many features traditionally supported only by classes, such as computed properties to provide additional information about the enumeration’s current value, and instance methods to provide functionality related to the values the enumeration represents. Enumerations can also define initializers to provide an initial case value and can be extended to expand their functionality beyond their original implementation; and can conform to protocols to provide standard functionality.

`enumCardSuit{caseclubscasediamondscaseheartscasespades}`

Unlike C and Objective-C, Swift enumeration cases are not assigned a default integer value when they are created. In the CardSuit example above, clubs, diamonds, hearts, and spades do not implicitly equal 0, 1, 2 and 3. Instead, the different enumeration cases are fully-fledged values in their own right, with an explicitly-defined type of CardSuit.

Multiple cases can appear on a single line, separated by commas:

`enumCardSuit{caseclubs,diamonds,hearts,spades}`

When working with enumerations that store integer or string raw values, one doesn’t need to explicitly assign a raw value for each case because Swift will automatically assign the values.

For instance, when integers are used for raw values, the implicit value for each case is one more than the previous case. If the first case doesn’t have a value set, its value is 0.

The enumeration below is a refinement of the earlier Planet enumeration, with integer raw values to represent each planet’s order from the sun:

`enumPlanet:Int{casemercury=1,venus,earth,mars,jupiter,saturn,uranus,neptune}`

In the example above, Planet.mercury has an explicit raw value of 1, Planet.venus has an implicit raw value of 2, and so on.

#### TypeScript

Typescript adds an 'enum' data type to JavaScript.

`enumCardsuit{Clubs,Diamonds,Hearts,Spades};varc:Cardsuit=Cardsuit.Diamonds;`

By default, enums number members starting at 0; this can be overridden by setting the value of the first:

`enumCardsuit{Clubs=1,Diamonds,Hearts,Spades};varc:Cardsuit=Cardsuit.Diamonds;`

All the values can be set:

`enumCardsuit{Clubs=1,Diamonds=2,Hearts=4,Spades=8};varc:Cardsuit=Cardsuit.Diamonds;`

TypeScript supports mapping the numeric value to its name. For example, this finds the name of the value 2:

`enumCardsuit{Clubs=1,Diamonds,Hearts,Spades};varsuitName:string=Cardsuit[2];alert(suitName);`

#### Python

An `enum` module was added to the Python standard library in version 3.4.

`fromenumimportEnumclassCards(Enum):clubs=1diamonds=2hearts=3spades=4`

There is also a functional API for creating enumerations with automatically generated indices (starting with one):

`Cards=Enum("Cards",["clubs","diamonds","hearts","spades"])`

Python enumerations do not enforce semantic correctness (a meaningless comparison to an incompatible enumeration always returns False rather than raising a TypeError):

`>>> Color=Enum("Color",["red","green","blue"])>>> Shape=Enum("Shape",["circle","triangle","square","hexagon"])>>> defhas_vertices(shape):... returnshape!=Shape.circle...>>> has_vertices(Color.green)True`

#### Fortran

Fortran only has enumerated types for interoperability with C; hence, the semantics is similar to C and, as in C, the enum values are just integers and no further type check is done. The C example from above can be written in Fortran as

`enum,bind(C)enumerator::CLUBS=1,DIAMONDS=2,HEARTS=4,SPADES=8end enum`

#### Visual Basic/VBA

Enumerated datatypes in Visual Basic (up to version 6) and VBA are automatically assigned the "`Long`" datatype and also become a datatype themselves:

`'Zero-basedEnumCardSuitClubsDiamondsHeartsSpadesEndEnumSubEnumExample()DimsuitAsCardSuitsuit=DiamondsMsgBoxsuitEndSub`

Example Code in VB.NET

`EnumCardSuitClubsDiamondsHeartsSpadesEndEnumSubEnumExample()DimsuitAsCardSuitsuit=CardSuit.DiamondsMessageBox.show(suit)EndSub`

#### Lisp

Common Lisp uses the member type specifier, e.g.,

`(deftypecardsuit()'(memberclubdiamondheartspade))`

that states that object is of type cardsuit if it is `#'eql` to club, diamond, heart or spade. The member type specifier is not valid as a Common Lisp Object System (CLOS) parameter specializer, however. Instead, `(eql atom)`, which is the equivalent to `(member atom)` may be used (that is, only one member of the set may be specified with an eql type specifier, however, it may be used as a CLOS parameter specializer.) In other words, to define methods to cover an enumerated type, a method must be defined for each specific element of that type.

`(deftypefinite-element-set-type(&restelements)`(member,@elements))`

may be used to define arbitrary enumerated types at runtime. For instance

`(finite-element-set-typeclubdiamondheartspade)`

would refer to a type equivalent to the prior definition of cardsuit, as of course would simply have been using

`(memberclubdiamondheartspade)`

but may be less confusing with the function `#'member` for stylistic reasons.

## Algebraic data type in functional programming

In functional programming languages in the ML lineage (e.g., Standard ML (SML), OCaml, and Haskell), an algebraic data type with only nullary constructors can be used to implement an enumerated type. For example (in the syntax of SML signatures):

`datatypecardsuit=Clubs|Diamonds|Hearts|Spadestypecard={suit:cardsuit;value:int}valhand:cardlistvaltrump:cardsuit`

In these languages the small-integer representation is completely hidden from the programmer, if indeed such a representation is employed by the implementation. However, Haskell has the `Enum` type class which a type can derive or implement to get a mapping between the type and `Int`.

## Databases

Some databases support enumerated types directly. MySQL provides an enumerated type `ENUM` with allowable values specified as strings when a table is created. The values are stored as numeric indices with the empty string stored as 0, the first string value stored as 1, the second string value stored as 2, etc. Values can be stored and retrieved as numeric indexes or string values.

## XML Schema

XML Schema supports enumerated types through the enumeration facet used for constraining most primitive datatypes such as strings.

`<xs:elementname="cardsuit"><xs:simpleType><xs:restrictionbase="xs:string"><xs:enumerationvalue="Clubs"/><xs:enumerationvalue="Diamonds"/><xs:enumerationvalue="Hearts"/><xs:enumerationvalue="Spades"/></xs:restriction></xs:simpleType></xs:element>`

## Related Research Articles

C is a general-purpose, procedural computer programming language supporting structured programming, lexical variable scope, and recursion, with a static type system. By design, C provides constructs that map efficiently to typical machine instructions. It has found lasting use in applications previously coded in assembly language. Such applications include operating systems and various application software for computer architectures that range from supercomputers to PLCs and embedded systems.

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are commonly represented in a computer as a group of binary digits (bits). The size of the grouping varies so the set of integer sizes available varies between different types of computers. Computer hardware nearly always provide a way to represent a processor register or memory address as an integer.

ML is a general-purpose functional programming language. It is known for its use of the polymorphic Hindley–Milner type system, which automatically assigns the types of most expressions without requiring explicit type annotations, and ensures type safety – there is a formal proof that a well-typed ML program does not cause runtime type errors. ML provides pattern matching for function arguments, garbage collection, imperative programming, call-by-value and currying. It is used heavily in programming language research and is one of the few languages to be completely specified and verified using formal semantics. Its types and pattern matching make it well-suited and commonly used to operate on other formal languages, such as in compiler writing, automated theorem proving, and formal verification.

OCaml is a general-purpose, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

This is a comparison of Java and C++, two prominent object-oriented programming languages.

In computer science and computer programming, a data type or simply type is an attribute of data which tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans. A data type constrains the values that an expression, such as a variable or a function, might take. This data type defines the operations that can be done on the data, the meaning of the data, and the way values of that type can be stored. A data type provides a set of values from which an expression may take its values.

In computer programming, especially functional programming and type theory, an algebraic data type is a kind of composite type, i.e., a type formed by combining other types.

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as the value itself, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

This article compares two programming languages: C# with Java. While the focus of this article is mainly the languages and their features, such a comparison will necessarily also consider some features of platforms and libraries. For a more detailed comparison of the platforms, please see Comparison of the Java and .NET platforms.

The syntax of Java refers to the set of rules defining how a Java program is written and interpreted.

In computer science, the Boolean data type is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type —logic doesn't always need to be Boolean.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

C++11 is a version of the standard for the programming language C++. It was approved by International Organization for Standardization (ISO) on 12 August 2011, replacing C++03, superseded by C++14 on 18 August 2014 and later, by C++17. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

In computer programming, a variable or scalar is a storage location paired with an associated symbolic name, which contains some known or unknown quantity of information referred to as a value or in easy terms, a variable is a container for different types of data. The variable name is the usual way to reference the stored value, in addition to referring to the variable itself, depending on the context. This separation of name and content allows the name to be used independently of the exact information it represents. The identifier in computer source code can be bound to a value during run time, and the value of the variable may thus change during the course of program execution.

In the C, C++, and D programming languages, a type qualifier is a keyword that is applied to a type, resulting in a qualified type. For example, `const int` is a qualified type representing a constant integer, while `int` is the corresponding unqualified type, simply an integer. In D these are known as type constructors, by analogy with constructors in object-oriented programming.