Top type

Last updated

In mathematical logic and computer science, some type theories and type systems include a top type that is commonly denoted with top or the symbol ⊤. The top type is sometimes called also universal type, or universal supertype as all other types in the type system of interest are subtypes of it, and in most cases, it contains every possible object of the type system. It is in contrast with the bottom type, or the universal subtype, which every other type is supertype of and it is often that the type contains no members at all.

Contents

Support in programming languages

Several typed programming languages provide explicit support for the top type.

In statically-typed languages, there are two different, often confused, concepts when discussing the top type.

  1. A universal base class or other item at the top of a run time class hierarchy (often relevant in object-oriented programming) or type hierarchy; it is often possible to create objects with this (run time) type, or it could be found when one examines the type hierarchy programmatically, in languages that support it
  2. A (compile time) static type in the code whose variables can be assigned any value (or a subset thereof, like any object pointer value), similar to dynamic typing

The first concept often implies the second, i.e., if a universal base class exists, then a variable that can point to an object of this class can also point to an object of any class. However, several languages have types in the second regard above (e.g., void * in C++, id in Objective-C, interface {} in Go), static types which variables can accept any object value, but which do not reflect real run time types that an object can have in the type system, so are not top types in the first regard.

In dynamically-typed languages, the second concept does not exist (any value can be assigned to any variable anyway), so only the first (class hierarchy) is discussed. This article tries to stay with the first concept when discussing top types, but also mention the second concept in languages where it is significant.

Most object-oriented programming languages include a universal base class:
NameLanguages
Object Smalltalk, JavaScript, Ruby (pre-1.9.2), [1] and some others.
java.lang.Object Java. Often written without the package prefix, as Object. Also, it is not a supertype of the primitive types; however, since Java 1.5, autoboxing allows implicit or explicit type conversion of a primitive value to Object, e.g., ((Object)42).toString()
System.Object [2] C#, Visual Basic .NET, and other .NET Framework languages
std::any C++ since C++17
object Python since the type/class unification [3] in version 2.2 (new-style objects only; old-style objects in 2.x lack this as a base class)
TObject Object Pascal
t Lisp, many dialects such as Common Lisp
Any? Kotlin [4]
Any Scala, [5] Swift [6]
ANY Eiffel [7]
UNIVERSAL Perl 5
Variant Visual Basic up to version 6
interface{} Go
BasicObject Ruby (version 1.9.2 and beyond)
any and unknown [8] TypeScript (with unknown having been introduced in version 3.0 [9] )
mixed PHP (as of version 8.0)

The following object-oriented languages have no universal base class:

Other languages

Languages that are not object-oriented usually have no universal supertype, or subtype polymorphism support.

While Haskell purposefully lacks subtyping, it has several other forms of polymorphism including parametric polymorphism. The most generic type class parameter is an unconstrained parameter a (without a type class constraint). In Rust, <T: ?Sized> is the most generic parameter (<T> is not, as it implies the Sized trait by default).

The top type is used as a generic type, more so in languages without parametric polymorphism. For example, before introducing generics in Java 5, collection classes in the Java library (excluding Java arrays) held references of type Object. In this way, any non-intrinsic type could be inserted into a collection. The top type is also often used to hold objects of unknown type.

The top type may also be seen as the implied type of non-statically typed languages. Languages with run time typing often provide downcasting (or type refinement) to allow discovering a more specific type for an object at run time. In C++, downcasting from void * cannot be done in a safe way, where failed downcasts are detected by the language run time.

In languages with a structural type system, the empty structure serves as a top type. For example, objects in OCaml are structurally typed; the empty object type (the type of objects with no methods), < >, is the top type of object types. Any OCaml object can be explicitly upcasted to this type, although the result would be of no use. Go also uses structural typing; and all types implement the empty interface: interface {}, which has no methods, but may still be downcast back to a more specific type.

In logic

The notion of top is also found in propositional calculus, corresponding to a formula which is true in every possible interpretation. It has a similar meaning in predicate calculus. In description logic, top is used to refer to the set of all concepts. This is intuitively like the use of the top type in programming languages. For example, in the Web Ontology Language (OWL), which supports various description logics, top corresponds to the class owl:Thing, where all classes are subclasses of owl:Thing. (the bottom type or empty set corresponds to owl:Nothing).

See also

Notes

  1. "Class: BasicObject (Ruby 1.9.2)" . Retrieved April 7, 2014.CS1 maint: discouraged parameter (link)
  2. System.Object
  3. Python type/class unification
  4. Matilla, Hugo (2019-02-27). "Kotlin basics: types. Any, Unit and Nothing". Medium. Retrieved 2019-09-16.
  5. "An Overview of the Scala Programming Language" (PDF). 2006. Retrieved April 7, 2014.CS1 maint: discouraged parameter (link)
  6. "Types — The Swift Programming Language (Swift 5.3)". docs.swift.org. Retrieved 2020-10-02.
  7. "Standard ECMA-367. Eiffel: Analysis, Design and Programming Language" (PDF). 2006. Retrieved March 10, 2016.CS1 maint: discouraged parameter (link)
  8. https://2ality.com/2020/06/any-unknown-typescript.html
  9. https://mariusschulz.com/blog/the-unknown-type-in-typescript

Related Research Articles

C++ General-purpose programming language

C++ is a general-purpose programming language created by Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significantly over time, and modern C++ now has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Oracle, and IBM, so it is available on many platforms.

Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication. Such software entities are known as generics in Ada, C#, Delphi, Eiffel, F#, Java, Nim, Python, Rust, Swift, TypeScript and Visual Basic .NET. They are known as parametric polymorphism in ML, Scala, Julia, and Haskell ; templates in C++ and D; and parameterized types in the influential 1994 book Design Patterns.

In programming languages, a type system is a logical system comprising a set of rules that assigns a property called a type to the various constructs of a computer program, such as variables, expressions, functions or modules. These types formalize and enforce the otherwise implicit categories the programmer uses for algebraic data types, data structures, or other components. The main purpose of a type system is to reduce possibilities for bugs in computer programs by defining interfaces between different parts of a computer program, and then checking that the parts have been connected in a consistent way. This checking can happen statically, dynamically, or as a combination of both. Type systems have other purposes as well, such as expressing business rules, enabling certain compiler optimizations, allowing for multiple dispatch, providing a form of documentation, etc.

In programming language theory, subtyping is a form of type polymorphism in which a subtype is a datatype that is related to another datatype by some notion of substitutability, meaning that program elements, typically subroutines or functions, written to operate on elements of the supertype can also operate on elements of the subtype. If S is a subtype of T, the subtyping relation is often written S <: T, to mean that any term of type S can be safely used in a context where a term of type T is expected. The precise semantics of subtyping crucially depends on the particulars of what "safely used in a context where" means in a given programming language. The type system of a programming language essentially defines its own subtyping relation, which may well be trivial should the language support no conversion mechanisms.

In programming languages and type theory, polymorphism is the provision of a single interface to entities of different types or the use of a single symbol to represent multiple different types.

In database design, object-oriented programming and design, has-a is a composition relationship where one object "belongs to" another object, and behaves according to the rules of ownership. In simple words, has-a relationship in an object is called a member field of an object. Multiple has-a relationships will combine to form a possessive hierarchy.

In knowledge representation, object-oriented programming and design, is-a is a subsumption relationship between abstractions, wherein one class A is a subclass of another class B . In other words, type A is a subtype of type B when A's specification implies B's specification. That is, any object that satisfies A's specification also satisfies B's specification, because B's specification is weaker.

Substitutability is a principle in object-oriented programming stating that, in a computer program, if S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desirable properties of the program. More formally, the Liskov substitution principle (LSP) is a particular definition of a subtyping relation, called (strong) behavioral subtyping, that was initially introduced by Barbara Liskov in a 1988 conference keynote address titled Data abstraction and hierarchy. It is a semantic rather than merely syntactic relation, because it intends to guarantee semantic interoperability of types in a hierarchy, object types in particular. Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:

Subtype Requirement: Let be a property provable about objects of type T. Then should be true for objects of type S where S is a subtype of T.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as the value itself, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

In computer science, type conversion, type casting, type coercion, and type juggling are different ways of changing an expression from one data type to another. An example would be the conversion of an integer value into a floating point value or its textual representation as a string, and vice versa. Type conversions can take advantage of certain features of type hierarchies or data representations. Two important aspects of a type conversion are whether it happens implicitly (automatically) or explicitly, and whether the underlying data representation is converted from one representation into another, or a given representation is merely reinterpreted as the representation of another data type. In general, both primitive and compound data types can be converted.

In computer programming, run-time type information or run-time type identification (RTTI) is a feature of the C++ programming language that exposes information about an object's data type at runtime. Run-time type information can apply to simple data types, such as integers and characters, or to generic types. This is a C++ specialization of a more general concept called type introspection. Similar mechanisms are also known in other programming languages, such as Object Pascal (Delphi).

In computer science, type safety is the extent to which a programming language discourages or prevents type errors. A type error is erroneous or undesirable program behaviour caused by a discrepancy between differing data types for the program's constants, variables, and methods (functions), e.g., treating an integer (int) as a floating-point number (float). Type safety is sometimes alternatively considered to be a property of a computer program rather than the language in which that program is written; that is, some languages have type-safe facilities that can be circumvented by programmers who adopt practices that exhibit poor type safety. The formal type-theoretic definition of type safety is considerably stronger than what is understood by most programmers.

Many programming language type systems support subtyping. For instance, if the type Cat is a subtype of Animal, then an expression of type Cat should be substitutable wherever an expression of type Animal is used.

In object-oriented programming, inheritance is the mechanism of basing an object or class upon another object or class, retaining similar implementation. Also defined as deriving new classes from existing ones such as super class or base class and then forming them into a hierarchy of classes. In most class-based object-oriented languages, an object created through inheritance, a "child object", acquires all the properties and behaviors of the "parent object", with the exception of: constructors, destructor, overloaded operators and friend functions of the base class. Inheritance allows programmers to create classes that are built upon existing classes, to specify a new implementation while maintaining the same behaviors, to reuse code and to independently extend original software via public classes and interfaces. The relationships of objects or classes through inheritance give rise to a directed graph.

Generics are a facility of generic programming that were added to the Java programming language in 2004 within version J2SE 5.0. They were designed to extend Java's type system to allow "a type or method to operate on objects of various types while providing compile-time type safety". The aspect compile-time type safety was not fully achieved, since it was shown in 2016 that it is not guaranteed in all cases.

In class-based programming, downcasting or type refinement is the act of casting a reference of a base class to one of its derived classes.

In computer programming, a variable or scalar is a storage location paired with an associated symbolic name, which contains some known or unknown quantity of information referred to as a value or In easy terms a variable is container for different types of data. The variable name is the usual way to reference the stored value, in addition to referring to the variable itself, depending on the context. This separation of name and content allows the name to be used independently of the exact information it represents. The identifier in computer source code can be bound to a value during run time, and the value of the variable may thus change during the course of program execution.

Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code: data in the form of fields, and code, in the form of procedures.

The wildcard? in Java is a special kind of type argument that controls the type safety of the use of generic (parameterized) types. It can be used in variable declarations and instantiations as well as in method definitions, but not in the definition of a generic type. This is a form of use-site variance annotation, in contrast with the definition-site variance annotations found in C# and Scala.

References