Top type

Last updated

In mathematical logic and computer science, some type theories and type systems include a top type that is commonly denoted with top or the symbol ⊤. The top type is sometimes called also universal type, or universal supertype as all other types in the type system of interest are subtypes of it, and in most cases, it contains every possible object of the type system. It is in contrast with the bottom type, or the universal subtype, which every other type is supertype of and it is often that the type contains no members at all.

Contents

Support in programming languages

Several typed programming languages provide explicit support for the top type.

In statically-typed languages, there are two different, often confused, concepts when discussing the top type.

  1. A universal base class or other item at the top of a run time class hierarchy (often relevant in object-oriented programming) or type hierarchy; it is often possible to create objects with this (run time) type, or it could be found when one examines the type hierarchy programmatically, in languages that support it
  2. A (compile time) static type in the code whose variables can be assigned any value (or a subset thereof, like any object pointer value), similar to dynamic typing

The first concept often implies the second, i.e., if a universal base class exists, then a variable that can point to an object of this class can also point to an object of any class. However, several languages have types in the second regard above (e.g., void * in C++, id in Objective-C, interface {} in Go), static types which variables can accept any object value, but which do not reflect real run time types that an object can have in the type system, so are not top types in the first regard.

In dynamically-typed languages, the second concept does not exist (any value can be assigned to any variable anyway), so only the first (class hierarchy) is discussed. This article tries to stay with the first concept when discussing top types, but also mention the second concept in languages where it is significant.

Most object-oriented programming languages include a universal base class:
NameLanguages
Object Smalltalk, JavaScript, Ruby (pre-1.9.2), [1] and some others.
java.lang.Object Java. Often written without the package prefix, as Object. Also, it is not a supertype of the primitive types; however, since Java 1.5, autoboxing allows implicit or explicit type conversion of a primitive value to Object, e.g., ((Object)42).toString()
System.Object [2] C#, Visual Basic .NET, and other .NET Framework languages
std::any C++ since C++17
object Python since the type/class unification [3] in version 2.2 (new-style objects only; old-style objects in 2.x lack this as a base class)
TObject Object Pascal
t Lisp, many dialects such as Common Lisp
Any? Kotlin [4]
Any Scala, [5] Swift, [6] Julia [7]
ANY Eiffel [8]
UNIVERSAL Perl 5
Variant Visual Basic up to version 6
interface{} Go
BasicObject Ruby (version 1.9.2 and beyond)
any and unknown [9] TypeScript (with unknown having been introduced in version 3.0 [10] )
mixed PHP (as of version 8.0)

The following object-oriented languages have no universal base class:

Other languages

Languages that are not object-oriented usually have no universal supertype, or subtype polymorphism support.

While Haskell purposefully lacks subtyping, it has several other forms of polymorphism including parametric polymorphism. The most generic type class parameter is an unconstrained parameter a (without a type class constraint). In Rust, <T: ?Sized> is the most generic parameter (<T> is not, as it implies the Sized trait by default).

The top type is used as a generic type, more so in languages without parametric polymorphism. For example, before introducing generics in Java 5, collection classes in the Java library (excluding Java arrays) held references of type Object. In this way, any non-intrinsic type could be inserted into a collection. The top type is also often used to hold objects of unknown type.

The top type may also be seen as the implied type of non-statically typed languages. Languages with run time typing often provide downcasting (or type refinement) to allow discovering a more specific type for an object at run time. In C++, downcasting from void * cannot be done in a safe way, where failed downcasts are detected by the language run time.

In languages with a structural type system, the empty structure serves as a top type. For example, objects in OCaml are structurally typed; the empty object type (the type of objects with no methods), < >, is the top type of object types. Any OCaml object can be explicitly upcasted to this type, although the result would be of no use. Go also uses structural typing; and all types implement the empty interface: interface {}, which has no methods, but may still be downcast back to a more specific type.

In logic

The notion of top is also found in propositional calculus, corresponding to a formula which is true in every possible interpretation. It has a similar meaning in predicate calculus. In description logic, top is used to refer to the set of all concepts. This is intuitively like the use of the top type in programming languages. For example, in the Web Ontology Language (OWL), which supports various description logics, top corresponds to the class owl:Thing, where all classes are subclasses of owl:Thing. (the bottom type or empty set corresponds to owl:Nothing).

See also

Notes

  1. "Class: BasicObject (Ruby 1.9.2)" . Retrieved April 7, 2014.
  2. System.Object
  3. Python type/class unification
  4. Matilla, Hugo (2019-02-27). "Kotlin basics: types. Any, Unit and Nothing". Medium. Retrieved September 16, 2019.
  5. "An Overview of the Scala Programming Language" (PDF). 2006. Retrieved April 7, 2014.
  6. "Types — The Swift Programming Language (Swift 5.3)". docs.swift.org. Retrieved November 2, 2020.
  7. "Types · The Julia Language" . Retrieved May 15, 2021.
  8. "Standard ECMA-367. Eiffel: Analysis, Design and Programming Language" (PDF). 2006. Retrieved March 10, 2016.
  9. "The top types 'any' and 'unknown' in TypeScript".
  10. "The unknown Type in TypeScript". 15 May 2019.

Related Research Articles

C++ General-purpose programming language

C++ is a general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significantly over time, and modern C++ now has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Oracle, and IBM, so it is available on many platforms.

Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication. Such software entities are known as generics in Ada, C#, Delphi, Eiffel, F#, Java, Nim, Python, Go, Rust, Swift, TypeScript and Visual Basic .NET. They are known as parametric polymorphism in ML, Scala, Julia, and Haskell ; templates in C++ and D; and parameterized types in the influential 1994 book Design Patterns.

In computer science, an object can be a variable, a data structure, a function, or a method. As regions of memory, they contain value and are referenced by identifiers.

In programming languages, a type system is a logical system comprising a set of rules that assigns a property called a type to the various constructs of a computer program, such as variables, expressions, functions or modules. These types formalize and enforce the otherwise implicit categories the programmer uses for algebraic data types, data structures, or other components. The main purpose of a type system is to reduce possibilities for bugs in computer programs by defining interfaces between different parts of a computer program, and then checking that the parts have been connected in a consistent way. This checking can happen statically, dynamically, or as a combination of both. Type systems have other purposes as well, such as expressing business rules, enabling certain compiler optimizations, allowing for multiple dispatch, providing a form of documentation, etc.

In programming language theory, subtyping is a form of type polymorphism in which a subtype is a datatype that is related to another datatype by some notion of substitutability, meaning that program elements, typically subroutines or functions, written to operate on elements of the supertype can also operate on elements of the subtype. If S is a subtype of T, the subtyping relation is often written S <: T, to mean that any term of type S can be safely used in a context where a term of type T is expected. The precise semantics of subtyping crucially depends on the particulars of what "safely used in a context where" means in a given programming language. The type system of a programming language essentially defines its own subtyping relation, which may well be trivial, should the language support no conversion mechanisms.

In programming language theory and type theory, polymorphism is the provision of a single interface to entities of different types or the use of a single symbol to represent multiple different types. The concept is borrowed from a principle in biology where an organism or species can have many different forms or stages.

In database design, object-oriented programming and design, has-a is a composition relationship where one object "belongs to" another object, and behaves according to the rules of ownership. In simple words, has-a relationship in an object is called a member field of an object. Multiple has-a relationships will combine to form a possessive hierarchy.

In knowledge representation, object-oriented programming and design, is-a is a subsumption relationship between abstractions, wherein one class A is a subclass of another class B . In other words, type A is a subtype of type B when A's specification implies B's specification. That is, any object that satisfies A's specification also satisfies B's specification, because B's specification is weaker.

Liskov substitution principle Object-oriented programming principle

The Liskov substitution principle (LSP) is a particular definition of a subtyping relation, called strong behavioral subtyping, that was initially introduced by Barbara Liskov in a 1988 conference keynote address titled Data abstraction and hierarchy. It is based on the concept of "substitutability" – a principle in object-oriented programming stating that an object and a sub-object must be interchangeable without breaking the program. It is a semantic rather than merely syntactic relation, because it intends to guarantee semantic interoperability of types in a hierarchy, object types in particular. Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:

Subtype Requirement: Let be a property provable about objects of type T. Then should be true for objects of type S where S is a subtype of T.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as the value itself, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

In computer science, type conversion, type casting, type coercion, and type juggling are different ways of changing an expression from one data type to another. An example would be the conversion of an integer value into a floating point value or its textual representation as a string, and vice versa. Type conversions can take advantage of certain features of type hierarchies or data representations. Two important aspects of a type conversion are whether it happens implicitly (automatically) or explicitly, and whether the underlying data representation is converted from one representation into another, or a given representation is merely reinterpreted as the representation of another data type. In general, both primitive and compound data types can be converted.

In computer programming, run-time type information or run-time type identification (RTTI) is a feature of some programming languages that exposes information about an object's data type at runtime. Run-time type information may be available for all types or only to types that explicitly have it. Run-time type information is a specialization of a more general concept called type introspection.

In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that is, some facilities are type-safe and their usage will not result in type errors, while other facilities in the same language may be type-unsafe and a program using them may encounter type errors. The behaviors classified as type errors by a given programming language are usually those that result from attempts to perform operations on values that are not of the appropriate data type, e.g., adding a string to an integer when there's no definition on how to handle this case. This classification is partly based on opinion.

In object-oriented programming, inheritance is the mechanism of basing an object or class upon another object or class, retaining similar implementation. Also defined as deriving new classes from existing ones such as super class or base class and then forming them into a hierarchy of classes. In most class-based object-oriented languages, an object created through inheritance, a "child object", acquires all the properties and behaviors of the "parent object", with the exception of: constructors, destructor, overloaded operators and friend functions of the base class. Inheritance allows programmers to create classes that are built upon existing classes, to specify a new implementation while maintaining the same behaviors, to reuse code and to independently extend original software via public classes and interfaces. The relationships of objects or classes through inheritance give rise to a directed acyclic graph.

In programming languages and type theory, parametric polymorphism is a way to make a language more expressive, while still maintaining full static type-safety. Using parametric polymorphism, a function or a data type can be written generically so that it can handle values identically without depending on their type. Such functions and data types are called generic functions and generic datatypes respectively and form the basis of generic programming.

Generics are a facility of generic programming that were added to the Java programming language in 2004 within version J2SE 5.0. They were designed to extend Java's type system to allow "a type or method to operate on objects of various types while providing compile-time type safety". The aspect compile-time type safety was not fully achieved, since it was shown in 2016 that it is not guaranteed in all cases.

In class-based programming, downcasting or type refinement is the act of casting a reference of a base class to one of its derived classes.

In computer programming, a constant is a value that should not be altered by the program during normal execution, i.e., the value is constant. When associated with an identifier, a constant is said to be "named," although the terms "constant" and "named constant" are often used interchangeably. This is contrasted with a variable, which is an identifier with a value that can be changed during normal execution, i.e., the value is variable. Constants are useful for both programmers and compilers: For programmers they are a form of self-documenting code and allow reasoning about correctness, while for compilers they allow compile-time and run-time checks that verify that constancy assumptions are not violated, and allow or simplify some compiler optimizations.

Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTSTEP operating system. Objective-C was the standard programming language supported by Apple for developing macOS and iOS applications using their respective application programming interfaces (APIs), Cocoa and Cocoa Touch, until the introduction of Swift in 2014.

References