Type–token distinction

Last updated
Although this flock is made of the same type of bird, each individual bird is a different token. (50 MB video of a flock of birds in Rome)

The type–token distinction is the difference between a word referring to a class of objects and the same word referring to an individual instance of an object. For example, the sentence "A rose is a rose is a rose" could be said to contain three words, the word types "a", "rose", and "is"; or to contain eight words, the word tokens "a", "rose", "is", "a", "rose", "is", "a", "rose". The distinction is important in disciplines such as logic, linguistics, metalogic, typography, and computer programming.

Logic the systematic study of the form of arguments

Logic is the systematic study of the form of valid inference, and the most general laws of truth. A valid inference is one where there is a specific relation of logical support between the assumptions of the inference and its conclusion. In ordinary discourse, inferences may be signified by words such as therefore, thus, hence, ergo, and so on.

Linguistics is the scientific study of language. It involves analysing language form, language meaning, and language in context. The earliest activities in the documentation and description of language have been attributed to the 6th-century-BC Indian grammarian Pāṇini who wrote a formal description of the Sanskrit language in his Aṣṭādhyāyī.

Metalogic is the study of the metatheory of logic. Whereas logic studies how logical systems can be used to construct valid and sound arguments, metalogic studies the properties of logical systems. Logic concerns the truths that may be derived using a logical system; metalogic concerns the truths that may be derived about the languages and systems that are used to express truths.



The sentence "they drive the same car" is ambiguous. Do they drive the same type of car (the same model) or the same instance of a car type (a single vehicle)? Clarity requires us to distinguish words that represent abstract types from words that represent objects that embody or exemplify types. The type–token distinction separates types (abstract descriptive concepts) from tokens (objects that instantiate concepts).

For example: "bicycle" represents a type: the concept of a bicycle; whereas "my bicycle" represents a token of that type: an object that instantiates that type. In the sentence "the bicycle is becoming more popular" the word "bicycle" represents a type that is a concept; whereas in the sentence "the bicycle is in the garage" the word "bicycle" represents a token: a particular object.

(The distinction in computer programming between classes and objects is related, though in this context, "class" sometimes refers to a set of objects (with class-level attribute or operations) rather than a description of an object in the set, as "type" would.)

Computer programming Process that leads from an original formulation of a computing problem to executable computer programs

Computer programming is the process of designing and building an executable computer program for accomplishing a specific computing task. Programming involves tasks such as: analysis, generating algorithms, profiling algorithms' accuracy and resource consumption, and the implementation of algorithms in a chosen programming language. The source code of a program is written in one or more languages that are intelligible to programmers, rather than machine code, which is directly executed by the central processing unit. The purpose of programming is to find a sequence of instructions that will automate the performance of a task on a computer, often for solving a given problem. The process of programming thus often requires expertise in several different subjects, including knowledge of the application domain, specialized algorithms, and formal logic.

In computer science, an object can be a variable, a data structure, a function, or a method, and as such, is a value in memory referenced by an identifier.

The words type, concept, property, quality, feature and attribute (all used in describing things) tend to be used with different verbs. E.g. Suppose a rose bush is defined as a plant that is "thorny", "flowering" and "bushy". You might say a rose bush instantiates these three types, or embodies these three concepts, or exhibits these three properties, or possesses these three qualities, features or attributes.

Property types (e.g. "height in metres" or "thorny") are often understood ontologically as concepts. Property instances (e.g. height = 1.74) are sometimes understood as measured values, and sometimes understood as sensations or observations of reality.

Ontology study of the nature of being, becoming, existence or reality, as well as the basic categories of being and their relations

Ontology is the philosophical study of being. More broadly, it studies concepts that directly relate to being, in particular becoming, existence, reality, as well as the basic categories of being and their relations. Traditionally listed as a part of the major branch of philosophy known as metaphysics, ontology often deals with questions concerning what entities exist or may be said to exist and how such entities may be grouped, related within a hierarchy, and subdivided according to similarities and differences.

Some types exist as descriptions of objects, but not as tangible physical objects. One can show someone a particular bicycle, but cannot show someone, explicitly, the type "bicycle", as in "the bicycle is popular.". Such use of typologically similar yet different semantic properties appear in mental and documented models, and are often referenced in every day conversation.

In common usage, a physical object or physical body is a collection of matter within a defined contiguous boundary in 3-dimensional space. The boundary must be defined and identified by the properties of the material. The boundary may change over time. The boundary is usually the visible or tangible surface of the object. The matter in the object is constrained to move as one object. The boundary may move in space relative to other objects that it is not attached to. An object's boundary may also deform and change over time in other ways.

Some say tokens are objects that are tangible, exist in space and time as physical matter and/or energy. However, tokens can be intangible objects of types such as "thought", "tennis match", "government" and "act of kindness".


There is a related distinction very closely connected with the type-token distinction. This distinction is the distinction between an object, or type of object, and an occurrence of it. In this sense, an occurrence is not necessarily a token. Considering the sentence: "A rose is a rose is a rose". We may equally correctly state that there are eight or three words in the sentence. There are, in fact, three word types in the sentence: "rose", "is" and "a". There are eight word tokens in a token copy of the line. The line itself is a type. There are not eight word types in the line. It contains (as stated) only the three word types, 'a', 'is' and 'rose', each of which is unique. So what do we call what there are eight of? They are occurrences of words. There are three occurrences of the word type 'a', two of 'is' and three of 'rose'.

The need to distinguish tokens of types from occurrences of types arises, not just in linguistics, but whenever types of things have other types of things occurring in them. [1] Reflection on the simple case of occurrences of numerals is often helpful.[ citation needed ]

Numeral system Writing system for expressing numbers

A numeral system is a writing system for expressing numbers; that is, a mathematical notation for representing numbers of a given set, using digits or other symbols in a consistent manner.


In typography, the type–token distinction is used to determine the presence of a text printed by movable type: [2]

The defining criteria which a typographic print has to fulfill is that of the type identity of the various letter forms which make up the printed text. In other words: each letter form which appears in the text has to be shown as a particular instance ("token") of one and the same type which contains a reverse image of the printed letter.

Charles Sanders Peirce

There are only 26 letters in the English alphabet and yet there are more than 26 letters in this sentence. Moreover, every time a child writes the alphabet 26 new letters have been created.

The word 'letters' was used three times in the above paragraph, each time in a different meaning. The word 'letters' is one of many words having "type–token ambiguity". This section disambiguates 'letters' by separating the three senses using terminology standard in logic today. The key distinctions were first made by the American logician-philosopher Charles Sanders Peirce in 1906 using terminology that he established. [3]

The letters that are created by writing are physical objects that can be destroyed by various means: these are letter TOKENS or letter INSCRIPTIONS. The 26 letters of the alphabet are letter TYPES or letter FORMS.

Peirce's type–token distinction, also applies to words, sentences, paragraphs, and so on: to anything in a universe of discourse of character-string theory, or concatenation theory. There is only one word type spelled el-ee-tee-tee-ee-ar, [4] namely, 'letter'; but every time that word type is written, a new word token has been created.

Some logicians consider a word type to be the class of its tokens. Other logicians counter that the word type has a permanence and constancy not found in the class of its tokens. The type remains the same while the class of its tokens is continually gaining new members and losing old members.

The word type 'letter' uses only four letter types: el, ee, tee, and ar. Nevertheless, it uses ee twice and tee twice. In standard terminology, the word type 'letter' has six letter OCCURRENCES and the letter type ee OCCURS twice in the word type 'letter'. Whenever a word type is inscribed, the number of letter tokens created equals the number of letter occurrences in the word type.

Peirce's original words are the following. "A common mode of estimating the amount of matter in a ... printed book is to count the number of words. There will ordinarily be about twenty 'thes' on a page, and, of course, they count as twenty words. In another sense of the word 'word,' however, there is but one word 'the' in the English language; and it is impossible that this word should lie visibly on a page, or be heard in any voice .... Such a ... Form, I propose to term a Type. A Single ... Object ... such as this or that word on a single line of a single page of a single copy of a book, I will venture to call a Token. .... In order that a Type may be used, it has to be embodied in a Token which shall be a sign of the Type, and thereby of the object the Type signifies." – Peirce 1906, Ogden-Richards, 1923, 280-1.

These distinctions are subtle but solid and easy to master. This section ends using the new terminology to disambiguate the first paragraph.

There are 26 letter types in the English alphabet and yet there are more than 26 letter occurrences in this sentence type. Moreover, every time a child writes the alphabet 26 new letter tokens have been created.

See also

Related Research Articles

Charles Sanders Peirce American philosopher, logician, mathematician, and scientist

Charles Sanders Peirce was an American philosopher, logician, mathematician, and scientist who is sometimes known as "the father of pragmatism". He was educated as a chemist and employed as a scientist for thirty years. Today he is appreciated largely for his contributions to logic, mathematics, philosophy, scientific methodology, semiotics, and for his founding of pragmatism.

Formal language set of strings of symbols that may be constrained by rules that are specific to it

In mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.

In any of several studies that treat the use of signs—for example, in linguistics, logic, mathematics, semantics, and semiotics—the extension of a concept, idea, or sign consists of the things to which it applies, in contrast with its comprehension or intension, which consists very roughly of the ideas, properties, or corresponding signs that are implied or suggested by the concept in question.

The use–mention distinction is a foundational concept of analytic philosophy, according to which it is necessary to make a distinction between using a word and mentioning it, and many philosophical works have been "vitiated by a failure to distinguish use and mention". The distinction is disputed by non-analytic philosophers.

English alphabet Latin alphabet consisting of 26 letters, each having an uppercase and a lowercase form

The modern English alphabet is a Latin alphabet consisting of 26 letters, each having an upper- and lower-case form. The same letters constitute the ISO basic Latin alphabet. The alphabet's current form originated in about the 7th century from the Latin script. Since then, various letters have been added, or removed, to give the current Modern English alphabet of 26 letters:

In mathematical logic, propositional logic and predicate logic, a well-formed formula, abbreviated WFF or wff, often simply formula, is a finite sequence of symbols from a given alphabet that is part of a formal language. A formal language can be identified with the set of formulas in the language.

In semiotics, a sign is anything that communicates a meaning that is not the sign itself to the interpreter of the sign. The meaning can be intentional such as a word uttered with a specific meaning, or unintentional, such as a symptom being a sign of a particular medical condition. Signs can communicate through any of the senses, visual, auditory, tactile, olfactory, or taste.

Letter case Distinction between alphabetic letters in taller, "upper" case and shorter "lower" case

Letter case is the distinction between the letters that are in larger upper case and smaller lower case in the written representation of certain languages. The writing systems that distinguish between the upper and lower case have two parallel sets of letters, with each letter in one set usually having an equivalent in the other set. The two case variants are alternative representations of the same letter: they have the same name and pronunciation and are treated identically when sorting in alphabetical order.

A grammatical category or grammatical feature is a property of items within the grammar of a language. Within each category there are two or more possible values, which are normally mutually exclusive. Frequently encountered grammatical categories include:

Much of Tamil grammar is extensively described in the oldest available grammar book for Tamil, the Tolkāppiyam. Modern Tamil writing is largely based on the 13th century grammar Naṉṉūl which restated and clarified the rules of the Tolkāppiyam, with some modifications.

Representation (arts) art technique

Representation is the use of signs that stand in for and take the place of something else. It is through representation that people organize the world and reality through the act of naming its elements. Signs are arranged in order to form semantic constructions and express relations.

Benjamin Franklins phonetic alphabet

Benjamin Franklin's phonetic alphabet was Benjamin Franklin's proposal for a spelling reform of the English language. The alphabet was based on the Latin alphabet used in English. Franklin modified the standard English alphabet by omitting the letters c, j, q, w, x, and y, and adding new letters to explicitly represent the open-mid back rounded and unrounded vowels, and the consonants sh, ng, voiced th, and voiceless th. It was one of the earlier proposed spelling reforms to the English language. The alphabet consisted of 26 letters in the following order:

Letter (alphabet) grapheme in an alphabetic system of writing

A letter is a segmental symbol of a phonemic writing system. The inventory of all letters forms the alphabet. Letters broadly correspond to phonemes in the spoken form of the language, although there is rarely a consistent, exact correspondence between letters and phonemes.

A truth-bearer is an entity that is said to be either true or false and nothing else. The thesis that some things are true while others are false has led to different theories about the nature of these entities. Since there is divergence of opinion on the matter, the term truth-bearer is used to be neutral among the various theories. Truth-bearer candidates include propositions, sentences, sentence-tokens, statements, concepts, beliefs, thoughts, intuitions, utterances, and judgements but different authors exclude one or more of these, deny their existence, argue that they are true only in a derivative sense, assert or assume that the terms are synonymous, or seek to avoid addressing their distinction or do not clarify it.

Charles Sanders Peirce began writing on semiotics, which he also called semeiotics, meaning the philosophical study of signs, in the 1860s, around the time that he devised his system of three categories. During the 20th century, the term "semiotics" was adopted to cover all tendencies of sign researches, including Ferdinand de Saussure's semiology, which began in linguistics as a completely separate tradition.

The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision.

Following the developments in formal logic with symbolic logic in the late nineteenth century and mathematical logic in the twentieth, topics traditionally treated by logic not being part of formal logic have tended to be termed either philosophy of logic or philosophical logic if no longer simply logic.

Irish Braille is the braille alphabet of the Irish language. It is augmented by specifically Irish letters for vowels that take acute accents in print:


  1. Stanford Encyclopedia of Philosophy, Types and Tokens
  2. Brekle, Herbert E.: Die Prüfeninger Weiheinschrift von 1119. Eine paläographisch-typographische Untersuchung, Scriptorium Verlag für Kultur und Wissenschaft, Regensburg 2005, ISBN   3-937527-06-0, p. 23
  3. Charles Sanders Peirce, Prolegomena to an apology for pragmaticism, Monist, vol.16 (1906), pp. 492–546.
  4. Using a variant of Alfred Tarski's structural-descriptive naming found in John Corcoran, Schemata: the Concept of Schema in the History of Logic, Bulletin of Symbolic Logic, vol. 12 (2006), pp. 219–40.