Last updated

A lexicon is the vocabulary of a language or branch of knowledge (such as nautical or medical). In linguistics, a lexicon is a language's inventory of lexemes. The word lexicon derives from Greek word λεξικόν (lexikon), neuter of λεξικός (lexikos) meaning 'of or for words'. [1]


Linguistic theories generally regard human languages as consisting of two parts: a lexicon, essentially a catalogue of a language's words (its wordstock); and a grammar, a system of rules which allow for the combination of those words into meaningful sentences. The lexicon is also thought to include bound morphemes, which cannot stand alone as words (such as most affixes). [2] In some analyses, compound words and certain classes of idiomatic expressions, collocations and other phrases are also considered to be part of the lexicon. Dictionaries represent attempts at listing, in alphabetical order, the lexicon of a given language; usually, however, bound morphemes are not included.

Size and organization

Items in the lexicon are called lexemes, or lexical items, or word forms. Lexemes are not atomic elements but contain both phonological and morphological components. When describing the lexicon, a reductionist approach is used, trying to remain general while using a minimal description. To describe the size of a lexicon, lexemes are grouped into lemmas. A lemma is a group of lexemes generated by inflectional morphology. Lemmas are represented in dictionaries by headwords which list the citation forms and any irregular forms, since these must be learned to use the words correctly. Lexemes derived from a word by derivational morphology are considered new lemmas. The lexicon is also organized according to open and closed categories. Closed categories, such as determiners or pronouns, are rarely given new lexemes; their function is primarily syntactic. Open categories, such as nouns and verbs, have highly active generation mechanisms and their lexemes are more semantic in nature.

Lexicalization and other mechanisms in the lexicon

A central role of the lexicon is the documenting of established lexical norms and conventions. Lexicalization is the process by which new words, having gained widespread usage, enter the lexicon. Since lexicalization [3] may modify lexemes phonologically and morphologically, it is possible that a single etymological source may be inserted into a single lexicon in two or more forms. These pairs, called a doublet, are often close semantically. Two examples are aptitude versus attitude and employ versus imply . [4]

The mechanisms, not mutually exclusive, are: [5]

Neologisms (new words)

Neologisms are new lexeme candidates which, if they gain wide usage over time, become part of a language's lexicon. Neologisms are often introduced by children who produce erroneous forms by mistake. [7] Other common sources are slang and advertising.

Neologisms that maintain the sound of their external source

There are two types of borrowings (neologisms based on external sources) that retain the sound of the source language material:

  • Borrowing using the source language lexical item as the basic material for the neologization: guestwords, foreignisms and loanwords
  • Borrowing using a target language lexical items as the basic material for the neologization: phono-semantic matching, semanticized phonetic matching and phonetic matching.

Guestwords, foreignisms and loanwords

The following are examples of external lexical expansion using the source language lexical item as the basic material for the neologization, listed in decreasing order of phonetic resemblance to the original lexical item (in the source language): [8]

  • Guestword (in German: Gastwort): unassimilated borrowing.
  • Foreignism (in German: Fremdwort): foreign word, e.g. phonetic adaptation.
  • Loanword (in German: Lehnwort): totally assimilated borrowing, e.g. morphemic adaptation.

Phono-semantic matches, semanticized phonetic matches and phonetic matches

The following are examples of simultaneous external and internal lexical expansion using target language lexical items as the basic material for the neologization but still resembling the sound of the lexical item in the source language: [9]

  • Phono-semantic matching (PSM): the target language material is originally similar to the source language lexical item both phonetically and semantically.
  • Semanticized phonetic matching (SPM): the target language material is originally similar to the source language lexical item phonetically, and only in a loose way semantically.
  • Phonetic matching (PM): the target language material is originally similar to the source language lexical item phonetically but not semantically.

Role of morphology

Another mechanism involves generative devices that combine morphemes according to a language's rules. For example, the suffix "-able" is usually only added to transitive verbs, as in "readable" but not "cryable".


A compound word is a lexeme composed of several established lexemes, whose semantics is not the sum of that of their constituents. They can be interpreted through analogy, common sense and, most commonly, context. [3] Compound words can have simple or complex morphological structures. Usually only the head requires inflection for agreement. Compounding may result in lexemes of unwieldy proportion. This is compensated by mechanisms that reduce the length of words. A similar phenomenon has been recently shown to feature in social media also where hashtags compound to form longer-sized hashtags that are at times more popular than the individual constituent hashtags forming the compound. [10] Compounding is the most common of word formation strategies cross-linguistically.

Diachronic mechanisms

Comparative historical linguistics studies the evolutions languages and takes a diachronic view of the lexicon. The evolution of lexicons in different languages occurs through parallel mechanism. Over time historical forces work to shape the lexicon, [11] making it simpler to acquire and often creating an illusion of great regularity in language.

Second-language lexicon

The term "lexicon" is generally used in the context of single language. Therefore, multi-lingual speakers are generally thought to have multiple lexicons. Speakers of language variants (Brazilian Portuguese and European Portuguese, for example) may be considered to possess a single lexicon. Thus a cash dispenser (British English) as well as an automatic teller machine or ATM in American English would be understood by both American and British speakers, despite each group using different dialects.

When linguists study a lexicon, they consider such things as what constitutes a word; the word/concept relationship; lexical access and lexical access failure; how a word's phonology, syntax, and meaning intersect; the morphology-word relationship; vocabulary structure within a given language; language use (pragmatics); language acquisition; the history and evolution of words (etymology); and the relationships between words, often studied within philosophy of language.

Various models of how lexicons are organized and how words are retrieved have been proposed in psycholinguistics, neurolinguistics and computational linguistics.

See also

Related Research Articles

A lexeme is a unit of lexical meaning that underlies a set of words that are related through inflection. It is a basic abstract unit of meaning, a unit of morphological analysis in linguistics that roughly corresponds to a set of forms taken by a single root word. For example, in English, run, runs, ran and running are forms of the same lexeme, which can be represented as RUN.

Lexicology is the branch of linguistics that analyzes the lexicon of a specific language. A word is the smallest meaningful unit of a language that can stand on its own, and is made up of small components called morphemes and even smaller elements known as phonemes, or distinguishing sounds. Lexicology examines every feature of a word – including formation, spelling, origin, usage, and definition.

A morpheme is the smallest meaningful lexical item in a language. A morpheme is not necessarily the same as a word. The main difference between a morpheme and a word is that a morpheme sometimes does not stand alone, but a word, by definition, always stands alone. The field of linguistic study dedicated to morphemes is called morphology.

In linguistics, morphology is the study of words, how they are formed, and their relationship to other words in the same language. It analyzes the structure of words and parts of words such as stems, root words, prefixes, and suffixes. Morphology also looks at parts of speech, intonation and stress, and the ways context can change a word's pronunciation and meaning. Morphology differs from morphological typology, which is the classification of languages based on their use of words, and lexicology, which is the study of words and how they make up a language's vocabulary.

A loanword is a word as adopted from one language and incorporated into another language without translation. This is in contrast to cognates, which are words in two or more languages that are similar because they share an etymological origin, and calques, which involve translation.

International scientific vocabulary (ISV) comprises scientific and specialized words whose language of origin may or may not be certain, but which are in current use in several modern languages. The name "international scientific vocabulary" was first used by Philip Gove in Webster’s Third New International Dictionary (1961). As noted by Crystal, science is an especially productive field for new coinages.

Linguistics is the scientific study of human language. Someone who engages in this study is called a linguist. See also the Outline of linguistics, the List of phonetics topics, the List of linguists, and the List of cognitive science topics. Articles related to linguistics include:

In linguistics, a calque or loan translation is a word or phrase borrowed from another language by literal word-for-word or root-for-root translation. When used as a verb, "to calque" means to borrow a word or phrase from another language while translating its components, so as to create a new lexeme in the target language. For instance, the English word "skyscraper" led to the French gratte-ciel, the Spanish rascacielos, the Portuguese arranha-céus, the Italian grattacielo, and to similar calques in dozens of other languages. Another notable example is the Latin week names, which came to be associated by ancient Germanic speakers with their own gods following a practice known as interpretatio germanica: the Latin "Day of Mercury", Mercurii dies, was borrowed into Late Proto-Germanic as the "Day of Wōđanaz" (*Wodanesdag), which became Wōdnesdæg in Old English, then "Wednesday" in Modern English.

Lexical semantics, as a subfield of linguistic semantics, is the study of word meanings. It includes the study of how words structure their meaning, how they act in grammar and compositionality, and the relationships between the distinct senses and uses of a word.

A root is the core of a word that is irreducible into more meaningful elements. In morphology, a root is a morphologically simple unit which can be left bare or to which a prefix or a suffix can attach. The root word is the primary lexical unit of a word, and of a word family, which carries aspects of semantic content and cannot be reduced into smaller constituents. Content words in nearly all languages contain, and may consist only of, root morphemes. However, sometimes the term "root" is also used to describe the word without its inflectional endings, but with its lexical endings in place. For example, chatters has the inflectional root or lemma chatter, but the lexical root chat. Inflectional roots are often called stems, and a root in the stricter sense, a root morpheme, may be thought of as a monomorphemic stem.

In generative linguistics, Distributed Morphology is a theoretical framework introduced in 1993 by Morris Halle and Alec Marantz. The central claim of Distributed Morphology is that there is no divide between the construction of words and sentences. The syntax is the single generative engine that forms sound-meaning correspondences, both complex phrases and complex words. This approach challenges the traditional notion of the Lexicon as the unit where derived words are formed and idiosyncratic word-meaning correspondences are stored. In Distributed Morphology there is no unified Lexicon as in earlier generative treatments of word-formation. Rather, the functions that other theories ascribe to the Lexicon are distributed among other components of the grammar.

Phono-semantic matching (PSM) is the incorporation of a word into one language from another, often creating a neologism, where the word's non-native quality is hidden by replacing it with phonetically and semantically similar words or roots from the adopting language. Thus, the approximate sound and meaning of the original expression in the source language are preserved, though the new expression in the target language may sound native.

In linguistics, a stem is a part of a word responsible for its lexical meaning. The term is used with slightly different meanings depending on the morphology of the language in question. In Athabaskan linguistics, for example, a verb stem is a root that cannot appear on its own, and that carries the tone of the word. Athabaskan verbs typically have two stems in this analysis, each preceded by prefixes.

Word Smallest linguistic element that will be uttered in isolation with semantic or pragmatic content

In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with objective or practical meaning. In many languages, words also correspond to sequences of graphemes ("letters") in their standard writing systems that are delimited by spaces wider than the normal inter-letter space, or by other graphical conventions. The concept of "word" is usually distinguished from that of a morpheme, which is the smallest unit of word which has a meaning, even if it will not stand on its own.

In morphology and lexicography, a lemma is the canonical form, dictionary form, or citation form of a set of words (headword). In English, for example, break, breaks, broke, broken and breaking are forms of the same lexeme, with break as the lemma by which they are indexed. Lexeme, in this context, refers to the set of all the forms that have the same meaning, and lemma refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly inflected languages such as Arabic, Turkish and Russian. The process of determining the lemma for a given word is called lemmatisation. The lemma can be viewed as the chief of the principal parts, although lemmatisation is at least partly arbitrary.

In historical linguistics and language change, grammaticalization is a process of language change by which words representing objects and actions become grammatical markers. Thus it creates new function words by a process other than deriving them from existing bound, inflectional constructions, instead deriving them from content words. For example, the Old English verb willan 'to want', 'to wish' has become the Modern English auxiliary verb will, which expresses intention or simply futurity. Some concepts are often grammaticalized, while others, such as evidentiality, are not so much.

Lexicalization is the process of adding words, set phrases, or word patterns to a language's lexicon.

Meaning–text theory (MTT) is a theoretical linguistic framework, first put forward in Moscow by Aleksandr Žolkovskij and Igor Mel’čuk, for the construction of models of natural language. The theory provides a large and elaborate basis for linguistic description and, due to its formal character, lends itself particularly well to computer applications, including machine translation, phraseology, and lexicography.

The Integrational theory of language is the general theory of language that has been developed within the general linguistic approach of integrational linguistics.

<i>Language Contact and Lexical Enrichment in Israeli Hebrew</i>

Language Contact and Lexical Enrichment in Israeli Hebrew is a scholarly book written in the English language by linguist Ghil'ad Zuckermann, published in 2003 by Palgrave Macmillan. The book proposes a socio-philological framework for the analysis of "camouflaged borrowing" such as phono-semantic matching. It introduces for the first time a classification for "multisourced neologisms", new words that are based on two or more sources at the same time.


  1. λεξικός in Henry George Liddell, Robert Scott, A Greek–English Lexicon (Perseus Digital Library). Sc. βιβλίονbiblios 'book'.
  2. Dominiek, Sandra; Taft, Marcus (1994). Morphological structure, lexical representation, and lexical access. Lawrence Erlbaum Associates, Publishers. ISBN   9780863779268.
  3. 1 2 Geert, Booij (2005). The grammar of words : an introduction to linguistic morphology. Oxford textbooks in linguistics. Oxford University Press. ISBN   0-19-928042-8.
  4. Skeat, Walter (2010-04-17). A Concise Etymological Dictionary of the English Language. Forgotten Books. p. 648. ISBN   978-1-4400-5722-9.
  5. Ornan, Uzzi (2003). The Final Word — Mechanism For Hebrew Word Generation (in Hebrew). Haifa: Haifa University Press.
  6. Metcalf, Allan (2002). Predicting New Words — The Secrets of Their Success . Boston: Houghton Mifflin Company. ISBN   0-618-13006-3.
  7. Jaeger, Jeri J. (2005). Kid's slips: what young children's slips of the tongue reveal about language development. Psychology Press. ISBN   978-0-8058-3579-3 . Retrieved 8 April 2012.
  8. Page 8 in Language Contact and Lexical Enrichment in Israeli Hebrew , by Ghil'ad Zuckermann, Palgrave Macmillan, 2003.
  9. Page 8 in Language Contact and Lexical Enrichment in Israeli Hebrew, by Ghil'ad Zuckermann, Palgrave Macmillan, 2003.
  10. Suman Kalyan Maity, Ritvik Saraf and Animesh Mukherjee (2016). #Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds. In ACM CSCW, San Francisco, CA.
  11. Deutscher, Guy (May 19, 2005). The Unfolding of Language: An Evolutionary Tour of Mankind's Greatest Invention . Metropolitan Books.

Further reading