A lexicon is the vocabulary of a language or branch of knowledge (such as nautical or medical). In linguistics, a lexicon is a language's inventory of lexemes. The word lexicon derives from Greek word λεξικόν (lexikon), neuter of λεξικός (lexikos) meaning 'of or for words'. [1]


Linguistic theories generally regard human languages as consisting of two parts: a lexicon, essentially a catalogue of a language's words (its wordstock); and a grammar, a system of rules which allow for the combination of those words into meaningful sentences. The lexicon is also thought to include bound morphemes, which cannot stand alone as words (such as most affixes). [2] In some analyses, compound words and certain classes of idiomatic expressions, collocations and other phrases are also considered to be part of the lexicon. Dictionaries represent attempts at listing, in alphabetical order, the lexicon of a given language; usually, however, bound morphemes are not included.

Size and organization

Items in the lexicon are called lexemes, or lexical items, or word forms. Lexemes are not atomic elements but contain both phonological and morphological components. When describing the lexicon, a reductionist approach is used, trying to remain general while using a minimal description. To describe the size of a lexicon, lexemes are grouped into lemmas. A lemma is a group of lexemes generated by inflectional morphology. Lemmas are represented in dictionaries by headwords which list the citation forms and any irregular forms, since these must be learned to use the words correctly. Lexemes derived from a word by derivational morphology are considered new lemmas. The lexicon is also organized according to open and closed categories. Closed categories, such as determiners or pronouns, are rarely given new lexemes; their function is primarily syntactic. Open categories, such as nouns and verbs, have highly active generation mechanisms and their lexemes are more semantic in nature.

Lexicalization and other mechanisms in the lexicon

A central role of the lexicon is the documenting of established lexical norms and conventions. Lexicalization is the process by which new words, having gained widespread usage, enter the lexicon. Since lexicalization [3] may modify lexemes phonologically and morphologically, it is possible that a single etymological source may be inserted into a single lexicon in two or more forms. These pairs, called a doublet, are often close semantically. Two examples are aptitude versus attitude and employ versus imply . [4]

The mechanisms, not mutually exclusive, are: [5]

Neologisms (new words)

Neologisms are new lexeme candidates which, if they gain wide usage over time, become part of a language's lexicon. Neologisms are often introduced by children who produce erroneous forms by mistake. [7] Other common sources are slang and advertising.

Neologisms that maintain the sound of their external source

There are two types of borrowings (neologisms based on external sources) that retain the sound of the source language material:

  • Borrowing using the source language lexical item as the basic material for the neologization: guestwords, foreignisms and loanwords
  • Borrowing using a target language lexical items as the basic material for the neologization: phono-semantic matching, semanticized phonetic matching and phonetic matching.

Guestwords, foreignisms and loanwords

The following are examples of external lexical expansion using the source language lexical item as the basic material for the neologization, listed in decreasing order of phonetic resemblance to the original lexical item (in the source language): [8]

  • Guestword (in German: Gastwort): unassimilated borrowing.
  • Foreignism (in German: Fremdwort): foreign word, e.g. phonetic adaptation.
  • Loanword (in German: Lehnwort): totally assimilated borrowing, e.g. morphemic adaptation.

Phono-semantic matches, semanticized phonetic matches and phonetic matches

The following are examples of simultaneous external and internal lexical expansion using target language lexical items as the basic material for the neologization but still resembling the sound of the lexical item in the source language: [9]

  • Phono-semantic matching (PSM): the target language material is originally similar to the source language lexical item both phonetically and semantically.
  • Semanticized phonetic matching (SPM): the target language material is originally similar to the source language lexical item phonetically, and only in a loose way semantically.
  • Phonetic matching (PM): the target language material is originally similar to the source language lexical item phonetically but not semantically.

Role of morphology

Another mechanism involves generative devices that combine morphemes according to a language's rules. For example, the suffix "-able" is usually only added to transitive verbs, as in "readable" but not "cryable".


A compound word is a lexeme composed of several established lexemes, whose semantics is not the sum of that of their constituents. They can be interpreted through analogy, common sense and, most commonly, context. [3] Compound words can have simple or complex morphological structures. Usually only the head requires inflection for agreement. Compounding may result in lexemes of unwieldy proportion. This is compensated by mechanisms that reduce the length of words. A similar phenomenon has been recently shown to feature in social media also where hashtags compound to form longer-sized hashtags that are at times more popular than the individual constituent hashtags forming the compound. [10] Compounding is the most common of word formation strategies cross-linguistically.

Diachronic mechanisms

Comparative historical linguistics studies the evolutions languages and takes a diachronic view of the lexicon. The evolution of lexicons in different languages occurs through parallel mechanism. Over time historical forces work to shape the lexicon, [11] making it simpler to acquire and often creating an illusion of great regularity in language.

Second-language lexicon

The term "lexicon" is generally used in the context of single language. Therefore, multi-lingual speakers are generally thought to have multiple lexicons. Speakers of language variants (Brazilian Portuguese and European Portuguese, for example) may be considered to possess a single lexicon. Thus a cash dispenser (British English) as well as an automatic teller machine or ATM in American English would be understood by both American and British speakers, despite each group using different dialects.

When linguists study a lexicon, they consider such things as what constitutes a word; the word/concept relationship; lexical access and lexical access failure; how a word's phonology, syntax, and meaning intersect; the morphology-word relationship; vocabulary structure within a given language; language use (pragmatics); language acquisition; the history and evolution of words (etymology); and the relationships between words, often studied within philosophy of language.

Various models of how lexicons are organized and how words are retrieved have been proposed in psycholinguistics, neurolinguistics and computational linguistics.

