In semantics and lexical typology, colexification is the ability for a language to express different meanings with the same word.
When a language colexifies several concepts, this is generally understood as a sign that these concepts are semantically related. [1] Research in lexical typology has thus been using colexification as a tool to measure semantic similarity between concepts.
Colexification describes the case of different meanings being expressed by the same word (i.e., “co-lexified”) in a language. For example, the two senses which are distinguished in English as people and village are colexified in Spanish, which uses pueblo in both cases.
Colexification is meant as a neutral descriptive term that avoids distinguishing between vagueness, polysemy, and homonymy. Some cases of colexification are common across the world (e.g. ‘blue’ = ‘green’); others are typical of certain linguistic and cultural areas (e.g. ‘tree’ = ‘fire’ among Papuan and Australian languages; [2] or ‘thunder’ = ‘dragon’ in the Sino-Tibetan languages [3] ).
The opposite of “co-lexify” is “dis-lexify”, i.e. 'express two meanings using different lexical forms'. [4] Thus, Russian colexifies 'arm' and 'hand' using the single word рука, but Spanish dislexifies these two meanings using two distinct words, respectively brazo v. mano.
Language | Word form | sense 1 | sense 2 | sense 3 |
---|---|---|---|---|
Basque | herri | ‘village’ | ‘people’ | ‘country’ |
Spanish | pueblo | ‘village’ | ‘people’ | |
Catalan | sentir | ‘feel’ | ‘hear’ | |
French | femme | ‘woman’ | ‘wife’ | |
fille | ‘girl’ | ‘daughter’ | ||
grand | ‘large’ | ‘tall (in size)’ | ‘grown up (in age)’ | |
English | uncle | ‘mother's brother’ | ‘father's brother’ | ‘aunt's husband’ |
draw | ‘pull, drag’ | ‘depict w/ lines’ | ||
Kriol | gilim | ‘hit’ | ‘kill’ | |
Chinese | 天 tiān | ‘sky’ | ‘heaven’ | ‘day’ |
Japanese | 木 ki | ‘tree’ | ‘wood’ | |
Mota | pane- | ‘arm’ | ‘hand’ | ‘wing’ |
Italian | ciao | ‘hello’ | ‘goodbye’ | |
Vietnamese | chào | |||
LSF | ‘hello’ | ‘thanks’ | ||
(sign) | ‘(s.o.) kind, nice’ | ‘(s.th.) easy’ |
“A given language is said to colexify two functionally distinct senses if, and only if, it can associate them with the same lexical form.”
The term was coined by the linguist Alexandre François in his 2008 article “Semantic maps and the typology of colexification”. This article illustrated the notion with various examples, including the semantic domains of { STRAIGHT }, { CALL }, { BREATHE }. The latter notion is at the source of a colexification network that is attested in several languages, linking together such senses as ‘breath’, ‘life’, ‘soul’, ‘spirit’, ‘ghost’...: Skr. आत्मन् ātmán; Anc. Gk ψυχή, πνεῦμα; Lat. animus, spīritus; Arab. روح rūḥ, etc. François built on that example to propose a method for constructing lexical semantic maps.
Several studies have taken up the concept of colexification and applied it to different semantic domains and various language families. [6]
Colexification is also the object of a dedicated database, known as CLiCS “Database of Cross-Linguistic Colexifications”. [7] Based on data from more than 2400 language varieties of the world, the database makes it possible to check the typological frequency of individual instances of colexification, [8] and to visualize semantic networks [9] based on empirical data from the world's languages.
Natural semantic metalanguage (NSM) is a linguistic theory that reduces lexicons down to a set of semantic primitives. It is based on the conception of Polish professor Andrzej Bogusławski. The theory was formally developed by Anna Wierzbicka at Warsaw University and later at the Australian National University in the early 1970s, and Cliff Goddard at Australia's Griffith University.
Polysemy is the capacity for a sign to have multiple related meanings. For example, a word can have several word senses. Polysemy is distinct from monosemy, where a word has a single meaning.
Hypernymy and hyponymy are the semantic relations between a generic term (hypernym) and a more specific term (hyponym). The hypernym is also called a supertype, umbrella term, or blanket term. The hyponym names a subtype of the hypernym. The semantic field of the hyponym is included within that of the hypernym. For example, pigeon, crow, and hen are all hyponyms of bird and animal; bird and animal are both hypernyms of pigeon, crow, and hen.
Lexical semantics, as a subfield of linguistic semantics, is the study of word meanings. It includes the study of how words structure their meaning, how they act in grammar and compositionality, and the relationships between the distinct senses and uses of a word.
Semantic change is a form of language change regarding the evolution of word usage—usually to the point that the modern meaning is radically different from the original usage. In diachronic linguistics, semantic change is a change in one of the meanings of a word. Every word has a variety of senses and connotations, which can be added, removed, or altered over time, often to the extent that cognates across space and time have very different meanings. The study of semantic change can be seen as part of etymology, onomasiology, semasiology, and semantics.
In linguistics, a word sense is one of the meanings of a word. For example, a dictionary may have over 50 different senses of the word "play", each of these having a different meaning based on the context of the word's usage in a sentence, as follows:
We went to see the playRomeo and Juliet at the theater.
The coach devised a great play that put the visiting team on the defensive.
The children went out to play in the park.
In linguistics, a semantic field is a lexical set of words grouped semantically that refers to a specific subject. The term is also used in anthropology, computational semiotics, and technical exegesis.
Heterosemy is a concept in linguistics: a word is heterosemous if it has two or more meanings or functions that are historically related, but belong to different morphosyntactic categories. An example is the English word peel: peel functions as a noun in the expression I threw the orange peel in the bin, but as a verb in Would you peel the orange for me?. Heterosemy can be seen as a special case of homonymy.
Alexandre François is a French linguist specialising in the description and study of the indigenous languages of Melanesia. He belongs to Lattice, a research centre of the CNRS and École Normale Supérieure dedicated to linguistics.
The Timor–Alor–Pantar (TAP) languages are a family of languages spoken in Timor, Kisar, and the Alor archipelago in Southern Indonesia. It is the westernmost Papuan language family that survives, and one of two such outlier families in east Nusantara.
The Kho-Bwa languages, also known as Kamengic, are a small family of languages, or pair of families, spoken in Arunachal Pradesh, northeast India. The name Kho-Bwa was originally proposed by George van Driem (2001). It is based on the reconstructed words *kho ("water") and *bwa ("fire"). Blench (2011) suggests the name Kamengic, from the Kameng area of Arunachal Pradesh. Alternatively, Anderson (2014) refers to Kho-Bwa as Northeast Kamengic.
The mental lexicon is a component of the human language faculty that contains information regarding the composition of words, such as their meanings, pronunciations, and syntactic characteristics. The mental lexicon is used in linguistics and psycholinguistics to refer to individual speakers' lexical, or word, representations. However, there is some disagreement as to the utility of the mental lexicon as a scientific construct.
In linguistics, an expression is semantically ambiguous when it can have multiple meanings. The higher the number of synonyms a word has, the higher the degree of ambiguity. Like other kinds of ambiguity, semantic ambiguities are often clarified by context or by prosody. One's comprehension of a sentence in which a semantically ambiguous word is used is strongly influenced by the general structure of the sentence. The language itself is sometimes a contributing factor in the overall effect of semantic ambiguity, in the sense that the level of ambiguity in the context can change depending on whether or not a language boundary is crossed.
Embodied bilingual language, also known as L2 embodiment, is the idea that people mentally simulate their actions, perceptions, and emotions when speaking and understanding a second language (L2) as with their first language (L1). It is closely related to embodied cognition and embodied language processing, both of which only refer to native language thinking and speaking. An example of embodied bilingual language would be situation in which a L1 English speaker learning Spanish as a second language hears the word rápido ("fast") in Spanish while taking notes and then proceeds to take notes more quickly.
The Cross-Linguistic Linked Data (CLLD) project coordinated over a dozen linguistics databases covering the languages of the world. It is hosted by the Department of Linguistic and Cultural Evolution at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany.
The usage-based linguistics is a linguistics approach within a broader functional/cognitive framework, that emerged since the late 1980s, and that assumes a profound relation between linguistic structure and usage. It challenges the dominant focus, in 20th century linguistics, on considering language as an isolated system removed from its use in human interaction and human cognition. Rather, usage-based models posit that linguistic information is expressed via context-sensitive mental processing and mental representations, which have the cognitive ability to succinctly account for the complexity of actual language use at all levels. Broadly speaking, a usage-based model of language accounts for language acquisition and processing, synchronic and diachronic patterns, and both low-level and high-level structure in language, by looking at actual language use.
Lexibank is a linguistics database managed by the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. The database consists of over 100 standardized wordlists (datasets) that are independently curated.
PHOIBLE is a linguistic database accessible through its website and compiling phonological inventories from primary documents and tertiary databases into a single, easily searchable sample. The 2019 version 2.0 includes 3,020 inventories containing 3,183 segment types found in 2,186 distinct languages. It is edited by Steven Moran, Assistant Professor from the Institute of Biology at the University of Neuchâtel and Daniel McCloy, Researcher at the Institute for Learning and Brain Sciences at the University of Washington.
Maria Koptjevskaja-Tamm is a Russian-born linguist and typologist who is Professor of General Linguistics at Stockholm University.