Leipzig–Jakarta list

Last updated

The Leipzig–Jakarta list of 100 words is used by linguists to test the degree of chronological separation of languages by comparing words that are resistant to borrowing. The Leipzig–Jakarta list became available in 2009. [1] The word list is named after the cities of Leipzig, Germany, and Jakarta, Indonesia, the places where the list was conceived and created.

Contents

In the 1950s, the linguist Morris Swadesh published a list of 200 words called the Swadesh list, allegedly the 200 lexical concepts found in all languages that were least likely to be borrowed from other languages. Swadesh later whittled his list down to 100 items. The Swadesh list, however, was based mainly on intuition, according to Martin Haspelmath and Uri Tadmor. [2]

The Loanword Typology Project, with the World Loanword Database (WOLD), published by the Max Planck Digital Library, was established to rectify this problem. Experts on 41 languages from across the world were given a uniform vocabulary list and asked to provide the words for each item in the language on which they were an expert, as well as information on how strong the evidence that each word was borrowed was. [3] The 100 concepts that were found in most languages and were most resistant to borrowing formed the Leipzig–Jakarta list. Only 62 items on the Leipzig–Jakarta list and on the 100-word Swadesh list overlap, hence a 38% difference between the two lists.

A quarter of the words in the Leipzig–Jakarta list are body parts: mouth , eye , leg /foot, navel , liver , knee , etc. [4] Six animal words appear on the list: fish , bird , dog , louse , ant and fly – animals found everywhere humans can be found. [2]

The items house , name , rope and to tie are products of human culture, but are probably found in all present-day human societies. Haspelmath and Tadmor drew the conclusion that "rope is the most basic of human tools and tying is the most basic technology". [2]

List

Lexical items in the Leipzig–Jakarta list are ranked by semantic stability, i.e. words least likely to be replaced by other words as a language evolves. [5] [6] The right two columns indicate inclusion on the 100-word and 207-word Swadesh lists. [7]

RankWord meaning100-word Swadesh list207-word Swadesh list
1fire
2nose
3to go
4water
5mouth
6tongue
7blood
8bone
92nd-person singular pronoun (you)
10root
11to come
12breast
13rain
141st-person singular pronoun (I/me)
15name
16louse
17wing
18flesh/meat
19arm/hand
20fly
21night
22ear
23neck
24far
25to do/make
26house
27stone/rock
28bitter
29to say
30tooth
31hair
32big
33one
34who?
353rd-person singular pronoun (he/she/it/him/her)
36to hit/beat
37leg/foot
38horn
39this
40fish
41yesterday
42to drink
43black
44navel
45to stand
46to bite
47back
48wind
49smoke
50what?
51child (kin term)
52egg
53to give
54new
55to burn (intr.)
56not
57good
58to know
59knee
60sand
61to laugh
62to hear
63soil
64leaf
65red
66liver
67to hide
68skin/hide
69to suck
70to carry
71ant
72heavy
73to take
74old
75to eat
76thigh
77thick
78long
79to blow
80wood
81to run
82to fall
83eye
84ash
85tail
86dog
87to cry/weep
88to tie
89to see
90sweet
91rope
92shade/shadow
93bird
94salt
95small
96wide
97star
98in
99hard
100to crush/grind

Other differences with the Swadesh list

Items on the 100-word Swadesh list but not on the Leipzig–Jakarta list: [7]

See also

Related Research Articles

<span class="mw-page-title-main">Chadic languages</span> Branch of the Afroasiatic languages

The Chadic languages form a branch of the Afroasiatic language family. They are spoken in parts of the Sahel. They include 150 languages spoken across northern Nigeria, southern Niger, southern Chad, the Central African Republic, and northern Cameroon. The most widely spoken Chadic language is Hausa, a lingua franca of much of inland Eastern West Africa.

A lexicon is the vocabulary of a language or branch of knowledge. In linguistics, a lexicon is a language's inventory of lexemes. The word lexicon derives from Greek word λεξικόν, neuter of λεξικός meaning 'of or for words'.

<span class="mw-page-title-main">Loanword</span> Word borrowed from a donor language and incorporated into a recipient language

A loanword is a word at least partly assimilated from one language into another language. This is in contrast to cognates, which are words in two or more languages that are similar because they share an etymological origin; and calques, which involve translation. Loanwords from languages with different scripts are usually transliterated, but they are not translated. Additionally, loanwords may be adapted to the phonology, phonotactics, orthography, and morphology of the target language. When a loanword is fully adapted to the rules of the target language, it is distinguished from native words of the target language only by its origin. However, often the adaptation is incomplete, so loanwords may conserve specific features distinguishing them from native words of the target language: loaned phonemes and sound combinations, partial or total conserving of the original spelling, foreign plural or case forms or indeclinability.

Glottochronology is the part of lexicostatistics which involves comparative linguistics and deals with the chronological relationship between languages.

Lexicostatistics is a method of comparative linguistics that involves comparing the percentage of lexical cognates between languages to determine their relationship. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language. It is to be distinguished from glottochronology, which attempts to use lexicostatistical methods to estimate the length of time since two or more languages diverged from a common earlier proto-language. This is merely one application of lexicostatistics, however; other applications of it may not share the assumption of a constant rate of change for basic lexical items.

Mass comparison is a method developed by Joseph Greenberg to determine the level of genetic relatedness between languages. It is now usually called multilateral comparison. The method is rejected by most linguists, though not all.

Comparative linguistics is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness.

The Swadesh list is a classic compilation of tentatively universal concepts for the purposes of lexicostatistics. Translations of the Swadesh list into a set of languages allow researchers to quantify the interrelatedness of those languages. The Swadesh list is named after linguist Morris Swadesh. It is used in lexicostatistics and glottochronology. Because there are several different lists, some authors also refer to "Swadesh lists".

<span class="mw-page-title-main">Saharan languages</span> Small language family in the East Sahara desert

The Saharan languages are a small family of languages across parts of the eastern Sahara, extending from northwestern Darfur to southern Libya, north and central Chad, eastern Niger and northeastern Nigeria. Noted Saharan languages include Kanuri, Daza, Teda, and Zaghawa. They are a part of the proposed Nilo-Saharan family.

<span class="mw-page-title-main">Eastern Romance languages</span> Romance subfamily of the Balkans and Eastern Europe

The Eastern Romance languages are a group of Romance languages. Today, the group consists of the Balkan Romance subgroup, which comprises the Romanian language (Daco-Romanian), the Aromanian language and two other related minor languages, Megleno-Romanian, and Istro-Romanian.

The lexis of Bulgarian, a South Slavic language, consists of native words, as well as borrowings from Russian, French, and to a lesser extent English, Greek, Ottoman Turkish, Arabic and other languages.

Sino-Xenic or Sinoxenic pronunciations are regular systems for reading Chinese characters in Japan, Korea and Vietnam, originating in medieval times and the source of large-scale borrowings of Chinese words into the Japanese, Korean and Vietnamese languages, none of which are genetically related to Chinese. The resulting Sino-Japanese, Sino-Korean and Sino-Vietnamese vocabularies now make up a large part of the lexicons of these languages. The pronunciation systems are used alongside modern varieties of Chinese in historical Chinese phonology, particularly the reconstruction of the sounds of Middle Chinese. Some other languages, such as Hmong–Mien and Kra–Dai languages, also contain large numbers of Chinese loanwords but without the systematic correspondences that characterize Sino-Xenic vocabularies.

Arabic has had a great influence on other languages, especially in vocabulary. The influence of Arabic has been most profound in those countries visited by Islam or Islamic power.

Martin Haspelmath is a German linguist working in the field of linguistic typology. He is a researcher at the Max Planck Institute for Evolutionary Anthropology in Leipzig, where he worked from 1998 to 2015 and again since 2020. Between 2015 and 2020, he worked at the Max Planck Institute for the Science of Human History. He is also an honorary professor of linguistics at the University of Leipzig.

The Slavic influence on Romanian is noticeable on all linguistic levels: lexis, phonetics, morphology and syntax.

<span class="mw-page-title-main">Cheq Wong language</span> Austroasiatic language spoken in Malaysia

Cheq Wong is an Austroasiatic language spoken in the Malay Peninsula by the Cheq Wong people. It belongs to the Northern subbranch of the Aslian languages. Northern Aslian was labelled Jehaic in the past.

Manang, also called Manangba, Manange, Manang Ke, Nyishang, Nyishangte and Nyishangba, is a Sino-Tibetan language spoken in Nepal. Native speakers refer to the language as ŋyeshaŋ, meaning 'our language'. It is one of half a dozen languages of the Sino-Tibetan family. Manang and its most closely related languages are often written as TGTM in literature, referring to Tamang, Gurung, Thakali, and Manangba, due to the high degree of similarity in the linguistic characteristics of the languages. The language is unwritten and almost solely spoken within the Manang District, leading it to be classified as threatened, with the number of speakers continuing to decline. Suspected reasons for the decline include parents not passing down the language to their children, in order to allow for what they see as more advanced communication with other groups of people, and thus gain more opportunities. Due to the proximity of the district to Tibet, as well as various globally widespread languages being introduced into the area, use of the native language is declining in favor of new languages, which are perceived to aid in the advancement of the people and region.

The re-latinization of Romanian was the reinforcement of the Romance features of the Romanian language during the 18th and 19th centuries. In this period, Romanian adopted a Latin-based alphabet to replace the Cyrillic script and borrowed many words from French as well as from Latin and Italian, in order to acquire the lexical tools necessary for modernization. This process coined words for recently introduced objects or concepts (neologisms), added Latinate synonyms for some Slavic and other loanwords, and strengthened some Romance syntactic features.

Concepticon is an open-source online lexical database of linguistic concept lists. It links concept labels in concept lists to concept sets.

References

  1. Jeanette Sakel; Daniel L. Everett (2012). Linguistic Fieldwork: A Student Guide. Cambridge University Press. p. 116. ISBN   978-1-107-37702-8.
  2. 1 2 3 Haspelmath & Tadmor, p. 72.
  3. "The World Loanword Database (WOLD)". wold.clld.org. Retrieved February 24, 2019.
  4. Haspelmath & Tadmor, p. 71.
  5. The Leipzig-Jakarta List of Basic Vocabulary. Source: Haspelmath, Martin and Uri Tadmor (eds.), 2009. Loanwords in the World’s Languages: A Comparative Handbook. Berlin and New York: Mouton de Gruyter.
  6. Tadmor, Uri, Martin Haspelmath, and Bradley Taylor. 2010. Borrowability and the notion of basic vocabulary. Diachronica 27:2 (2010), 226–246. doi : 10.1075/dia.27.2.04tad
  7. 1 2 Haspelmath & Tadmor, p. 74.