Cross-Linguistic Linked Data

Last updated
Cross-Linguistic Linked Data
Producer Max Planck Institute for Evolutionary Anthropology (Germany)
LanguagesEnglish
Access
CostFree
Coverage
DisciplinesLinguistics
Links
Website clld.org

The Cross-Linguistic Linked Data (CLLD) project coordinates over a dozen linguistics databases covering the languages of the world. It is hosted by the Department of Linguistic and Cultural Evolution at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany (previously at the Max Planck Institute for the Science of Human History in Jena [1] ).

Contents

Databases and projects

Related Research Articles

<span class="mw-page-title-main">Chadic languages</span> Branch of the Afroasiatic languages

The Chadic languages form a branch of the Afroasiatic language family. They are spoken in parts of the Sahel. They include 150 languages spoken across northern Nigeria, southern Niger, southern Chad, the Central African Republic, and northern Cameroon. The most widely spoken Chadic language is Hausa, a lingua franca of much of inland Eastern West Africa.

Abau is a Papuan language spoken in southern Sandaun Province of Papua New Guinea, primarily along the border with Indonesia.

The Swadesh list is a classic compilation of tentatively universal concepts for the purposes of lexicostatistics. Translations of the Swadesh list into a set of languages allow researchers to quantify the interrelatedness of those languages. The Swadesh list is named after linguist Morris Swadesh. It is used in lexicostatistics and glottochronology. Because there are several different lists, some authors also refer to "Swadesh lists".

The Lower Mamberamo languages are a recently proposed language family linking two languages spoken along the northern coast of Papua province, Indonesia, near the mouth of the Mamberamo River. They have various been classified either as heavily Papuanized Austronesian languages belonging to the SHWNG branch, or as Papuan languages that had undergone heavy Austronesian influence. Glottolog 3.4 classifies Lower Mamberamo as Austronesian, while Donohue classifies it as Papuan. Kamholz (2014) classifies Warembori and Yoke each as coordinate primary subgroups of the South Halmahera–West New Guinea languages.

<span class="mw-page-title-main">South Halmahera–West New Guinea languages</span> Subgroup of the Austronesian language family

The South Halmahera–West New Guinea (SHWNG) languages are a branch of the Malayo-Polynesian languages, found in the islands and along the shores of the Halmahera Sea in the Indonesian province of North Maluku and of Cenderawasih Bay in the provinces of Papua and West Papua. There are 38 languages.

<span class="mw-page-title-main">Central Solomon languages</span> Papuan language family of Solomon Islands

The Central Solomon languages are the four Papuan languages spoken in the state of Solomon Islands.

Mariveleño is a Sambalic language. It has around 500 speakers and is spoken within an Aeta community in Mariveles in the Philippines.

<span class="mw-page-title-main">West Bomberai languages</span> Family of Papuan languages

The West Bomberai languages are a family of Papuan languages spoken on the Bomberai Peninsula of western New Guinea and in East Timor and neighboring islands of Indonesia.

<span class="mw-page-title-main">Duna–Pogaya languages</span> Proposed Trans–New Guinea language branch

The Duna–Pogaya (Duna–Bogaia) languages are a proposed small family of Trans–New Guinea languages in the classification of Voorhoeve (1975), Ross (2005) and Usher (2018), consisting of two languages, Duna and Bogaya, which in turn form a branch of the larger Trans–New Guinea family. Glottolog, which is based largely on Usher, however finds the connections between the two languages to be tenuous, and the connection to TNG unconvincing.

<span class="mw-page-title-main">Yam languages</span> Family of Papuan languages

The Yam languages, also known as the Morehead River languages, are a family of Papuan languages. They include many of the languages south and west of the Fly River in Papua New Guinea and Indonesian West Papua.

The Kho-Bwa languages, also known as Kamengic, are a small family of languages spoken in Arunachal Pradesh, northeast India. The name Kho-Bwa was originally proposed by George van Driem (2001). It is based on the reconstructed words *kho ("water") and *bwa ("fire"). Blench (2011) suggests the name Kamengic, from the Kameng area of Arunachal Pradesh. Alternatively, Anderson (2014) refers to Kho-Bwa as Northeast Kamengic.

The family of Northwest Solomonic languages is a branch of the Oceanic languages. It includes the Austronesian languages of Bougainville and Buka in Papua New Guinea, and of Choiseul, New Georgia, and Santa Isabel in Solomon Islands.

<span class="mw-page-title-main">Languages of the Solomon Islands archipelago</span>

Between 60 and 70 languages are spoken in the Solomon Islands Archipelago which covers a broader area than the nation state of Solomon Islands, and includes the island of Bougainville, which is an autonomous province of Papua New Guinea (PNG). The lingua franca of the archipelago is Pidgin, and the official language in both countries is English.

<span class="mw-page-title-main">Baduy language</span> Sundanesic language spoken by Baduy people

Baduy is one of the Sundanese-Baduy languages spoken predominantly by the Baduy people. It is sometimes considered a dialect of Sundanese, but more often it is considered a separate language due to its diverging vocabulary and cultural reasons that differ from the rest of the Sundanese people. Native speakers of the Baduy language are spread in regions around the Mount Kendeng, Rangkasbitung district of Lebak Regency and Pandeglang Regency, Banten Province, Indonesia. It is estimated that there are 11,620 speakers as of 2015.

Colexification, together with its associated verb colexify, are terms used in semantics and lexical typology. They refer to the ability, for a language, to express different meanings with the same word.

Concepticon is an open-source online lexical database of linguistic concept lists. It links concept labels in concept lists to concept sets.

Lexibank is a linguistics database managed by the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. The database consists of over 100 standardized wordlists (datasets) that are independently curated.

Johann-Mattis List is a German scientist. He is known for his work on quantitative comparative linguistics. List is currently professor at the University of Passau, Germany, where he leads the Chair of Multilingual Computational Linguistics.

<span class="mw-page-title-main">PHOIBLE</span>

PHOIBLE is a linguistic database accessible through its website and compiling phonological inventories from primary documents and tertiary databases into a single, easily searchable sample. The 2019 version 2.0 includes 3,020 inventories containing 3,183 segment types found in 2,186 distinct languages. It is edited by Steven Moran, Assistant Professor from the Institute of Biology at the University of Neuchâtel and Daniel McCloy, Researcher at the Institute for Learning and Brain Sciences at the University of Washington.

<span class="mw-page-title-main">North Coast Sundanese</span> Sundanese varieties of Indonesia

North Coast Sundanese or Pakaléran Sundanese, can be shortened to Pantura Sundanese is a geographical grouping of forms Sundanese language that live as the mother tongue for people living along the northern coast of the Sundanese speaking area. The area includes several regencies, such as Serang Regency, Tangerang Regency, Bekasi Regency, Karawang Regency, Subang Regency, Indramayu Regency, and Cirebon Regency. North Coast Sundanese has a language structure that is more or less the same as standard Sundanese, when judging from its morphological, phonological and syntactical systems, not much difference was found. The difference is only found in a small part vocabulary and intonation. Some words have the same form, but different meanings, and vice versa.

References

  1. Haspelmath, Martin. "Max Planck diversity linguistics redux: Welcome to "Linguistic and Cultural Evolution" in Jena". Hypotheses: Diversity Linguistics Comment (blog). Retrieved 28 March 2015.
  2. Glottolog. doi : 10.5281/zenodo.437430
  3. WALS Online. doi : 10.5281/zenodo.11040
  4. WOLD. doi : 10.5281/zenodo.11137
  5. APICS Online. doi : 10.5281/zenodo.11135
  6. eWAVE. doi : 10.5281/zenodo.11169
  7. AfBo. doi : 10.5281/zenodo.11188
  8. SAILS Online. doi : 10.5281/zenodo.11175
  9. PHOIBLE Online. doi : 10.5281/zenodo.11706
  10. Tsammalex. doi : 10.5281/zenodo.17571
  11. Comparative Siouan Dictionary. doi : 10.5281/zenodo.19782
  12. Concepticon. doi : 10.5281/zenodo.19782
  13. Dogon and Bangime Linguistics. doi : 10.5281/zenodo.1193579
  14. Rzymski, Christoph and Tresoldi, Tiago et al. 2019. The Database of Cross-Linguistic Colexifications, reproducible analysis of cross- linguistic polysemies. doi : 10.1038/s41597-019-0341-x
  15. Glottobank
  16. List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph; Englisch, Johannes; Gray, Russell D. (2022-06-16). "Lexibank, a public repository of standardized wordlists with computed phonological and lexical features". Scientific Data. 9 (1): 1–16. doi: 10.1038/s41597-022-01432-0 . ISSN   2052-4463. PMC   9203750 .
  17. List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph; Englisch, Johannes; Gray, Russell D. (2021-09-02), Lexibank: A public repository of standardized wordlists with computed phonological and lexical features, Research Square, doi:10.21203/rs.3.rs-870835/v1, S2CID   239629792
  18. Grambank. doi : 10.5281/zenodo.7844558
  19. Skirgård, Hedvig; Haynie, Hannah J.; Blasi, Damián E.; Hammarström, Harald (2023-04-21). "Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss". Science Advances. American Association for the Advancement of Science (AAAS). 9 (16). doi:10.1126/sciadv.adg6175. hdl: 10067/1958300151162165141 . ISSN   2375-2548.
  20. Haspelmath, Martin & Stiebels, Barbara (eds). Dictionaria.
  21. Kelly, Piers (ed.). 2018. The Australian Message Stick Database.
  22. Language Description Heritage
  23. Forkel, R. et al. Cross-Linguistic Data Formats, advancing data sharing and reuse in comparative linguistics. Sci. Data. 5:180205 doi : 10.1038/sdata.2018.205 (2018).
  24. Johann-Mattis List, Cormac Anderson, Tiago Tresoldi, Simon J. Greenhill, Christoph Rzymski, & Robert Forkel. (2019). Cross-Linguistic Transcription Systems (Version v1.2.0). Max Planck Institute for the Science of Human History: Jena doi : 10.5281/zenodo.2633838
  25. Language Description Heritage
  26. Heggarty, Paul & Anderson, Cormac & Scarborough, Matthew (eds). IE-CoR (Indo-European Cognate Relationships). doi : 10.5281/zenodo.8089434

CC BY icon-80x15.png  This article incorporates text available under the CC BY 3.0 license.