Cross-Linguistic Linked Data

Last updated
Cross-Linguistic Linked Data
Producer Max Planck Institute for Evolutionary Anthropology (Germany)
LanguagesEnglish
Access
CostFree
Coverage
DisciplinesLinguistics
Links
Website clld.org

The Cross-Linguistic Linked Data (CLLD) project coordinated over a dozen linguistics databases covering the languages of the world. It is hosted by the Department of Linguistic and Cultural Evolution at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany (previously at the Max Planck Institute for the Science of Human History in Jena [1] ).

Contents

CLLD was a project for publishing linguistic databases on the web, it ended in 2016. clld, on the other hand, is a web app framework - a piece of software. clld and CLDF came out of the CLLD-project but are distinct from it. CLDF data interfaces smoothly with clld web applications.

Databases and projects

Related Research Articles

<span class="mw-page-title-main">Chadic languages</span> Branch of the Afroasiatic languages

The Chadic languages form a branch of the Afroasiatic language family. They are spoken in parts of the Sahel. They include 150 languages spoken across northern Nigeria, southern Niger, southern Chad, the Central African Republic, and northern Cameroon. By far the most widely spoken Chadic language is Hausa, a lingua franca of much of inland Eastern West Africa, particularly Niger and the northern half of Nigeria.

The Kalenjin languages are a family of a dozen Southern Nilotic languages spoken in Kenya, eastern Uganda and northern Tanzania. The term Kalenjin comes from an expression meaning 'I say ' or 'I have told you'. Kalenjin in this broad linguistic sense should not be confused with Kalenjin as a term for the common identity the Nandi-speaking peoples of Kenya assumed halfway through the twentieth century; see Kalenjin people and Kalenjin language.

<span class="mw-page-title-main">Eastern Romance languages</span> Romance subfamily of Southeast Europe

The Eastern Romance languages are a group of Romance languages. The group, also called the Balkan Romance or Daco-Romance languages, comprises the Romanian language (Daco-Romanian), the Aromanian language and two other related minor languages, Megleno-Romanian and Istro-Romanian.

<span class="mw-page-title-main">Torricelli languages</span> Language family

The Torricelli languages are a family of about fifty languages of the northern Papua New Guinea coast, spoken by about 80,000 people. They are named after the Torricelli Mountains. The most populous and best known Torricelli language is Arapesh, with about 30,000 speakers.

<span class="mw-page-title-main">Central Solomon languages</span> Papuan language family of Solomon Islands

The Central Solomon languages are the four Papuan languages spoken in the state of Solomon Islands.

Laghée is a dialect of Western Lombard language spoken in the north of province of Como (Lombardy), on the coast of the eponymous lake.

<span class="mw-page-title-main">West Bomberai languages</span> Family of Papuan languages

The West Bomberai languages are a family of Papuan languages spoken on the Bomberai Peninsula of western New Guinea and in East Timor and neighboring islands of Indonesia.

<span class="mw-page-title-main">Duna–Pogaya languages</span> Proposed Trans–New Guinea language branch

The Duna–Pogaya (Duna–Bogaia) languages are a proposed small family of Trans–New Guinea languages in the classification of Voorhoeve (1975), Ross (2005) and Usher (2018), consisting of two languages, Duna and Bogaya, which in turn form a branch of the larger Trans–New Guinea family. Glottolog, which is based largely on Usher, however finds the connections between the two languages to be tenuous, and the connection to TNG unconvincing.

The Sere languages are a proposed family of Ubangian languages spoken in South Sudan and the Democratic Republic of the Congo. Several are endangered or extinct. The most populous Sere language is Ndogo of South Sudan, with about 30,000 speakers.

The Kho-Bwa languages, also known as Kamengic, are a small family of languages spoken in Arunachal Pradesh, northeast India. The name Kho-Bwa was originally proposed by George van Driem (2001). It is based on the reconstructed words *kho ("water") and *bwa ("fire"). Blench (2011) suggests the name Kamengic, from the Kameng area of Arunachal Pradesh. Alternatively, Anderson (2014) refers to Kho-Bwa as Northeast Kamengic.

The family of Northwest Solomonic languages is a branch of the Oceanic languages. It includes the Austronesian languages of Bougainville and Buka in Papua New Guinea, and of Choiseul, New Georgia, and Santa Isabel in Solomon Islands.

<span class="mw-page-title-main">Agob languages</span> Pahoturi language group of Papua New Guinea

The Agöb languages are a group of Pahoturi languages spoken in eastern Morehead Rural LLG, Western Province, Papua New Guinea. The language varieties include Agöb, Ende, and Kawam. Languages in this group, along with the Idi language, form a dialect chain with the Idi and Agob dialects proper at the ends of the chain.

Southern Luo is a dialect cluster of Uganda and neighboring countries. Although Southern Luo dialects are mutually intelligible, there are six ethnically and culturally distinct varieties which are considered to be separate languages socially.

Kĕnaboi is an extinct unclassified language of Negeri Sembilan, Malaysia that may be a language isolate or an Austroasiatic language belonging to the Aslian branch. It is attested in what appears to be two dialects, based on word lists of about 250 lexical items, presumably collected around 1870–90.

<span class="mw-page-title-main">Baduy language</span> Sundanesic language spoken by Baduy people

Baduy is one of the Sundanese-Baduy languages spoken predominantly by the Baduy people. It is conventionally considered a dialect of Sundanese, but it is often considered a separate language due to its diverging vocabulary and cultural reasons that differ from the rest of the Sundanese people. Native speakers of the Baduy language are spread in regions around the Mount Kendeng, Rangkasbitung district of Lebak Regency and Pandeglang Regency, Banten Province, Indonesia. It is estimated that there are 11,620 speakers as of 2015.

Colexification, together with its associated verb colexify, are terms used in semantics and lexical typology. They refer to the ability, for a language, to express different meanings with the same word.

Concepticon is an open-source online lexical database of linguistic concept lists. It links concept labels in concept lists to concept sets.

Lexibank is a linguistics database managed by the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. The database consists of over 100 standardized wordlists (datasets) that are independently curated.

Johann-Mattis List is a German scientist. He is known for his work on quantitative comparative linguistics. List is currently professor at the University of Passau, Germany, where he leads the Chair of Multilingual Computational Linguistics.

<span class="mw-page-title-main">PHOIBLE</span>

PHOIBLE is a linguistic database accessible through its website and compiling phonological inventories from primary documents and tertiary databases into a single, easily searchable sample. The 2019 version 2.0 includes 3,020 inventories containing 3,183 segment types found in 2,186 distinct languages. It is edited by Steven Moran, Assistant Professor from the Institute of Biology at the University of Neuchâtel and Daniel McCloy, Researcher at the Institute for Learning and Brain Sciences at the University of Washington.

References

  1. Haspelmath, Martin. "Max Planck diversity linguistics redux: Welcome to "Linguistic and Cultural Evolution" in Jena". Hypotheses: Diversity Linguistics Comment (blog). Retrieved 28 March 2015.
  2. Glottolog. doi : 10.5281/zenodo.437430
  3. WALS Online. doi : 10.5281/zenodo.11040
  4. WOLD. doi : 10.5281/zenodo.11137
  5. APICS Online. doi : 10.5281/zenodo.11135
  6. eWAVE. doi : 10.5281/zenodo.11169
  7. AfBo. doi : 10.5281/zenodo.11188
  8. SAILS Online. doi : 10.5281/zenodo.11175
  9. PHOIBLE Online. doi : 10.5281/zenodo.11706
  10. Tsammalex. doi : 10.5281/zenodo.17571
  11. Comparative Siouan Dictionary. doi : 10.5281/zenodo.19782
  12. Concepticon. doi : 10.5281/zenodo.19782
  13. Dogon and Bangime Linguistics. doi : 10.5281/zenodo.1193579
  14. Rzymski, Christoph and Tresoldi, Tiago et al. 2019. The Database of Cross-Linguistic Colexifications, reproducible analysis of cross- linguistic polysemies. doi : 10.1038/s41597-019-0341-x
  15. Glottobank
  16. List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph; Englisch, Johannes; Gray, Russell D. (2022-06-16). "Lexibank, a public repository of standardized wordlists with computed phonological and lexical features". Scientific Data. 9 (1): 1–16. doi: 10.1038/s41597-022-01432-0 . ISSN   2052-4463. PMC   9203750 .
  17. List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph; Englisch, Johannes; Gray, Russell D. (2021-09-02), Lexibank: A public repository of standardized wordlists with computed phonological and lexical features, Research Square, doi:10.21203/rs.3.rs-870835/v1, hdl: 2292/62117 , S2CID   239629792
  18. Grambank. doi : 10.5281/zenodo.7844558
  19. Skirgård, Hedvig; Haynie, Hannah J.; Blasi, Damián E.; Hammarström, Harald (2023-04-21). "Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss". Science Advances. 9 (16). American Association for the Advancement of Science (AAAS). doi:10.1126/sciadv.adg6175. hdl: 10067/1958300151162165141 . ISSN   2375-2548.
  20. Haspelmath, Martin & Stiebels, Barbara (eds). Dictionaria.
  21. Kelly, Piers (ed.). 2018. The Australian Message Stick Database.
  22. Language Description Heritage
  23. Forkel, R. et al. Cross-Linguistic Data Formats, advancing data sharing and reuse in comparative linguistics. Sci. Data. 5:180205 doi : 10.1038/sdata.2018.205 (2018).
  24. Johann-Mattis List, Cormac Anderson, Tiago Tresoldi, Simon J. Greenhill, Christoph Rzymski, & Robert Forkel. (2019). Cross-Linguistic Transcription Systems (Version v1.2.0). Max Planck Institute for the Science of Human History: Jena doi : 10.5281/zenodo.2633838
  25. Language Description Heritage
  26. Heggarty, Paul & Anderson, Cormac & Scarborough, Matthew (eds). IE-CoR (Indo-European Cognate Relationships). doi : 10.5281/zenodo.8089434

Creative Commons by small.svg  This article incorporates text available under the CC BY 3.0 license.