Producer | Max Planck Institute for Evolutionary Anthropology (Germany) |
---|---|
Languages | English |
Access | |
Cost | Free |
Coverage | |
Disciplines | Linguistics, lexicography |
Lexibank is a linguistics database managed by the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. [1] The database consists of over 100 standardized wordlists (datasets) that are independently curated. [2] [3]
Lexibank datasets are presented in the Cross-Linguistic Data Format (CLDF). [4]
Phonological and lexical features are automatically computed in Lexibank. [2]
The datasets are publicly accessible and are archived at Zenodo [5] and are also publicly available on GitHub. [6] Lexibank is also part of the Cross-Linguistic Linked Data project. All of the datasets are released under the CC BY 4.0 license.
Applications of the database include historical linguistics and comparative phonology.
The following is a list of Lexibank (version 0.2) datasets as of 17 June 2022. [7]
ID | Languages | Zenodo | Citation |
---|---|---|---|
aaleykusunda | Kusunda | 5115947 | Uday Raj Aaley and Timotheus A. Bodt (2020): New Kusunda data: A list of 250 concepts. Computer-Assisted Language Comparison in Practice 3.4 (08/04/2020), URL: https://calc.hypotheses.org/2414. |
abrahammonpa | Monpa | 5115885 | Abraham, Binny, Kara Sako, Elina Kinny, and Isapdaile Zeliang (2018): Sociolinguistic Research among Selected Groups in Western Arunachal Pradesh: Highlighting Monpa. Dallas: SIL International. |
allenbai | Bai | 5115649 | Allen, Bryan (2007): Bai Dialect Survey. Dallas: SIL International. |
backstromnorthernpakistan | Northern Pakistan | 5116054 | Backstrom, Peter C. and Radloff, Carla F. (1992): Sociolinguistic Survey of Northern Pakistan, Volume 2. Languages of Northern Areas. Islamabad: National Institute of Pakistan Studies. |
bantubvd | Bantu | 5115982 | Simon Greenhill and Russell Gray, 2015. Bantu Basic Vocabulary Database . |
bdpa | 5116087 | List, Johann-Mattis and Jelena Prokić. (2014). A benchmark database of phonetic alignments in historical linguistics and dialectology. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC), 26 — 31 May 2014, Reykjavik. 288-294. | |
beidasinitic | Sinitic | 5119295 | Běijīng Dàxué 北京大学 (1964): Hànyǔ fāngyán cíhuì 汉语方言词汇 [Chinese dialect vocabularies]. Beijing: Wenzi Gaige. |
birchallchapacuran | Chapacuran | 5119306 | Birchall J, Dunn M, & Greenhill SJ. 2016. A Combined Comparative and Phylogenetic Analysis of the Chapacuran Language Family. International Journal of American Linguistics 82(3). 255–284. |
blustaustronesian | Austronesian | 5137392 | Greenhill, SJ; Blust, R and Gray, RD (2008): The Austronesian Basic Vocabulary Database: From bioinformatics to lexomics. Evolutionary Bioinformatics. 4. 271-283. |
bodtkhobwa | Kho-Bwa | 5119330 | Bodt, Timotheus Adrianus and List, Johann-Mattis (2019): Testing the predictive strength of the comparative method: An ongoing experiment on unattested words in Western Kho-Bwa languages. Papers in Historical Phonology 4.1: 22-44. |
bowernpny | Pama-Nyungan | 5119341 | Bowern, Claire, & Atkinson, Quentin. (2012). Computational Phylogenetics and the Internal Structure of Pama-Nyungan: Dataset [Data set]. Language. doi : 10.1353/lan.2012.0081 |
cals | Turkic and Indo-European | 5121189 | Mennecier, P., Nerbonne, J., Heyer, E., & Manni, F. (2016). A Central Asian Language Survey, Language Dynamics and Change, 6(1), 57-98. doi : 10.1163/22105832-00601015 |
carvalhopurus | Purus | 5121195 | de Carvalho, F. O. (2021): A comparative reconstruction of Proto-Purus (Arawakan) segmental phonology. IJAL. 87.1. 49-108. |
castrosui | Sui | 5121213 | Castro, Andy and Pan, Xingwen (2015): Sui dialect research. SIL: Guiyang. |
castroyi | Yi | 5121214 | Castro, Andy; Crook, Brian; Flaming, Royce (2010): A sociolinguistic survey of Kua-nsi and related Yi varieties in Heqing county, Yunnan province, China. SIL Electronic Survey Reports 2010-001. Dallas: SIL International. |
castrozhuang | Zhuang | 5121215 | Castro, Andy; Hansen, Bruce (2010): Hongshui He Zhuang dialect intelligibility survey. Dallas: SIL International. |
chaconarawakan | Arawakan | 5118556 | Chacon, Thiago C. (2017): Arawakan and Tukanoan contacts in Northwest Amazonia prehistory. PAPIA 27(2). 237-265. |
chaconbaniwa | Baniwa | 5118605 | Chacon, T. C.; Gonçalves, A. G.; and da Silva, L. F (2019): A diversidade linguística Aruák no Alto Rio Negro em gravações da década de 1950 [The diversity of Arawakan languages from the upper Rio Negro in recordings from the 1950s]. Forma y Función, 32.2, 41-67. doi : 10.15446/fyf.v32n2.80814 |
chaconcolumbian | Colombian | 5118763 | Chacon, Thiago C. (2017): Arawakan and Tukanoan contacts in Northwest Amazonia prehistory. PAPIA 27(2). 237-265. |
chacontukanoan | Tukanoan | 5118723 | T. Chacon. (2014). A revised proposal of Proto-Tukanoan consonants and Tukanoan family classification. Journal of American Linguistics 80.3, pp. 275–322. doi : 10.1086/676393 |
chenhmongmien | Hmong-Mien | 5118744 | Chén, Qíguāng 陳其光 (2012): Miáoyáo yǔwén 苗瑤语文 [Miao and Yao language]. Zhōngyāng Mínzú Dàxué 中央民族大学 [China Minzu University Press]. |
chindialectsurvey | Chin | 5121280 | Language and Social Development Organization (2019): Chin dialect data collection. Yangon: LSDO. |
chingelong | Gelong | 5121324 | Chin, Andy C. (2015): The Gelong Language in the Multilingual Hub of Hainan. Bulletin of Chinese Linguistics. 8. 140-156. |
clarkkimmun | Kim Mun | 5121482 | Clark, E. R. (2008). A phonological analysis and comparison of two Kim Mun varieties in Laos and Vietnam. Payap University: Chiang Mai. |
clics1 | 5121530 | List, Johann-Mattis, Thomas Mayer, Anselm Terhalle, and Matthias Urban (2014). CLICS: Database of Cross-Linguistic Colexifications. Marburg: Forschungszentrum Deutscher Sprachatlas (Version 1.0). | |
constenlachibchan | Chibchan | 5121347 | Umaña, Adolfo Constenla. 2005. ¿Existe relación genealógica entre las lenguas misumalpas y las chibchenses?. Estudios de Lingüística Chibcha. |
davletshinaztecan | Aztecan | 5121382 | Davletshin, Albert (2012): Proto-Uto-Aztecans on their way to the Proto-Aztecan homeland: linguistic evidence. Journal of Language Relationship. 8. 1. 75-92. |
deepadungpalaung | Palaung | 5121402 | Deepadung, Sujaritlak; Buakaw, Supakit; and Rattanapitak, Ampica (2015): A lexical comparison of the Palaung dialects spoken in China, Myanmar, and Thailand. Mon-Khmer Studies 44. 19-38. |
diacl | 5121561 | Carling, Gerd (ed.) 2017. Diachronic Atlas of Comparative Linguistics Online. Lund: Lund University. (URL: https://diacl.ht.lu.se/). Accessed on: 2019-02-07. | |
dravlex | Dravidian | 5121580 | Kolipakam, Vishnupriya, Michael Dunn, Fiona M. Jordan & Annemarie Verkerk. (2018). DravLex: A Dravidian lexical database. Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands. |
dunnaslian | Aslian | 5121613 | Dunn, Michael, Nicole Kruspe, and Niclas Burenhult. 2013. "Time and Place in the Prehistory of the Aslian Languages." Human Biology 85: 383–400. |
dunnielex | Indo-European | 5121651 | Dunn, Michael (2012): Indo-European Lexical Cognacy Database. Max Planck Institute for Psycholinguistics: Nijmegen. |
duonglachi | Lachi | 5121663 | Duong, Thu Hang and Nguyen, Thu Quynh and Nguyen, Van Loi (2021): The Language of the La Chí People in Bản Díu Commune, Xín Mần District, Hà Giang Province, Vietnam. In: Studies in the Anthropology of Language in Mainland Southeast Asia. Ed. by N. J. Enfield, Jack Sidnell, and Charles H. P. Zuckermann. University of Hawaii Press: Honolulu. 124-138 |
felekesemitic | Semitic | 5126691 | Feleke, Tekabe Legesse (2021): Ethiosemitic languages: classifications and classification determinants. Ampersand. 2021. doi : 10.1016/j.amper.2021.100074 |
galuciotupi | Tupian | 5121724 | Galucio, Ana Vilacy, Meira, Sérgio, Birchall, Joshua, Moore, Denny, Gabas Júnior, Nilson, Drude, Sebastian, Storto, Luciana, Picanço, Gessiane, & Rodrigues, Carmen Reis. (2015). Genealogical relations and lexical distances within the Tupian linguistic family. Boletim do Museu Paraense Emílio Goeldi. Ciências Humanas, 10(2), 229-274. doi : 10.1590/1981-81222015000200004 |
gaotb | Tibeto-Burman | 5121776 | Gao, Tianjun (2020): Reconstruction and analysis of phylogenetic network on Tibeto-Burman languages in China. Journal of Chinese Linguistics, 48:1, 257-293. |
gerarditupi | Tupi–Guarani | 5127906 | Ferraz Gerardi, Fabrício and Reichert, Stanislav (2020) The Tupí-Guaraní Language Family: A Phylogenetic Classification. To appear in Diachronica. |
halenepal | Nepal | 5121540 | Hale, Austin (1973): Clause, sentences, and discourse patterns in selected languages of Nepal. Kathmandu: Institute of Nepal and Asiatic Studies. |
hantganbangime | Bangime | 5126441 | Hantgan, Abbie and List, Johann-Mattis (2018): Bangime. Secret language, language isolate, or language island? Journal of Language Contact. |
hattorijaponic | Japonic | 5126845 | Hattori, S. (1973): Japanese dialects. In: Diachronic, areal and typological linguistics. Edited by H. M. Hoenigswald and R. H. Langacre. 368-400. |
houchinese | Sinitic | 5126858 | Hóu, J. (2004): Xiàndài Hànyǔ fāngyán yīnkù 现代汉语方言音库 [Phonological database of Chinese dialects]. Shànghǎi: Shànghǎi Jiàoyù. |
hsiuhmongmien | Hmong-Mien | 5126451 | Hsiu, Andrew (2015): The classification of Na Meo, a Hmong-Mien language of Vietnam. Handout prepared for SEALS 25 (Chiang Mai, 2015/05/27-29). |
hubercolumbian | Colombian | 5121219 | Huber, R. Q. and Reed, R. B. 1992. Vocabulario comparativo: palabras selectas de lenguas indígenas de Colombia [Comparative vocabulary. Selected words from the indigenous languages of Columbia]. Santafé de Bogota: Associatión Instituto Lingüístico de Verano. |
huntergatherer | 5126741 | Bowern, Claire, Patience Epps, Jane Hill, and Patrick McConvell. Hunter-Gatherer Language Database. https://huntergatherer.la.utexas.edu/ Accessed 2021-04-27. | |
ids | 5126899 | Key, Mary Ritchie & Comrie, Bernard (eds.) 2015. The Intercontinental Dictionary Series . Leipzig: Max Planck Institute for Evolutionary Anthropology. | |
ivanisuansu | Suansu | 5126966 | Ivani, J. K. (2019): A first overview on Suansu, a Tibeto-Burman language from Northeastern India. Talk, held at the 29th conference of the Southeast Asian Linguistic Society (27-29 May, Tokyo). https://zenodo.org/record/3383006 |
johanssonsoundsymbolic | 5127131 | Erben Johansson, N., Anikin, A., Carling, G., & Holmer, A. (2020). The typology of sound symbolism: Defining macro-concepts via their semantic and phonetic features, Linguistic Typology, 24(2), 253-310. doi : 10.1515/lingty-2020-2034 | |
joophonosemantic | 5137230 | Joo, I. (2020). Phonosemantic biases found in Leipzig-Jakarta lists of 66 languages. Linguistic Typology, 24(1), 1–12. doi : 10.1515/lingty-2019-0030 | |
kesslersignificance | 5127775 | Kessler, B. (2001): The Significance of Wordlists. CSLI: Stanford. | |
kleinewillinghoeferbikwinjen | Bikwin-Jen | 5127404 | Kleinewillinghöfer, Ulrich (2015). Bikwin-Jen Group. https://www.blogs.uni-mainz.de/fb07-adamawa/adamawa-languages/bikwin-jen-group/. Accessed on: 2020-04-15. |
kraftchadic | Chadic | 5121222 | Kraft, Charles H. 1981. Chadic wordlists. Berlin: Dietrich Reimer. |
leejaponic | Japonic | 5126801 | Lee, Sean, Hasegawa, Toshikazu (2011). Bayesian phylogenetic analysis supports an agricultural origin of Japonic languages. Proceedings of the Royal Society B: Biological Sciences, 278(1725), 3662–3669. doi : 10.1098/rspb.2011.0518 |
leeainu | Ainu | 5126890 | Lee Sean, Hasegawa Toshikazu (2013). Evolution of the Ainu Language in Space and Time. PLoS ONE 8(4): e62243. doi : 10.1371/journal.pone.0062243 |
bremerberta | Berta | 5126757 | Bremer, Nate D. (2016): A Sociolinguistic Survey of Six Berta Speech Varieties in Ethiopia. SIL Electronic Survey Reports 2016-007. Dallas: SIL International. |
leekoreanic | Koreanic | 5126904 | Lee, Sean (2015). A Sketch of Language History in the Korean Peninsula. PLoS ONE 10(5): e0128448. doi : 10.1371/journal.pone.0128448 |
lieberherrkhobwa | Kho-Bwa | 5127687 | Lieberherr, Ismail and Bodt, Timotheus Adrianus (2017): Sub-grouping Kho-Bwa based on shared core vocabulary. Himalayan Linguistics 16(2). 26-63. URL: https://escholarship.org/uc/item/4t27h5fg |
lindseyende | Ende | 5127829 | Kate Lynn Lindsey and Bernard Comrie. 2020. Ende (Papua New Guinea) dictionary. In: Key, Mary Ritchie & Comrie, Bernard (eds.) The Intercontinental Dictionary Series . Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://ids.clld.org/) |
listsamplesize | 5128050 | List, Johann-Mattis (2014): Investigating the impact of sample size on cognate detection. Journal of Language Relationship. 11. 91-102. doi : 10.31826/jlr-2014-110111 | |
liusinitic | Sinitic | 5131413 | Líu, L.; Wáng, H.; Bǎi, Y. (2007): Xiàndài Hànyǔ fāngyán héxīncí, tèzhēng cíjí 现代汉语方言核心词·特征词集 [Collection of basic vocabulary words and characteristic dialect words in modern Chinese dialects]. Nánjīng: Fènghuáng. |
lundgrenomagoa | Proto-Omagua-Kokama-Tupinambá | 5128097 | Lundgren, Olof (2020): A phonological reconstruction of Proto-Omagua-Kokama-Tupinambá. Master's thesis. Lund: Lund University. |
mannburmish | Burmish | 5131419 | Mann, Noel W. 1998. A phonological reconstruction of Proto Northern Burmic. (PhD Thesis). |
marrisonnaga | Naga | 5121317 | Marrison, Geoffrey Edward (1967) : The classification of the Naga languages of North-East India. London: School of African and Oriental Sciences. |
mcelhanonhuon | Huon | 5127348 | McElhanon, K.A. 1967. Preliminary Observations on Huon Peninsula Languages. Oceanic Linguistics. 6, 1-45. |
mitterhoferbena | Bena | 5121327 | Mitterhofer, Bernadette. 2013. Lessons from a dialect survey of Bena: Analyzing wordlists. SIL International. |
naganorgyalrongic | rGyalrongic | 5126458 | Nagano, Yasuhiko and Prins, Marielle (2013): rGyalrongic Languages Database. Osaka: National Museum of Ethnology. |
nagarajakhasian | Khasian | 5131421 | Nagaraja KS, Sidwell P & Greenhill SJ. 2013. A Lexicostatistical Study of the Khasian Languages: Khasi, Pnar, Lyngngam, and War. Mon-Khmer Studies Journal, 42, 1-11. |
northeuralex | 5121268 | Dellert, J., Daneyko, T., Münch, A. et al (2020). NorthEuraLex (Version 0.9). Lang Resources and Evaluation. doi : 10.1007/s10579-019-09480-6 | |
peirosaustroasiatic | Austroasiatic | 5127536 | Peiros, I. I. (2004): Генетическая классификация австроазиатских языков / Genetičeskaja klassifikacija avstroaziatskix jazykov [Genetic classification of Austro-Asiatic languages]. Russian State University for the Humanities, Russian State University for the Humanities, Moscow. |
pharaocoracholaztecan | Proto-Corachol, Proto-Náhuatl | 5136882 | Pharao Hansen, Magnus (2020): ¿Familia o vecinos? Investigando la relación entre el proto-náhuatl y el proto-corachol [Family or neighbors? Investigating the relation between Proto-Náhuatl and Proto-Corachol]. In: Lenguas yutoaztecas: historia, estructuras y contacto lingüístico. Homenaje a Karen Dakin. Rosa Yañez (ed.) Guadalajara: Universidad de Guadalajara. |
polyglottaafricana | 5136890 | Koelle, Sigismund W. (1854). Polyglotta Africana or Comparative Vocabulary of Nearly Three Hundred Words and Phrases in more than One Hundred Distinct African Languages. London: Church Missionary House. | |
ratcliffearabic | Arabic | 5136898 | Ratcliffe, Robert R. (2021): The glottometrics of Arabic. Language Dynamics and Change. 2021. doi : 10.1163/22105832-01001100 |
robinsonap | Alor-Pantar | 5121340 | Robinson, Laura C. and Holton, Gary (2012): Internal Classification of the Alor-Pantar Language Family Using Computational Methods Applied to the Lexicon. Language Dynamics and Change 2.2. 123-149. |
saenkoromance | Romance | 5136900 | Saenko, M. (2015): Annotated Swadesh wordlists for the Romance group (Indo-European family). In: Starostin GS, editor. The Global Lexicostatistical Database. RGU; 2015. http://starling.rinet.ru/new100/tuj.xls |
sagartst | Sino-Tibetan | 5121409 | Laurent Sagart, Jacques, Guillaume, Yunfan Lai, and Johann-Mattis List (2019): Sino-Tibetan Database of Lexical Cognates. Jena: Max Planck Institute for the Science of Human History. |
satterthwaitetb | Tibeto-Burman | 5136997 | Satterthwaite-Phillips, Damian (2011) Phylogenetic inference of the Tibeto-Burman languages or on the usefuseful of lexicostatistics (and "megalo"-comparison) for the subgrouping of Tibeto-Burman. Stanford: Stanford University. |
savelyevturkic | Turkic | 5137274 | Savelyev, Alexander and Robbeets, Martine (2020): Bayesian phylolinguistics infers the internal structure and the time-depth of the Turkic language family. Journal of Language Evolution 5.1. 39-53. |
servamalagasy | Malagasy | 5137040 | Serva M., Pasquini M. (2020): Dialects of Madagascar, PLoS ONE 15(10). |
sidwellbahnaric | Bahnaric | 5137055 | Sidwell, Paul. 2015. Austroasiatic dataset for phylogenetic analysis: 2015 version. Mon-Khmer Studies (Notes, Reviews, Data-Papers) 44. lxviii-ccclvii. |
simsrma | Rma | 5166593 | Sims, Nathanial A. (2020): Reconsidering the diachrony of tone in Rma. Journal of the Southeast Asian Linguistics Society 13.1. 53-85. |
sohartmannchin | Chin | 5121813 | So-Hartmann, Helga (1988): Notes on the Southern Chin Languages. Linguistics of the Tibeto-Burman Area 11.2: 98-119. |
starostinpie | Proto-Indo-European | 5137281 | Starostin, S. A. (2005): Indo-European files in DBF/VAR. Moscow. |
suntb | Tibeto-Burman | 5121515 | Sūn, Hóngkāi 孙宏开 (1991): Zangmianyu yuyin he cihui 藏缅语音和词汇 [Tibeto-Burman phonology and lexicon]. Beijing: Chinese Social Sciences Press. |
syrjaenenuralic | Uralic | 5137236 | Syrjänen, K.; Honkola, T.; Korhonen, K.; Lehtinen, J.; Vesakoski, O. & Wahlber, N. Shedding more light on language classification using basic vocabularies and phylogenetic methods. Diachronica, 2013, 30, 323-352 |
tls | Bantu | 5121819 | Nurse, Derek and Gérard Philippson (1975). The Tanzanian Language Survey. Department of Foreign Languages and Linguistics of the University of Dar es Salaam: Dar es Salaam. |
tppsr | |||
transnewguineaorg | Trans-New Guinea | 5141620 | Greenhill, Simon J. (2015): TransNewGuinea.org: An Online Database of New Guinea Languages. PLoS ONE 10.10: e0141563. |
tuled | Tupian | Ferraz Gerardi, Fabrício & Reichert, Stanislav & Aragon, Carolina. (2021) TuLeD (Tupían Lexical Database): Introducing a database of a South American language family. Language Resources and Evaluation. doi : 10.1007/s10579-020-09521-5 | |
visserkalamang | Kalamang | 5139559 | Eline Visser. 2021. Kalamang dictionary. In: Key, Mary Ritchie & Comrie, Bernard (eds.) The Intercontinental Dictionary Series . Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at https://ids.clld.org/) |
walworthpolynesian | Polynesian | 5126932 | Walworth, Mary. (2018). Polynesian Segmented Data (Version 1) [Data set]. Zenodo. doi : 10.5281/zenodo.1689909 |
wangbai | Bai | 5137407 | Wang, Feng (2004): Language contact and language comparison. The case of Bai. PhD thesis. Hong Kong: City University of Hong Kong. |
wangbcd | Sinitic | 5136930 | Wang, F. 2004. BCD: basic words of Chinese dialects. Unpublished dataset. [Digital version in: List, J.-M. (2015): Network perspectives on Chinese dialect history. Bulletin of Chinese Linguistics 8. 42-67.] |
wichmannmixezoquean | Mixe-Zoquean | 5126948 | Cysouw, M., Wichmann, S., & Kamholz, D. (2006). A critique of the separation base method for genealogical subgrouping, with data from Mixe-Zoquean. Journal of Quantitative Linguistics, 13(2-3), 225–264. doi : 10.1080/09296170600850759 |
wold | 5139859 | Haspelmath, Martin & Tadmor, Uri (eds.) 2009. World Loanword Database. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at https://wold.clld.org/) | |
yanglalo | Lalo | 5121829 | Yang, Cathryn (2011): Lalo regional varieties: Phylogeny, dialectometry and sociolinguistics. Bundoora: La Trobe University. |
yangyi | Yi | 5167277 | Yang, Cathryn (2021): The phonetic tone change *high > rising: Evidence from the Ngwi dialect laboratory. |
yuchinese | Sinitic | 5139881 | Hsiao-jung Yu and Yifan Wang. 2021. Mandarin Chinese dictionary. In: Key, Mary Ritchie & Comrie, Bernard (eds.) The Intercontinental Dictionary Series . Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at https://ids.clld.org/) |
zgraggenmadang | Madang | 5121535 | Z'graggen, J A. (1980) A comparative word list of the Northern Adelbert Range Languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics. |
zhaobai | Bai | 5136947 | Zhao, Yanzhen (2006): Zhàozhuāng Báiyǔ miáoxiě yánjiū 趙莊白語描寫研究 [Investigations of Zhaozhuang Bai]. Běijīng: Zhōngyāng Mínzú Dàxué. |
zhivlovobugrian | Ob-Ugrian | 5137439 | Zhivlov, M. (2011): Annotated Swadesh wordlists for the Ob-Ugrian group (Uralic family). The Global Lexicostatistical Database. Moscow: RGGU. |
zhoubizic | Bizic | 5140129 | Zhou, Yulou (2020): Proto-Bizic. A study of Tujia historical phonology. Bachelor Thesis. Stanford University. |
logos | 5141379 | List, Johann-Mattis, Thomas Mayer, Anselm Terhalle, and Matthias Urban (2014). CLICS: Database of Cross-Linguistic Colexifications. Marburg: Forschungszentrum Deutscher Sprachatlas (Version 1.0). | |
utoaztecan | Uto-Aztecan | 5173799 | Greenhill, Simon J., Hannah J. Haynie, Robert M. Ross, Angela M. Chira, List, Johann-Mattis, Lyle Campbell, Carlos A. Botero, and Russell D. Gray (2021): A recent northern origin for the Uto-Aztecan language family. Leipzig: Max Planck Institute for Evolutionary Anthropology. |
abvdoceanic | Oceanic | 5206553 | Greenhill, S.J., Blust. R, & Gray, R.D. (2008). The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics. Evolutionary Bioinformatics, 4:271-283. |
The Chadic languages form a branch of the Afroasiatic language family. They are spoken in parts of the Sahel. They include 150 languages spoken across northern Nigeria, southern Niger, southern Chad, the Central African Republic, and northern Cameroon. By far the most widely spoken Chadic language is Hausa, a lingua franca of much of inland Eastern West Africa, particularly Niger and the northern half of Nigeria.
The Sko or Skou languages are a small language family spoken by about 7000 people, mainly along the Vanimo coast of Sandaun Province in Papua New Guinea, with a few being inland from this area and at least one just across the border in the Indonesian province of Papua.
Comparative linguistics is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness.
The Torricelli languages are a family of about fifty languages of the northern Papua New Guinea coast, spoken by about 80,000 people. They are named after the Torricelli Mountains. The most populous and best known Torricelli language is Arapesh, with about 30,000 speakers.
In linguistics, lexical similarity is a measure of the degree to which the word sets of two given languages are similar. A lexical similarity of 1 would mean a total overlap between vocabularies, whereas 0 means there are no common words.
The South Halmahera–West New Guinea (SHWNG) languages are a branch of the Malayo-Polynesian languages, found in the islands and along the shores of the Halmahera Sea in the Indonesian province of North Maluku and of Cenderawasih Bay in the provinces of Papua and West Papua. There are 38 languages.
The Central Solomon languages are the four Papuan languages spoken in the state of Solomon Islands.
Elseng is a poorly documented Papuan language spoken by about 300 people in the Indonesian province of Papua. It is also known as Morwap, which means "what is it?" ‘Morwap’ is vigorously rejected as a language name by speakers and government officials.
The Reef Islands – Santa Cruz languages are a branch of the Oceanic languages comprising the languages of the Santa Cruz Islands and Reef Islands:
The West Bomberai languages are a family of Papuan languages spoken on the Bomberai Peninsula of western New Guinea and in East Timor and neighboring islands of Indonesia.
Quantitative comparative linguistics is the use of quantitative analysis as applied to comparative linguistics. Examples include the statistical fields of lexicostatistics and glottochronology, and the borrowing of phylogenetics from biology.
The Kho-Bwa languages, also known as Kamengic, are a small family of languages spoken in Arunachal Pradesh, northeast India. The name Kho-Bwa was originally proposed by George van Driem (2001). It is based on the reconstructed words *kho ("water") and *bwa ("fire"). Blench (2011) suggests the name Kamengic, from the Kameng area of Arunachal Pradesh. Alternatively, Anderson (2014) refers to Kho-Bwa as Northeast Kamengic.
Figshare is an online open access repository where researchers can preserve and share their research outputs, including figures, datasets, images, and videos. It is free to upload content and free to access, in adherence to the principle of open data. Figshare is one of a number of portfolio businesses supported by Digital Science, a subsidiary of Springer Nature.
The Cross-Linguistic Linked Data (CLLD) project coordinates over a dozen linguistics databases covering the languages of the world. It is hosted by the Department of Linguistic and Cultural Evolution at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany.
Zenodo is a general-purpose open repository developed under the European OpenAIRE program and operated by CERN. It allows researchers to deposit research papers, data sets, research software, reports, and any other research related digital artefacts. For each submission, a persistent digital object identifier (DOI) is minted, which makes the stored items easily citeable.
In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing, and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data Cloud was conceived and is being maintained by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation, but has been a point of focal activity for several W3C community groups, research projects, and infrastructure efforts since then.
Colexification, together with its associated verb colexify, are terms used in semantics and lexical typology. They refer to the ability, for a language, to express different meanings with the same word.
Concepticon is an open-source online lexical database of linguistic concept lists. It links concept labels in concept lists to concept sets.
Johann-Mattis List is a German scientist. He is known for his work on quantitative comparative linguistics. List is currently professor at the University of Passau, Germany, where he leads the Chair of Multilingual Computational Linguistics.
PHOIBLE is a linguistic database accessible through its website and compiling phonological inventories from primary documents and tertiary databases into a single, easily searchable sample. The 2019 version 2.0 includes 3,020 inventories containing 3,183 segment types found in 2,186 distinct languages. It is edited by Steven Moran, Assistant Professor from the Institute of Biology at the University of Neuchâtel and Daniel McCloy, Researcher at the Institute for Learning and Brain Sciences at the University of Washington.