Unclassified language

Last updated

An unclassified language is a language whose genetic affiliation to other languages has not been established. Languages can be unclassified for a variety of reasons, mostly due to a lack of reliable data [1] but sometimes due to the confounding influence of language contact, if different layers of its vocabulary or morphology point in different directions and it is not clear which represents the ancestral form of the language. [2] Some poorly known extinct languages, such as Gutian and Cacán, are simply unclassifiable, and it is unlikely the situation will ever change.

Contents

A supposedly unclassified language may turn out not to be a language at all, or even a distinct dialect, but merely a family, tribal or village name, or an alternative name for a people or language that is classified.

If a language's genetic relationship has not been established after significant documentation of the language and comparison with other languages and families, as in the case of Basque in Europe, it is considered a language isolate – that is, it is classified as a language family of its own. An 'unclassified' language therefore is one which may still turn out to belong to an established family once better data is available or more thorough comparative research is done. Extinct unclassified languages for which little evidence has been preserved are likely to remain in limbo indefinitely, unless lost documents or a surviving speaking population are discovered.

Classification challenges

An example of a language that has caused multiple problems for classification is Mimi of Decorse in Chad. This language is only attested in a single word list collected ca. 1900. At first it was thought to be a Maban language, because of similarities to Maba, the first Maban language to be described. However, as other languages of the Maban family were described, it became clear that the similarities were solely with Maba itself, and the relationship was too distant for Mimi to be related specifically to Maba and not equally to the other Maban languages. The obvious similarities are therefore now thought to be due to borrowings from Maba, which is the socially dominant language in the area. When such loans are discounted, there is much less data to classify Mimi with, and what does remain is not particularly similar to any other language or language family. Mimi might therefore be a language isolate, or perhaps a member of some other family related to Maban in the proposed but as yet undemonstrated Nilo-Saharan phylum. It would be easier to address the problem with better data, but no-one has been able to find speakers of the language again.

It also happens that a language may be unclassified within an established family. That is, it may be obvious that it is, say, a Malayo-Polynesian language, but not clear in which branch of Malayo-Polynesian it belongs. When a family consists of many similar languages with great degree of confusing contact, a large number of languages may be effectively unclassified in this manner. Families where this is a substantial problem include Malayo-Polynesian, Bantu, Pama–Nyungan, and Arawakan.

Examples by reason

There are hundreds of unclassified languages, most of them extinct, although there are some, albeit relatively few, that are still spoken; in the following list, the extinct languages are labeled with a dagger (†).

Absence of data

These languages are unclassifiable, not just unclassified, because while there may be record of a language existing there may not be enough materials in it to analyze and classify, especially with now-extinct languages. (See, for example, a list of unclassified languages of South America.)

Scarcity of data

Many of these languages are also considered unclassifiable, as the amount of data may not be enough to reveal close relatives if there were some. For others there may be enough data to show the language belongs to a particular family, but not where within it, or to show the language has no close relatives, but not enough to conclude that it is a language isolate.

Unrelated to nearby languages and not commonly examined

Basic vocabulary unrelated to other languages

Languages of dubious existence

Some 'languages' turn out to be fabricated, such as the Kukurá language of Brazil.

See also

Notes

  1. The Harappan 'script' that decipherers rely on for identification is indecipherable so far, and is likely not actually a script.
  2. According to Rupert Moser, "The Hamba were hunters and gatherers who were resettled and scattered in the 1950s, when their hunting-and-gathering area [located northwest of Nachingwea south of the Mbemkuru River] was planned to be used for ground-nut-plantations. Though that project failed for climatical reasons, the Hamba vanished or were assimilated by neighbouring groups [such as the Matumbi and Yao in addition to those listed next]. Already before parts of them had been assimilated by invading Mwera, Ndonde, Ndendeule and Ngindo." [3]
  3. 'Okwa' is attested by one word collected in the 18th century, tschabee 'God' (in German orthography), which is not known whether is a native word rather than a loan. The language is not so much unclassified as unidentified. [4]

Related Research Articles

<span class="mw-page-title-main">Chadic languages</span> Branch of the Afroasiatic languages

The Chadic languages form a branch of the Afroasiatic language family. They are spoken in parts of the Sahel. They include 196 languages spoken across northern Nigeria, southern Niger, southern Chad, and northern Cameroon. By far the most widely spoken Chadic language is Hausa, a lingua franca of much of inland Eastern West Africa, particularly Niger and the northern half of Nigeria. Hausa, along with Mafa and Karai Karai, are the only three Chadic languages with more than 1 million speakers.

<span class="mw-page-title-main">Nilo-Saharan languages</span> Proposed family of African languages

The Nilo-Saharan languages are a proposed family of around 210 African languages spoken by somewhere around 70 million speakers, mainly in the upper parts of the Chari and Nile rivers, including historic Nubia, north of where the two tributaries of the Nile meet. The languages extend through 17 nations in the northern half of Africa: from Algeria to Benin in the west; from Libya to the Democratic Republic of the Congo in the centre; and from Egypt to Tanzania in the east.

<span class="mw-page-title-main">Languages of Africa</span>

The number of languages natively spoken in Africa is variously estimated at between 1,250 and 2,100, and by some counts at over 3,000. Nigeria alone has over 500 languages, one of the greatest concentrations of linguistic diversity in the world. The languages of Africa belong to many distinct language families, among which the largest are:

<span class="mw-page-title-main">Languages of Asia</span>

Asia is home to hundreds of languages comprising several families and some unrelated isolates. The most spoken language families on the continent include Austroasiatic, Austronesian, Japonic, Dravidian, Indo-European, Afroasiatic, Turkic, Sino-Tibetan, Kra–Dai and Koreanic. Many languages of Asia, such as Chinese, Sanskrit, Arabic, Tamil or Telugu, have a long history as a written language.

<span class="mw-page-title-main">Language isolate</span> Language that has no demonstrable genetic relationship with another language

A language isolate is a language that has no demonstrable genetic relationship with any other languages. Basque in Europe, Ainu in Asia, Sandawe in Africa, Haida and Zuni in North America, Kanoê in South America, Tiwi in Australia and Burushaski in Pakistan are all examples of language isolates. The exact number of language isolates is yet unknown due to insufficient data on several languages.

The Reer Barre are a tribe in the Gedo region of the Ethiopia-Ogaden region on the Shabele River, near Somalia, who currently speak Somali. Historically farmers, a small number of Reer Barre are pastoralist, mostly keeping cows and goats and are usually residents of eastern cities of Ethiopia, such as Jijiga, Gode, Kelafo, Far-libah, Feerfeer, Mustahiil, along with more cities and towns in different parts of the Somali Regional state of Ethiopia. These tribes are unkown to the government of Ethiopia so far, regardless of the country's ethnic based federalism who has no stock of its ethnic group. They are a large family who also have a large population in the Shabelle region, Somalia. Reer means "family" and Barre is a name, altogether meaning "Barre family". They are descendants of Barre Abdille.

<span class="mw-page-title-main">Classification of the Indigenous languages of the Americas</span>

This is a list of different language classification proposals developed for the Indigenous languages of the Americas. The article is divided into North, Central, and South America sections; however, the classifications do not correspond to these divisions.

<span class="mw-page-title-main">Torricelli languages</span> Language family

The Torricelli languages are a family of about fifty languages of the northern Papua New Guinea coast, spoken by about 80,000 people. They are named after the Torricelli Mountains. The most populous and best known Torricelli language is Arapesh, with about 30,000 speakers.

The Lower Mamberamo languages are a recently proposed language family linking two languages spoken along the northern coast of Papua province, Indonesia, near the mouth of the Mamberamo River. They have various been classified either as heavily Papuanized Austronesian languages belonging to the SHWNG branch, or as Papuan languages that had undergone heavy Austronesian influence. Glottolog 3.4 classifies Lower Mamberamo as Austronesian, while Donohue classifies it as Papuan. Kamholz (2014) classifies Warembori and Yoke each as coordinate primary subgroups of the South Halmahera–West New Guinea languages.

<span class="mw-page-title-main">South Halmahera–West New Guinea languages</span> Subgroup of the Austronesian language family

The South Halmahera–West New Guinea (SHWNG) languages are a branch of the Malayo-Polynesian languages, found in the islands and along the shores of the Halmahera Sea in the Indonesian province of North Maluku and of Cenderawasih Bay in the provinces of Papua and West Papua. There are 38 languages.

<span class="mw-page-title-main">Pre-Indo-European languages</span> Languages of Eurasia before the arrival of Indo-European languages

The pre-Indo-European languages are any of several ancient languages, not necessarily related to one another, that existed in Prehistoric Europe, Asia Minor, Ancient Iran and Southern Asia before the arrival of speakers of Indo-European languages. The oldest Indo-European language texts are Hittite and date from the 19th century BC in Kültepe, and while estimates vary widely, the spoken Indo-European languages are believed to have developed at the latest by the 3rd millennium BC. Thus, the pre-Indo-European languages must have developed earlier than or, in some cases, alongside the Indo-European languages that ultimately displaced almost all of them.

The family of Northwest Solomonic languages is a branch of the Oceanic languages. It includes the Austronesian languages of Bougainville and Buka in Papua New Guinea, and of Choiseul, New Georgia, and Santa Isabel in Solomon Islands.

<span class="mw-page-title-main">Extinct languages of the Marañón River basin</span> Extinct languages of the Andes

The Marañón River basin, at a low point in the Andes which made it an attractive location for trade between the Inca Empire and the Amazon basin, once harbored numerous languages which have been poorly attested or not attested at all. Those of the middle reaches of the river, above the Amazon basin, were replaced in historical times by Aguaruna, a Jivaroan language from the Amazon which is still spoken there. The languages further upriver are difficult to identify, due to lack of data. The region was multilingual at the time of the Conquest, and the people largely switched to Spanish rather than to Quechua, though Quechua also expanded during Colonial times.

Kĕnaboi is an extinct unclassified language of Negeri Sembilan, Malaysia that may be a language isolate or an Austroasiatic language belonging to the Aslian branch. It is attested in what appears to be two dialects, based on word lists of about 250 lexical items, presumably collected around 1870–90.

Spurious languages are languages that have been reported as existing in reputable works, while other research has reported that the language in question did not exist. Some spurious languages have been proven to not exist. Others have very little evidence supporting their existence, and have been dismissed in later scholarship. Others still are of uncertain existence due to limited research.

Glottolog is an open-access online bibliographic database of the world's languages. In addition to listing linguistic materials describing individual languages, the database also contains the most up-to-date language affiliations based on the work of expert linguists.

<span class="mw-page-title-main">Baduy language</span> Sundanesic language spoken by Baduy people

Baduy is one of the Sundanese-Baduy languages spoken predominantly by the Baduy people. It is conventionally considered a dialect of Sundanese, but it is often considered a separate language due to its diverging vocabulary and cultural reasons that differ from the rest of the Sundanese people. Native speakers of the Baduy language are spread in regions around the Mount Kendeng, Rangkasbitung district of Lebak Regency and Pandeglang Regency, Banten Province, Indonesia. It is estimated that there are 11,620 speakers as of 2015.

<span class="mw-page-title-main">Bawean language</span> Languages of Indonesia

Bawean dialect, also known as Bawean language, is a dialect of Madurese language spoken predominantly by Bawean people in Bawean island. This dialect have 4 major sub-dialects each spoken predominantly in village of Daun and Suwari in the villages of Sangkapura, and the village of Kepuhteluk in the district of Tambak. As well as additional sub-dialects of Bawean Creole.

References

  1. Hasnain, Imtiaz (2013-07-16). Alternative Voices: (Re)searching Language, Culture, Identity …. Cambridge Scholars Publishing. p. 314. ISBN   9781443849982.
  2. Muysken, Pieter (2008). From Linguistic Areas to Areal Linguistics. John Benjamins Publishing. p. 168. ISBN   978-9027231000.
  3. Gabriele Sommer, 'A Survey on Language Death in Africa', in Brenzinger (2012) Language Death, p. 351.
    See Hammarström, Harald; Forkel, Robert; Haspelmath, Martin, eds. (2017). "Hamba". Glottolog 3.0 . Jena, Germany: Max Planck Institute for the Science of Human History.)
  4. Hammarström, Harald; Forkel, Robert; Haspelmath, Martin, eds. (2017). "Okwa". Glottolog 3.0 . Jena, Germany: Max Planck Institute for the Science of Human History.