This article needs additional citations for verification .(July 2017) |
This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations .(December 2020) |
An unclassified language is a language whose genetic affiliation to other languages has not been established. Languages can be unclassified for a variety of reasons, mostly due to a lack of reliable data [1] but sometimes due to the confounding influence of language contact, if different layers of its vocabulary or morphology point in different directions and it is not clear which represents the ancestral form of the language. [2] Some poorly known extinct languages, such as Gutian and Cacán, are simply unclassifiable, and it is unlikely the situation will ever change.
A supposedly unclassified language may turn out not to be a language at all, or even a distinct dialect, but merely a family, tribal or village name, or an alternative name for a people or language that is classified.
If a language's genetic relationship has not been established after significant documentation of the language and comparison with other languages and families, as in the case of Basque in Europe, it is considered a language isolate – that is, it is classified as a language family of its own. An 'unclassified' language therefore is one which may still turn out to belong to an established family once better data is available or more thorough comparative research is done. Extinct unclassified languages for which little evidence has been preserved are likely to remain in limbo indefinitely, unless lost documents or a surviving speaking population are discovered.
An example of a language that has caused multiple problems for classification is Mimi of Decorse in Chad. This language is only attested in a single word list collected ca. 1900. At first it was thought to be a Maban language, because of similarities to Maba, the first Maban language to be described. However, as other languages of the Maban family were described, it became clear that the similarities were solely with Maba itself, and the relationship was too distant for Mimi to be related specifically to Maba and not equally to the other Maban languages. The obvious similarities are therefore now thought to be due to borrowings from Maba, which is the socially dominant language in the area. When such loans are discounted, there is much less data to classify Mimi with, and what does remain is not particularly similar to any other language or language family. Mimi might therefore be a language isolate, or perhaps a member of some other family related to Maban in the proposed but as yet undemonstrated Nilo-Saharan phylum. It would be easier to address the problem with better data, but no-one has been able to find speakers of the language again.
It also happens that a language may be unclassified within an established family. That is, it may be obvious that it is, say, a Malayo-Polynesian language, but not clear in which branch of Malayo-Polynesian it belongs. When a family consists of many similar languages with great degree of confusing contact, a large number of languages may be effectively unclassified in this manner. Families where this is a substantial problem include Malayo-Polynesian, Bantu, Pama–Nyungan, and Arawakan.
There are hundreds of unclassified languages, most of them extinct, although there are some, albeit relatively few, that are still spoken; in the following list, the extinct languages are labeled with a dagger (†).
Some languages are unclassifiable, not just unclassified, because while there may be record of a language existing there may not be enough materials in it to analyze and classify, especially with now-extinct languages. (See, for example, list of unclassified languages of South America.) Unclassifiable languages with an absence of data include:
Many unclassified languages are also considered unclassifiable due to the presence of some, but not enough, data to reveal close language relatives. For others there may be enough data to show the language belongs to a particular family, but not where within it, or to show the language has no close relatives, but not enough to conclude that it is a language isolate.
Some 'languages' turn out to be fabricated, such as the Kukurá language of Brazil.
A language family is a group of languages related through descent from a common ancestor, called the proto-language of that family. The term family is a metaphor borrowed from biology, with the tree model used in historical linguistics analogous to a family tree, or to phylogenetic trees of taxa used in evolutionary taxonomy. Linguists thus describe the daughter languages within a language family as being genetically related. The divergence of a proto-language into daughter languages typically occurs through geographical separation, with different regional dialects of the proto-language undergoing different language changes and thus becoming distinct languages over time.
The Nilo-Saharan languages are a proposed family of around 210 African languages spoken by somewhere around 70 million speakers, mainly in the upper parts of the Chari and Nile rivers, including historic Nubia, north of where the two tributaries of the Nile meet. The languages extend through 17 nations in the northern half of Africa: from Algeria to Benin in the west; from Libya to the Democratic Republic of the Congo in the centre; and from Egypt to Tanzania in the east.
The number of languages natively spoken in Africa is variously estimated at between 1,250 and 2,100, and by some counts at over 3,000. Nigeria alone has over 500 languages, one of the greatest concentrations of linguistic diversity in the world. The languages of Africa belong to many distinct language families, among which the largest are:
Asia is home to hundreds of languages comprising several families and some unrelated isolates. The most spoken language families on the continent include Austroasiatic, Austronesian, Japonic, Dravidian, Indo-European, Afroasiatic, Turkic, Sino-Tibetan, Kra–Dai and Koreanic. Many languages of Asia, such as Chinese, Persian, Sanskrit, Arabic, Tamil or Telugu, have a long history as a written language.
A language isolate is a language that has no demonstrable genetic relationship with any other languages. Basque in Europe, Ainu in Asia, Sandawe in Africa, Haida and Zuni in North America, Kanoê in South America, Tiwi in Australia and Burushaski in Pakistan are all examples of such languages. The exact number of language isolates is yet unknown due to insufficient data on several languages.
This is a list of different language classification proposals developed for the Indigenous languages of the Americas or Amerindian languages. The article is divided into North, Central, and South America sections; however, the classifications do not correspond to these divisions.
The Torricelli languages are a family of about fifty languages of the northern Papua New Guinea coast, spoken by about 80,000 people. They are named after the Torricelli Mountains. The most populous and best known Torricelli language is Arapesh, with about 30,000 speakers.
The South Halmahera–West New Guinea (SHWNG) languages are a branch of the Malayo-Polynesian languages, found in the islands and along the shores of the Halmahera Sea in the Indonesian province of North Maluku and of Cenderawasih Bay in the provinces of Papua and West Papua. There are 38 languages.
The pre-Indo-European languages are any of several ancient languages, not necessarily related to one another, that existed in Prehistoric Europe, Asia Minor, Ancient Iran and Southern Asia before the arrival of speakers of Indo-European languages. The oldest Indo-European language texts are Hittite and date from the 19th century BC in Kültepe, and while estimates vary widely, the spoken Indo-European languages are believed to have developed at the latest by the 3rd millennium BC. Thus, the pre-Indo-European languages must have developed earlier than or, in some cases, alongside the Indo-European languages that ultimately displaced almost all of them.
The family of Northwest Solomonic languages is a branch of the Oceanic languages. It includes the Austronesian languages of Bougainville and Buka in Papua New Guinea, and of Choiseul, New Georgia, and Santa Isabel in Solomon Islands.
The Marañón River basin, at a low point in the Andes which made it an attractive location for trade between the Inca Empire and the Amazon basin, once harbored numerous languages which have been poorly attested or not attested at all. Those of the middle reaches of the river, above the Amazon basin, were replaced in historical times by Aguaruna, a Jivaroan language from the Amazon which is still spoken there. The languages further upriver are difficult to identify, due to lack of data. The region was multilingual at the time of the Conquest, and the people largely switched to Spanish rather than to Quechua, though Quechua also expanded during Colonial times.
The East Kainji languages are spoken in a compact area of the Jos Plateau in Nigeria, near Jos. There are more than 20 of them, most of which are poorly studied.
Mimi of Decorse, also known as Mimi of Gaudefroy-Demombynes and Mimi-D, is a language of Chad that is attested only in a word list labelled "Mimi" that was collected ca. 1900 by G. J. Decorse and published by Maurice Gaudefroy-Demombynes. Joseph Greenberg (1960) classified it as a Maban language, like the rather remote Maban relative Mimi of Nachtigal. However, George Starostin (2011) rejects this classification, arguing that similarities to Maban are due to contact with locally dominant Maba, and provisionally regards it as a language isolate, though it is suggestive of Central Sudanic.
Kĕnaboi is an extinct unclassified language of Negeri Sembilan, Malaysia that may be a language isolate or an Austroasiatic language belonging to the Aslian branch. It is attested in what appears to be two dialects, based on word lists of about 250 lexical items, presumably collected around 1870–90.
Spurious languages are languages that have been reported as existing in reputable works, while other research has reported that the language in question did not exist. Some spurious languages have been proven to not exist. Others have very little evidence supporting their existence, and have been dismissed in later scholarship. Others still are of uncertain existence due to limited research.
The Negrito peoples of the Philippines speak various Philippine languages. They have more in common with neighboring languages than with each other, and are listed here merely as an aid to identification.
Glottolog is an open-access online bibliographic database of the world's languages. In addition to listing linguistic materials describing individual languages, the database also contains the most up-to-date language affiliations based on the work of expert linguists.
Bawean dialect, also known as Bawean language, is a dialect of Madurese language spoken predominantly by Bawean people in Bawean island. This dialect have 4 major sub-dialects each spoken predominantly in village of Daun and Suwari in the villages of Sangkapura, and the village of Kepuhteluk in the district of Tambak. As well as additional sub-dialects of Bawean Creole.