Pre-Indo-European languages

A diagram showing Pre-Indo-European languages. Red dots indicate populations before the Indo-European peoples migrated from the steppes. Pre-indo-european lang.png
A diagram showing Pre-Indo-European languages. Red dots indicate populations before the Indo-European peoples migrated from the steppes.

The Pre-Indo-European languages are any of several ancient languages, not necessarily related to one another, that existed in Prehistoric Europe and South Asia before the arrival of speakers of Indo-European languages. The oldest Indo-European language texts date from the 19th century BC in Kültepe (modern Turkey), and while estimates vary widely, the spoken Indo-European languages are believed to have developed at the latest by the 3rd millennium BC (see Proto-Indo-European Urheimat hypotheses). Thus, the Pre-Indo-European languages must have developed earlier than or, in some cases, alongside the Indo-European languages that ultimately displaced them. [1] [2] [3]


A handful of the pre-Indo-European languages still survive; in Europe, Basque retains a localised strength, with fewer than a million native speakers, but the Dravidian languages remain very widespread in South Asia, with over 200 million native speakers. Some of the pre-Indo-European languages are attested only as linguistic substrates in Indo-European languages.


Before World War II, all the unclassified languages of Europe and the Near East were commonly referred to as Asianic languages, and the term encompassed several languages that were later found to be Indo-European (such as Lydian), and others (such as Hurro-Urartian, Hattic) were classified as distinct language families. In 1953, the linguist Johannes Hubschmid identified at least five pre-Indo-European language families in Western Europe: Eurafrican, which covered North Africa, Italy, Spain and France; Hispano-Caucasian, which replaced Eurafrican and stretched from Northern Spain to the Caucasus Mountains; Iberian, which was spoken by most of Spain prior to the Roman conquest of the Iberian peninsula; Libyan, which was spoken mostly in North Africa but encroached into Sardinia; and Etruscan, which was spoken in Northern Italy. [4] The term pre-Indo-European is not universally accepted, as some linguists maintain the idea of the relatively-late arrival of the speakers of the unclassified languages to Europe, possibly even after the Indo-European languages, and so prefer to speak about non-Indo-European languages. A new term, Paleo-European, is not applicable to the languages that predated or coexisted with Indo-European outside Europe.

Surviving languages

Surviving pre-Indo-European languages are held to include the following: [5]

Languages that contributed substrates to Indo-European languages

Examples of suggested or known substrate influences on specific Indo-European languages include the following:[ citation needed ]

Other propositions are generally rejected by modern linguists:

Attested languages

Languages attested in inscriptions include the following:[ citation needed ]

Later Indo-European expansion

Further, there have been replacements of Indo-European languages by others, most prominently of most of the Celtic languages by Germanic or Romance varieties because of Roman rule and the invasions of Germanic tribes.

Also, however, languages replaced or engulfed by Indo-European in ancient times must be distinguished from languages replaced or engulfed by Indo-European languages in more recent times. In particular, the vast majority of the major languages spread by colonialism have been Indo-European, which has in the last few centuries led to superficially similar linguistic islands being formed by, for example, indigenous languages of the Americas (now surrounded by English, Spanish, Portuguese, Dutch, and French), as well as of several Uralic languages (now surrounded by Russian).[ citation needed ] Many creole languages have also arisen based upon Indo-European colonial languages.

Indo-European languages Language family native to western and southern Eurasia

The Indo-European languages are a language family native to western and southern Eurasia. It comprises most of the languages of Europe together with those of the northern Indian subcontinent and the Iranian Plateau. Some European languages of this family, such as English, French, Portuguese, Russian, Dutch, and Spanish, have expanded through colonialism in the modern period and are now spoken across several continents. The Indo-European family is divided into several branches or sub-families, of which there are eight groups with languages still alive today: Albanian, Armenian, Balto-Slavic, Celtic, Germanic, Hellenic, Indo-Iranian, and Italic; and another six subdivisions which are now extinct.

The Alarodian languages are a proposed language family that encompasses the Northeast Caucasian (Nakh–Dagestanian) languages and the extinct Hurro-Urartian languages.

Old Europe is a term coined by the Lithuanian archaeologist Marija Gimbutas to describe what she perceived as a relatively homogeneous pre-Indo-European Neolithic culture in Southeastern Europe located in the Danube River valley, also known as Danubian culture.

The Germanic substrate hypothesis attempts to explain the purportedly distinctive nature of the Germanic languages within the context of the Indo-European languages. Based on the elements of Common Germanic vocabulary and syntax which do not seem to have cognates in other Indo-European languages, it claims that Proto-Germanic may have been either a creole or a contact language that subsumed a non-Indo-European substrate language, or a hybrid of two quite different Indo-European languages, mixing the centum and satem types.

Robert S. P. Beekes Dutch linguist

Robert Stephen Paul Beekes was a Dutch linguist who was emeritus professor of Comparative Indo-European Linguistics at Leiden University and an author of many monographs on the Proto-Indo-European language.

Tyrsenian languages Hypothetical extinct pre-Indo-European language family

Tyrsenian, named after the Tyrrhenians, is a proposed extinct family of closely related ancient languages put forward by linguist Helmut Rix (1998), which consists of the Etruscan language of northern, central and south-western Italy, and eastern Corsica (France); the Rhaetic language of the Alps, named after the Rhaetian people; and the Lemnian language of the Aegean Sea. Camunic in northern Lombardy, in between Etruscan and Rhaetic, may belong here too, but the material is very scant. The Tyrsenian languages are generally considered Pre-Indo-European (Paleo-European).

Theo Vennemann genannt Nierfeld is a German historical linguist known for his controversial theories of a "Vasconic" and an "Atlantic" stratum in European languages, published since the 1990s.

The origin of the Basques and the Basque language is a controversial topic that has given rise to numerous hypotheses. Modern Basque, a descendant or close relative of Aquitanian and Proto-Basque, is the only Pre-Indo-European language that is extant in western Europe. The Basques have therefore long been supposed to be a remnant of a pre-Indo-European population of Europe.

Old European hydronymy

Old European is the term used by Hans Krahe (1964) for the language of the oldest reconstructed stratum of European hydronymy in Central and Western Europe.

The Pre-Greek substrate consists of the unknown language(s) spoken in prehistoric Greece before the coming of the Proto-Greek language in the area during the Bronze Age. It is possible that Greek acquired some thousand words and proper names from such a language(s), because some of its vocabulary cannot be satisfactorily explained as deriving from Proto-Greek and a Proto-Indo-European reconstruction is almost impossible for such terms.

Kalevi Wiik

Kaino Kalevi Wiik was a professor of phonetics at the University of Turku, Finland. He was best known for his controversial hypothesis about the effect of the Uralic contact influence on the creation of various Indo-European protolanguages in Northern Europe such as Germanic, Slavic, and Baltic. He also based much of his hypothetical structures on results of genetics of his time. Ludomir R. Lozny states, "Wiik's controversial ideas are rejected by the majority of the scholarly community, but they have attracted the enormous interest of a wider audience."

The Atlantic languages of Semitic or "Semitidic" (para-Semitic) origin are a disputed concept in historical linguistics put forward by Theo Vennemann. He proposed that Semitic-language-speakers occupied regions in Europe thousands of years ago and influenced the later European languages that are not part of the Semitic family. The theory has found no notable acceptance among linguists or other relevant scholars and is criticised as being based on sparse and often-misinterpreted data.

Vasconic substrate hypothesis linguistic theory

The Vasconic substrate hypothesis is a proposal that several Western European languages contain remnants of an old language family of Vasconic languages, of which Basque is the only surviving member. The proposal was made by the German linguist Theo Vennemann, but has been rejected by other linguists.

Pre-Celtic Period of prehistory in parts of Europe and Anatolia

The pre-Celtic period in the prehistory of Central Europe and Western Europe occurred before the expansion of the Celts or their culture in Iron Age Europe and Anatolia, but after the emergence of the Proto-Celtic language and cultures. The area involved is that of the maximum extent of the Celtic languages in about the mid 1st century BC. The extent to which Celtic language, culture and genetics coincided and interacted during this period remains very uncertain and controversial.

The Goidelic substrate hypothesis refers to the hypothesized language or languages spoken in Ireland before the Iron Age arrival of the Goidelic languages.

Paleo-Sardinian language Extinct language isolate indigenous to the island of Sardinia

Paleo-Sardinian, also known as Proto-Sardinian or Nuragic, is an extinct language, or perhaps set of languages, spoken on the Mediterranean island of Sardinia by the ancient Sardinian population during the Nuragic era. Starting from the Roman conquest with the establishment of a specific province, a process of language shift took place wherein Latin came slowly to be the only language spoken by the islanders. Paleo-Sardinian is thought to have left traces in the island's onomastics as well as toponyms, which appear to preserve grammatical suffixes, and a number of words in the modern Sardinian language.

The Paleo-European languages, or Old European languages, are the mostly unknown languages that were spoken in Europe prior to the spread of the Indo-European and Uralic families caused by the Bronze Age invasion from the Eurasian steppe of pastoralists whose descendant languages dominate the continent today. Today, the vast majority of European populations speak Indo-European languages, but until the Bronze Age, it was the opposite, with Paleo-European languages of non-Indo-European affiliation dominating the linguistic landscape of Europe.

Pre-Finno-Ugric substrate category of words in some Uralic languages

Pre-Finno-Ugric substrate refers to substratum loanwords from unidentified non-Indo-European and non-Uralic languages that are found in various Finno-Ugric languages, most notably Sami. The presence of Pre-Finno-Ugric substrate in Sami languages was demonstrated by Ante Aikio. Janne Saarikivi points out that similar substrate words are present in Finnic languages as well, but in much smaller numbers.


