This article ranks human languages by their number of native speakers.
However, all such rankings should be used with caution, because it is not possible to devise a coherent set of linguistic criteria for distinguishing languages in a dialect continuum.For example, a language is often defined as a set of varieties that are mutually intelligible, but independent national standard languages may be considered to be separate languages even though they are largely mutually intelligible, as in the case of Danish and Norwegian. Conversely, many commonly accepted languages, including German, Italian and even English, encompass varieties that are not mutually intelligible. While Arabic is sometimes considered a single language centred on Modern Standard Arabic, other authors describe its mutually unintelligible varieties as separate languages. Similarly, Chinese is sometimes viewed as a single language because of a shared culture and common literary language. It is also common to describe various Chinese dialect groups, such as Mandarin, Wu and Yue, as languages, even though each of these groups contains many mutually unintelligible varieties.
There are also difficulties in obtaining reliable counts of speakers, which vary over time because of population change and language shift. In some areas, there is no reliable census data, the data is not current, or the census may not record languages spoken, or record them ambiguously. Sometimes speaker populations are exaggerated for political reasons, or speakers of minority languages may be under-reported in favour of a national language.
The following languages are listed as having at least 10 million first language speakers in the 2019 edition of Ethnologue, a language reference published by SIL International, which is based in the United States.
|% of the World population||Language family|
|1||Mandarin Chinese||918||11.922|| Sino-Tibetan |
|2||Spanish||480||5.994|| Indo-European |
|3||English||379||4.922|| Indo-European |
|4||Hindi (Sanskritised Hindustani)||341||4.429|| Indo-European |
|5||Bengali||228||2.961|| Indo-European |
|6||Portuguese||221||2.870|| Indo-European |
|7||Russian||154||2.000|| Indo-European |
|8||Japanese||128||1.662|| Japonic |
|9||Western Punjabi||92.7||1.204|| Indo-European |
|10||Marathi||83.1||1.079|| Indo-European |
|11||Telugu||82.0||1.065|| Dravidian |
|12||Wu Chinese||81.4||1.057|| Sino-Tibetan |
|13||Turkish||79.4||1.031|| Turkic |
|14||Korean||77.3||1.004|| Koreanic |
|15||French||77.2||1.003|| Indo-European |
|16||German||76.1||0.988|| Indo-European |
|17||Vietnamese||76.0||0.987|| Austroasiatic |
|18||Tamil||75.0||0.974|| Dravidian |
|19||Yue Chinese||73.1||0.949|| Sino-Tibetan |
|20||Urdu (Persianised Hindustani)||68.6||0.891|| Indo-European |
|21||Javanese||68.3||0.887|| Austronesian |
|22||Italian||64.8||0.842|| Indo-European |
|23||Egyptian Arabic||64.6||0.839|| Afroasiatic |
|24||Gujarati||56.4||0.732|| Indo-European |
|25||Iranian Persian||52.8||0.686|| Indo-European |
|26||Bhojpuri||52.2||0.678|| Indo-European |
|27||Min Nan Chinese||50.1||0.651|| Sino-Tibetan |
|28||Hakka Chinese||48.2||0.626|| Sino-Tibetan |
|29||Jin Chinese||46.9||0.609|| Sino-Tibetan |
|30||Hausa||43.9||0.570|| Afroasiatic |
|31||Kannada||43.6||0.566|| Dravidian |
|32||Indonesian (Indonesian Malay)||43.4||0.564|| Austronesian |
|33||Polish||39.7||0.516|| Indo-European |
|34||Yoruba||37.8||0.491|| Niger–Congo |
|35||Xiang Chinese||37.3||0.484|| Sino-Tibetan |
|36||Malayalam||37.1||0.482|| Dravidian |
|37||Odia||34.5||0.448|| Indo-European |
|38||Maithili||33.9||0.440|| Indo-European |
|39||Burmese||32.9||0.427|| Sino-Tibetan |
|40||Eastern Punjabi||32.6||0.423|| Indo-European |
|41||Sunda||32.4||0.421|| Austronesian |
|42||Sudanese Arabic||31.9||0.414|| Afroasiatic |
|43||Algerian Arabic||29.4||0.382|| Afroasiatic |
|44||Moroccan Arabic||27.5||0.357|| Afroasiatic |
|45||Ukrainian||27.3||0.355|| Indo-European |
|46||Igbo||27.0||0.351|| Niger–Congo |
|47||Northern Uzbek||25.1||0.326|| Turkic |
|48||Sindhi||24.6||0.319|| Indo-European |
|49||North Levantine Arabic||24.6||0.319|| Afroasiatic |
|50||Romanian||24.3||0.316|| Indo-European |
|51||Tagalog||23.6||0.306|| Austronesian |
|52||Dutch||23.1||0.300|| Indo-European |
|53||Saʽidi Arabic||22.4||0.291|| Afroasiatic |
|54||Gan Chinese||22.1||0.287|| Sino-Tibetan |
|55||Amharic||21.9||0.284|| Afroasiatic |
|56||Northern Pashto||20.9||0.271|| Indo-European |
|57||Magahi||20.7||0.269|| Indo-European |
|58||Thai||20.7||0.269|| Kra–Dai |
|59||Saraiki||20.0||0.260|| Indo-European |
|60||Khmer||16.6||0.216|| Austroasiatic |
|61||Chhattisgarhi||16.3||0.212|| Indo-European |
|62||Somali||16.2||0.210|| Afroasiatic |
|63||Malay (Malaysian Malay)||16.1||0.209|| Austronesian |
|64||Cebuano||15.9||0.206|| Austronesian |
|65||Nepali||15.8||0.205|| Indo-European |
|66||Mesopotamian Arabic||15.7||0.204|| Afroasiatic |
|67||Assamese||15.3||0.199|| Indo-European |
|68||Sinhalese||15.3||0.199|| Indo-European |
|69||Northern Kurdish||14.6||0.190|| Indo-European |
|70||Hejazi Arabic||14.5||0.188|| Afroasiatic |
|71||Nigerian Fulfulde||14.5||0.188|| Niger–Congo |
|72||Bavarian||14.1||0.183|| Indo-European |
|73||South Azerbaijani||13.8||0.179|| Turkic |
|74||Greek||13.1||0.170|| Indo-European |
|75||Chittagonian||13.0||0.169|| Indo-European |
|76||Kazakh||12.9||0.168|| Turkic |
|77||Deccan||12.8||0.166|| Indo-European |
|78||Hungarian||12.6||0.164|| Uralic |
|79||Kinyarwanda||12.1||0.157|| Niger–Congo |
|80||Zulu||12.1||0.157|| Niger–Congo |
|81||South Levantine Arabic||11.6||0.151|| Afroasiatic |
|82||Tunisian Arabic||11.6||0.151|| Afroasiatic |
|83||Sanaani Spoken Arabic||11.4||0.148|| Afroasiatic |
|84||Min Bei Chinese||11.0||0.143|| Sino-Tibetan |
|85||Southern Pashto||10.9||0.142|| Indo-European |
|86||Rundi||10.8||0.140|| Niger–Congo |
|87||Czech||10.7||0.139|| Indo-European |
|88||Taʽizzi-Adeni Arabic||10.5||0.136|| Afroasiatic |
|89||Uyghur||10.4||0.135|| Turkic |
|90||Min Dong Chinese||10.3||0.134|| Sino-Tibetan |
|91||Sylheti||10.3||0.134|| Indo-European |
This section needs to be updated. In particular: the 2010 edition of Nationalencyklopedin has revised figures.November 2018)(
The following table contains the top 100 languages by estimated number of native speakers in the 2007 edition of the Swedish encyclopedia Nationalencyklopedin . As census methods in different countries vary to a considerable extent, and given that some countries do not record language in their censuses, any list of languages by native speakers, or total speakers, is effectively based on estimates. Updated estimates from 2010 are also provided.
The top eleven languages have additional figures from the 2010 edition of the Nationalencyklopedin. Numbers above 95 million are rounded off to the nearest 5 million.
|1||Mandarin (entire branch)||935 (955)||14.1%|
|13||Wu (inc. Shanghainese)||80||1.20%|
|14||Malay (inc. Indonesian and Malaysian)||77||1.16%|
|24||Yue (inc. Cantonese)||59||0.89%|
|28||Southern Min (inc. Hokkien and Teochew)||47||0.71%|
|80||Northern Min [ disputed ]||10.9||0.16%|
|87||Eastern Min (inc. Fuzhou dialect)||9.5||0.14%|
Chinese is a group of languages that forms the Sinitic branch of the Sino-Tibetan languages. Chinese languages are spoken by the ethnic Chinese majority and many minority ethnic groups in China. About 1.2 billion people speak some form of Chinese as their first language.
The term dialect is used in two distinct ways to refer to two different types of linguistic phenomena:
Hindi or more precisely Modern Standard Hindi, is a standardised and Sanskritised register of the Hindustani language, which itself is based primarily on the Khariboli dialect of Delhi and neighbouring areas of Northern India. Hindi, written in the Devanagari script, is one of the two official languages of the Government of India, along with the English language. It is one of the 22 scheduled languages of the Republic of India.
Urdu – also known as Lashkari – or Modern Standard Urdu is a Persianised standard register of the Hindustani language. Urdu is the official national language, and lingua franca, of Pakistan. In India, it is one of 22 constitutionally recognised official languages, having official status in the five states of Telangana, Uttar Pradesh, Bihar, Jharkhand and West Bengal, as well as the national capital territory of Delhi.
Hindustani, known in its literary forms as Hindi/Urdu and historically as Urdu, Hindavi, Rekhta, Hindi and Dehlavi, is the lingua franca of Northern India and Pakistan. It is an Indo-Aryan language, deriving its base primarily from the Western Hindi dialect of Delhi, also known as Khariboli. Hindustani is a pluricentric language, with two standardised registers, Modern Standard Hindi and Modern Standard Urdu.
The Indo-Aryan languages, or Indic languages, are a major language family native to northern Indian subcontinent, and presently found all across South Asia. They constitute a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family. In the early 21st century, Indo-Aryan languages were spoken by more than 800 million people, primarily in India, Bangladesh, Nepal, Pakistan and Sri Lanka. Moreover, there are large immigrant and expatriate Indo-Aryan speaking communities in Northwestern Europe, Western Asia, North America and Australia. There are about 219 known Indo-Aryan languages in the world.
Languages spoken in India belong to several language families, the major ones being the Indo-Aryan languages spoken by 78.05% of Indians and the Dravidian languages spoken by 19.64% of Indians. Languages spoken by the remaining 2.31% of the population belong to the Austroasiatic, Sino-Tibetan, Tai-Kadai and a few other minor language families and isolates. India (780) has the world's second highest number of languages, after Papua New Guinea (839).
A dialect continuum or dialect chain is a spread of language varieties spoken across some geographical area such that neighboring varieties differ only slightly, but the differences accumulate over distance so that widely separated varieties may not be mutually intelligible. This is a typical occurrence with widely spread languages and language families around the world; some prominent examples include across large parts of India ; Iran, neighboring Afghanistan, and Tajikistan ; the Arab world ; Portugal, Spain, France, southern Belgium (Wallonia) and southern Italy ; from Flanders to Germany to Austria ; the Chinese languages or dialects; and several divisions of the Slavic languages. Leonard Bloomfield used the name dialect area. Charles F. Hockett used the term L-complex. It is analogous to a ring species in evolutionary biology.
Marwari is a Rajasthani language spoken in the Indian state of Rajasthan. Marwari is also found in the neighbouring state of Gujarat and Haryana, Eastern Pakistan and some migrant communities in himalayan country Nepal. With some 7.9 million or so speakers, it is one of the largest varieties of Rajasthani. Most speakers live in Rajasthan, with a quarter million in Sindh and a tenth that number in Nepal. There are two dozen dialects of Marwari.
In linguistics, mutual intelligibility is a relationship between languages or dialects in which speakers of different but related varieties can readily understand each other without prior familiarity or special effort. It is sometimes used as an important criterion for distinguishing languages from dialects, although sociolinguistic factors are often also used.
A pluricentric language or polycentric language is a language with several interacting codified standard forms, often corresponding to different countries. Examples include Chinese, English, French, German, Korean, Portuguese, Spanish, Swahili and Tamil. The opposite case is a monocentric language, which has only one formally standardized version. Examples include Japanese and Russian. In some cases, the different standards of a pluricentric language may be elaborated until they become autonomous languages, as happened with Malaysian and Indonesian, and with Hindi and Urdu. The same process is under way in Serbo-Croatian.
The Hindi Belt or Hindi Heartland or Hindi Patti, is a linguistic region encompassing parts of northern, central, eastern and western India where Hindi is widely spoken. Hindi belt is sometimes also used to refer to nine Indian states whose official language is Hindi and have a Hindi-speaking majority, namely Bihar, Chhattisgarh, Haryana, Himachal Pradesh, Jharkhand, Madhya Pradesh, Rajasthan, Uttarakhand, Uttar Pradesh and National Capital Territory of Delhi. It is also referred as Hindi-Urdu belt by some Muslim writers.
Linguistic demography is the statistical study of languages among all populations. Estimating the number of speakers of a given language is not straightforward, and various estimates may diverge considerably. This is first of all due to the question of defining "language" vs. "dialect". Identification of varieties as a single language or as distinct languages is often based on ethnic, cultural, or political considerations rather than mutual intelligibility. The second difficulty is multilingualism, complicating the definition of "native language". Finally, in many countries, insufficient census data add to the difficulties.
The Persian language historically influenced many of the modern languages and dialects of the Middle East, Central Asia, Eastern Europe, and South Asia including the standard register Urdu, the national language of Pakistan.
There are over 500 native languages spoken in Nigeria. The official language of Nigeria is English, the former language of colonial British Nigeria. As reported in 2003, Nigerian English and Nigerian Pidgin were spoken as a second language by 60 million people in Nigeria. Communication in the English language is much more popular in the country's urban communities than it is in the rural areas, due to globalization.
In linguistics, a koiné language, koiné dialect, or simply koiné is a standard or common language or dialect that has arisen to prestige or dominance as a result of the contact, mixing, and often simplifying of two or more mutually intelligible varieties of the same language.
A world language is one that is spoken internationally and learned and spoken by numerous people as a second language. A world language is characterized not only by the total number of speakers but also by geographical distribution and its use in international organizations and diplomatic relations.
The languages of Uttar Pradesh generally belong to two zones in the Indo-Aryan languages, Central and East. Hindi is the state's official language, and according to census data, it is spoken by 91.32% of the population. However, Hindi is a wide label that covers many dialects, which may or not be considered separate languages and may or may not be fully mutually intelligible. These include Awadhi, Braj Bhasha, Bundeli, Bagheli, Kannauji, Hindustani and Bhojpuri. Bhojpuri belongs to the Bihari languages of the Eastern zone, and its status as a Hindi language is subject to debate.