List of languages by total number of speakers

Last updated

Principal language families of the world (and in some cases geographic groups of families). For greater detail, see Distribution of languages in the world. Primary Human Languages Improved Version.png
Principal language families of the world (and in some cases geographic groups of families). For greater detail, see Distribution of languages in the world .

This is a list of languages by total number of speakers.

Contents

It is difficult to define what constitutes a language as opposed to a dialect. For example, Chinese and Arabic are sometimes considered single languages, but each includes several mutually unintelligible varieties, and so they are sometimes considered language families instead. Conversely, colloquial registers of Hindi and Urdu are almost completely mutually intelligible, and are sometimes classified as one language, Hindustani. Such rankings should be used with caution, because it is not possible to devise a coherent set of linguistic criteria for distinguishing languages in a dialect continuum. [1]

There is no single criterion for how much knowledge is sufficient to be counted as a second-language speaker. For example, English has about 450 million native speakers but, depending on the criterion chosen, can be said to have as many as two billion speakers. [2]

There are also difficulties in obtaining reliable counts of speakers, which vary over time because of population change and language shift. In some areas, there is no reliable census data, the data is not current, or the census may not record languages spoken, or record them ambiguously. Sometimes speaker populations are exaggerated for political reasons, or speakers of minority languages may be underreported in favor of a national language. [3]

Ethnologue (2023)

The following languages are listed as having 45 million or more total speakers in the 26th edition of Ethnologue published in 2023. [4] This section does not include entries that Ethnologue identifies as macrolanguages encompassing all their respective varieties, such as Arabic, Lahnda, Persian, Malay, Pashto, and Chinese.

Most spoken languages, Ethnologue, 2023 [4]
Language Family Branch First-language
(L1) speakers
Second-language
(L2) speakers
Total speakers
(L1+L2)
English
(excl. creole languages)
Indo-European Germanic 380 million1.077 billion [5] 1.456 billion
Mandarin Chinese
(incl. Standard Chinese, but excl. other varieties)
Sino-Tibetan Sinitic 939 million199 million [6] 1.138 billion
Hindi
(excl. Urdu)
Indo-European Indo-Aryan 345 million266 million [7] 610 million
Spanish
(excl. creole languages)
Indo-European Romance 485 million74 million [8] 559 million
French
(excl. creole languages)
Indo-European Romance 81 million229 million [9] 310 million
Modern Standard Arabic
(excl. dialects)
Afro-Asiatic Semitic 0 [lower-alpha 1] 274 million [11] 274 million
Bengali Indo-European Indo-Aryan 234 million39 million [12] 273 million
Portuguese
(excl. creole languages)
Indo-European Romance 236 million27 million [13] 264 million
Russian Indo-European Balto-Slavic 147 million108 million [14] 255 million
Urdu
(excl. Hindi)
Indo-European Indo-Aryan 71 million161 million [15] 232 million
Indonesian
(excl. other Malay)
Austronesian Malayo-Polynesian 44 million155 million [16] 199 million
Standard German Indo-European Germanic 75 million58 million [17] 133 million
Japanese Japonic 123 million0.2 million [18] 123 million
Nigerian Pidgin English Creole Krio 5 million116 million [19] 121 million
Egyptian Arabic
(excl. other Arabic dialects)
Afro-Asiatic Semitic 77 million25 million [20] 102 million
Marathi Indo-European Indo-Aryan 83 million16 million [21] 99 million
Telugu Dravidian South-Central83 million13 million [22] 96 million
Turkish Turkic Oghuz 84 million6 million [23] 90 million
Tamil Dravidian Southern79 million8 million [24] 87 million
Yue Chinese
(incl. Cantonese)
Sino-Tibetan Sinitic 86 million1 million [25] 87 million
Vietnamese Austroasiatic Vietic 85 million1 million [26] 86 million
Wu Chinese
(incl. Shanghainese)
Sino-Tibetan Sinitic 83 million0.1 million [27] 83 million
Tagalog [lower-alpha 2] Austronesian Malayo-Polynesian 29 million54 million [28] 83 million
Korean Koreanic 82 million
[29]
82 million
Iranian Persian
(excl. other Persian dialects)
Indo-European Iranian 57 million21 million [30] 79 million
Hausa Afro-Asiatic Chadic 52 million27 million [31] 79 million
Swahili Niger–Congo Bantu 16 million55 million [32] 72 million
Javanese Austronesian Malayo-Polynesian
[33]
68 million
Italian Indo-European Romance 65 million3 million [34] 68 million
Western Punjabi
(excl. Eastern Punjabi)
Indo-European Indo-Aryan
[35]
67 million
Gujarati Indo-European Indo-Aryan 57 million5 million [36] 62 million
Thai Kra–Dai Zhuang–Tai 21 million40 million [37] 61 million
Kannada Dravidian Southern44 million15 million [38] 59 million
Amharic Afro-Asiatic Semitic 32 million25 million [39] 58 million
Bhojpuri Indo-European Indo-Aryan 52 million0.2 million [40] 52 million
Eastern Punjabi
(excl. Western Punjabi)
Indo-European Indo-Aryan 48 million4 million [41] 52 million
Min Nan Chinese
(incl. Hokkien)
Sino-Tibetan Sinitic 50 million0.4 million [42] 50 million
Jin Chinese Sino-Tibetan Sinitic
[43]
48 million
Levantine Arabic
(excl. other Arabic dialects)
Afro-Asiatic Semitic 47 million0.4 million [44] 48 million
Yoruba Niger–Congo Atlantic–Congo 44 million2 million [45] 46 million

CIA World Factbook (2022)

The World Factbook , produced by the Central Intelligence Agency (CIA), estimates the ten most spoken languages (L1 + L2) in 2022 as follows: [46]

Most spoken languages, CIA, 2022 [46]
LanguagePercentage
of world
population
(2022)
English 18.8%
Mandarin Chinese 13.8%
Hindi 7.5%
Spanish 6.9%
French 3.4%
Arabic 3.4%
Bengali 3.4%
Russian 3.2%
Portuguese 3.2%
Urdu 2.9%

See also

Notes

  1. Modern Standard Arabic (MSA) is not an L1. Arabic speakers first learn their respective local dialect. MSA is acquired through formal education. [10]
  2. Tagalog and Filipino are defined as two different languages in the ISO 639 standard. Ethnologue considers that Filipino is a standardized variety of the Tagalog language with no speakers.

Related Research Articles

<span class="mw-page-title-main">Arabic</span> Semitic language and lingua franca of the Arab world

Arabic is a Central Semitic language of the Afroasiatic language family spoken primarily in the Arab world. The ISO assigns language codes to 32 varieties of Arabic, including its standard form of Literary Arabic, known as Modern Standard Arabic, which is derived from Classical Arabic. This distinction exists primarily among Western linguists; Arabic speakers themselves generally do not distinguish between Modern Standard Arabic and Classical Arabic, but rather refer to both as al-ʿarabiyyatu l-fuṣḥā or simply al-fuṣḥā (اَلْفُصْحَىٰ).

<span class="mw-page-title-main">Languages of Europe</span>

There are over 250 languages indigenous to Europe, and most belong to the Indo-European language family. Out of a total European population of 744 million as of 2018, some 94% are native speakers of an Indo-European language. The three largest phyla of the Indo-European language family in Europe are Romance, Germanic, and Slavic; they have more than 200 million speakers each, and together account for close to 90% of Europeans.

<span class="mw-page-title-main">Demographics of the United Arab Emirates</span>

Demographic features of the United Arab Emirates (UAE) include population density, vital statistics, immigration and emigration data, ethnicity, education levels, religions practiced, and languages spoken within the UAE.

<span class="mw-page-title-main">Urdu</span> Indo-Aryan language spoken in South Asia

Urdu is an Indo-Aryan language spoken chiefly in South Asia. It is the national language and lingua franca of Pakistan, where it is also an official language alongside English. In India, Urdu is an Eighth Schedule language, the status and cultural heritage of which are recognised by the Constitution of India; and it also has an official status in several Indian states. In Nepal, Urdu is a registered regional dialect and in South Africa it is a protected language in the constitution. It is also spoken as a minority language in Afghanistan and Bangladesh, with no official status.

<span class="mw-page-title-main">Hindustani language</span> Indo-Aryan language spoken in India and Pakistan

Hindustani is an Indo-Aryan language spoken in Deccan, Northern India and Pakistan, and used as a lingua franca in both countries. Hindustani is a pluricentric language with two standard registers, known as Hindi and Urdu. Thus, it is also called Hindi–Urdu. Colloquial registers of the language fall on a spectrum between these standards. In modern times, a third variety of Hindustani with significant English influences has also appeared which is sometimes called Hinglish or Urdish.

<span class="mw-page-title-main">Languages of Pakistan</span> Overview of languages spoken in Pakistan

Pakistan is a multilingual country with over 70 languages spoken as first languages. The majority of Pakistan's languages belong to the Indo-Iranian group of the Indo-European language family.

<span class="mw-page-title-main">Marwari language</span> Language spoken in Rajasthan, India

Marwari is a language within the Rajasthani language family of the Indo-Aryan languages. Marwari and its closely related varieties like Dhundhari, Shekhawati and Mewari form a part of the broader Marwari language family. It is spoken in the Indian state of Rajasthan, as well as the neighbouring states of Gujarat and Haryana, some adjacent areas in eastern parts of Pakistan, and some migrant communities in Nepal. There are two dozen varieties of Marwari. Marwari is also referred to as simply Rajasthani.

Literary language is the form (register) of a language used when writing in a formal, academic, or particularly polite tone; when speaking or writing in such a tone, it can also be known as formal language. It may be the standardized variety of a language. It can sometimes differ noticeably from the various spoken lects, but the difference between literary and non-literary forms is greater in some languages than in others. If there is a strong divergence between a written form and the spoken vernacular, the language is said to exhibit diglossia.

<span class="mw-page-title-main">Mutual intelligibility</span> Closeness of linguistic varieties

In linguistics, mutual intelligibility is a relationship between languages or dialects in which speakers of different but related varieties can readily understand each other without prior familiarity or special effort. It is sometimes used as an important criterion for distinguishing languages from dialects, although sociolinguistic factors are often also used.

A pluricentric language or polycentric language is a language with several codified standard forms, often corresponding to different countries. Many examples of such languages can be found worldwide among the most-spoken languages, including but not limited to Chinese in mainland China, Taiwan and Singapore; English in the United States, United Kingdom, Canada, Australia, New Zealand, Ireland, South Africa, India, and elsewhere; and French in France, Canada, and elsewhere. The converse case is a monocentric language, which has only one formally standardized version. Examples include Japanese and Russian. In some cases, the different standards of a pluricentric language may be elaborated to appear as separate languages, e.g. Malaysian and Indonesian, Hindi and Urdu, while Serbo-Croatian is in an earlier stage of that process.

<span class="mw-page-title-main">Modern Standard Arabic</span> Formal literary variety of Arabic

Modern Standard Arabic (MSA) or Modern Written Arabic (MWA) is the variety of standardized, literary Arabic that developed in the Arab world in the late 19th and early 20th centuries, and in some usages also the variety of spoken Arabic that approximates this written standard. MSA is the language used in literature, academia, print and mass media, law and legislation, though it is generally not spoken as a first language, similar to Contemporary Latin. It is a pluricentric standard language taught throughout the Arab world in formal education, differing significantly from many vernacular varieties of Arabic that are commonly spoken as mother tongues in the area; these are only partially mutually intelligible with both MSA and with each other depending on their proximity in the Arabic dialect continuum.

A national language is a language that has some connection—de facto or de jure—with a nation. The term is applied quite differently in various contexts. One or more languages spoken as first languages in the territory of a country may be referred to informally or designated in legislation as national languages of the country. National languages are mentioned in over 150 world constitutions.

<span class="mw-page-title-main">Languages of Eritrea</span>

The main languages spoken in Eritrea are Tigrinya, Tigre, Kunama, Bilen, Nara, Saho, Afar, and Beja. The country's working languages are Tigrinya, Arabic, English.

<span class="mw-page-title-main">Languages of Afghanistan</span>

Afghanistan is a linguistically diverse nation, with upwards of 40 distinct languages. However, Dari and Pashto are two of the most prominent languages in the country, and have shared official status under various governments of Afghanistan. Dari, as a shared language between multiple ethnic groups in the country, has served as a historical lingua franca between different linguistic groups in the region and is the most widely understood language in the country. Pashto is also widely spoken in the region; but the language does not have a diverse multi-ethnic population like Dari, and the language is not as commonly spoken by non-Pashtuns. Dari and Pashto are also "relatives", as both are Iranian languages.

Kuwaiti is a Gulf Arabic dialect spoken in Kuwait. Kuwaiti Arabic shares many phonetic features unique to Gulf dialects spoken in the Arabian Peninsula. Due to Kuwait's soap opera industry, knowledge of Kuwaiti Arabic has spread throughout the Arabic-speaking world and become recognizable even to people in countries such as Tunisia and Jordan.

<span class="mw-page-title-main">Languages of Morocco</span> Languages of the country and its peoples

There are a number of languages in Morocco. De jure, the two official languages are Standard Arabic and Standard Moroccan Berber. Moroccan Arabic is by far the primary spoken vernacular and lingua franca, whereas Berber languages serve as vernaculars for significant portions of the country. The languages of prestige in Morocco are Arabic in its Classical and Modern Standard Forms and sometimes French, the latter of which serves as a second language for approximately 33% of Moroccans. According to a 2000–2002 survey done by Moha Ennaji, author of Multilingualism, Cultural Identity, and Education in Morocco, "there is a general agreement that Standard Arabic, Moroccan Arabic, and Berber are the national languages." Ennaji also concluded "This survey confirms the idea that multilingualism in Morocco is a vivid sociolinguistic phenomenon, which is favored by many people."

<span class="mw-page-title-main">Jordanian Arabic</span> Variety of Levantine Arabic spoken in the Kingdom of Jordan

Jordanian Arabic is a dialect continuum of mutually intelligible varieties of Arabic spoken by the population of the Hashemite Kingdom of Jordan.

References

  1. Paolillo, John C.; Das, Anupam (31 March 2006). "Evaluating language statistics: the Ethnologue and beyond" (PDF). UNESCO Institute of Statistics. pp. 3–5. Retrieved 17 November 2018.
  2. Crystal, David (March 2008). "Two thousand million?". English Today. 24: 3–6. doi: 10.1017/S0266078408000023 . S2CID   145597019.
  3. Crystal, David (1988). The Cambridge Encyclopedia of Language . Cambridge University Press. pp.  286–287. ISBN   978-0-521-26438-9.
  4. 1 2 "What are the top 200 most spoken languages?". Ethnologue. 2023. Retrieved 2023-08-25.
  5. English at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  6. Chinese, Mandarin at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  7. Hindi at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  8. Spanish at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  9. French at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  10. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  11. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  12. Bengali at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  13. Portuguese at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  14. Russian at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  15. Urdu at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  16. Indonesian at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  17. German, Standard at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  18. Japanese at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  19. Nigerian Pidgin at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  20. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  21. Marathi at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  22. Telugu at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  23. Turkish at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  24. Tamil at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  25. Chinese, Yue at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  26. Vietnamese at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  27. Chinese, Wu at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  28. Tagalog at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  29. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  30. Persian, Iranian at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  31. Hausa at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  32. Swahili at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  33. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  34. Italian at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  35. Western Punjabi at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  36. Gujarati at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  37. Thai at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  38. Kannada at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  39. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  40. Bhojpuri at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  41. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  42. Chinese, Min Nan at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  43. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  44. List of languages by total number of speakers at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  45. Yoruba at Ethnologue (26th ed., 2023) Closed Access logo transparent.svg
  46. 1 2 "Most spoken languages in the World". The World Factbook . CIA . Retrieved 2022-01-01.