Slightly over half of the homepages of the most visited websites on the World Wide Web are in English, with varying amounts of information available in many other languages. [1] [2] Other top languages are Chinese, Spanish, Russian, Persian, French, German and Japanese. [1]
Of the more than 7,000 existing languages, only a few hundred are recognized as being in use for Web pages on the World Wide Web. [3]
There is debate over the most-used languages on the Internet. A 2009 UNESCO report monitoring the languages of websites for 12 years, from 1996 to 2008, found a steady year-on-year decline in the percentage of webpages in English, from 75 percent in 1998 to 45 percent in 2005. [2] The authors found that English remained at 45 percent of content for 2005 to the end of the study but believe this was due to the bias of search engines indexing more English-language content rather than a true stabilization of the percentage of content in English on the World Wide Web. [2]
The number of non-English web pages is rapidly expanding. The use of English online increased by around 281 percent from 2001 to 2011, a lower rate of growth than that of Spanish (743 percent), Chinese (1,277 percent), Russian (1,826 percent) or Arabic (2,501 percent) over the same period. [4]
According to a 2000 study, the international auxiliary language Esperanto ranked 40 out of all languages in search engine queries, also ranking 27 out of all languages that rely on the Latin script. [5]
W3Techs estimated percentages of the top 10 million websites on the World Wide Web using various content languages as of 18 April 2024: [1]
Rank | Language | 16 May 2023 | 18 April 2024 |
---|---|---|---|
1 | English | 55.5% | 50.5% |
2 | Spanish | 5.0% | 5.7% |
3 | Russian | 4.9% | 4.2% |
4 | German | 4.3% | 5.2% |
5 | French | 4.4% | 4.3% |
6 | Japanese | 3.7% | 4.7% |
7 | Portuguese | 2.4% | 3.5% |
8 | Turkish | 2.3% | 1.9% |
9 | Italian | 1.9% | 2.5% |
10 | Persian | 1.8% | 1.4% |
11 | Dutch, Flemish | 1.5% | 2.0% |
12 | Polish | 1.4% | 1.7% |
13 | Chinese | 1.4% | 1.3% |
14 | Vietnamese | 1.3% | 1.2% |
15 | Indonesian | 0.7% | 1.2% |
16 | Czech | 0.7% | 0.9% |
17 | Korean | 0.7% | 0.8% |
18 | Arabic | 0.7% | 0.6% |
19 | Ukrainian | 0.6% | 0.7% |
20 | Greek | 0.5% | 0.5% |
21 | Hebrew | 0.5% | 0.4% |
22 | Swedish | 0.5% | 0.5% |
23 | Romanian | 0.4% | 0.5% |
24 | Hungarian | 0.4% | 0.6% |
25 | Thai | 0.4% | 0.4% |
26 | Danish | 0.3% | 0.4% |
27 | Slovak | 0.3% | 0.3% |
28 | Finnish | 0.3% | 0.4% |
29 | Bulgarian | 0.2% | 0.3% |
30 | Serbian | 0.2% | 0.2% |
31 | Norwegian (Bokmål) | 0.1% | 0.2% |
32 | Croatian | 0.1% | 0.2% |
33 | Lithuanian | 0.1% | 0.2% |
34 | Slovenian | 0.1% | 0.1% |
35 | Catalan, Valencian | 0.1% | 0.1% |
36 | Norwegian | 0.1% | 0.1% |
37 | Estonian | 0.1% | 0.1% |
38 | Latvian | 0.1% | 0.1% |
All other languages are used in less than 0.1% of websites. Even including all languages, percentages may not sum to 100% because some websites contain multiple content languages.
The figures from the W3Techs study are based on the one million most visited websites (i.e., approximately 0.27 percent of all websites according to December 2011 figures) as ranked by Alexa.com, and language is identified using only the home page of the sites in most cases (e.g., all of Wikipedia is based on the language detection of http://www.wikipedia.org). [6] As a consequence, the figures show a significantly higher percentage for many languages (especially for English) as compared to the figures for all websites. [7] For all websites, estimates are between 20 and 50% for English. [8] [2] [9] [10]
Of the top 250 YouTube channels, 66% of the content is in English, 15% in Spanish, 7% in Portuguese, 5% in Hindi, 2% in Korean, while other languages make up 5%, [11] although other sources point to different percentages. [12] [ better source needed ] YouTube is available in over 80 languages with more than a hundred different local versions. [13] Of those popular YouTube channels that posted a video in the first week of 2019, just over half contained some content in a language other than English. [14]
InternetWorldStats estimates of the number of Internet users by language as of March 31, 2020: [15]
Rank | Language | Internet users | Percentage |
---|---|---|---|
1 | English | 1,186,451,052 | 25.9% |
2 | Chinese | 888,453,068 | 19.4% |
3 | Spanish | 363,684,593 | 7.9% |
4 | Arabic | 237,418,349 | 5.2% |
5 | Indonesian | 198,029,815 | 4.3% |
6 | Portuguese | 171,750,818 | 3.7% |
7 | French | 151,733,611 | 3.3% |
8 | Japanese | 118,626,672 | 2.6% |
9 | Russian | 116,353,942 | 2.5% |
10 | German | 92,525,427 | 2.0% |
1–10 | Top 10 languages | 3,525,027,347 | 76.9% |
– | Others | 1,060,551,371 | 23.1% |
Total | 4,585,578,718 | 100% |
Wikimedia Statistics gives the number of page views of each edition of Wikipedia by language. [16]
Rank | Language | Daily page views (average during the last year with "Agent"="User" on 4 January 2021) |
---|---|---|
1 | English | 257,705,129 |
2 | Japanese | 37,286,466 |
3 | Spanish | 37,018,505 |
4 | German | 30,844,175 |
5 | Russian | 26,358,126 |
6 | French | 24,392,611 |
7 | Italian | 18,622,198 |
8 | Chinese | 13,371,571 |
9 | Portuguese | 11,506,680 |
10 | Polish | 8,810,420 |
11 | Arabic | 7,333,102 |
12 | Persian | 5,672,829 |
13 | Indonesian | 5,385,401 |
14 | Dutch | 4,935,611 |
15 | Turkish | 3,382,454 |
Wikipedia, a free-content online encyclopedia written and maintained by a community of volunteers known as Wikipedians, began with its first edit on 15 January 2001, two days after the domain was registered. It grew out of Nupedia, a more structured free encyclopedia, as a way to allow easier and faster drafting of articles and translations.
The Internet is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a network of networks that consists of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies. The Internet carries a vast range of information resources and services, such as the interlinked hypertext documents and applications of the World Wide Web (WWW), electronic mail, telephony, and file sharing.
Internet slang is a non-standard or unofficial form of language used by people on the Internet to communicate to one another. An example of Internet slang is "LOL" meaning "laugh out loud." Since Internet slang is constantly changing, it is difficult to provide a standardized definition. However, it can be understood to be any type of slang that Internet users have popularized, and in many cases, have coined. Such terms often originate with the purpose of saving keystrokes or to compensate for small character limits. Many people use the same abbreviations in texting, instant messaging, and social networking websites. Acronyms, keyboard symbols, and abbreviations are common types of Internet slang. New dialects of slang, such as leet or Lolspeak, develop as ingroup Internet memes rather than time savers. Many people also use Internet slang in face-to-face, real life communication.
A web browser is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on a range of devices, including desktops, laptops, tablets, and smartphones. In 2020, an estimated 4.9 billion people have used a browser. The most-used browser is Google Chrome, with a 64% global market share on all devices, followed by Safari with 19%.
A website is a collection of web pages and related content that is identified by a common domain name and published on at least one web server. Websites are typically dedicated to a particular topic or purpose, such as news, education, commerce, entertainment, or social media. Hyperlinking between web pages guides the navigation of the site, which often starts with a home page. The most-visited sites are Google, YouTube, and Facebook.
The Great Firewall is the combination of legislative actions and technologies enforced by the People's Republic of China to regulate the Internet domestically. Its role in internet censorship in China is to block access to selected foreign websites and to slow down cross-border internet traffic. The Great Firewall operates by checking transmission control protocol (TCP) packets for keywords or sensitive words. If the keywords or sensitive words appear in the TCP packets, access will be closed. If one link is closed, more links from the same machine will be blocked by the Great Firewall. The effect includes: limiting access to foreign information sources, blocking foreign internet tools and mobile apps, and requiring foreign companies to adapt to domestic regulations.
The Chinese Wikipedia is the written vernacular Chinese edition of Wikipedia. It is run by the Wikimedia Foundation. Started on 11 May 2001, the Chinese Wikipedia currently has 1,414,349 articles and 3,498,432 registered users, of whom 63 have administrative privileges.
The English language is sometimes described as the lingua franca of computing. In comparison to other sciences, where Latin and Greek are often the principal sources of vocabulary, computer science borrows more extensively from English. In the past, due to the technical limitations of early computers, and the lack of international standardization on the Internet, computer users were limited to using English and the Latin alphabet. However, this historical limitation is less present today, due to innovations in internet infrastructure and increases in computer speed. Most software products are localized in numerous languages and the invention of the Unicode character encoding has resolved problems with non-Latin alphabets. Some limitations have only been changed recently, such as with domain names, which previously allowed only ASCII characters.
User-generated content (UGC), alternatively known as user-created content (UCC), is generally any form of content, such as images, videos, text, testimonials, and audio, that has been posted by users on online content aggregation platforms such as social media, discussion forums and wikis. It is a product consumers create to disseminate information about online products or the firms that market them.
The mobile web comprises mobile browser-based World Wide Web services accessed from handheld mobile devices, such as smartphones or feature phones, through a mobile or other wireless network.
Internet censorship is the legal control or suppression of what can be accessed, published, or viewed on the Internet. Censorship is most often applied to specific internet domains but exceptionally may extend to all Internet resources located outside the jurisdiction of the censoring state. Internet censorship may also put restrictions on what information can be made internet accessible. Organizations providing internet access – such as schools and libraries – may choose to preclude access to material that they consider undesirable, offensive, age-inappropriate or even illegal, and regard this as ethical behavior rather than censorship. Individuals and organizations may engage in self-censorship of material they publish, for moral, religious, or business reasons, to conform to societal norms, political views, due to intimidation, or out of fear of legal or other consequences.
The Cantonese Wikipedia is the Cantonese-language edition of Wikipedia, run by the Wikimedia Foundation. It was started on 25 March 2006.
Most Internet censorship in Thailand prior to the September 2006 military coup d'état was focused on blocking pornographic websites. The following years have seen a constant stream of sometimes violent protests, regional unrest, emergency decrees, a new cybercrimes law, and an updated Internal Security Act. Year by year Internet censorship has grown, with its focus shifting to lèse majesté, national security, and political issues. By 2010, estimates put the number of websites blocked at over 110,000. In December 2011, a dedicated government operation, the Cyber Security Operation Center, was opened. Between its opening and March 2014, the Center told ISPs to block 22,599 URLs.
Internet linguistics is a domain of linguistics advocated by the English linguist David Crystal. It studies new language styles and forms that have arisen under the influence of the Internet and of other new media, such as Short Message Service (SMS) text messaging. Since the beginning of human–computer interaction (HCI) leading to computer-mediated communication (CMC) and Internet-mediated communication (IMC), experts, such as Gretchen McCulloch have acknowledged that linguistics has a contributing role in it, in terms of web interface and usability. Studying the emerging language on the Internet can help improve conceptual organization, translation and web usability. Such study aims to benefit both linguists and web users combined.
Uncyclopedia is several forks of satirical online encyclopedias that parody Wikipedia. Its logo, a hollow "puzzle potato", parodies Wikipedia's globe puzzle logo, and it styles itself as "the content-free encyclopedia", parodying Wikipedia's slogan of "the free encyclopedia". Founded in 2005 as an English-language wiki, the project spans more than 75 languages as well as several subprojects parodying other wikis. Uncyclopedia's name is a portmanteau of the prefix un- and the word encyclopedia.
Niconico, known before 2012 as Nico Nico Douga, is a Japanese video-sharing service based in Tokyo, Japan. "Niconico" or "nikoniko" is the Japanese ideophone for smiling. As of 2021, Niconico is the 34th most-visited website in Japan, according to Alexa Internet.
The history of wikis began in 1994, when Ward Cunningham gave the name "WikiWikiWeb" to the knowledge base, which ran on his company's website at c2.com, and the wiki software that powered it. The wiki went public in March 1995, the date used in anniversary celebrations of the wiki's origins. c2.com is thus the first true wiki, or a website with pages and links that can be easily edited via the browser, with a reliable version history for each page. He chose "WikiWikiWeb" as the name based on his memories of the "Wiki Wiki Shuttle" at Honolulu International Airport, and because "wiki" is the Hawaiian word for "quick".
Censorship of Wikipedia by governments has occurred widely in countries including China, Iran, Myanmar, Pakistan, Russia, Saudi Arabia, Syria, Tunisia, Turkey, Uzbekistan, and Venezuela. Some instances are examples of widespread Internet censorship in general that includes Wikipedia content. Others are indicative of measures to prevent the viewing of specific content deemed offensive. The duration of different blocks has varied from hours to years.
The Russian Internet or Runet, is the part of the Internet that uses the Russian language, including the Russian-language community on the Internet and websites. Geographically, it reaches all continents, including Antarctica, but mostly it is based in Russia.
Media pluralism defines the state of having a plurality of voices, opinions, and analyses in media systems or the coexistence of different and diverse types of medias and media support.