Arabic script

Last updated

Arabic script
Arabic-script.png
Script type primarily, alphabet
Time period
3rd century CE to the present [1]
Direction Right-to-left script   OOjs UI icon edit-ltr-progressive.svg
Official script
20 sovereign states

Co-official script in:

9 sovereign states

Official script at regional level in:

2 sovereign states
LanguagesSee below
Related scripts
Parent systems
Child systems
N'Ko
Hanifi script
Persian alphabet
ISO 15924
ISO 15924 Arab(160),Arabic
Unicode
Unicode alias
Arabic
 This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA).For an introductory guide on IPA symbols, see Help:IPA.For the distinction between [ ], / / and  , see IPA § Brackets and transcription delimiters.
Worldwide use of the Arabic script
Arabic alphabet world distribution Arabic alphabet world distribution - four shades.png
Arabic alphabet world distribution
Countries where the Arabic script is:
  the sole official script
  official alongside other scripts
  official at a provincial level (China, India, Tanzania) or a recognized second script of the official language (Malaysia, Tajikistan)

The Arabic script is the writing system used for Arabic (Arabic alphabet) and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world (after the Latin script), [2] the second-most widely used writing system in the world by number of countries using it, and the third-most by number of users (after the Latin and Chinese scripts). [3]

Contents

The script was first used to write texts in Arabic, most notably the Quran, the holy book of Islam. With the religion's spread, it came to be used as the primary script for many language families, leading to the addition of new letters and other symbols. Such languages still using it are: Persian (Farsi and Dari), Urdu, Uyghur, Kurdish, Pashto, Punjabi (Shahmukhi), Sindhi, Azerbaijani (Torki in Iran), Malay (Jawi), Javanese and Indonesian (Pegon), Balti, Balochi, Luri, Kashmiri, Cham (Akhar Srak), [4] Rohingya, Somali, Mandinka, and Mooré, among others. [5] Until the 16th century, it was also used for some Spanish texts, and—prior to the script reform in 1928—it was the writing system of Turkish. [6]

The script is written from right to left in a cursive style, in which most of the letters are written in slightly different forms according to whether they stand alone or are joined to a following or preceding letter. The script does not have capital letters. [7] In most cases, the letters transcribe consonants, or consonants and a few vowels, so most Arabic alphabets are abjads, with the versions used for some languages, such as Kurdish dialect of Sorani, Uyghur, Mandarin, and Bosniak, being alphabets. It is the basis for the tradition of Arabic calligraphy.

History

The Arabic alphabet is derived either from the Nabataean alphabet [8] [9] or (less widely believed) directly from the Syriac alphabet, [10] which are both derived from the Aramaic alphabet, which, in turn, descended from the Phoenician alphabet. The Phoenician script also gave rise to the Greek alphabet (and, therefore, both the Cyrillic alphabet and the Latin alphabet used in America and most European countries.).

Origins

In the 6th and 5th centuries BCE, northern Arab tribes emigrated and founded a kingdom centred around Petra, Jordan. These people (now named Nabataeans from the name of one of the tribes, Nabatu) spoke Nabataean Arabic, a dialect of the Arabic language. In the 2nd or 1st centuries BCE, [11] [12] the first known records of the Nabataean alphabet were written in the Aramaic language (which was the language of communication and trade), but included some Arabic language features: the Nabataeans did not write the language which they spoke. They wrote in a form of the Aramaic alphabet, which continued to evolve; it separated into two forms: one intended for inscriptions (known as "monumental Nabataean") and the other, more cursive and hurriedly written and with joined letters, for writing on papyrus. [13] This cursive form influenced the monumental form more and more and gradually changed into the Arabic alphabet.

Overview

the Arabic alphabet
خ ح ج ث ت ب ا
khā’ḥā’jīmtha’tā’bā’alif
ص ش س ز ر ذ د
ṣādshīnsīnzāy /
zayn
rā’dhāldāl
ق ف غ ع ظ ط ض
qāffā’ghayn‘aynẓā’ṭā’ḍād
ي و ه ن م ل ك
yā’wāwhā’nūnmīmlāmkāf
أآإئؠء
alif hamza↑alif maddaalif hamza↓yā’ hamza↑kashmiri yā’hamzarohingya yā’
ىٱیەًٌٍ
alif maksuraalif waslafarsi yā’aefathatandammatankasratan
َُِّْٓۤ
fathadammakasrashaddasukunmaddahmadda
ںٹٺٻپٿڃ
nūn ghunnattā’ttāhā’bāā’pā’tāhā’nyā’
ڄچڇڈڌڍڎ
dyā’tchā’tchahā’ddāldāhālddāhālduul
ڑژڤڦکڭگ
rrā’jā’vā’pāḥā’kāḥā’nggāf
ڳڻھہةۃۅ
guehrnūnhā’ doachashmeehā’ goaltā’ marbutatā’ marbuta goalkirghiz oe
ۆۇۈۉۋېے
oeuyukirghiz yuveeyā’ barree
(see below for other alphabets)

The Arabic script has been adapted for use in a wide variety of languages aside from Arabic, including Persian, Malay and Urdu, which are not Semitic. Such adaptations may feature altered or new characters to represent phonemes that do not appear in Arabic phonology. For example, the Arabic language lacks a voiceless bilabial plosive (the [p] sound), therefore many languages add their own letter to represent [p] in the script, though the specific letter used varies from language to language. These modifications tend to fall into groups: Indian and Turkic languages written in the Arabic script tend to use the Persian modified letters, whereas the languages of Indonesia tend to imitate those of Jawi. The modified version of the Arabic script originally devised for use with Persian is known as the Perso-Arabic script by scholars.

When the Arabic script is used to write Serbo-Croatian, Sorani, Kashmiri, Mandarin Chinese, or Uyghur, vowels are mandatory. The Arabic script can, therefore, be used as a true alphabet as well as an abjad, although it is often strongly, if erroneously, connected to the latter due to it being originally used only for Arabic.

Use of the Arabic script in West African languages, especially in the Sahel, developed with the spread of Islam. To a certain degree the style and usage tends to follow those of the Maghreb (for instance the position of the dots in the letters fāʼ and qāf ). [14] [15] Additional diacritics have come into use to facilitate the writing of sounds not represented in the Arabic language. The term ʻAjamī , which comes from the Arabic root for "foreign", has been applied to Arabic-based orthographies of African languages.

Wikipedia in Arabic script of five languages Wikipedia in Arabic script languages in KACST Office.svg
Wikipedia in Arabic script of five languages

Table of writing styles

Script or styleAlphabet(s)Language(s)RegionDerived fromComment
Naskh Arabic,
Pashto,
& others
Arabic,
Pashto,
Sindhi,
& others
Every region where Arabic scripts are usedSometimes refers to a very specific calligraphic style, but sometimes used to refer more broadly to almost every font that is not Kufic or Nastaliq.
Nastaliq Urdu,
Shahmukhi,
Persian,
& others
Urdu,
Punjabi,
Persian,
Kashmiri
& others
Southern and Western Asia Taliq Used for almost all modern Urdu and Punjabi text, but only occasionally used for Persian. (The term "Nastaliq" is sometimes used by Urdu-speakers to refer to all Perso-Arabic scripts.)
Taliq Persian Persian A predecessor of Nastaliq.
Kufic Arabic Arabic Middle East and parts of North Africa
Rasm Restricted Arabic alphabet Arabic Mainly historicalOmits all diacritics including i'jam. Digital replication usually requires some special characters. See: ٮ   ڡ   ٯ (links to Wiktionary).

Table of alphabets

AlphabetLettersAdditional
Characters
Script or StyleLanguagesRegionDerived from:
(or related to)
Note
Arabic 28 ^(see above) Naskh, Kufi, Rasm, & others Arabic North Africa, West Asia Phoenician Aramaic Nabataean
Ajami script 33 ٻ تٜ تٰٜ Naskh Hausa, Yoruba, Swahili West Africa, East Africa Arabic Abjad | documented use likely between the 15th to 18th century for Hausa, Mande, Pulaar, Swahili, Wolof, and Yoruba Languages
Aljamiado 28 Maghrebi, Andalusi variant; Kufic Old Spanish, Andalusi Romance, Ladino, Aragonese, Valencian, Old Galician-Portuguese Southwest Europe Arabic 8th–13th centuries for Andalusi Romance, 14th–16th centuries for the other languages
Arebica 30 ڄ ە اٖى ي ڵ ںٛ ۉ ۆ Naskh Serbo-Croatian Southeastern Europe Perso-Arabic Latest stage has full vowel marking
Arwi alphabet 41 ڊ ڍ ڔ صٜ ۻ ڣ ڹ ݧ Naskh Tamil Southern India, Sri Lanka Perso-Arabic
Belarusian Arabic alphabet 32 Naskh Belarusian Eastern Europe Perso-Arabic 15th / 16th century
Balochi Standard Alphabet(s)29 ٹ ڈ ۏ ݔ ے Naskh and Nastaliq Balochi South-West Asia Perso-Arabic, also borrows multiple glyphs from Urdu This standardization is based on the previous orthography. For more information, see Balochi writing.
Berber Arabic alphabet(s)33 چ ژ ڞ ݣ ء Various Berber languages North Africa Arabic
Burushaski 53 ݳ ݴ ݼ څ ڎ ݽ ڞ ݣ ݸ ݹ ݶ ݷ ݺ ݻ
(see note)
Nastaliq Burushaski South-West Asia (Pakistan) Urdu Also uses the additional letters shown for Urdu. (see below) Sometimes written with just the Urdu alphabet, or with the Latin alphabet.
Chagatai alphabet 32 ݣ Nastaliq and Naskh Chagatai Central Asia Perso-Arabic ݣ is interchangeable with نگ and ڭ.
Dobrujan Tatar 32 Naskh Dobrujan Tatar Southeastern Europe Chagatai
Galal 32 Naskh Somali Horn of Africa Arabic
Jawi 36 چ ڠ ڤ ݢ ڽ ۏ Naskh Malay Peninsular Malaysia, Sumatra and part of Borneo Arabic Since 1303 AD (Trengganu Stone)
Kashmiri 44 ۆ ۄ ؠ ێ Nastaliq Kashmiri South Asia Urdu This orthography is fully voweled. 3 out of the 4 (ۆ, ۄ, ێ) additional glyphs are actually vowels. Not all vowels are listed here since they are not separate letters. For further information, see Kashmiri writing.
Kazakh Arabic alphabet 35 ٵ ٶ ۇ ٷ ۋ ۆ ە ھ ى ٸ ي Naskh Kazakh Central Asia, China Chagatai In use since 11th century, reformed in the early 20th century, now official only in China
Khowar 45 ݯ ݮ څ ځ ݱ ݰ ڵ Nastaliq Khowar South Asia Urdu, however, borrows multiple glyphs from Pashto
Kyrgyz Arabic alphabet 33 ۅ ۇ ۉ ۋ ە ى ي Naskh Kyrgyz Central Asia Chagatai In use since 11th century, reformed in the early 20th century, now official only in China
Pashto 45 ټ څ ځ ډ ړ ږ ښ ګ ڼ ۀ ي ې ۍ ئ Naskh and occasionally, Nastaliq Pashto South-West Asia, Afghanistan and Pakistan Perso-Arabic ګ is interchangeable with گ. Also, the glyphs ی and ې are often replaced with ے in Pakistan.
Pegon script 35 چ ڎ ڟ ڠ ڤ ڮ ۑ Naskh Javanese, Sundanese, Madurese South-East Asia (Indonesia) Arabic
Persian 32 پ چ ژ گ Naskh and Nastaliq Persian (Farsi)West Asia (Iran etc. ) Arabic Also known as
Perso-Arabic.
Shahmukhi 41 ݪ ݨ Nastaliq Punjabi South Asia (Pakistan) Perso-Arabic
Saraiki 45 ٻ ڄ ݙ ڳ Nastaliq Saraiki South Asia (Pakistan) Urdu
Sindhi 52 ڪ ڳ ڱ گ ک
پ ڀ ٻ ٽ ٿ ٺ
ڻ ڦ ڇ چ ڄ ڃ
ھ ڙ ڌ ڏ ڎ ڍ ڊ
Naskh Sindhi South Asia (Pakistan) Perso-Arabic
Sorabe 28 Naskh Malagasy Madagascar Arabic
Soranî 33 ڕ ڤ ڵ ۆ ێ Naskh Kurdish languages Middle-East Perso-Arabic Vowels are mandatory, i.e. alphabet
Swahili Arabic script 28 Naskh Swahili Western and Southern Africa Arabic
İske imlâ 35 ۋ Naskh Tatar Volga region Chagatai Used prior to 1920.
Ottoman Turkish 32 ئە ی Ottoman Turkish Ottoman Empire Chagatai Official until 1928
Urdu 39+
(see notes)
ٹ ڈ ڑ ں ہ ھ ے
(see notes)
Nastaliq Urdu South Asia Perso-Arabic 58 [ citation needed ] letters including digraphs representing aspirated consonants.
بھ پھ تھ ٹھ جھ چھ دھ ڈھ کھ گھ
Uyghur 32 ئا ئە ھ ئو ئۇ ئۆ ئۈ ۋ ئې ئى Naskh Uyghur China, Central Asia Chagatai Reform of older Arabic-script Uyghur orthography that was used prior to the 1950s. Vowels are mandatory, i.e. alphabet
Wolofal 33 ݖ گ ݧ ݝ ݒ Naskh Wolof West Africa Arabic, however, borrows at least one glyph from Perso-Arabic
Xiao'erjing 36 ٿ س﮲ ڞ ي Naskh Sinitic languages China, Central Asia Chagatai Used to write Chinese languages by Muslims living in China such as the Hui people.
Yaña imlâ 29 ئا ئە ئی ئو ئۇ ئ ھ Naskh Tatar Volga region İske imlâ alphabet 1920–1927 replaced with Cyrillic

Current use

Today Iran, Afghanistan, Pakistan, India, and China are the main non-Arabic speaking states using the Arabic alphabet to write one or more official national languages, including Azerbaijani, Baluchi, Brahui, Persian, Pashto, Central Kurdish, Urdu, Sindhi, Kashmiri, Punjabi and Uyghur.[ citation needed ]

An Arabic alphabet is currently used for the following languages:[ citation needed ]

Middle East and Central Asia

East Asia

South Asia

Southeast Asia

Europe

Africa

Former use

With the establishment of Muslim rule in the subcontinent, one or more forms of the Arabic script were incorporated among the assortment of scripts used for writing native languages. [37] In the 20th century, the Arabic script was generally replaced by the Latin alphabet in the Balkans,[ dubious discuss ] parts of Sub-Saharan Africa, and Southeast Asia, while in the Soviet Union, after a brief period of Latinisation, [38] use of Cyrillic was mandated. Turkey changed to the Latin alphabet in 1928 as part of an internal Westernizing revolution. After the collapse of the Soviet Union in 1991, many of the Turkic languages of the ex-USSR attempted to follow Turkey's lead and convert to a Turkish-style Latin alphabet. However, renewed use of the Arabic alphabet has occurred to a limited extent in Tajikistan, whose language's close resemblance to Persian allows direct use of publications from Afghanistan and Iran. [39]

Africa

Europe

Central Asia and Caucasus

South and Southeast Asia

Middle East

Unicode

As of Unicode 15.1, the following ranges encode Arabic characters:

Additional letters used in other languages

Assignment of phonemes to graphemes

∅ = phoneme absent from language
Language family Austron. Dravid. Turkic Indo-European Niger–Con.
Language/script Pegon Jawi Arwi Azeri Kazakh Uyghur Uzbek Sindhi Punjabi Urdu Persian Pashto [a] Balochi Kurdish Swahili
/t͡ʃ/ چ
/ʒ/ ژ
/p/ ڤ ڣ پ
/g/ ؼ ݢ ق گ ڠ
/v/ ۏ و ۆ‎ ۋ و ڤ
/ŋ/ ڠ ڭ نگ‎ ڱ ن نݝ
/ɲ/ ۑ ڽ ݧ ڃ ن نْي
/ɳ/ ڹ ڻ ݨ ن ڼ
Table of additional letters in other languages
Letter [A] Use & Pronunciation Unicode i'jam & other additionsShapeSimilar Arabic Letter(s)
U+ [B] [C] abovebelow
Additional letters with additional marks
پ Pe, used to represent the phoneme /p/ in Persian, Pashto, Punjabi, Khowar, Sindhi, Urdu, Kurdish, Kashmiri; it can be used in Arabic to describe the phoneme /p/ otherwise it is written ب /b/. U+067E none3 dots ٮ ب
ݐ used to represent the equivalent of the Latin letter Ƴ (palatalized glottal stop /ʔʲ/) in some African languages such as Fulfulde. U+0750   ﮳﮳﮳  none3 dots
(horizontal)
ٮ ب
ٻ B̤ē, used to represent a voiced bilabial implosive /ɓ/ in Hausa, Sindhi and Saraiki. U+067B none2 dots
(vertically)
ٮ ب
ڀ represents an aspirated voiced bilabial plosive // in Sindhi. U+0680 none4 dots ٮ ب
ٺ Ṭhē, represents the aspirated voiceless retroflex plosive /ʈʰ/ in Sindhi. U+067A 2 dots
(vertically)
none ٮ ت
ټ Ṭē, used to represent the phoneme /ʈ/ in Pashto. U+067C ﮿ 2 dotsring ٮ ت
ٽ Ṭe, used to represent the phoneme (a voiceless retroflex plosive /ʈ/) in Sindhi U+067D 3 dots
(inverted)
none ٮ ت
Ṭe, used to represent Ṭ (a voiceless retroflex plosive /ʈ/) in Punjabi, Kashmiri, Urdu. U+0679 ◌ؕ small
ط
none ٮ ت
ٿ Teheh, used in Sindhi and Rajasthani (when written in Sindhi alphabet); used to represent the phoneme /t͡ɕʰ/ (pinyin q) in Chinese Xiao'erjing. U+067F 4 dotsnone ٮ ت
ڄ represents the "c" voiceless dental affricate /t͡s/ phoneme in Bosnian U+0684 none2 dots
(vertically)
ح ج
ڃ represents the "ć" voiceless alveolo-palatal affricate /t͡ɕ/ phoneme in Bosnian. U+0683 none2 dots ح ج
چ Che, used to represent /t͡ʃ/ ("ch"). It is used in Persian, Pashto, Punjabi, Urdu, Kashmiri and Kurdish. /ʒ/ in Egypt. U+0686 none3 dots ح ج
څ Ce, used to represent the phoneme /t͡s/ in Pashto. U+0685 3 dotsnone ح خ
ݗ represents the "đ" voiced alveolo-palatal affricate /d͡ʑ/ phoneme in Bosnian. Also used to represent the letter X in Afrikaans. U+0757 2 dotsnone ح خ
ځ Źim, used to represent the phoneme /d͡z/ in Pashto. U+0681 ◌ٔ Hamza none ح خ
ݙ used in Saraiki to represent a Voiced alveolar implosive /ɗ̢/. U+0759 small
ط
2 dots
(vertically)
د د
ڊ used in Saraiki to represent a voiced retroflex implosive //. U+068A none1 dot د د
ڈ Ḍal, used to represent a Ḍ (a voiced retroflex plosive /ɖ/) in Punjabi, Kashmiri and Urdu. U+0688 ◌ؕ small ط none د د
ڌ Dhal, used to represent the phoneme /d̪ʱ/ in Sindhi U+068C 2 dotsnone د د
ډ Ḍal, used to represent the phoneme /ɖ/ in Pashto. U+0689 ﮿ nonering د د
ڑ Ṛe, represents a retroflex flap /ɽ/ in Punjabi and Urdu. U+0691 ◌ؕ small ط none ر ر
ړ Ṛe, used to represent a retroflex lateral flap in Pashto. U+0693 ﮿ nonering ر ر
ݫ used in Ormuri to represent a voiced alveolo-palatal fricative /ʑ/, as well as in Torwali. U+076B 2 dots
(vertically)
none ر ر
ژ Že / zhe, used to represent the voiced postalveolar fricative /ʒ/ in, Persian, Pashto, Kurdish, Urdu, Punjabi and Uyghur. U+0698 3 dotsnone ر ز
ږ Ǵe / ẓ̌e, used to represent the phoneme /ʐ//ɡ//ʝ/ in Pashto. U+0696 1 dot1 dot ر ز
ڕ used in Kurdish to represent rr /r/ in Soranî dialect. U+0695 ٚ noneV pointing down ر ر
ݭ used in Kalami to represent a voiceless retroflex fricative /ʂ/, and in Ormuri to represent a voiceless alveolo-palatal fricative /ɕ/. U+076D 2 dots verticallynone س س
ݜ used in Shina to represent a voiceless retroflex fricative /ʂ/. U+075C 4 dotsnone س ش
ښ X̌īn / ṣ̌īn, used to represent the phoneme /x//ʂ//ç/ in Pashto. U+069A 1 dot1 dot س س
ڜ‎Used in Wakhi to represent the phoneme /ʂ/. U+069C 3 dots3 dots س ش
ڞ Used to represent the phoneme /tsʰ/ (pinyin c) in Chinese. U+069E 3 dotsnone ص ض
ڠ Nga/ŋ/ in the Jawi script and Pegon script. U+06A0 3 dotsnone ع غ
ڤ Ve, used in Kurdish to represent /v/, it can be used in Arabic to describe the phoneme /v/ otherwise it is written ف /f/. Pa, used in the Jawi script and Pegon script to represent /p/. U+06A4 3 dotsnone ڡ ف
ڥ Vi, used in Algerian Arabic and Tunisian Arabic when written in Arabic script to represent the sound /v/ if needed. U+06A5 none3 dots ڡ ف
ڨ Ga, used to represent the voiced velar plosive /ɡ/ in Algerian and Tunisian. U+06A8 3 dotsnone ٯ ق
ڭ Ng, used to represent the /ŋ/ phone in Ottoman Turkish, Kazakh, Kyrgyz, and Uyghur.

Used to represent /ɡ/ in Morocco and in many dialects of Algerian.

U+06AD 3 dotsnone ك ك
ڬ Gaf, represents a voiced velar plosive /ɡ/ in the Jawi script of Malay. U+06AC 1 dotnone ك ك
ݢ U+0762 1 dotnone ک ك
گ Gaf, represents a voiced velar plosive /ɡ/ in Persian, Pashto, Punjabi, Kyrgyz, Kazakh, Kurdish, Uyghur, Mesopotamian Arabic, Urdu and Ottoman Turkish. U+06AF linehorizontal linenone ک ك
ګ Gaf, used to represent the phoneme /ɡ/ in Pashto. U+06AB ﮿ ringnone ک ك
ؼ Gaf, represents a voiced velar plosive /ɡ/ in the Pegon script of Indonesian. U+08B4 none3 dotsک ك
ڱ represents the Velar nasal /ŋ/ phoneme in Sindhi. U+06B1 2 dots + horizontal
line
none ک ك
ڳ represents a voiced velar implosive /ɠ/ in Sindhi and Saraiki U+06B1 horizontal
line
2 dots ک ك
ݣ used to represent the phoneme /ŋ/ (pinyin ng) in Chinese. U+0763 none3 dots ک ك
ݪ used in Marwari to represent a retroflex lateral flap /ɺ̢/, and in Kalami to represent a voiceless lateral fricative /ɬ/. U+076A linehorizontal
line
none ل ل
– or alternately typeset as لؕ – is used in Punjabi to represent voiced retroflex lateral approximant /ɭ/ [43] U+08C7 ◌ؕ small ط none ل ل
لؕ U+0644 U+0615
ڵ used in Kurdish to represent ll /ɫ/ in Soranî dialect. Represents the "lj" palatal lateral approximant /ʎ/ phoneme in Bosnian. U+06B5 ◌ٚ V pointing downnone ل ل
ڼ represents the retroflex nasal /ɳ/ phoneme in Pashto. U+06BC ﮿ 1 dotring ں ن
ڻ represents the retroflex nasal /ɳ/ phoneme in Sindhi. U+06BB ◌ؕ small ط none ں ن
ݨ used in Punjabi to represent /ɳ/ and Saraiki to represent /ɲ/. U+0768 1 dot + small ط none ں ن
ڽ Nya/ɲ/ in the Jawi script ڽـ ـڽـ ڽ., The isolated ڽ‎ and final ـڽ‎ resemble the form ڽ, while the initial ڽـ‎ and medial forms ـڽـ‎, resemble the form پ. U+06BD 3 dotsnone ں ن
ݩ represents the "nj" palatal nasal /ɲ/ phoneme in Bosnian. U+0769 ◌ٚ 1 dot
V pointing down
none ں ن
ۅÖ, used to represent the phoneme /ø/ in Kyrgyz. U+0624 ◌̵ Strikethrough [D] none و و
ﻭٓ Uu, used to represent the phoneme // in Somali. + ◌ٓ U+0648 U+0653 ◌ٓ Maddanone و + ◌ٓ
ۏ Va in the Jawi script. U+06CF 1 dotnone و و
ۋ represents a /v/ in Kyrgyz, Uyghur, and Old Tatar; and /w,ʊw,ʉw/ in Kazakh; also formerly used in Nogai. U+06CB 3 dotsnone و و
ۆ represents "o" // in Kurdish, "ü" /y/ in Azerbaijani, and /ø/ in Uyghur as part of the digraph ئۆ. It represents the "u" /u/ phoneme in Bosnian. U+06C6 ◌ٚ V pointing downnone و و
ۇ U, used to represents the /u/ phoneme in Azerbaijani, Kazakh, Kyrgyz and Uyghur. U+06C7 ◌ُ Damma [E] none و و
ۉ represents the "o" /ɔ/ phoneme in Bosnian. Also used to represent /ø/ in Kyrgyz. U+06C9 ◌ٛ V pointing upnone و و
ىٓ Ii, used to represent the phoneme // in Somali and Saraiki. U+0649 U+0653 ◌ٓ Maddanone ى ي
ې Pasta Ye, used to represent the phoneme /e/ in Pashto and Uyghur. U+06D0 none2 dots vertical ى ي
ۍ X̌əźīna ye Ye, used to represent the phoneme [əi] in Pashto. U+06CD linehorizontal
line
none ى ي
ۑ Nya/ɲ/ in the Pegon script. U+06D1 none3 dots ى ي
ێ represents ê // in Kurdish. U+06CE ◌ٚ V pointing down 2 dots
(start + mid)
ى ي
Additional letters with shape alteration
ک Khē, represents // in Sindhi. U+06A9 nonenonenone ک ك
ڪ "Swash kāf" is a stylistic variant of ك in Arabic, but represents un- aspirated /k/ in Sindhi. U+06AA nonenonenone ڪ ك
ھ
ھ
Do-chashmi he (two-eyed hāʼ), used in digraphs for aspiration /ʰ/ and breathy voice /ʱ/ in Punjabi and Urdu. Also used to represent /h/ in Kazakh, Sorani and Uyghur. [F] U+06BE nonenonenone ھ ه / هـ
ە Ae, used represent /æ/ and /ɛ/ in Kazakh, Sorani and Uyghur. U+06D5 nonenonenoneه ه / هـ
ے Baṛī ye ('big yāʼ'), is a stylistic variant of ي in Arabic, but represents "ai" or "e" /ɛː/, // in Urdu and Punjabi. U+06D2 nonenonenone ے ي
Additional Digraph letters
أو Oo, used to represent the phoneme // in Somali. U+0623 U+0648 ◌ٔ Hamza none او أ + و
اٖى represents the "i" /i/ phoneme in Bosnian. U+0627 U+0656 U+0649 ◌ٖ Alefnone اى اٖ + ى
أيEe, used to represent the phoneme // in Somali. U+0623 U+064A ◌ٔ Hamza 2 dots اى أ + ي
  1. letter or digraph
  2. Joined to the letter, closest to the letter, on the first letter, or above.
  3. Further away from the letter, or on the second letter, or below.
  4. A variant that end up with loop also exists.
  5. Although the letter also known as Waw with Damma, some publications and fonts features filled Damma that looks similar to comma.
  6. Shown in Naskh (top) and Nastaliq (bottom) styles. The Nastaliq version of the connected forms are connected to each other, because the tatweel character U+0640 used to show the other forms does not work in many Nastaliq fonts.

Letter construction

Most languages that use alphabets based on the Arabic alphabet use the same base shapes. Most additional letters in languages that use alphabets based on the Arabic alphabet are built by adding (or removing) diacritics to existing Arabic letters. Some stylistic variants in Arabic have distinct meanings in other languages. For example, variant forms of kāf ك ک ڪ are used in some languages and sometimes have specific usages. In Urdu and some neighbouring languages, the letter Hā has diverged into two forms ھ dō-čašmī hē and ہ ہـ ـہـ ـہ gōl hē, [44] while a variant form of ي referred to as baṛī yē ے is used at the end of some words. [44]

Table of letter components

See also

Notes

  1. Broadly speaking, there are two standards for Pashto orthography: the Afghan orthography in Afghanistan and the Peshawar orthography in Pakistan, where /g/ in the latter is represented by ګ instead of the Afghani گ.

Related Research Articles

Transliteration is a type of conversion of a text from one script to another that involves swapping letters in predictable ways, such as Greek ⟨α⟩⟨a⟩, Cyrillic ⟨д⟩⟨d⟩, Greek ⟨χ⟩ → the digraph ⟨ch⟩, Armenian ⟨ն⟩⟨n⟩ or Latin ⟨æ⟩⟨ae⟩.

Uyghur is a Turkic language with a long literary tradition spoken in Xinjiang, China by the Uyghurs. Today, the Uyghur Arabic alphabet is the official writing system used for Uyghur in Xinjiang, whereas other alphabets like the Uyghur Cyrillic alphabets are still in use outside China, especially in Central Asia, and Uyghur Latin is used in western countries.

Comorian is the name given to a group of four Bantu languages spoken in the Comoro Islands, an archipelago in the southwestern Indian Ocean between Mozambique and Madagascar. It is named as one of the official languages of the Union of the Comoros in the Comorian constitution. Shimaore, one of the languages, is spoken on the disputed island of Mayotte, a French department claimed by Comoros.

The Azerbaijani alphabet has three versions which includes the Arabic, Latin, and Cyrillic alphabets.

<span class="mw-page-title-main">Persian alphabet</span> Writing system used for the Persian language

The Persian alphabet, also known as the Perso-Arabic script, is the right-to-left alphabet used for the Persian language. It is a variation of the Arabic script with five additional letters: پ چ ژ گ, in addition to the obsolete ڤ that was used for the sound. This letter is no longer used in Persian, as the -sound changed to, e.g. archaic زڤان > زبان 'language'.

<span class="mw-page-title-main">Wakhi language</span> Eastern Iranian language spoken by the Wakhi people

Wakhi is an Indo-European language in the Eastern Iranian branch of the language family spoken today in Wakhan District, Northern Afghanistan, and neighboring areas of Tajikistan, Pakistan and China.

An official script is a writing system that is specifically designated to be official in the constitutions or other applicable laws of countries, states, and other jurisdictions. Akin to an official language, an official script is much rarer. It is used primarily where an official language is in practice written with two or more scripts. As, in these languages, use of script often has cultural or political connotations, proclamation of an official script is sometimes criticized as having a goal of influencing culture or politics or both. Desired effects also may include easing education, communication and some other aspects of life.

The Arabic alphabet is thought to be traced back to a Nabataean variation of the Aramaic alphabet, known as Nabataean Aramaic. This script itself descends from the Phoenician alphabet, an ancestral alphabet that additionally gave rise to the Hebrew and Greek alphabets. Nabataean Aramaic evolved into Nabataean Arabic, so-called because it represents a transitional phase between the known recognizably Aramaic and Arabic scripts. Nabataean Arabic was succeeded by Paleo-Arabic, termed as such because it dates to the pre-Islamic period in the fifth and sixth centuries CE, but is also recognizable in light of the Arabic script as expressed during the Islamic era. Finally, the standardization of the Arabic alphabet during the Islamic era led to the emergence of classical Arabic. The phase of the Arabic alphabet today is known as Modern Standard Arabic, although classical Arabic survives as a "high" variety as part of a diglossia.

<span class="mw-page-title-main">Kyrgyz alphabets</span> Alphabets used to write the Kyrgyz language

The Kyrgyz alphabets are the alphabets used to write the Kyrgyz language. Kyrgyz uses the following alphabets:

<span class="mw-page-title-main">Kazakh alphabets</span> Alphabets used to write the Kazakh language

Three alphabets are used to write Kazakh: the Cyrillic, Latin and Arabic scripts. The Cyrillic script is used in Kazakhstan, Russia, and Mongolia. An October 2017 Presidential Decree in Kazakhstan ordered that the transition from Cyrillic to a Latin script be completed by 2031. The Arabic script is used in Saudi Arabia, Iran, Afghanistan, and parts of China.

Che or cheem is a letter of the Persian alphabet, used to represent, and which derives from ǧīm by the addition of two dots. It is found with this value in other Arabic-derived scripts. It is based on the jim ج. It is used in Persian, Urdu, Pashto, Kurdish, Uyghur, Kashmiri, Azerbaijani, Ottoman Turkish, Malay (Jawi), Javanese (Pegon), and other Indo-Iranian languages. It is also one of the five letters the Persian alphabet added from the twenty-eight inherited from the Arabic alphabet. In name and shape, it is a variant of jim. Its numerical value is 3000.

<span class="mw-page-title-main">Ng (Arabic letter)</span>

Ng or Naf is an additional letter of the Arabic script, derived from kāf with the addition of three dots above the letter. The letter is used in Uyghur to represent a velar and is still used for when writing other Turkic languages, including the obsolete Ottoman Turkish.

Že or Zhe, used to represent the phoneme, is a letter in the Persian alphabet, based on zayn (ز) with two additional diacritic dots. It is one of the five letters that the Persian alphabet adds to the original Arabic script, others being چ ,پ and گ, in addition the obsolete ڤ. In name and shape, it is a variant of ze. Its numerical value is 4000.

<span class="mw-page-title-main">Pular language</span> Indigenous language widely spoken in Guinea and surrounding countries

Pular (𞤆𞤵𞤤𞤢𞤪) is a Fula language spoken primarily by the Fula people of Fouta Djallon, Guinea. It is also spoken in parts of Guinea-Bissau, Sierra Leone, and Senegal. There are a small number of speakers in Mali. Pular is spoken by 4.3 million Guineans, about 55% of the national population. This makes Pular the most widely spoken indigenous language in the country. Substantial numbers of Pular speakers have migrated to other countries in West Africa, notably Senegal.

Ajami or Ajamiyya, which comes from the Arabic root for 'foreign' or 'stranger', is an Arabic-derived script used for writing African languages, particularly Songhai, Mandé, Hausa and Swahili, although many other languages are also written using the script, including Mooré, Pulaar, Wolof, and Yoruba. It is an adaptation of the Arabic script to write sounds not found in Standard Arabic. Rather than adding new letters, modifications usually consist of additional dots or lines added to pre-existing letters.

<span class="mw-page-title-main">Pe (Persian letter)</span> Letter used to represent the [p] sound in Persian and Kurdish alphabet

Pe (پ) is a letter in the Persian alphabet and the Kurdish alphabet used to represent the voiceless bilabial plosive ⟨p⟩. It is based on bā' (ب) with two additional diacritic dots. It is one of the five letters that were created specifically for the Persian alphabet to symbolize sounds found in Persian but not in Standard Arabic, others being ژ, چ, and گ, in addition the obsolete ڤ. In name and shape, it is a variant of be (ب). It is used in Persian, Kurdish, Pashto, Balochi, and other Iranian languages, Uyghur, Urdu, Sindhi, Kashmiri, Shina, and Turkic languages. Its numerical value is 2000.

<span class="mw-page-title-main">Digraphia</span> Using multiple writing system for a language

In sociolinguistics, digraphia refers to the use of more than one writing system for the same language. Synchronic digraphia is the coexistence of two or more writing systems for the same language, while diachronic digraphia or sequential digraphia is the replacement of one writing system by another for a particular language.

<span class="mw-page-title-main">Uzbek alphabet</span> Scripts used to write the Uzbek language

The Uzbek language has been written in various scripts: Latin, Cyrillic and Arabic. The language traditionally used Arabic script, but the official Uzbek government under the Soviet Union started to use Cyrillic in 1940, which is when widespread literacy campaigns were initiated by the Soviet government across the Union. In 1992, Latin script was officially reintroduced in Uzbekistan along with Cyrillic. In the Xinjiang region of China, some Uzbek speakers write using Cyrillic, others with an alphabet based on the Uyghur Arabic alphabet. Uzbeks of Afghanistan also write the language using Arabic script, and the Arabic Uzbek alphabet is taught at some schools.

<span class="mw-page-title-main">Pashto alphabet</span> Writing system used for the Pashto language

The Pashto alphabet is the right-to-left abjad-based alphabet developed from the Arabic script, used for the Pashto language in Pakistan and Afghanistan. It originated in the 16th century through the works of Pir Roshan.

Hausa Ajami script refers to the practice of using the alphabet derived from Arabic script for writing of Hausa language.

References

  1. Daniels, Peter T.; Bright, William, eds. (1996). The World's Writing Systems. Oxford University Press, Inc. p. 559. ISBN   978-0195079937.
  2. "Arabic Alphabet". Encyclopædia Britannica online. Archived from the original on 26 April 2015. Retrieved 16 May 2015.
  3. Vaughan, Don. "The World's 5 Most Commonly Used Writing Systems". Encyclopædia Britannica. Archived from the original on 29 July 2023. Retrieved 29 July 2023.
  4. Cham romanization table background. Library of Congress
  5. Mahinnaz Mirdehghan. 2010. Persian, Urdu, and Pashto: A comparative orthographic analysis. Writing Systems Research Vol. 2, No. 1, 9–23.
  6. "Exposición Virtual. Biblioteca Nacional de España". Bne.es. Archived from the original on 18 February 2012. Retrieved 6 April 2012.
  7. Ahmad, Syed Barakat. (11 January 2013). Introduction to Qur'anic script. Routledge. ISBN   978-1-136-11138-9. OCLC   1124340016.
  8. Gruendler, Beatrice (1993). The Development of the Arabic Scripts: From the Nabatean Era to the First Islamic Century According to Dated Texts. Scholars Press. p. 1. ISBN   9781555407100.
  9. Healey, John F.; Smith, G. Rex (13 February 2012). "II - The Origin of the Arabic Alphabet". A Brief Introduction to The Arabic Alphabet. Saqi. ISBN   9780863568817.
  10. Senner, Wayne M. (1991). The Origins of Writing. U of Nebraska Press. p. 100. ISBN   0803291671.
  11. "Nabataean abjad". www.omniglot.com. Retrieved 8 March 2017.
  12. Naveh, Joseph. "Nabatean Language, Script and Inscriptions" (PDF).
  13. Taylor, Jane (2001). Petra and the Lost Kingdom of the Nabataeans. I.B.Tauris. p. 152. ISBN   9781860645082.
  14. "Zribi, I., Boujelbane, R., Masmoudi, A., Ellouze, M., Belguith, L., & Habash, N. (2014). A Conventional Orthography for Tunisian Arabic. In Proceedings of the Language Resources and Evaluation Conference (LREC), Reykjavík, Iceland".
  15. Brustad, K. (2000). The syntax of spoken Arabic: A comparative study of Moroccan, Egyptian, Syrian, and Kuwaiti dialects. Georgetown University Press.
  16. "Sayad Zahoor Shah Hashmii". baask.com.
  17. Sarlak, Riz̤ā (2002). "Dictionary of the Bakhtiari dialect of Chahar-lang". google.com.eg.
  18. Iran, Mojdeh (5 February 2011). "Bakhtiari Language Video (bak) بختياري ها! خبری مهم" via Vimeo.
  19. "Pakistan should mind all of its languages!". tribune.com.pk. June 2011.
  20. "Ethnologue" . Retrieved 1 February 2020.
  21. "Ethnologue" . Retrieved 1 February 2020.
  22. "The Bible in Brahui". Worldscriptures.org. Archived from the original on 30 October 2016. Retrieved 5 August 2013.
  23. "Rohingya Language Book A-Z". Scribd.
  24. "Ida'an". scriptsource.org.
  25. "The Coptic Studies' Corner". stshenouda.com. Archived from the original on 19 April 2012. Retrieved 17 April 2012.
  26. "--The Cradle of Nubian Civilisation--". thenubian.net. Archived from the original on 24 April 2012. Retrieved 17 April 2012.
  27. "2 » AlNuba egypt". 19 July 2012. Archived from the original on 19 July 2012.
  28. "Zarma". scriptsource.org.
  29. "Tadaksahak". scriptsource.org.
  30. "Lost Language — Bostonia Summer 2009". bu.edu.
  31. "Dyula". scriptsource.org.
  32. "Jola-Fonyi". scriptsource.org.
  33. "African Arabic-Script Languages Title: From the 'Sacred' to the 'Profane': the Yoruba Ajami Script and the Challenges of a Standard Orthography". ResearchGate. October 2021.
  34. "Ibn Sayyid manuscript". Archived from the original on 8 September 2015. Retrieved 27 September 2018.
  35. "Muhammad Arabic letter". Archived from the original on 8 September 2015. Retrieved 27 September 2018.
  36. "Charno Letter". Muslims In America. Archived from the original on 20 May 2013. Retrieved 5 August 2013.
  37. Asani, Ali S. (2002). Ecstasy and enlightenment : the Ismaili devotional literature of South Asia. Institute of Ismaili Studies. London: I.B. Tauris. p. 124. ISBN   1-86064-758-8. OCLC   48193876.
  38. Alphabet Transitions – The Latin Script: A New Chronology – Symbol of a New Azerbaijan Archived 2007-04-03 at the Wayback Machine , by Tamam Bayatly
  39. Sukhail Siddikzoda. "Tajik Language: Farsi or Not Farsi?" (PDF). Archived from the original (PDF) on 13 June 2006.
  40. "Brief history of writing in Chechen". Archived from the original on 23 December 2008.
  41. p. 20, Samuel Noel Kramer. 1986. In the World of Sumer: An Autobiography. Detroit: Wayne State University Press.
  42. J. Blau. 2000. Hebrew written in Arabic characters: An instance of radical change in tradition. (In Hebrew, with English summary). In Heritage and Innovation in Judaeo-Arabic Culture: Proceedings of the Sixth Conference of the Society For Judaeo-Arabic Studies, p. 27-31. Ramat Gan.
  43. Lorna Priest Evans; M. G. Abbas Malik. "Proposal to encode ARABIC LETTER LAM WITH SMALL ARABIC LETTER TAH ABOVE in the UCS" (PDF). www.unicode.org. Retrieved 10 May 2020.
  44. 1 2 "Urdu Alphabet". www.user.uni-hannover.de. Archived from the original on 11 September 2019. Retrieved 4 May 2020.