Arabic script

Last updated
Arabic script
Arabic-script.png
Script type
Abjad
(abugida or true alphabet in some adaptations)
Time period
400 CE to the present
Direction right-to-left   OOjs UI icon edit-ltr-progressive.svg
LanguagesSee below
Related scripts
Parent systems
Child systems
Inspired the N'Ko alphabet and the Hanifi script
ISO 15924
ISO 15924 Arab, 160  OOjs UI icon edit-ltr-progressive.svg ,Arabic
Unicode
Unicode alias
Arabic
 This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA).For an introductory guide on IPA symbols, see Help:IPA.For the distinction between [ ], / / and  , see IPA § Brackets and transcription delimiters.

The Arabic script is a writing system used for writing Arabic and several other languages of Asia and Africa, such as Persian (Farsi/Dari), Uyghur, Kurdish, Punjabi, Sindhi, Balti, Balochi, Pashto, Lurish, Urdu, Kashmiri, Rohingya, Somali and Mandinka, among others. [1] Until the 16th century, it was also used to write some texts in Spanish. Additionally, prior to the language reform in 1928, it was the writing system of Turkish. [2] It is the second-most widely used writing system in the world by the number of countries using it and the third by the number of users, after the Latin and Chinese scripts. [3]

Contents

The Arabic script is written from right to left in a cursive style, in which most of the letters are written in slightly different forms according to whether they stand alone or are joined to a following or preceding letter. The basic letter form remains unchanged. In most cases, the letters transcribe consonants or consonants and a few vowels, so most Arabic alphabets are abjads. It does not have capital letters. [4]

The script was first used to write texts in Arabic, most notably the Quran, the holy book of Islam. With the religion's spread, it came to be used as the primary script for many language families, leading to the addition of new letters and other symbols, with some versions, such as Kurdish, Uyghur and old Bosnian being abugidas or true alphabets. It is also the basis for the tradition of Arabic calligraphy.

Worldwide use of the Arabic script
Arabic alphabet world distribution Arabic alphabet world distribution - four shades.png
Arabic alphabet world distribution
Countries where the Arabic script:
  is the only official script
  is the only official script, but other scripts are recognized for national or regional languages
  is official alongside other scripts
  is official at a sub-national level (China, India) or is a recognized alternative script (Malaysia)

History

The Arabic alphabet is a derivative of the Nabataean alphabet [5] [6] or (less widely believed) directly from the Syriac alphabet [7] which are both derived from the Aramaic alphabet, which descended from the Phoenician alphabet. The Phoenician alphabet gave rise to among others the Arabic alphabet, Hebrew alphabet and the Greek alphabet (and therefore the Cyrillic and Roman alphabets).

Origins

In the 6th and 5th centuries BCE, northern Arab tribes emigrated and founded a kingdom centred around Petra, Jordan. These people (now named Nabataeans from the name of one of the tribes, Nabatu) spoke Nabataean Arabic, a dialect of the Arabic language. In the 2nd or 1st centuries BCE, [8] [9] the first known records of the Nabataean alphabet were written in the Aramaic language (which was the language of communication and trade), but included some Arabic language features: the Nabataeans did not write the language which they spoke. They wrote in a form of the Aramaic alphabet, which continued to evolve; it separated into two forms: one intended for inscriptions (known as "monumental Nabataean") and the other, more cursive and hurriedly written and with joined letters, for writing on papyrus. [10] This cursive form influenced the monumental form more and more and gradually changed into the Arabic alphabet.

Overview

the Arabic alphabet
خحجثتبا
ḫā’ /
khā’
ḥā’jīmṯā’ /
tha’
tā’’bā’’alif
صشسزرذد
ṣād’šīn /
shīn
sīnzāy /
zayn
rā’ḏāl’ /
dhāl’
dāl’’
قفغعظطض
qāffā’ġayn’ /
ghayn’
‘ayn’ẓā’ṭā’ḍād’
يوهنملك
yā’wāwhā’nūnmīmlāmkāf
(see below for other alphabets)

The Arabic script has been adapted for use in a wide variety of languages besides Arabic, including Persian, Malay and Urdu, which are not Semitic. Such adaptations may feature altered or new characters to represent phonemes that do not appear in Arabic phonology. For example, the Arabic language lacks a voiceless bilabial plosive (the [p] sound), therefore many languages add their own letter to represent [p] in the script, though the specific letter used varies from language to language. These modifications tend to fall into groups: Indian and Turkic languages written in the Arabic script tend to use the Persian modified letters, whereas the languages of Indonesia tend to imitate those of Jawi. The modified version of the Arabic script originally devised for use with Persian is known as the Perso-Arabic script by scholars.[ citation needed ]

In the cases of Bosnian, Kurdish, Kashmiri and Uyghur writing systems, vowels are mandatory. The Arabic script can therefore be used in both abugida and abjad forms, although it is often strongly, if erroneously, connected to the latter due to it being originally used only for Arabic.[ citation needed ]

Use of the Arabic script in West African languages, especially in the Sahel, developed with the spread of Islam. To a certain degree the style and usage tends to follow those of the Maghreb (for instance the position of the dots in the letters fāʼ and qāf ). Additional diacritics have come into use to facilitate the writing of sounds not represented in the Arabic language. The term ʻAjamī , which comes from the Arabic root for "foreign," has been applied to Arabic-based orthographies of African languages.[ citation needed ]

Wikipedia in Arabic script of five languages Wikipedia in Arabic script languages in KACST Office.svg
Wikipedia in Arabic script of five languages

Table of writing styles

Script or styleAlphabet(s)Language(s)RegionDerived fromComment
Naskh Arabic
& others
Arabic
& others
Every region where Arabic scripts are usedSometimes refers to a very specific calligraphic style, but sometimes used to refer more broadly to almost every font that is not Kufic or Nastaliq.
Nastaliq Urdu,
Persian,
& others
Urdu,
Persian,
& others
Southern and Western Asia Taliq Used for almost all modern Urdu text, but only occasionally used for Persian. (The term "Nastaliq" is sometimes used by Urdu speakers to refer to all Perso-Arabic scripts.)
Taliq Persian Persian A predecessor of Nastaliq.
Kufic Arabic Arabic Middle East and parts of North Africa
Rasm Restricted Arabic alphabet Arabic Mainly historicalOmits all diacritics including i'jam. Digital replication usually requires some special characters. See: ٮ   ڡ   ٯ (links to Wiktionary).

Table of alphabets

AlphabetLettersAdditional
Characters
Script or StyleLanguagesRegionDerived from:
(or related to)
Note
Arabic 28 ^(see above) Naskh, Kufi, Rasm, & others Arabic North Africa, West Asia Aramaic,
Syriac,
Nabataean
Ajami script 33 Hausa, Swahili West Africa Arabic Abjad
Aljamiado 28 Old Spanish, Mozarabic, Ladino, Aragonese, Old Galician-Portuguese Southwest Europe Arabic 8th-13th centuries for Mozarabic, 14th-16th centuries for the other languages
Arebica 30 Bosnian Southeastern EuropePerso-ArabicLatest stage with full vowel marking
Arwi alphabet 41 Tamil Southern India, Sri LankaPerso-Arabic
Belarusian Arabic alphabet 32 Belarusian Eastern EuropePerso-Arabic15th / 16th century
Berber Arabic alphabet(s)Various Berber languages North Africa Arabic
Burushaski 53 ݳ ݴ څ ݼ ڎ ݽ ڞ ݣ ݸ ݹ ݶ ݷ ݺ ݻ
(see note)
Burushaski South-West Asia (Pakistan) Urdu Also uses the additional letters shown for Urdu. (see below) Sometimes written with just the Urdu alphabet, or with the Latin alphabet.
Chagatai alphabet(s)32 Chagatai Central AsiaPerso-Arabic
Galal 32 Somali Horn of AfricaArabic
Jawi 36 ݘ ڠ ڤ ݢ ڽ ۏ Malay Peninsular MalayPerso-ArabicSince 1303 AD (Trengganu Stone)
Kashmiri 44 أ ٲ إ ٳ وٗ ۆ ۄ ےٚ ؠ Nastaliq Kashmiri South AsiaPerso-Arabic
Kazakh Arabic alphabet 35 Kazakh Central Asia, ChinaPerso-Arabic
/ Chagatai
Since 11th century, now official only in China
Khowar 60 Khowar South AsiaPerso-Arabic
Kyrgyz Arabic alphabet 33 Kyrgyz Central AsiaPerso-ArabicNow official only in China
Kuryan alphabet44 Korean East Asia, South KoreaPerso-ArabicInvented by Korean Muslim in the 2000s
Pashto 45 Pashto South-West Asia, Afghanistan and Pakistan Perso-Arabic
Pegon script 35 Javanese, Sundanese South-East Asia (Indonesia)Perso-Arabic
Persian 32 پ ݘ ژ گ Nastaliq or Naskh Persian (Farsi)West Asia (Iran etc. ) Arabic
Saraiki 45 Saraiki South-West Asia (Pakistan)Perso-Arabic
Shahmukhi 41+
(see note)
ࣇ ݨUsually Nastaliq Punjabi South-West Asia (Pakistan)Perso-ArabicSimilar to Urdu; 58 [ citation needed ] letters including digraphs for aspirated consonants.
Sindhi 64 ڪ ڳ ڱ گ ک
پ ڀ ٻ ٽ ٿ ٺ
ڻ ڦ ڇ چ ڄ ڃ
ھ ڙ ڌ ڏ ڎ ڍ ڊ
Naskh-like Sindhi South-West Asia (Pakistan)Perso-Arabic
Sorabe 33 Malagasy Madagascar Arabic
Soranî 33 Central Kurdish Middle-EastPerso-ArabicVowels are mandatory, i.e. abugida
Swahili
İske imlâ 35 Tatar Chagatai / Perso-ArabicBefore 1920
Ottoman Turkish 32 Ottoman Turkish Ottoman Empire Perso-ArabicOfficial until 1928
Urdu 39+
(see notes)
پ ژ ݘ
ٹ ڈ ڑ ں ہ ھ ے

(see notes)
Nastaliq Urdu South AsiaPerso-Arabic 58 [ citation needed ] letters including digraphs representing aspirated consonants.
بھ پھ تھ ٹھ جھ چھ دھ ڈھ کھ گھ
Uyghur 32 Uyghur China, Central AsiaPerso-Arabic
/ Chagatai
Vowels are mandatory, i.e. abugida
Wolofal 28 Wolof West Africa Arabic
Xiao'erjing 36 Sinitic languages China, Central AsiaPerso-Arabic
Yaña imlâ 29 Tatar RussiaPerso-Arabic
/ Chagatai
1920–1927 replaced with Cyrillic

Current use

Today Iran, Afghanistan, Pakistan, India, and China are the main non-Arabic speaking states using the Arabic alphabet to write one or more official national languages, including Azerbaijani, Baluchi, Brahui, Persian, Pashto, Central Kurdish, Urdu, Sindhi, Kashmiri, Punjabi and Uyghur.[ citation needed ]

An Arabic alphabet is currently used for the following languages:[ citation needed ]

Middle East and Central Asia

East Asia

South Asia

Southeast Asia

Africa

Former use

In the 20th century, the Arabic script was generally replaced by the Latin alphabet in the Balkans,[ dubious ] parts of Sub-Saharan Africa, and Southeast Asia, while in the Soviet Union, after a brief period of Latinisation, [39] use of Cyrillic was mandated. Turkey changed to the Latin alphabet in 1928 as part of an internal Westernizing revolution. After the collapse of the Soviet Union in 1991, many of the Turkic languages of the ex-USSR attempted to follow Turkey's lead and convert to a Turkish-style Latin alphabet. However, renewed use of the Arabic alphabet has occurred to a limited extent in Tajikistan, whose language's close resemblance to Persian allows direct use of publications from Afghanistan and Iran. [40]

Most languages of the Iranian languages family continue to use Arabic script, as well as the Indo-Aryan languages of Pakistan and of Muslim populations in India. However, the Bengali language of India and Bangladesh was never written in Arabic script, which has been written in the Bengali alphabet since inception. [41]

Africa

Europe

Central Asia and Caucasus

Southeast Asia

Middle East

Unicode

As of Unicode 13.0, the following ranges encode Arabic characters:

Additional letters used in other languages

Assignment of phonemes to graphemes

∅ = phoneme absent from language
Language family Austron. Dravid Turkic Indic (Indo-European) Iranian (Indo-European) Arabic (Semitic)
Language/script Jawi Pegon Arwi Uyghur Sindhi Punjabi Urdu Persian Balochi Kurdish Pashto Moroccan Tunisian Algerian Hejazi Najdi Egyptian Israeli Levantine Iraqi Gulf
/p/ ڤ ڣ پ پ / ب
/g/ ݢ گ ګ ڭ / گ ڨ / ڧـ ـڧـ ـٯ / ق ق ج چ / ج ك / ج گ / ك ق / گ
/t͡ʃ/ چ چ ڜ تش چ
/v/ ۏ ف و ۋ و ڤ ڥ / ڢ / ف ڤ / ف
/ʒ/ ژ ژ ج چ / ج ج
/ŋ/ ڠ ڭ ڱ ن٘
/ɳ/ ڹ ڻ ݨ ن ڼ
/ɲ/ ڽ ۑ ݧ ڃ نج
Table of additional letters in other languages
Letter or Digraph [upper-alpha 1] Use & Pronunciation Unicode i'jam & other additionsShapeSimilar Arabic Letter(s)
U+ [upper-alpha 2] [upper-alpha 3] abovebelow
پ پـ ـپـ ـپ Pe, used to represent the phoneme /p/ in Persian, Pashto, Punjabi, Khowar, Sindhi, Urdu, Kurdish; it is not used in most Arabic varieties (except Mesopotamian and Gulf) and it is normalized as /b/; e.g., pepsi > bibsi. U+067E none3 dots ٮ ب
ݐ ݐـ ـݐـ ـݐ used to represent the equivalent of the Latin letter Ƴ (palatalized glottal stop /ʔʲ/) in some African languages such as Fulfulde. U+0750   none3 dots
(horizontal)
ٮ ب
ٻ ٻـ ـٻـ ـٻ B̤ē, used to represent a voiced bilabial implosive /ɓ/ in Hausa, Sindhi and Saraiki. U+067B none2 dots
(vertically)
ٮ ب
ڀ ڀـ ـڀـ ـڀ represents an aspirated voiced bilabial plosive // in Sindhi. U+0680 none4 dots ٮ ب
ٺ ٺـ ـٺـ ـٺ Ṭhē, represents the aspirated voiceless retroflex plosive /ʈʰ/ in Sindhi. U+067A 2 dots
(vertically)
none ٮ ت
ټ ټـ ـټـ ـټ Ṭē, used to represent the phoneme /ʈ/ in Pashto. U+067C ﮿ 2 dotsring ٮ ت
ٽ ٽـ ـٽـ ـٽ Ṭe, used to represent the phoneme (a voiceless retroflex plosive /ʈ/) in Sindhi U+067D 3 dots
(inverted)
none ٮ ت
ٹـ ـٹـ ـٹ Ṭe, used to represent Ṭ (a voiceless retroflex plosive /ʈ/) in Punjabi, Urdu. U+0679 ◌ؕ small
ط
none ٮ ت
ٿ ٿـ ـٿـ ـٿ Teheh, used in Sindhi and Rajasthani (when written in Sindhi alphabet); used to represent the phoneme /t͡ɕʰ/ (pinyin q) in Chinese Xiao'erjing. U+067F 4 dotsnone ٮ ت
ڄ ڄـ ـڄـ ـڄ represents the "ц" voiceless dental affricate /t͡s/ phoneme in Bosnian. U+0684 none2 dots
(vertically)
ح ج
ڃ ڃـ ـڃـ ـڃ represents the "ћ" voiceless alveolo-palatal affricate /t͡ɕ/ phoneme in Bosnian. U+0683 none2 dots ح ح ج
چ چـ ـچـ ـچ Che, used to represent /t͡ʃ/ ("ch"). It is used in Persian, Pashto, Punjabi, Urdu and Kurdish. /ʒ/ in Egypt. U+0686 none3 dots ح ج
څ څـ ـڅـ ـڅ Ce, used to represent the phoneme /t͡s/ in Pashto. U+0685 3 dotsnone ح ج خ ح
ݗ ݗـ ـݗـ ـݗ represents the "ђ" voiced alveolo-palatal affricate /d͡ʑ/ phoneme in Bosnian. U+0757 2 dotsnone ح ح
ځ ځـ ـځـ ـځ Źim, used to represent the phoneme /d͡z/ in Pashto. U+0681 ◌ٔ Hamza none ح ج خ ح
ݙ ݙ ـݙ used in Saraiki to represent a Voiced alveolar implosive /ɗ̢/. U+0759 small
ط
2 dots
(vertically)
د د
ڊ ڊ ـڊ used in Saraiki to represent a voiced retroflex implosive //. U+068A none1 dot د د
ڈ ڈ ـڈ Ḍal, used to represent a Ḍ (a voiced retroflex plosive /ɖ/) in Punjabi and Urdu. U+0688 ◌ؕ small ط none د د
ڌ ڌ ـڌ Dhal, used to represent the phoneme /d̪ʱ/ in Sindhi U+068C 2 dotsnone د د
ډ ډ ـډ Ḍal, used to represent the phoneme /ɖ/ in Pashto. U+0689 ﮿ nonering د د
ڑ ڑ ـڑ Ṛe, represents a retroflex flap /ɽ/ in Punjabi and Urdu. U+0691 ◌ؕ small ط none ر ر
ړ ړ ـړ Ṛe, used to represent a retroflex lateral flap in Pashto. U+0693 ﮿ nonering ر _
ݫ ݫ ـݫ used in Ormuri to represent a voiced alveolo-palatal fricative /ʑ/, as well as in Torwali. U+076B 2 dots
(vertically)
none ر _
ژ ژ ـژ Že / zhe, used to represent the voiced postalveolar fricative /ʒ/ in, Persian, Pashto, Kurdish, Urdu, Punjabi and Uyghur. U+0698 3 dotsnone ر ز
ږ ږ ـږ Ǵe / ẓ̌e, used to represent the phoneme /ʐ//ɡ//ʝ/ in Pashto. U+0696 1 dot1 dot ر ز
ڕ ڕ ـڕ used in Kurdish to represent rr /r/ in Soranî dialect. U+0695 ٚ noneV pointing down ر ر
ݭ ݭـ ـݭـ ـݭ used in Kalami to represent a voiceless retroflex fricative /ʂ/, and in Ormuri to represent a voiceless alveolo-palatal fricative /ɕ/. U+076D 2 dots verticallynone س س
ݜ ݜـ ـݜـ ـݜ used in Shina to represent a voiceless retroflex fricative /ʂ/. U+075C 4 dotsnone س ش س
ښ ښـ ـښـ ـښ X̌īn / ṣ̌īn, used to represent the phoneme /x//ʂ//ç/ in Pashto. U+069A 1 dot1 dot س ش س
ڜ ڜـ ـڜـ ـڜ used to represent Spanish words with /t͡ʃ/ in Morocco. U+069C 3 dots3 dots س ش س
ڨ ڨـ ـڨـ ـڨ Ga, used to represent the voiced velar plosive /ɡ/ in Algerian and Tunisian. U+06A8 3 dotsnone ٯ ق
گ گـ ـگـ ـگ Gaf, represents a voiced velar plosive /ɡ/ in Persian, Pashto, Punjabi, Kyrgyz, Kazakh, Kurdish, Uyghur, Mesopotamian, Urdu and Ottoman Turkish. U+06AF linehorizontal linenone گ ك
ګ ګـ ـګـ ـګ Gaf, used to represent the phoneme /ɡ/ in Pashto. U+06AB ﮿ ringnone ک ك
ݢ ݢـ ـݢـ ـݢ Gaf, represents a voiced velar plosive /ɡ/ in the Jawi script of Malay. U+0762 1 dotnone ک ك
ڬ ڬـ ـڬـ ـڬ U+06AC 1 dotnone ك ك
ࢴـ ـࢴـ ـࢴ Gaf, represents a voiced velar plosive /ɡ/ in the Pegon script of Indonesian. U+08B4 none1 dot ك ك
ڭ ڭـ ـڭـ ـڭ Ng, used to represent the /ŋ/ phone in Ottoman Turkish, Kazakh, Kyrgyz, and Uyghur, and to represent the /ɡ/ in Morocco and in many dialects of Algerian. U+06AD 3 dotsnone ك ك
أي أيـ ـأيـ ـأي Ee, used to represent the phoneme // in Somali. U+0623 U+064A ◌ٔ Hamza 2 dots اى أ + ي
ئ ئـ ـئـ ـئ E, used to represent the phoneme /e/ in Somali. U+0626 ◌ٔ Hamza none ى ي ی
ىٓ ىٓـ ـىٓـ ـىٓ Ii, used to represent the phoneme // in Somali and Saraiki. U+0649 U+0653 ◌ٓ Maddanone ى ي
ؤ ؤ ـؤ O, used to represent the phoneme /o/ in Somali. U+0624 ◌ٔ Hamza none و ؤ
ۅ ۅ ـۅ Ö, used to represent the phoneme /ø/ in Kyrgyz. U+0624 ◌̵ Strikethrough [upper-alpha 4] none و و
ې ېـ ـېـ ـې Pasta Ye, used to represent the phoneme /e/ in Pashto and Uyghur. U+06D0 none2 dots vertical ى ي
ی یـ ـیـ ـی Nārīna Ye, used to represent the phoneme [ɑj] and phoneme /j/ in Pashto. U+06CC 2 dots
(start + mid)
none ى ي
ۍ ـۍ end
only
X̌əźīna ye Ye, used to represent the phoneme [əi] in Pashto. U+06CD linehorizontal
line
none ى ي
ئ ئـ ـئـ ـئ Fāiliya Ye, used to represent the phoneme [əi] and /j/ in Pashto, Punjabi, Saraiki and Urdu U+0626 ◌ٔ Hamza none ى ي ى
أو أو ـأو Oo, used to represent the phoneme // in Somali. U+0623 U+0648 ◌ٔ Hamza none او أ + و
ﻭٓ ﻭٓ ـﻭٓ Uu, used to represent the phoneme // in Somali. + ◌ٓ U+0648 U+0653 ◌ٓ Maddanone و + ◌ٓ
ڳ ڳـ ـڳـ ـڳ represents a voiced velar implosive /ɠ/ in Sindhi and Saraiki U+06B1 horizontal
line
2 dots گ ك
ڱ ڱـ ـڱـ ـڱ represents the Velar nasal /ŋ/ phoneme in Sindhi. U+06B1 2 dots + horizontal
line
none گ ك
ک کـ ـکـ ـک Khē, represents // in Sindhi. U+06A9 nonenonenone ک ك
ڪ ڪـ ـڪـ ـڪ "Swash kāf" is a stylistic variant of ك in Arabic, but represents un- aspirated /k/ in Sindhi. U+06AA nonenonenone ڪ ك or ڪ
ݣ ݣـ ـݣـ ـݣ used to represent the phoneme /ŋ/ (pinyin ng) in Chinese. U+0763 none3 dots ک ك
ڼ ڼـ ـڼـ ـڼ represents the retroflex nasal /ɳ/ phoneme in Pashto. U+06BC ں ﮿ 1 dotringن
ڻ ڻـ ـڻـ ـڻ represents the retroflex nasal /ɳ/ phoneme in Sindhi. U+06BB ◌ؕ small ط none ں ن
ݨ ݨـ ـݨـ ـݨ used in Punjabi to represent /ɳ/ and Saraiki to represent /ɲ/. U+0768 1 dot + small ط none ں ن
ڽ ڽـ ـڽـ ـڽ Nya/ɲ/ in the Jawi script. U+06BD 3 dotsnone ں ن
ۑ ۑـ ـۑـ ـۑ Nya/ɲ/ in the Pegon script. U+06D1 none3 dots ى _
ڠ ڠـ ـڠـ ـڠ Nga/ŋ/ in the Jawi script and Pegon script. U+06A0 3 dotsnone ع غ
ݪ ݪـ ـݪـ ـݪ used in Marwari to represent a retroflex lateral flap /ɺ̢/, and in Kalami to represent a voiceless lateral fricative /ɬ/. U+076A linehorizontal
line
none ل ل
ࣇ‍ ‍ࣇ‍ ‍ࣇ - or alternately typeset as لؕ - is used in Punjabi to represent voiced retroflex lateral approximant /ɭ/ [45] U+08C7 ◌ؕ small ط none ل ل
لؕ لؕـ ـلؕـ ـلؕ U+0644 U+0615
ڥ ڥـ ـڥـ ـڥ Vi, used in Algerian and Tunisian when written in Arabic script to represent the sound /v/. U+06A5 none3 dots ڡ ف
ڤ ڤـ ـڤـ ـڤ Ve, used in by some Arabic speakers to represent the phoneme /v/ in loanwords, and in the Kurdish language when written in Arabic script to represent the sound /v/. Also used as pa/p/ in the Jawi script and Pegon script. U+06A4 3 dotsnone ڡ ف
ۏ ۏ ـۏ Va in the Jawi script. U+06CF 1 dotnone و و
ۋ ۋ ـۋ represents a voiced labiodental fricative /v/ in Kyrgyz, Uyghur, and Old Tatar; and /w, ʊw, ʉw/ in Kazakh; also formerly used in Nogai. U+06CB 3 dotsnone و و
ۆ ۆ ـۆ represents "O" /o/ in Kurdish, and in Uyghur it represents the sound similar to the French eu and œu/ø/ sound. It represents the "у" close back rounded vowel /u/ phoneme in Bosnian. U+06C6 ◌ٚ V pointing downnone و و
ۇ‎ ۇ ـۇ U, used to represents the Close back rounded vowel /u/ phoneme in Azerbaijani, Kazakh, Kyrgyz and Uyghur. U+06C7 ◌ُ Damma [upper-alpha 5] none و و
ێ ێـ ـێـ ـێ represents Ê or É /e/ in Kurdish. U+06CE ◌ٚ V pointing down 2 dots
(start + mid)
ى ي
ھ
ھ
ھـ ـھـ ـھ
ھھھ
Do-chashmi he (two-eyed hāʼ), used in digraphs for aspiration /ʰ/ and breathy voice /ʱ/ in Punjabi and Urdu. Also used to represent /h/ in Kazakh, Sorani and Uyghur. [upper-alpha 6] U+06BE nonenonenone ھ ه
ە ە ـە Ae, used represent /æ/ and /ɛ/ in Kazakh, Sorani and Uyghur. U+06D5 nonenonenone ھ إ
ے ـے end
only
Baṛī ye ('big yāʼ'), is a stylistic variant of ي in Arabic, but represents "ai" or "e" /ɛː/, // in Urdu and Punjabi. U+06D2 nonenonenone ے ي
ڞ ڞـ ـڞـ ـڞ used to represent the phoneme /tsʰ/ (pinyin c) in Chinese. U+069E 3 dotsnone ص ص ض
ط طـ ـطـ ـط used to represent the phoneme /t͡s/ (pinyin z) in Chinese. U+0637 ط ط
ۉ ۉ ـۉ represents the "o" open-mid back rounded vowel /ɔ/ phoneme in Bosnian. U+06C9 ◌ٛ V pointing upnone و و
ݩ ݩـ ـݩـ ـݩ represents the "њ" palatal nasal /ɲ/ phoneme in Bosnian. U+0769 ◌ٚ 1 dot
V pointing down
none ں ن
ڵ ڵـ ـڵـ ـڵ used in Kurdish to represent ll /ɫ/ in Soranî dialect. U+06B5 ◌ٚ V pointing downnone ل ل
ڵ ڵـ ـڵـ ـڵ represents the "љ" palatal lateral approximant /ʎ/ phoneme in Bosnian. U+06B5 ◌ٚ V pointing downnone ل ل
اٖى اٖىـ ـاٖىـ ـاٖى represents the "и" close front unrounded vowel /i/ phoneme in Bosnian. U+0627 U+0656 U+0649 ◌ٖ Alefnone اى اٖ + ى
Footnotes:
  1. From right: start, middle, end, and isolated forms.
  2. Joined to the letter, closest to the letter, on the first letter, or above.
  3. Further away from the letter, or on the second letter, or below.
  4. A variant that end up with loop also exists.
  5. Although the letter also known as Waw with Damma, some publications and fonts features filled Damma that looks similar to comma.
  6. Shown in Naskh (top) and Nastaliq (bottom) styles. The Nastaliq version of the connected forms are connected to each other, because the tatweel character U+ used to show the other forms does not work in many Nastaliq fonts.

Letter construction

Most languages that use alphabets based on the Arabic alphabet use the same base shapes. Most additional letters in languages that use alphabets based on the Arabic alphabet are built by adding (or removing) diacritics to existing Arabic letters. Some stylistic variants in Arabic have distinct meanings in other languages. For example, variant forms of kāfك ک ڪ are used in some languages and sometimes have specific usages. In Urdu and some neighbouring languages the letter Hā has diverged into two forms ھdō-čašmī hē and ہ ہـ ـہـ ـہgōl hē. [46] while a variant form of ي referred to as baṛī yēے is used at the end of some words. [46]

Table of Letter Components

Abbreviations used below

A = The letter is used for most languages and dialects with writing systems based on Arabic.

MSA = Letters used in Modern Standard Arabic.

CA = Letters used in Classical Arabic.

AD = Letters used in some regional Arabic Dialects.

"Arabic" = Letters used in Classical Arabic, Modern Standard Arabic, and most regional dialects.

"Farsi" = Letters used in modern Persian.

FW = Foreign words: the letter is sometimes used to spell foreign words.

SV = Stylistic variant: the letter is used interchangeably with at least one other lletter depending on the calligraphic style.

AW = Arabic words: the letter is used in additional languages to spell Arabic words.


Table

No additions
Letter Line Shapes (ii) ء ا ے ى ں ٮ ح س ص ط ع ڡ ٯ ڪ ك ک گ ل م د ر و ھ ہ ه لا
positional formsisolated ء ا ے ى ں ٮ ح س ص ط ع ڡ ٯ ڪ ك ک گ ل م د ر و ھ ہ ه لا
start ء ا none ىـ (YNB) ٮـ حـسـصـطـعـڡـ (QF) ٯـڪـكـکـگـلـمـدروھـہـهـلا
middle ء ـا none ـىـ (YNB) ـٮـ ـحــســصــطــعــڡـ (QF) ـٯــڪــكــکــگــلــمــدـرـوـھــہــهــلا
end ء ـاـےـىـںـٮـحـسـصـطـعـڡـٯـڪـكـکـگـلـمـدـرـوـھـہـهـلا
image of formsisolated
start
middle
end
ءاےىںٮحسصطعڡٯڪكکگلمدروھہهلا
Unicode for above U+0621 U+0627 U+06D2 U+0649 U+06BA U+066E U+062D U+0633 U+0635 U+0637 U+0639 U+06A1 U+066F U+06AA U+0643 U+06A9 U+06AF U+0644 U+0645 U+062F U+0631 U+0648 U+06BE U+06C1 U+0647 ا + ل
Languages that use the letter shape with or without i'jam Arabic Arabic Urdu Arabic Arabic Arabic Arabic Arabic Arabic Arabic Arabic Arabic Arabic Sindhi, Arabic SV Arabic Farsi, Urdu, Arabic SV Farsi, Urdu Arabic Arabic Arabic Arabic Arabic Urdu, Arabic SV Urdu, Arabic SV Arabic, Farsi Arabic
diacritics (i) isolated ء ا ے ى ں ٮ ح س ص ط ع ڡ ٯ ڪ ك ک گ ل م د ر و ھ ہ ه لا
nonestart
mid
end
ءـا ا ـــے ے ىـ
ـىـ
ـى
ـں ںٮـ
ـٮـ
ـٮ
حـ
ـحـ
ـح
سـ
ـسـ
ـس
صـ
ـصـ
ـص
طـ
ـطـ
ـط
عـ
ـعـ
ـع
ڡـ
ـڡـ
ـڡ
ٯـ
ـٯـ
ـٯ
ڪــ
ـڪــ
ـڪ
كـ
ـكـ
ـك
کـ
ـکـ
ـک
گـ
ـگـ
ـگ
لـ
ـلـ
ـل
مـ
ـمـ
ـم
ـد دـر رـو وھـ
ـھـ
ـھ
ہـ
ـہـ
ـہ
هـ
ـهـ
ـه
ـلا لا
Languages using the bare shape with no additions Arabic Arabic, Farsi, Urdu Urdu MSA Urdu, Arabic SVArabic SV Rasm Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic SV in Rasm Arabic SV in Rasm Sindhi, Arabic SV Arabic Farsi, Urdu, Arabic SV Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Arabic, Farsi, Urdu Urdu, Arabic SV Urdu, Arabic SV Arabic, Farsi MSA
Unicode for above U+0621 U+0627 U+06D2 U+0649 U+06BA U+066E U+062D U+0633 U+0635 U+0637 U+0639 U+06A1 U+066F U+06AA U+0643 U+06A9 U+06AF U+0644 U+0645 U+062F U+0631 U+0648 U+06BE U+06C1 U+0647 ل + ا
Bare line in isolated and end forms onlyisolatedءاے ی ٮحسصطع ڪكکگلمدروھہهلا
start
mid
end
یـ
ـیـ
ـی
ࢽـ
ـࢽـ
ـࢽ
ࢻـ
ـࢻـ
ـࢻ
ࢼـ
ـࢼـ
ـࢼ
image
Languages
UnicodeU+08BDU+08BBU+08BC
dots
1 dot
Diacritics (i) Letter Shapes:
(ii)
ء ا ے ى ں ٮ ح س ص ط ع ڡ ٯ ڪ ك ک گ ل م د ر و ھ ہ ه لا
1 dot below  ◌࣭   ◌ٜ   ــٜـ     isolatedءاےىں ب ج سصطعڡٯڪكکگلمدروھہهلا
connectedبـ ـبـ ـبجـ ـجـ ـج
image
Languages
U+FBB3 U+065C Unicode U+0628 U+062C
1 dot above + 1 dot below ﮲﮳ isolatedءاےى ڹ ٮح ښ ۻ ط ۼ ڣ ٯڪكکگلمد ږ وھہهلا
connected
image
Languages
Unicode
1 dot above ◌࣪    isolatedءاےىنٮخسضظغ ف ڧ ڪكکگلمذزوھہهلا
connected
image
Languages
U+FBB2 Unicode U+0646 U+062E U+0636 U+0638 U+063A U+0641 U+06A7 U+0630 U+0632
2 dots
diacritics (i) Letter Shapes (ii) ء ا ے ى ں ٮ ح س ص ط ع ڡ ٯ ڪ ك ک گ ل م د ر و ھ ہ ه لا
2 dots below
start and mid (iii)
isolated formءاے یـ ـیـ ـی ی حسصطعڡٯڪكکگلمدروھہهلا
image
Languages Farsi, Urdu, AD
U+FBB5 U+FBB5 Unicode U+06CC
2 dots below
all positions
isolated formءاے يـ ـيـ ـي ي حسصطعڡٯڪكکگلمدروھہهلا
image
Languages Arabic (iv)
U+FBB5 U+FBB5 Unicode U+064A
2 vertical dots below    isolated formءاےىںٮحسصطعڡٯڪكکگلمدروھہهلا
image
Languages
U+FBBE Unicode
2 vertical dots above    isolated formءاےىںٮحسصطعڡٯڪكکگلمدروھہهلا
image
Languages
U+FBBD Unicode
2 horizontal
dots above
isolated formءاےىںتحسصطعڡ ق ڪكکگلمدروھـۃـةلا
image
Languages
FBB4 Unicode U+062A U+0642 U+06C3 U+0629
diacritics (i) Letter Shapes (ii) ء ا ے ى ں ٮ ح س ص ط ع ڡ ٯ ڪ ك ک گ ل م د ر و ھ ہ ه لا
above: 2 dots
below: 2 dots
above: 2 dots
(vertically)
below: 2 dots
(vertically)
U+08EB (2283)◌࣫TWO DOTS ABOVE
U+08EE (2286)◌࣮TWO DOTS BELOW
3 dots
ث پ چ ژ ش Arabic & Persian
ݑ ڥ ڤ ڨ ڠ ڟ ڞ ۺ ڜ ڛ څ ڿ ۑ ۋ ڮ ڴ ڷ ڸ other pointing out
ݤ ڏ ݓ ݒ ݡ ݘ ݞ inverted
3 dots below (horizontal)

    

characterءاےىں ݐ حسصطعڡ