Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t (spelling et, Latin for and) were combined. [1] The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType. [2]
As of Unicode 15.1, the Arabic script is contained in the following blocks: [3]
The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Extended-B and Arabic Extended-A ranges encode additional Qur'anic annotations and letter variants used for various non-Arabic languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text. [4] The Arabic Mathematical Alphabetical Symbols block encodes characters used in Arabic mathematical expressions. The Indic Siyaq Numbers block contains a specialized subset of Arabic script that was used for accounting in India under the Mughal Empire by the 17th century through the middle of the 20th century. [5] [6] The Ottoman Siyaq Numbers block contains a specialized subset of Arabic script, also known as Siyakat numbers, used for accounting in Ottoman Turkish documents. [6]
Below is a demonstration for the basic alphabet used in Modern Standard Arabic illustrating how Arabic letters are expected to appear in different contexts. Codepoints listed as contextual forms should "should not be used in general interchange" [4] . Unicode has other methods of encoding the difference if necessary, such as Zero-width joiner.
General Unicode | Contextual forms | Name | |||
---|---|---|---|---|---|
Isolated | Final (End) | Medial (Middle) | Initial (Beginning) | ||
0627 ا | FE8D ﺍ | FE8E ﺎ | ʾalif | ||
0628 ب | FE8F ﺏ | FE90 ﺐ | FE92 ﺒ | FE91 ﺑ | bāʾ |
062A ت | FE95 ﺕ | FE96 ﺖ | FE98 ﺘ | FE97 ﺗ | tāʾ |
062B ث | FE99 ﺙ | FE9A ﺚ | FE9C ﺜ | FE9B ﺛ | ṯāʾ |
062C ج | FE9D ﺝ | FE9E ﺞ | FEA0 ﺠ | FE9F ﺟ | ǧīm |
062D ح | FEA1 ﺡ | FEA2 ﺢ | FEA4 ﺤ | FEA3 ﺣ | ḥāʾ |
062E خ | FEA5 ﺥ | FEA6 ﺦ | FEA8 ﺨ | FEA7 ﺧ | ḫāʾ |
062F د | FEA9 ﺩ | FEAA ﺪ | dāl | ||
0630 ذ | FEAB ﺫ | FEAC ﺬ | ḏāl | ||
0631 ر | FEAD ﺭ | FEAE ﺮ | rāʾ | ||
0632 ز | FEAF ﺯ | FEB0 ﺰ | zayn/zāy | ||
0633 س | FEB1 ﺱ | FEB2 ﺲ | FEB4 ﺴ | FEB3 ﺳ | sīn |
0634 ش | FEB5 ﺵ | FEB6 ﺶ | FEB8 ﺸ | FEB7 ﺷ | šīn |
0635 ص | FEB9 ﺹ | FEBA ﺺ | FEBC ﺼ | FEBB ﺻ | ṣād |
0636 ض | FEBD ﺽ | FEBE ﺾ | FEC0 ﻀ | FEBF ﺿ | ḍād |
0637 ط | FEC1 ﻁ | FEC2 ﻂ | FEC4 ﻄ | FEC3 ﻃ | ṭāʾ |
0638 ظ | FEC5 ﻅ | FEC6 ﻆ | FEC8 ﻈ | FEC7 ﻇ | ẓāʾ |
0639 ع | FEC9 ﻉ | FECA ﻊ | FECC ﻌ | FECB ﻋ | ʿayn |
063A غ | FECD ﻍ | FECE ﻎ | FED0 ﻐ | FECF ﻏ | ġayn |
0641 ف | FED1 ﻑ | FED2 ﻒ | FED4 ﻔ | FED3 ﻓ | fāʾ |
0642 ق | FED5 ﻕ | FED6 ﻖ | FED8 ﻘ | FED7 ﻗ | qāf |
0643 ك | FED9 ﻙ | FEDA ﻚ | FEDC ﻜ | FEDB ﻛ | kāf |
0644 ل | FEDD ﻝ | FEDE ﻞ | FEE0 ﻠ | FEDF ﻟ | lām |
0645 م | FEE1 ﻡ | FEE2 ﻢ | FEE4 ﻤ | FEE3 ﻣ | mīm |
0646 ن | FEE5 ﻥ | FEE6 ﻦ | FEE8 ﻨ | FEE7 ﻧ | nūn |
0647 ه | FEE9 ﻩ | FEEA ﻪ | FEEC ﻬ | FEEB ﻫ | hāʾ |
0648 و | FEED ﻭ | FEEE ﻮ | wāw | ||
064A ي | FEF1 ﻱ | FEF2 ﻲ | FEF4 ﻴ | FEF3 ﻳ | yāʾ |
0622 آ | FE81 ﺁ | FE82 ﺂ | ʾalif maddah | ||
0629 ة | FE93 ﺓ | FE94 ﺔ | — | — | Tāʾ marbūṭah |
0649 ى | FEEF ﻯ | FEF0 ﻰ | — | — | ʾalif maqṣūrah |
Only the Arabic question mark ⟨؟⟩ and the Arabic comma ⟨،⟩ are used in regular Arabic script typing and the comma is often substituted for the Latin script comma ⟨,⟩ which is also used as the decimal separator when the Eastern Arabic numerals are used (e.g. ⟨100.6⟩ compared to ⟨١٠٠,٦⟩).
Arabic Presentation Forms-A has a few characters defined as "word ligatures" for terms frequently used in formulaic expressions in Arabic. They are rarely used out of professional liturgical typing, also the Rial grapheme is normally written fully, not by the ligature.
Code | Result | Unicode name | |||
---|---|---|---|---|---|
U+0600 | | Arabic Number Sign | |||
U+0601 | | Arabic Sign Sanah | |||
U+0602 | | Arabic Footnote Marker | |||
U+0603 | | Arabic Sign Safha | |||
U+0604 | | Arabic Sign Samvat used for writing Samvat era dates in Urdu | |||
U+0605 | | Arabic Number Mark Above may be used with Coptic Epact numbers | |||
U+0606 | ؆ | Arabic-Indic Cube Root → U+221B ∛ Cube Root | |||
U+0607 | ؇ | Arabic-Indic Fourth Root → U+221C ∜ Fourth Root | |||
U+0608 | ؈ | Arabic Ray | |||
U+0609 | ؉ | Arabic-Indic Per Mille Sign → U+2030 ‰ Per Mille Sign | |||
U+060A | ؊ | Arabic-Indic Per Ten Thousand Sign → U+2031‱ Per Ten Thousand Sign | |||
U+060B | ؋ | Afghani Sign | |||
U+060C | ، | Arabic Comma also used with Thaana and Syriac in modern text → U+002C , Comma → U+2E32 ⸲ Turned Comma → U+2E41 ⹁ Reversed Comma | |||
U+060D | ؍ | Arabic Date Separator | |||
U+060E | ؎ | Arabic Poetic Verse Sign | |||
U+060F | ؏ | Arabic Sign Misra | |||
U+0610 | ؐ | Arabic Sign Sallallahou Alayhe Wassallam represents sallallahu alayhe wasallam "may God's peace and blessings be upon him" | |||
U+0611 | ؑ | Arabic Sign Alayhe Assallam represents alayhe assalam "upon him be peace" | |||
U+0612 | ؒ | Arabic Sign Rahmatullah Alayhe represents rahmatullah alayhe "may God have mercy upon him" | |||
U+0613 | ؓ | Arabic Sign Radi Allahou Anhu represents radi allahu 'anhu "may God be pleased with him" | |||
U+0614 | ؔ | Arabic Sign Takhallus sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names | |||
U+0615 | ؕ | Arabic Small High Tah marks a recommended pause position in some Qurans published in Iran and Pakistan should not be confused with the small TAH sign used as a diacritic for some letters such as 0679 | |||
U+0616 | ؖ | Arabic Small High Ligature Alef With Lam With Yeh early Persian Arabic Small High Ligature Alef With Yeh Barree | |||
U+0617 | ؗ | Arabic Small High Zain | |||
U+0618 | ؘ | Arabic Small Fatha should not be confused with 064E Fatha | |||
U+0619 | ؙ | Arabic Small Damma should not be confused with 064F Damma | |||
U+061A | ؚ | Arabic Small Kasra should not be confused with 0650 Kasra | |||
U+061B | ؛ | Arabic Semicolon also used with Thaana and Syriac in modern text → U+003B ; Semicolon → U+204F ⁏ Reversed Semicolon → U+2E35 ⸵ Turned Semicolon | |||
U+061C | | Arabic Letter Mark (Alm) | |||
U+061D | ؝ | Arabic End Of Text Mark | |||
U+061E | ؞ | Arabic Triple Dot Punctuation Mark | |||
U+061F | ؟ | Arabic Question Mark also used with Thaana and Syriac in modern text → U+003F ? Question Mark → U+2E2E ⸮ Reversed Question Mark | |||
U+0620 | ؠ | Arabic Letter Kashmiri Yeh | |||
U+0621 | ء | Arabic Letter Hamza → U+02BE ʾ Modifier Letter Right Half Ring | |||
U+0622 | آ | Arabic Letter Alef With Madda Above ≡ آ U+0627 U+0653 | |||
U+0623 | أ | Arabic Letter Alef With Hamza Above ≡ أ U+0627 U+0654 | |||
U+0624 | ؤ | Arabic Letter Waw With Hamza Above ≡ ؤ U+0648 U+0654 | |||
U+0625 | إ | Arabic Letter Alef With Hamza Below ≡ إ U+0627 U+0655 | |||
U+0626 | ئ | Arabic Letter Yeh With Hamza Above in Kyrgyz the hamza is consistently positioned to the top right in isolate and final forms ≡ ئ U+064A U+0654 | |||
U+0627 | ا | Arabic Letter Alef | |||
U+0628 | ب | Arabic Letter Beh | |||
U+0629 | ة | Arabic Letter Teh Marbuta | |||
U+062A | ت | Arabic Letter Teh | |||
U+062B | ث | Arabic Letter Theh | |||
U+062C | ج | Arabic Letter Jeem | |||
U+062D | ح | Arabic Letter Hah | |||
U+062E | خ | Arabic Letter Khah | |||
U+062F | د | Arabic Letter Dal | |||
U+0630 | ذ | Arabic Letter Thal | |||
U+0631 | ر | Arabic Letter Reh | |||
U+0632 | ز | Arabic Letter Zain | |||
U+0633 | س | Arabic Letter Seen | |||
U+0634 | ش | Arabic Letter Sheen | |||
U+0635 | ص | Arabic Letter Sad | |||
U+0636 | ض | Arabic Letter Dad | |||
U+0637 | ط | Arabic Letter Tah | |||
U+0638 | ظ | Arabic Letter Zah | |||
U+0639 | ع | Arabic Letter Ain → U+01B9 ƹ Latin Small Letter Ezh Reversed → U+02BF ʿ MODIFIER LETTER LEFT HALF RING | |||
U+063A | غ | Arabic Letter Ghain | |||
U+063B | ػ | Arabic Letter Keheh With Two Dots Above | |||
U+063C | ؼ | Arabic Letter Keheh With Three Dots Below | |||
U+063D | ؽ | Arabic Letter Farsi Yeh With Inverted V Azerbaijani | |||
U+063E | ؾ | Arabic Letter Farsi Yeh With Two Dots Above | |||
U+063F | ؿ | Arabic Letter Farsi Yeh With Three Dots Above | |||
U+0640 | ـ | Arabic Tatweel inserted to stretch characters or to carry tashkil with no base letter also used with Adlam, Hanifi Rohingya, Mandaic, Manichaean, Psalter Pahlavi, Sogdian, and Syriac= kashida | |||
U+0641 | ف | Arabic Letter Feh | |||
U+0642 | ق | Arabic Letter Qaf | |||
U+0643 | ك | Arabic Letter Kaf | |||
U+0644 | ل | Arabic Letter Lam | |||
U+0645 | م | Arabic Letter Meem Sindhi uses a shape with a short tail | |||
U+0646 | ن | Arabic Letter Noon | |||
U+0647 | ه | Arabic Letter Heh | |||
U+0648 | و | Arabic Letter Waw | |||
U+0649 | ى | Arabic Letter Alef Maksura represents YEH-shaped dual-joining letter with no dots in any positional form not intended for use in combination with 0654 → U+0626 ئ Arabic Letter Yeh With Hamza Above | |||
U+064A | ي | Arabic Letter Yeh loses its dots when used in combination with 0654 retains its dots when used in combination with other combining marks → U+08A8 ࢨ Arabic Letter Yeh With Two Dots Below And Hamza Above | |||
U+064B | ً | Arabic Fathatan | |||
U+064C | ٌ | Arabic Dammatan a common alternative form is written as two intertwined dammas, one of which is turned 180 degrees | |||
U+064D | ٍ | Arabic Kasratan | |||
U+064E | َ | Arabic Fatha | |||
U+064F | ُ | Arabic Damma | |||
U+0650 | ِ | Arabic Kasra | |||
U+0651 | ّ | Arabic Shadda | |||
U+0652 | ْ | Arabic Sukun marks absence of a vowel after the base consonant used in some Qurans to mark a long vowel as ignored can have a variety of shapes, including a circular one and a shape that looks like '06E1' → U+06E1 ۡArabic Small High Dotless Head Of Khah | |||
U+0653 | ٓ | Arabic Maddah Above used for madd jaa'iz in South Asian and Indonesian orthographies →U+089C ࢜ Arabic Madda Waajib →U+089E ࢞ Arabic Doubled Madda →U+089F ࢟ Arabic Half Madda Over Madda | |||
U+0654 | ٔ | Arabic Hamza Above restricted to hamza and ezafe semantics is not used as a diacritic to form new letters | |||
U+0655 | ٕ | Arabic Hamza Below | |||
U+0656 | ٖ | Arabic Subscript Alef | |||
U+0657 | ٗ | Arabic Inverted Damma Kashmiri, Urdu= ulta pesh | |||
U+0658 | ٘ | Arabic Mark Noon Ghunna Baluchi indicates nasalization in Urdu | |||
U+0659 | ٙ | Arabic Zwarakay Pashto | |||
U+065A | ٚ | Arabic Vowel Sign Small V Above African languages | |||
U+065B | ٛ | Arabic Vowel Sign Inverted Small V Above African languages | |||
U+065C | ٜ | Arabic Vowel Sign Dot Below African languages also used in Quranic text in African and other orthographies | |||
U+065D | ٝ | Arabic Reversed Damma African languages | |||
U+065E | ٞ | Arabic Fatha With Two Dots Kalami | |||
U+065F | ٟ | Arabic Wavy Hamza Below Kashmiri | |||
U+0660 | ٠ | Arabic-Indic Digit Zero | |||
U+0661 | ١ | Arabic-Indic Digit One | |||
U+0662 | ٢ | Arabic-Indic Digit Two | |||
U+0663 | ٣ | Arabic-Indic Digit Three | |||
U+0664 | ٤ | Arabic-Indic Digit Four | |||
U+0665 | ٥ | Arabic-Indic Digit Five | |||
U+0666 | ٦ | Arabic-Indic Digit Six | |||
U+0667 | ٧ | Arabic-Indic Digit Seven | |||
U+0668 | ٨ | Arabic-Indic Digit Eight | |||
U+0669 | ٩ | Arabic-Indic Digit Nine | |||
U+066A | ٪ | Arabic Percent Sign → U+0025 % Percent Sign | |||
U+066B | ٫ | Arabic Decimal Separator the ordinary comma is most commonly used instead → U+002C , Comma | |||
U+066C | ٬ | Arabic Thousands Separator the Arabic comma is most commonly used instead → U+060C ، Arabic Comma → U+0027 ' Apostrophe → U+2019 ’ Right Single Quotation Mark | |||
U+066D | ٭ | Arabic Five Pointed Star appearance rather variable → U+002A * Asterisk | |||
U+066E | ٮ | Arabic Letter Dotless Beh | |||
U+066F | ٯ | Arabic Letter Dotless Qaf | |||
U+0670 | ٰ | Arabic Letter Superscript Alef | |||
U+0671 | ٱ | Arabic Letter Alef Wasla Quranic Arabic | |||
U+0672 | ٲ | Arabic Letter Alef With Wavy Hamza Above Baluchi, Kashmiri | |||
U+0673 | ٳ | Arabic Letter Alef With Wavy Hamza Below (deprecated) [7] Kashmiri this character is deprecated and its use is strongly discouraged use the sequence 0627 065F instead | |||
U+0674 | ٴ | Arabic Letter High Hamza Kazakh, Jawi forms digraphs | |||
U+0675 | ٵ | Arabic Letter High Hamza Alef preferred spelling is ٴا U+0674 U+0627 | |||
U+0676 | ٶ | Arabic Letter High Hamza Waw preferred spelling is ٴو U+0674 U+0648 | |||
U+0677 | ٷ | Arabic Letter U With Hamza Above preferred spelling is ٴۇ U+0674 U+06C7 | |||
U+0678 | ٸ | Arabic Letter High Hamza Yeh preferred spelling is ٴی U+0674 06CC | |||
U+0679 | ٹ | Arabic Letter Tteh Urdu | |||
U+067A | ٺ | Arabic Letter Tteheh Sindhi | |||
U+067B | ٻ | Arabic Letter Beeh Sindhi | |||
U+067C | ټ | Arabic Letter Teh With Ring Pashto | |||
U+067D | ٽ | Arabic Letter Teh With Three Dots Above Downwards Sindhi | |||
U+067E | پ | Arabic Letter Peh Persian, Urdu, ... | |||
U+067F | ٿ | Arabic Letter Teheh Sindhi | |||
U+0680 | ڀ | Arabic Letter Beheh Sindhi | |||
U+0681 | ځ | Arabic Letter Hah With Hamza Above Pashto, Sarikoli represents the phoneme /dz/ | |||
U+0682 | ڂ | Arabic Letter Hah With Two Dots Vertical Above not used in modern Pashto | |||
U+0683 | ڃ | Arabic Letter Nyeh Sindhi | |||
U+0684 | ڄ | Arabic Letter Dyeh Sindhi, historically Bosnian | |||
U+0685 | څ | Arabic Letter Hah With Three Dots Above Pashto, Khwarazmian, Sarikoli represents the phoneme /ts/ in Pashto | |||
U+0686 | چ | Arabic Letter Tcheh Persian, Urdu, ... | |||
U+0687 | ڇ | Arabic Letter Tcheheh Sindhi | |||
U+0688 | ڈ | Arabic Letter Ddal Urdu | |||
U+0689 | ډ | Arabic Letter Dal With Ring Pashto | |||
U+068A | ڊ | Arabic Letter Dal With Dot Below Sindhi, early Persian, Pegon, Malagasy | |||
U+068B | ڋ | Arabic Letter Dal With Dot Below And Small Tah Lahnda | |||
U+068C | ڌ | Arabic Letter Dahal Sindhi | |||
U+068D | ڍ | Arabic Letter Ddahal Sindhi | |||
U+068E | ڎ | Arabic Letter Dul older shape for DUL, now obsolete in Sindhi Burushaski | |||
U+068F | ڏ | Arabic Letter Dal With Three Dots Above Downwards Sindhi current shape used for DUL | |||
U+0690 | ڐ | Arabic Letter Dal With Four Dots Above Old Urdu, not in current use | |||
U+0691 | ڑ | Arabic Letter Rreh Urdu | |||
U+0692 | ڒ | Arabic Letter Reh With Small V Kurdish | |||
U+0693 | ړ | Arabic Letter Reh With Ring Pashto | |||
U+0694 | ڔ | Arabic Letter Reh With Dot Below Kurdish, early Persian | |||
U+0695 | ڕ | Arabic Letter Reh With Small V Below Kurdish | |||
U+0696 | ږ | Arabic Letter Reh With Dot Below And Dot Above Pashto | |||
U+0697 | ڗ | Arabic Letter Reh With Two Dots Above Dargwa | |||
U+0698 | ژ | Arabic Letter Jeh Persian, Urdu, ... | |||
U+0699 | ڙ | Arabic Letter Reh With Four Dots Above Sindhi | |||
U+069A | ښ | Arabic Letter Seen With Dot Below And Dot Above Pashto | |||
U+069B | ڛ | Arabic Letter Seen With Three Dots Below early Persian | |||
U+069C | ڜ | Arabic Letter Seen With Three Dots Below And Three Dots Above Moroccan Arabic | |||
U+069D | ڝ | Arabic Letter Sad With Two Dots Below Turkic | |||
U+069E | ڞ | Arabic Letter Sad With Three Dots Above Berber, Burushaski | |||
U+069F | ڟ | Arabic Letter Tah With Three Dots Above Old Hausa | |||
U+06A0 | ڠ | Arabic Letter Ain With Three Dots Above Jawi | |||
U+06A1 | ڡ | Arabic Letter Dotless Feh Adighe | |||
U+06A2 | ڢ | Arabic Letter Feh With Dot Moved Below Maghrib Arabic | |||
U+06A3 | ڣ | Arabic Letter Feh With Dot Below Ingush | |||
U+06A4 | ڤ | Arabic Letter Veh Middle Eastern Arabic for foreign words Kurdish, Khwarazmian, early Persian, Jawi | |||
U+06A5 | ڥ | Arabic Letter Feh With Three Dots Below North African Arabic for foreign words | |||
U+06A6 | ڦ | Arabic Letter Peheh Sindhi | |||
U+06A7 | ڧ | Arabic Letter Qaf With Dot Above Maghrib Arabic, Uyghur | |||
U+06A8 | ڨ | Arabic Letter Qaf With Three Dots Above Tunisian and Algerian Arabic | |||
U+06A9 | ک | Arabic Letter Keheh Persian, Urdu, Sindhi, ...= kaf mashkula | |||
U+06AA | ڪ | Arabic Letter Swash Kaf represents a letter distinct from Arabic KAF (0643) in Sindhi | |||
U+06AB | ګ | Arabic Letter Kaf With Ring Pashto may appear like an Arabic KAF (0643) with a ring below the base | |||
U+06AC | ڬ | Arabic Letter Kaf With Dot Above use for the Jawi gaf is not recommended, although it may be found in some existing text data; recommended character for Jawi gaf is 0762 → U+0762 ݢ Arabic Letter Keheh With Dot Above | |||
U+06AD | ڭ | Arabic Letter Ng Uyghur, Kazakh, Moroccan Arabic, early Jawi, early Persian, ... | |||
U+06AE | ڮ | Arabic Letter Kaf With Three Dots Below Berber, early Persian Pegon alternative for 08B4 | |||
U+06AF | گ | Arabic Letter Gaf Persian, Urdu, ... | |||
U+06B0 | ڰ | Arabic Letter Gaf With Ring Lahnda | |||
U+06B1 | ڱ | Arabic Letter Ngoeh Sindhi | |||
U+06B2 | ڲ | Arabic Letter Gaf With Two Dots Below not used in Sindhi | |||
U+06B3 | ڳ | Arabic Letter Gueh Sindhi, Saraiki | |||
U+06B4 | ڴ | Arabic Letter Gaf With Three Dots Above not used in Sindhi, Karakalpak | |||
U+06B5 | ڵ | Arabic Letter Lam With Small V Kurdish, historically Bosnian | |||
U+06B6 | ڶ | Arabic Letter Lam With Dot Above Kurdish | |||
U+06B7 | ڷ | Arabic Letter Lam With Three Dots Above Kurdish | |||
U+06B8 | ڸ | Arabic Letter Lam With Three Dots Below Avar, Soqotri | |||
U+06B9 | ڹ | Arabic Letter Noon With Dot Below | |||
U+06BA | ں | Arabic Letter Noon Ghunna Urdu, archaic Arabic dotless in all four contextual forms | |||
U+06BB | ڻ | Arabic Letter Rnoon dotless in all four contextual forms Sindhi | |||
U+06BC | ڼ | Arabic Letter Noon With Ring Pashto | |||
U+06BD | ڽ | Arabic Letter Noon With Three Dots Above Jawi | |||
U+06BE | ھ | Arabic Letter Heh Doachashmee forms aspirate digraphs in Urdu and other languages of South Asia represents the glottal fricative /h/ in Uyghur | |||
U+06BF | ڿ | Arabic Letter Tcheh With Dot Above | |||
U+06C0 | ۀ | Arabic Letter Heh With Yeh Above for ezafe, use 0654 over the language-appropriate base letter actually a ligature, not an independent letter arabic letter hamzah on ha (1.0) ≡ ۀ U+06D5 U+0654 | |||
U+06C1 | ہ | Arabic Letter Heh Goal Urdu | |||
U+06C2 | ۂ | Arabic Letter Heh Goal With Hamza Above Urdu actually a ligature, not an independent letter ≡ ۂ U+06C1 U+0654 | |||
U+06C3 | ۃ | Arabic Letter Teh Marbuta Goal Urdu | |||
U+06C4 | ۄ | Arabic Letter Waw With Ring Kashmiri | |||
U+06C5 | ۅ | Arabic Letter Kirghiz Oe Kyrgyz a glyph variant occurs which replaces the looped tail with a horizontal bar through the tail | |||
U+06C6 | ۆ | Arabic Letter Oe Uyghur, Kurdish, Kazakh, Azerbaijani, historically Bosnian | |||
U+06C7 | ۇ | Arabic Letter U Azerbaijani, Kazakh, Kyrgyz, Uyghur | |||
U+06C8 | ۈ | Arabic Letter Yu Uyghur | |||
U+06C9 | ۉ | Arabic Letter Kirghiz Yu Kazakh, Kyrgyz, historically Bosnian | |||
U+06CA | ۊ | Arabic Letter Waw With Two Dots Above Kurdish | |||
U+06CB | ۋ | Arabic Letter Ve Uyghur, Kazakh | |||
U+06CC | ی | Arabic Letter Farsi Yeh Arabic, Persian, Urdu, Kashmiri, ... initial and medial forms of this letter have dots → U+0649 ى ARABIC LETTER ALEF MAKSURA → U+064A ي Arabic Letter Yeh | |||
U+06CD | ۍ | Arabic Letter Yeh With Tail Pashto, Sindhi | |||
U+06CE | ێ | Arabic Letter Yeh With Small V Kurdish | |||
U+06CF | ۏ | Arabic Letter Waw With Dot Above Jawi | U+06D0 | ې | Arabic Letter E Pashto, Uyghur used as the letter bbeh in Sindhi |
U+06D1 | ۑ | Arabic Letter Yeh With Three Dots Below Mende languages, Hausa | |||
U+06D2 | ے | Arabic Letter Yeh Barree Urdu | |||
U+06D3 | ۓ | Arabic Letter Yeh Barree With Hamza Above Urdu | |||
U+06D4 | ۔ | Arabic Full Stop Urdu | |||
U+06D5 | ە | Arabic Letter Ae Uyghur, Kazakh, Kyrgyz | |||
U+06D6 | ۖ | Arabic Small High Ligature Sad With Lam With Alef Maksura | |||
U+06D7 | ۗ | Arabic Small High Ligature Qaf With Lam With Alef Maksura | |||
U+06D8 | ۘ | Arabic Small High Meem Initial Form | |||
U+06D9 | ۙ | Arabic Small High Lam Alef | |||
U+06DA | ۚ | Arabic Small High Jeem | |||
U+06DB | ۛ | Arabic Small High Three Dots | |||
U+06DC | ۜ | Arabic Small High Seen | |||
U+06DD | | Arabic End of Ayah | |||
U+06DE | ۞ | Arabic Star of Rub El Hizb | |||
U+06DF | ۟ | Arabic Small High Rounded Zero smaller than the typical circular shape used for 0652 | |||
U+06E0 | ۠ | Arabic Small High Upright Rectangular Zero the term "rectangular zero" is a translation of the Arabic name of this sign | |||
U+06E1 | ۡ | Arabic Small High Dotless Head Of Khah presentation form of 0652, using font technology to select the variant is preferred used in some Qurans to mark absence of a vowel= Arabic jazm → U+0652 ْ Arabic Sukun | |||
U+06E2 | ۢ | Arabic Small High Meem Isolated Form | |||
U+06E3 | ۣ | Arabic Small Low Seen | |||
U+06E4 | ۤ | Arabic Small High Madda typically used with 06E5, 06E6, 06E7, and 08F3 | |||
U+06E5 | ۥ | Arabic Small Waw → U+08D3 ࣓ Arabic Small Low Waw → U+08F3 ࣳ Arabic Small High Waw | |||
U+06E6 | ۦ | Arabic Small Yeh | |||
U+06E7 | ۧ | Arabic Small High Yeh | |||
U+06E8 | ۨ | Arabic Small High Noon | |||
U+06E9 | ۩ | Arabic Place Of Sajdah there is a range of acceptable glyphs for this character | |||
U+06EA | ۪ | Arabic Empty Centre Low Stop | |||
U+06EB | ۫ | Arabic Empty Centre High Stop | |||
U+06EC | ۬ | Arabic Rounded High Stop With Filled Centre also used in Quranic text in African and other orthographies to represent wasla, ikhtilas, etc. | |||
U+06ED | ۭ | Arabic Small Low Meem | |||
U+06EE | ۮ | Arabic Letter Dal With Inverted V | |||
U+06EF | ۯ | Arabic Letter Reh With Inverted V also used in early Persian | |||
U+06F0 | ۰ | Extended Arabic-Indic Digit Zero | |||
U+06F1 | ۱ | Extended Arabic-Indic Digit One | |||
U+06F2 | ۲ | Extended Arabic-Indic Digit Two | |||
U+06F3 | ۳ | Extended Arabic-Indic Digit Three | |||
U+06F4 | ۴ | Extended Arabic-Indic Digit Four Persian has a different glyph than Sindhi and Urdu | |||
U+06F5 | ۵ | Extended Arabic-Indic Digit Five Persian, Sindhi, and Urdu share glyph different from Arabic | |||
U+06F6 | ۶ | Extended Arabic-Indic Digit Six Persian, Sindhi, and Urdu have glyphs different from Arabic | |||
U+06F7 | ۷ | Extended Arabic-Indic Digit Seven Urdu and Sindhi have glyphs different from Arabic | |||
U+06F8 | ۸ | Extended Arabic-Indic Digit Eight | |||
U+06F9 | ۹ | Extended Arabic-Indic Digit Nine | |||
U+06FA | ۺ | Arabic Letter Sheen With Dot Below | |||
U+06FB | ۻ | Arabic Letter Dad With Dot Below | |||
U+06FC | ۼ | Arabic Letter Ghain With Dot Below | |||
U+06FD | ۽ | Arabic Sign Sindhi Ampersand | |||
U+06FE | ۾ | Arabic Sign Sindhi Postposition Men | |||
U+06FF | ۿ | Arabic Letter Heh With Inverted V |
Arabic [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+060x | | | | | | | ؆ | ؇ | ؈ | ؉ | ؊ | ؋ | ، | ؍ | ؎ | ؏ |
U+061x | ؐ | ؑ | ؒ | ؓ | ؔ | ؕ | ؖ | ؗ | ؘ | ؙ | ؚ | ؛ | ALM | ؝ | ؞ | ؟ |
U+062x | ؠ | ء | آ | أ | ؤ | إ | ئ | ا | ب | ة | ت | ث | ج | ح | خ | د |
U+063x | ذ | ر | ز | س | ش | ص | ض | ط | ظ | ع | غ | ػ | ؼ | ؽ | ؾ | ؿ |
U+064x | ـ | ف | ق | ك | ل | م | ن | ه | و | ى | ي | ً | ٌ | ٍ | َ | ُ |
U+065x | ِ | ّ | ْ | ٓ | ٔ | ٕ | ٖ | ٗ | ٘ | ٙ | ٚ | ٛ | ٜ | ٝ | ٞ | ٟ |
U+066x | ٠ | ١ | ٢ | ٣ | ٤ | ٥ | ٦ | ٧ | ٨ | ٩ | ٪ | ٫ | ٬ | ٭ | ٮ | ٯ |
U+067x | ٰ | ٱ | ٲ | ٳ | ٴ | ٵ | ٶ | ٷ | ٸ | ٹ | ٺ | ٻ | ټ | ٽ | پ | ٿ |
U+068x | ڀ | ځ | ڂ | ڃ | ڄ | څ | چ | ڇ | ڈ | ډ | ڊ | ڋ | ڌ | ڍ | ڎ | ڏ |
U+069x | ڐ | ڑ | ڒ | ړ | ڔ | ڕ | ږ | ڗ | ژ | ڙ | ښ | ڛ | ڜ | ڝ | ڞ | ڟ |
U+06Ax | ڠ | ڡ | ڢ | ڣ | ڤ | ڥ | ڦ | ڧ | ڨ | ک | ڪ | ګ | ڬ | ڭ | ڮ | گ |
U+06Bx | ڰ | ڱ | ڲ | ڳ | ڴ | ڵ | ڶ | ڷ | ڸ | ڹ | ں | ڻ | ڼ | ڽ | ھ | ڿ |
U+06Cx | ۀ | ہ | ۂ | ۃ | ۄ | ۅ | ۆ | ۇ | ۈ | ۉ | ۊ | ۋ | ی | ۍ | ێ | ۏ |
U+06Dx | ې | ۑ | ے | ۓ | ۔ | ە | ۖ | ۗ | ۘ | ۙ | ۚ | ۛ | ۜ | | ۞ | ۟ |
U+06Ex | ۠ | ۡ | ۢ | ۣ | ۤ | ۥ | ۦ | ۧ | ۨ | ۩ | ۪ | ۫ | ۬ | ۭ | ۮ | ۯ |
U+06Fx | ۰ | ۱ | ۲ | ۳ | ۴ | ۵ | ۶ | ۷ | ۸ | ۹ | ۺ | ۻ | ۼ | ۽ | ۾ | ۿ |
Notes |
Arabic Supplement [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+075x | ݐ | ݑ | ݒ | ݓ | ݔ | ݕ | ݖ | ݗ | ݘ | ݙ | ݚ | ݛ | ݜ | ݝ | ݞ | ݟ |
U+076x | ݠ | ݡ | ݢ | ݣ | ݤ | ݥ | ݦ | ݧ | ݨ | ݩ | ݪ | ݫ | ݬ | ݭ | ݮ | ݯ |
U+077x | ݰ | ݱ | ݲ | ݳ | ݴ | ݵ | ݶ | ݷ | ݸ | ݹ | ݺ | ݻ | ݼ | ݽ | ݾ | ݿ |
Notes
|
Arabic Extended-B [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+087x | ࡰ | ࡱ | ࡲ | ࡳ | ࡴ | ࡵ | ࡶ | ࡷ | ࡸ | ࡹ | ࡺ | ࡻ | ࡼ | ࡽ | ࡾ | ࡿ |
U+088x | ࢀ | ࢁ | ࢂ | ࢃ | ࢄ | ࢅ | ࢆ | ࢇ | ࢈ | ࢉ | ࢊ | ࢋ | ࢌ | ࢍ | ࢎ | |
U+089x | | | ࢘ | ࢙ | ࢚ | ࢛ | ࢜ | ࢝ | ࢞ | ࢟ | ||||||
Notes |
Arabic Extended-A [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+08Ax | ࢠ | ࢡ | ࢢ | ࢣ | ࢤ | ࢥ | ࢦ | ࢧ | ࢨ | ࢩ | ࢪ | ࢫ | ࢬ | ࢭ | ࢮ | ࢯ |
U+08Bx | ࢰ | ࢱ | ࢲ | ࢳ | ࢴ | ࢵ | ࢶ | ࢷ | ࢸ | ࢹ | ࢺ | ࢻ | ࢼ | ࢽ | ࢾ | ࢿ |
U+08Cx | ࣀ | ࣁ | ࣂ | ࣃ | ࣄ | ࣅ | ࣆ | ࣇ | ࣈ | ࣉ | ࣊ | ࣋ | ࣌ | ࣍ | ࣎ | ࣏ |
U+08Dx | ࣐ | ࣑ | ࣒ | ࣓ | ࣔ | ࣕ | ࣖ | ࣗ | ࣘ | ࣙ | ࣚ | ࣛ | ࣜ | ࣝ | ࣞ | ࣟ |
U+08Ex | ࣠ | ࣡ | | ࣣ | ࣤ | ࣥ | ࣦ | ࣧ | ࣨ | ࣩ | ࣪ | ࣫ | ࣬ | ࣭ | ࣮ | ࣯ |
U+08Fx | ࣰ | ࣱ | ࣲ | ࣳ | ࣴ | ࣵ | ࣶ | ࣷ | ࣸ | ࣹ | ࣺ | ࣻ | ࣼ | ࣽ | ࣾ | ࣿ |
Notes
|
They are mostly ligatures which can be created from the previous charts' characters, with the exception of the bracket-like graphemes ﴾ ﴿ and some of them are ligatures of common liturgical phrases.
Arabic Presentation Forms-A [1] [2] [3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+FB5x | ﭐ | ﭑ | ﭒ | ﭓ | ﭔ | ﭕ | ﭖ | ﭗ | ﭘ | ﭙ | ﭚ | ﭛ | ﭜ | ﭝ | ﭞ | ﭟ |
U+FB6x | ﭠ | ﭡ | ﭢ | ﭣ | ﭤ | ﭥ | ﭦ | ﭧ | ﭨ | ﭩ | ﭪ | ﭫ | ﭬ | ﭭ | ﭮ | ﭯ |
U+FB7x | ﭰ | ﭱ | ﭲ | ﭳ | ﭴ | ﭵ | ﭶ | ﭷ | ﭸ | ﭹ | ﭺ | ﭻ | ﭼ | ﭽ | ﭾ | ﭿ |
U+FB8x | ﮀ | ﮁ | ﮂ | ﮃ | ﮄ | ﮅ | ﮆ | ﮇ | ﮈ | ﮉ | ﮊ | ﮋ | ﮌ | ﮍ | ﮎ | ﮏ |
U+FB9x | ﮐ | ﮑ | ﮒ | ﮓ | ﮔ | ﮕ | ﮖ | ﮗ | ﮘ | ﮙ | ﮚ | ﮛ | ﮜ | ﮝ | ﮞ | ﮟ |
U+FBAx | ﮠ | ﮡ | ﮢ | ﮣ | ﮤ | ﮥ | ﮦ | ﮧ | ﮨ | ﮩ | ﮪ | ﮫ | ﮬ | ﮭ | ﮮ | ﮯ |
U+FBBx | ﮰ | ﮱ | ﮲ | ﮳ | ﮴ | ﮵ | ﮶ | ﮷ | ﮸ | ﮹ | ﮺ | ﮻ | ﮼ | ﮽ | ﮾ | ﮿ |
U+FBCx | ﯀ | ﯁ | ﯂ | |||||||||||||
U+FBDx | ﯓ | ﯔ | ﯕ | ﯖ | ﯗ | ﯘ | ﯙ | ﯚ | ﯛ | ﯜ | ﯝ | ﯞ | ﯟ | |||
U+FBEx | ﯠ | ﯡ | ﯢ | ﯣ | ﯤ | ﯥ | ﯦ | ﯧ | ﯨ | ﯩ | ﯪ | ﯫ | ﯬ | ﯭ | ﯮ | ﯯ |
U+FBFx | ﯰ | ﯱ | ﯲ | ﯳ | ﯴ | ﯵ | ﯶ | ﯷ | ﯸ | ﯹ | ﯺ | ﯻ | ﯼ | ﯽ | ﯾ | ﯿ |
U+FC0x | ﰀ | ﰁ | ﰂ | ﰃ | ﰄ | ﰅ | ﰆ | ﰇ | ﰈ | ﰉ | ﰊ | ﰋ | ﰌ | ﰍ | ﰎ | ﰏ |
U+FC1x | ﰐ | ﰑ | ﰒ | ﰓ | ﰔ | ﰕ | ﰖ | ﰗ | ﰘ | ﰙ | ﰚ | ﰛ | ﰜ | ﰝ | ﰞ | ﰟ |
U+FC2x | ﰠ | ﰡ | ﰢ | ﰣ | ﰤ | ﰥ | ﰦ | ﰧ | ﰨ | ﰩ | ﰪ | ﰫ | ﰬ | ﰭ | ﰮ | ﰯ |
U+FC3x | ﰰ | ﰱ | ﰲ | ﰳ | ﰴ | ﰵ | ﰶ | ﰷ | ﰸ | ﰹ | ﰺ | ﰻ | ﰼ | ﰽ | ﰾ | ﰿ |
U+FC4x | ﱀ | ﱁ | ﱂ | ﱃ | ﱄ | ﱅ | ﱆ | ﱇ | ﱈ | ﱉ | ﱊ | ﱋ | ﱌ | ﱍ | ﱎ | ﱏ |
U+FC5x | ﱐ | ﱑ | ﱒ | ﱓ | ﱔ | ﱕ | ﱖ | ﱗ | ﱘ | ﱙ | ﱚ | ﱛ | ﱜ | ﱝ | ﱞ | ﱟ |
U+FC6x | ﱠ | ﱡ | ﱢ | ﱣ | ﱤ | ﱥ | ﱦ | ﱧ | ﱨ | ﱩ | ﱪ | ﱫ | ﱬ | ﱭ | ﱮ | ﱯ |
U+FC7x | ﱰ | ﱱ | ﱲ | ﱳ | ﱴ | ﱵ | ﱶ | ﱷ | ﱸ | ﱹ | ﱺ | ﱻ | ﱼ | ﱽ | ﱾ | ﱿ |
U+FC8x | ﲀ | ﲁ | ﲂ | ﲃ | ﲄ | ﲅ | ﲆ | ﲇ | ﲈ | ﲉ | ﲊ | ﲋ | ﲌ | ﲍ | ﲎ | ﲏ |
U+FC9x | ﲐ | ﲑ | ﲒ | ﲓ | ﲔ | ﲕ | ﲖ | ﲗ | ﲘ | ﲙ | ﲚ | ﲛ | ﲜ | ﲝ | ﲞ | ﲟ |
U+FCAx | ﲠ | ﲡ | ﲢ | ﲣ | ﲤ | ﲥ | ﲦ | ﲧ | ﲨ | ﲩ | ﲪ | ﲫ | ﲬ | ﲭ | ﲮ | ﲯ |
U+FCBx | ﲰ | ﲱ | ﲲ | ﲳ | ﲴ | ﲵ | ﲶ | ﲷ | ﲸ | ﲹ | ﲺ | ﲻ | ﲼ | ﲽ | ﲾ | ﲿ |
U+FCCx | ﳀ | ﳁ | ﳂ | ﳃ | ﳄ | ﳅ | ﳆ | ﳇ | ﳈ | ﳉ | ﳊ | ﳋ | ﳌ | ﳍ | ﳎ | ﳏ |
U+FCDx | ﳐ | ﳑ | ﳒ | ﳓ | ﳔ | ﳕ | ﳖ | ﳗ | ﳘ | ﳙ | ﳚ | ﳛ | ﳜ | ﳝ | ﳞ | ﳟ |
U+FCEx | ﳠ | ﳡ | ﳢ | ﳣ | ﳤ | ﳥ | ﳦ | ﳧ | ﳨ | ﳩ | ﳪ | ﳫ | ﳬ | ﳭ | ﳮ | ﳯ |
U+FCFx | ﳰ | ﳱ | ﳲ | ﳳ | ﳴ | ﳵ | ﳶ | ﳷ | ﳸ | ﳹ | ﳺ | ﳻ | ﳼ | ﳽ | ﳾ | ﳿ |
U+FD0x | ﴀ | ﴁ | ﴂ | ﴃ | ﴄ | ﴅ | ﴆ | ﴇ | ﴈ | ﴉ | ﴊ | ﴋ | ﴌ | ﴍ | ﴎ | ﴏ |
U+FD1x | ﴐ | ﴑ | ﴒ | ﴓ | ﴔ | ﴕ | ﴖ | ﴗ | ﴘ | ﴙ | ﴚ | ﴛ | ﴜ | ﴝ | ﴞ | ﴟ |
U+FD2x | ﴠ | ﴡ | ﴢ | ﴣ | ﴤ | ﴥ | ﴦ | ﴧ | ﴨ | ﴩ | ﴪ | ﴫ | ﴬ | ﴭ | ﴮ | ﴯ |
U+FD3x | ﴰ | ﴱ | ﴲ | ﴳ | ﴴ | ﴵ | ﴶ | ﴷ | ﴸ | ﴹ | ﴺ | ﴻ | ﴼ | ﴽ | ﴾ | ﴿ |
U+FD4x | ﵀ | ﵁ | ﵂ | ﵃ | ﵄ | ﵅ | ﵆ | ﵇ | ﵈ | ﵉ | ﵊ | ﵋ | ﵌ | ﵍ | ﵎ | ﵏ |
U+FD5x | ﵐ | ﵑ | ﵒ | ﵓ | ﵔ | ﵕ | ﵖ | ﵗ | ﵘ | ﵙ | ﵚ | ﵛ | ﵜ | ﵝ | ﵞ | ﵟ |
U+FD6x | ﵠ | ﵡ | ﵢ | ﵣ | ﵤ | ﵥ | ﵦ | ﵧ | ﵨ | ﵩ | ﵪ | ﵫ | ﵬ | ﵭ | ﵮ | ﵯ |
U+FD7x | ﵰ | ﵱ | ﵲ | ﵳ | ﵴ | ﵵ | ﵶ | ﵷ | ﵸ | ﵹ | ﵺ | ﵻ | ﵼ | ﵽ | ﵾ | ﵿ |
U+FD8x | ﶀ | ﶁ | ﶂ | ﶃ | ﶄ | ﶅ | ﶆ | ﶇ | ﶈ | ﶉ | ﶊ | ﶋ | ﶌ | ﶍ | ﶎ | ﶏ |
U+FD9x | ﶒ | ﶓ | ﶔ | ﶕ | ﶖ | ﶗ | ﶘ | ﶙ | ﶚ | ﶛ | ﶜ | ﶝ | ﶞ | ﶟ | ||
U+FDAx | ﶠ | ﶡ | ﶢ | ﶣ | ﶤ | ﶥ | ﶦ | ﶧ | ﶨ | ﶩ | ﶪ | ﶫ | ﶬ | ﶭ | ﶮ | ﶯ |
U+FDBx | ﶰ | ﶱ | ﶲ | ﶳ | ﶴ | ﶵ | ﶶ | ﶷ | ﶸ | ﶹ | ﶺ | ﶻ | ﶼ | ﶽ | ﶾ | ﶿ |
U+FDCx | ﷀ | ﷁ | ﷂ | ﷃ | ﷄ | ﷅ | ﷆ | ﷇ | ﷏ | |||||||
U+FDDx | ||||||||||||||||
U+FDEx | ||||||||||||||||
U+FDFx | ﷰ | ﷱ | ﷲ | ﷳ | ﷴ | ﷵ | ﷶ | ﷷ | ﷸ | ﷹ | ﷺ | ﷻ | ﷼ | ﷽ | ﷾ | ﷿ |
Notes
|
These can all be created from the basic chart's characters.
Arabic Presentation Forms-B [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+FE7x | ﹰ | ﹱ | ﹲ | ﹳ | ﹴ | ﹶ | ﹷ | ﹸ | ﹹ | ﹺ | ﹻ | ﹼ | ﹽ | ﹾ | ﹿ | |
U+FE8x | ﺀ | ﺁ | ﺂ | ﺃ | ﺄ | ﺅ | ﺆ | ﺇ | ﺈ | ﺉ | ﺊ | ﺋ | ﺌ | ﺍ | ﺎ | ﺏ |
U+FE9x | ﺐ | ﺑ | ﺒ | ﺓ | ﺔ | ﺕ | ﺖ | ﺗ | ﺘ | ﺙ | ﺚ | ﺛ | ﺜ | ﺝ | ﺞ | ﺟ |
U+FEAx | ﺠ | ﺡ | ﺢ | ﺣ | ﺤ | ﺥ | ﺦ | ﺧ | ﺨ | ﺩ | ﺪ | ﺫ | ﺬ | ﺭ | ﺮ | ﺯ |
U+FEBx | ﺰ | ﺱ | ﺲ | ﺳ | ﺴ | ﺵ | ﺶ | ﺷ | ﺸ | ﺹ | ﺺ | ﺻ | ﺼ | ﺽ | ﺾ | ﺿ |
U+FECx | ﻀ | ﻁ | ﻂ | ﻃ | ﻄ | ﻅ | ﻆ | ﻇ | ﻈ | ﻉ | ﻊ | ﻋ | ﻌ | ﻍ | ﻎ | ﻏ |
U+FEDx | ﻐ | ﻑ | ﻒ | ﻓ | ﻔ | ﻕ | ﻖ | ﻗ | ﻘ | ﻙ | ﻚ | ﻛ | ﻜ | ﻝ | ﻞ | ﻟ |
U+FEEx | ﻠ | ﻡ | ﻢ | ﻣ | ﻤ | ﻥ | ﻦ | ﻧ | ﻨ | ﻩ | ﻪ | ﻫ | ﻬ | ﻭ | ﻮ | ﻯ |
U+FEFx | ﻰ | ﻱ | ﻲ | ﻳ | ﻴ | ﻵ | ﻶ | ﻷ | ﻸ | ﻹ | ﻺ | ﻻ | ﻼ | ZW NBSP | ||
Notes |
Rumi Numeral Symbols [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+10E6x | 𐹠 | 𐹡 | 𐹢 | 𐹣 | 𐹤 | 𐹥 | 𐹦 | 𐹧 | 𐹨 | 𐹩 | 𐹪 | 𐹫 | 𐹬 | 𐹭 | 𐹮 | 𐹯 |
U+10E7x | 𐹰 | 𐹱 | 𐹲 | 𐹳 | 𐹴 | 𐹵 | 𐹶 | 𐹷 | 𐹸 | 𐹹 | 𐹺 | 𐹻 | 𐹼 | 𐹽 | 𐹾 | |
Notes |
Arabic Extended-C [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+10ECx | ||||||||||||||||
U+10EDx | ||||||||||||||||
U+10EEx | ||||||||||||||||
U+10EFx | 𐻽 | 𐻾 | 𐻿 | |||||||||||||
Notes |
Indic Siyaq Numbers [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1EC7x | 𞱱 | 𞱲 | 𞱳 | 𞱴 | 𞱵 | 𞱶 | 𞱷 | 𞱸 | 𞱹 | 𞱺 | 𞱻 | 𞱼 | 𞱽 | 𞱾 | 𞱿 | |
U+1EC8x | 𞲀 | 𞲁 | 𞲂 | 𞲃 | 𞲄 | 𞲅 | 𞲆 | 𞲇 | 𞲈 | 𞲉 | 𞲊 | 𞲋 | 𞲌 | 𞲍 | 𞲎 | 𞲏 |
U+1EC9x | 𞲐 | 𞲑 | 𞲒 | 𞲓 | 𞲔 | 𞲕 | 𞲖 | 𞲗 | 𞲘 | 𞲙 | 𞲚 | 𞲛 | 𞲜 | 𞲝 | 𞲞 | 𞲟 |
U+1ECAx | 𞲠 | 𞲡 | 𞲢 | 𞲣 | 𞲤 | 𞲥 | 𞲦 | 𞲧 | 𞲨 | 𞲩 | 𞲪 | 𞲫 | 𞲬 | 𞲭 | 𞲮 | 𞲯 |
U+1ECBx | 𞲰 | 𞲱 | 𞲲 | 𞲳 | 𞲴 | |||||||||||
Notes |
Ottoman Siyaq Numbers [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1ED0x | 𞴁 | 𞴂 | 𞴃 | 𞴄 | 𞴅 | 𞴆 | 𞴇 | 𞴈 | 𞴉 | 𞴊 | 𞴋 | 𞴌 | 𞴍 | 𞴎 | 𞴏 | |
U+1ED1x | 𞴐 | 𞴑 | 𞴒 | 𞴓 | 𞴔 | 𞴕 | 𞴖 | 𞴗 | 𞴘 | 𞴙 | 𞴚 | 𞴛 | 𞴜 | 𞴝 | 𞴞 | 𞴟 |
U+1ED2x | 𞴠 | 𞴡 | 𞴢 | 𞴣 | 𞴤 | 𞴥 | 𞴦 | 𞴧 | 𞴨 | 𞴩 | 𞴪 | 𞴫 | 𞴬 | 𞴭 | 𞴮 | 𞴯 |
U+1ED3x | 𞴰 | 𞴱 | 𞴲 | 𞴳 | 𞴴 | 𞴵 | 𞴶 | 𞴷 | 𞴸 | 𞴹 | 𞴺 | 𞴻 | 𞴼 | 𞴽 | ||
U+1ED4x | ||||||||||||||||
Notes |
Arabic Mathematical Alphabetic Symbols [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1EE0x | 𞸀 | 𞸁 | 𞸂 | 𞸃 | 𞸅 | 𞸆 | 𞸇 | 𞸈 | 𞸉 | 𞸊 | 𞸋 | 𞸌 | 𞸍 | 𞸎 | 𞸏 | |
U+1EE1x | 𞸐 | 𞸑 | 𞸒 | 𞸓 | 𞸔 | 𞸕 | 𞸖 | 𞸗 | 𞸘 | 𞸙 | 𞸚 | 𞸛 | 𞸜 | 𞸝 | 𞸞 | 𞸟 |
U+1EE2x | 𞸡 | 𞸢 | 𞸤 | 𞸧 | 𞸩 | 𞸪 | 𞸫 | 𞸬 | 𞸭 | 𞸮 | 𞸯 | |||||
U+1EE3x | 𞸰 | 𞸱 | 𞸲 | 𞸴 | 𞸵 | 𞸶 | 𞸷 | 𞸹 | 𞸻 | |||||||
U+1EE4x | 𞹂 | 𞹇 | 𞹉 | 𞹋 | 𞹍 | 𞹎 | 𞹏 | |||||||||
U+1EE5x | 𞹑 | 𞹒 | 𞹔 | 𞹗 | 𞹙 | 𞹛 | 𞹝 | 𞹟 | ||||||||
U+1EE6x | 𞹡 | 𞹢 | 𞹤 | 𞹧 | 𞹨 | 𞹩 | 𞹪 | 𞹬 | 𞹭 | 𞹮 | 𞹯 | |||||
U+1EE7x | 𞹰 | 𞹱 | 𞹲 | 𞹴 | 𞹵 | 𞹶 | 𞹷 | 𞹹 | 𞹺 | 𞹻 | 𞹼 | 𞹾 | ||||
U+1EE8x | 𞺀 | 𞺁 | 𞺂 | 𞺃 | 𞺄 | 𞺅 | 𞺆 | 𞺇 | 𞺈 | 𞺉 | 𞺋 | 𞺌 | 𞺍 | 𞺎 | 𞺏 | |
U+1EE9x | 𞺐 | 𞺑 | 𞺒 | 𞺓 | 𞺔 | 𞺕 | 𞺖 | 𞺗 | 𞺘 | 𞺙 | 𞺚 | 𞺛 | ||||
U+1EEAx | 𞺡 | 𞺢 | 𞺣 | 𞺥 | 𞺦 | 𞺧 | 𞺨 | 𞺩 | 𞺫 | 𞺬 | 𞺭 | 𞺮 | 𞺯 | |||
U+1EEBx | 𞺰 | 𞺱 | 𞺲 | 𞺳 | 𞺴 | 𞺵 | 𞺶 | 𞺷 | 𞺸 | 𞺹 | 𞺺 | 𞺻 | ||||
U+1EECx | ||||||||||||||||
U+1EEDx | ||||||||||||||||
U+1EEEx | ||||||||||||||||
U+1EEFx | 𞻰 | 𞻱 | ||||||||||||||
Notes |
The Arabic alphabet, or Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is written from right-to-left in a cursive style, and includes 28 letters, of which most have contextual letterforms. The Arabic alphabet is considered an abjad, with only consonants required to be written; due to its optional use of diacritics to notate vowels, it is considered an impure abjad.
Devanagari is an Indic script used in the northern Indian subcontinent. Also simply called Nāgari, it is a left-to-right abugida, based on the ancient Brāhmi script. It is one of the official scripts of the Republic of India and Nepal. It was developed and in regular use by the 8th century CE and achieved its modern form by 1200 CE. The Devanāgari script, composed of 48 primary characters, including 14 vowels and 34 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.
Thaana, Tãnaa, Taana or Tāna is the present writing system of the Maldivian language spoken in the Maldives. Thaana has characteristics of both an abugida and a true alphabet, with consonants derived from indigenous and Arabic numerals, and vowels derived from the vowel diacritics of the Arabic abjad. Maldivian orthography in Thaana is largely phonemic.
Malayalam script is a Brahmic script used commonly to write Malayalam, which is the principal language of Kerala, India, spoken by 45 million people in the world. It is a Dravidian language spoken in the Indian state of Kerala and the union territories of Lakshadweep and Puducherry by the Malayali people. It is one of the official scripts of the Indian Republic. Malayalam script is also widely used for writing Sanskrit texts in Kerala.
In writing and typography, a ligature occurs where two or more graphemes or letters are joined to form a single glyph. Examples are the characters ⟨æ⟩ and ⟨œ⟩ used in English and French, in which the letters ⟨a⟩ and ⟨e⟩ are joined for the first ligature and the letters ⟨o⟩ and ⟨e⟩ are joined for the second ligature. For stylistic and legibility reasons, ⟨f⟩ and ⟨i⟩ are often merged to create ⟨fi⟩ ; the same is true of ⟨s⟩ and ⟨t⟩ to create ⟨st⟩. The common ampersand, ⟨&⟩, developed from a ligature in which the handwritten Latin letters ⟨e⟩ and ⟨t⟩ were combined.
The Tamil script is an abugida script that is used by Tamils and Tamil speakers in India, Sri Lanka, Malaysia, Singapore, Indonesia and elsewhere to write the Tamil language. It is one of the official scripts of the Indian Republic. Certain minority languages such as Saurashtra, Badaga, Irula and Paniya are also written in the Tamil script.
A precomposed character is a Unicode entity that can also be defined as a sequence of one or more other characters. A precomposed character may typically represent a letter with a diacritical mark, such as é. Technically, é (U+00E9) is a character that can be decomposed into an equivalent string of the base letter e (U+0065) and combining acute accent (U+0301). Similarly, ligatures are precompositions of their constituent letters or graphemes.
A ring diacritic may appear above or below letters. It may be combined with some letters of the extended Latin alphabets in various contexts.
The Ol Chiki script, also known as Ol Chemetʼ, Ol Ciki, Ol, and sometimes as the Santhali alphabet is the official writing system for Santhali, an Austroasiatic language recognized as an official regional language in India. It was invented by Pandit Raghunath Murmu in 1925, and is one of the official scripts of the Indian Republic. It has 30 letters, the design of which is intended to evoke natural shapes. The script is written from left to right, and has two styles. Unicode does not maintain a distinction between these two, as is typical for print and cursive variants of a script. In both styles, the script is unicameral.
The shapes of the letters are not arbitrary, but reflect the names for the letters, which are words, usually the names of objects or actions representing conventionalized form in the pictorial shape of the characters.
The Persian alphabet, also known as the Perso-Arabic script, is the right-to-left alphabet used for the Persian language. It is a variation of the Arabic script with five additional letters: پ چ ژ گ, in addition to the obsolete ڤ that was used for the sound. This letter is no longer used in Persian, as the -sound changed to, e.g. archaic زڤان > زبان 'language'.
Windows-1256 is a code page used under Microsoft Windows to write Arabic and other languages that use Arabic script, such as Persian and Urdu.
Dhikr is a form of Islamic worship in which phrases or prayers are repeatedly recited for the purpose of remembering God. It plays a central role in Sufism, and each Sufi order typically adopts a specific dhikr, accompanied by specific posture, breathing, and movement. In Sufism, dhikr refers to both the act of this remembrance as well as the prayers used in these acts of remembrance. Dhikr usually includes the names of God or supplication from the Quran or hadith. It may be counted with either one's fingers or prayer beads, and may be performed alone or with a collective group. A person who recites dhikr is called a dhākir.
Complex text layout (CTL) or complex text rendering is the typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes. The term is used in the field of software internationalization, where each grapheme is a character.
Virama is a Sanskrit phonological concept to suppress the inherent vowel that otherwise occurs with every consonant letter, commonly used as a generic term for a codepoint in Unicode, representing either
In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards. As the Unicode Glossary says:
A character that would not have been encoded except for compatibility and round-trip convertibility with other standards
There are various systems of romanization of the Armenian alphabet.
Unicode contains a number of characters that represent various cultural, political, and religious symbols. Most, but not all, of these symbols are in the Miscellaneous Symbols block.
Word heaping is a technique used for text justification in Arabic script, in which one word can be placed over another to save space on the line.
Scheherazade New, formerly Scheherazade, is a traditional Naskh styled font for Arabic script created by SIL, freely available under the Open Font License. It supports a wide range of Arabic-based writing system encoded in Unicode. The font offers two family members: regular and bold.