Arabic Supplement | |
---|---|
Range | U+0750..U+077F (48 code points) |
Plane | BMP |
Scripts | Arabic |
Major alphabets | Khowar Torwali Burushaski Shahmukhi Arwi Jawi script Ajami script Early Persian |
Assigned | 48 code points |
Unused | 0 reserved code points |
Unicode version history | |
4.1 (2005) | 30 (+30) |
5.1 (2008) | 48 (+18) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1] [2] |
Arabic Supplement is a Unicode block that encodes Arabic letter variants used for writing non-Arabic languages, including languages of Pakistan and Africa, and old Persian.
Arabic Supplement [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+075x | ݐ | ݑ | ݒ | ݓ | ݔ | ݕ | ݖ | ݗ | ݘ | ݙ | ݚ | ݛ | ݜ | ݝ | ݞ | ݟ |
U+076x | ݠ | ݡ | ݢ | ݣ | ݤ | ݥ | ݦ | ݧ | ݨ | ݩ | ݪ | ݫ | ݬ | ݭ | ݮ | ݯ |
U+077x | ݰ | ݱ | ݲ | ݳ | ݴ | ݵ | ݶ | ݷ | ݸ | ݹ | ݺ | ݻ | ݼ | ݽ | ݾ | ݿ |
Notes
|
The following Unicode-related documents record the purpose and process of defining specific characters in the Arabic Supplement block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
4.1 | U+0750..0769 | 26 | L2/02-274 | Kew, Jonathan (2002-07-16), Proposal for extensions to the Arabic block | |
L2/03-168 | Kew, Jonathan (2003-06-02), Proposal to encode Arabic-script letters for African languages | ||||
L2/03-176 | Kew, Jonathan (2003-06-03), Proposal to encode Jawi and Moroccan Arabic GAF characters | ||||
L2/03-210 | Kew, Jonathan (2003-06-12), Draft chart showing UTC #95 additions to Arabic blocks | ||||
L2/03-223 | N2598 | Kew, Jonathan (2003-07-10), Proposal to encode additional Arabic-script characters | |||
U+076A | 1 | L2/03-228R2 | N2627 | Kew, Jonathan (2003-09-29), Proposal to encode Marwari LAM WITH BAR Character | |
L2/03-240R3 | Moore, Lisa (2003-10-21), "Marwari Lam with Bar (B.14.6)", UTC #96 Minutes | ||||
U+076B..076D | 3 | L2/04-025R | N2723 | Kew, Jonathan (2004-03-15), Proposal to encode Additional Arabic script characters | |
5.1 | U+076E..077D | 16 | N3117 | Bashir, Elena; Hussain, Sarmad; Anderson, Deborah (2006-07-27), Proposal to add characters needed for Khowar, Torwali, and Burushaski | |
L2/06-150 | Bashir, Elena (2006-05-05), Letters of support for characters needed for Khowar, Torwali, and Burushaski | ||||
L2/06-149 | Bashir, Elena; Hussain, Sarmad; Anderson, Deborah (2006-05-09), Proposal to add characters needed for Khowar, Torwali, and Burushaski | ||||
L2/06-108 | Moore, Lisa (2006-05-25), "C.18", UTC #107 Minutes | ||||
N3153 (pdf, doc) | Umamaheswaran, V. S. (2007-02-16), "M49.8", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29 | ||||
L2/06-328 | Pournader, Roozbeh (2006-10-11), Proposal to change the previously decided name of some Arabic characters | ||||
L2/06-324R2 | Moore, Lisa (2006-11-29), "Consensus 109-C27", UTC #109 Minutes | ||||
L2/07-268 | N3253 (pdf, doc) | Umamaheswaran, V. S. (2007-07-26), "M50.4e", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27, Names of characters in the range 0773 to 077D are changed by replacing the word 'EASTERN' with 'EXTENDED' in them. | |||
L2/07-264 | Anderson, Deborah (2007-08-06), Shaping behavior of Burushaski characters and other Arabic additions in L2/06-149 | ||||
L2/07-225 | Moore, Lisa (2007-08-21), "Burushaski Shaping Behavior", UTC #112 Minutes | ||||
L2/10-158 | Mansour, Kamal (2010-05-04), Shaping Behavior of U+0777 | ||||
L2/10-108 | Moore, Lisa (2010-05-19), "Action item 123-A50", UTC #123 / L2 #220 Minutes, Suggest clarifying text in section 8.2 of TUS 5.2 pp 248-249 regarding Yeh and Farsi Yeh joining groups. | ||||
U+077E..077F | 2 | L2/06-345R | N3180R | Everson, Michael; Pournader, Roozbeh; Sarbar, Elnaz (2006-10-24), Proposal to encode eight Arabic characters for Persian and Azerbaijani in the UCS | |
L2/06-324R2 | Moore, Lisa (2006-11-29), "C.12", UTC #109 Minutes | ||||
L2/07-268 | N3253 (pdf, doc) | Umamaheswaran, V. S. (2007-07-26), "M50.15", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27 | |||
|
A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
Cyrillic Supplement is a Unicode block containing Cyrillic letters for writing several minority languages, including Abkhaz, Kurdish, Komi, Mordvin, Aleut, Azerbaijani, and Jakovlev's Chuvash orthography.
Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.
Arabic Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. This block also allocates 32 noncharacters in Unicode, designed specifically for internal use.
Arabic Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.
Arabic Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The special codepoint ZWNBSP is also here, which is only meant for a byte order mark. The block name in Unicode 1.0 was Basic Glyphs for Arabic Language; its characters were re-ordered in the process of merging with ISO 10646 in Unicode 1.0.1 and 1.1.
Syriac is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam.
Georgian is a Unicode block containing the Mkhedruli and Asomtavruli Georgian characters used to write Modern Georgian, Svan, and Mingrelian languages. Another lower case, Nuskhuri, is encoded in a separate Georgian Supplement block, which is used with the Asomtavruli to write the ecclesiastical Khutsuri Georgian script.
Ethiopic Supplement is a Unicode block containing extra Geʽez characters for writing the Sebatbeit language, and Ethiopic tone marks.
Tamil is a Unicode block containing characters for the Tamil, and Saurashtra languages of Tamil Nadu India, Sri Lanka, Singapore, and Malaysia. In its original incarnation, the code points U+0B82..U+0BCD were a direct copy of the Tamil characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.
Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.
Hiragana is a Unicode block containing hiragana characters for the Japanese language.
Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.
Bamum is a Unicode block containing the characters of stage-G Bamum script, used for modern writing of the Bamum language of western Cameroon. Characters for writing earlier orthographies are contained in a Bamum Supplement block.
Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.
Kana Extended-A is a Unicode block containing hentaigana and historic kana characters. Additional hentaigana characters are encoded in the Kana Supplement block.
Tangut Supplement is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty. This block is a supplement to the main Tangut block.
Lisu Supplement is a Unicode block containing supplementary characters of the Fraser alphabet, which is used to write the Lisu language. This is a supplement to the main Lisu block, with currently only a single character used for the Naxi language assigned to it.