Arabic Supplement | |
---|---|
Range | U+0750..U+077F (48 code points) |
Plane | BMP |
Scripts | Arabic |
Major alphabets | Khowar Torwali Burushaski African languages Early Persian |
Assigned | 48 code points |
Unused | 0 reserved code points |
Unicode version history | |
4.1 (2005) | 30 (+30) |
5.1 (2008) | 48 (+18) |
Code chart Note: [1] [2] |
Arabic Supplement is a Unicode block that encodes Arabic letter variants used for writing non-Arabic languages, including languages of Pakistan and Africa, and old Persian.
Arabic Supplement [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+075x | ݐ | ݑ | ݒ | ݓ | ݔ | ݕ | ݖ | ݗ | ݘ | ݙ | ݚ | ݛ | ݜ | ݝ | ݞ | ݟ |
U+076x | ݠ | ݡ | ݢ | ݣ | ݤ | ݥ | ݦ | ݧ | ݨ | ݩ | ݪ | ݫ | ݬ | ݭ | ݮ | ݯ |
U+077x | ݰ | ݱ | ݲ | ݳ | ݴ | ݵ | ݶ | ݷ | ݸ | ݹ | ݺ | ݻ | ݼ | ݽ | ݾ | ݿ |
Notes
|
The following Unicode-related documents record the purpose and process of defining specific characters in the Arabic Supplement block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
4.1 | U+0750..0769 | 26 | L2/02-274 | Kew, Jonathan (2002-07-16), Proposal for extensions to the Arabic block | |
L2/03-168 | Kew, Jonathan (2003-06-02), Proposal to encode Arabic-script letters for African languages | ||||
L2/03-176 | Kew, Jonathan (2003-06-03), Proposal to encode Jawi and Moroccan Arabic GAF characters | ||||
L2/03-210 | Kew, Jonathan (2003-06-12), Draft chart showing UTC #95 additions to Arabic blocks | ||||
L2/03-223 | N2598 | Kew, Jonathan (2003-07-10), Proposal to encode additional Arabic-script characters | |||
U+076A | 1 | L2/03-228R2 | N2627 | Kew, Jonathan (2003-09-29), Proposal to encode Marwari LAM WITH BAR Character | |
L2/03-240R3 | Moore, Lisa (2003-10-21), "Marwari Lam with Bar (B.14.6)", UTC #96 Minutes | ||||
U+076B..076D | 3 | L2/04-025R | N2723 | Kew, Jonathan (2004-03-15), Proposal to encode Additional Arabic script characters | |
5.1 | U+076E..077D | 16 | N3117 | Bashir, Elena; Hussain, Sarmad; Anderson, Deborah (2006-07-27), Proposal to add characters needed for Khowar, Torwali, and Burushaski | |
L2/06-150 | Bashir, Elena (2006-05-05), Letters of support for characters needed for Khowar, Torwali, and Burushaski | ||||
L2/06-149 | Bashir, Elena; Hussain, Sarmad; Anderson, Deborah (2006-05-09), Proposal to add characters needed for Khowar, Torwali, and Burushaski | ||||
L2/06-108 | Moore, Lisa (2006-05-25), "C.18", UTC #107 Minutes | ||||
N3153 (pdf, doc) | Umamaheswaran, V. S. (2007-02-16), "M49.8", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29 | ||||
L2/06-328 | Pournader, Roozbeh (2006-10-11), Proposal to change the previously decided name of some Arabic characters | ||||
L2/06-324R2 | Moore, Lisa (2006-11-29), "Consensus 109-C27", UTC #109 Minutes | ||||
L2/07-268 | N3253 (pdf, doc) | Umamaheswaran, V. S. (2007-07-26), "M50.4e", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27, Names of characters in the range 0773 to 077D are changed by replacing the word 'EASTERN' with 'EXTENDED' in them. | |||
L2/07-264 | Anderson, Deborah (2007-08-06), Shaping behavior of Burushaski characters and other Arabic additions in L2/06-149 | ||||
L2/07-225 | Moore, Lisa (2007-08-21), "Burushaski Shaping Behavior", UTC #112 Minutes | ||||
L2/10-158 | Mansour, Kamal (2010-05-04), Shaping Behavior of U+0777 | ||||
L2/10-108 | Moore, Lisa (2010-05-19), "Action item 123-A50", UTC #123 / L2 #220 Minutes, Suggest clarifying text in section 8.2 of TUS 5.2 pp 248-249 regarding Yeh and Farsi Yeh joining groups. | ||||
U+077E..077F | 2 | L2/06-345R | N3180R | Everson, Michael; Pournader, Roozbeh; Sarbar, Elnaz (2006-10-24), Proposal to encode eight Arabic characters for Persian and Azerbaijani in the UCS | |
L2/06-324R2 | Moore, Lisa (2006-11-29), "C.12", UTC #109 Minutes | ||||
L2/07-268 | N3253 (pdf, doc) | Umamaheswaran, V. S. (2007-07-26), "M50.15", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27 | |||
|
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
Cyrillic Supplement is a Unicode block containing Cyrillic letters for writing several minority languages, including Abkhaz, Kurdish, Komi, Mordvin, Aleut, Azerbaijani, and Jakovlev's Chuvash orthography.
Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.
Arabic Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.
Arabic Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The special codepoint, ZWNBSP is also here, which is used as a byte order mark. Its block name in Unicode 1.0 was Basic Glyphs for Arabic Language; its characters were re-ordered in the process of merging with ISO 10646 in Unicode 1.0.1 and 1.1.
Syriac is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam.
Georgian is a Unicode block containing the Mkhedruli and Asomtavruli Georgian characters used to write Modern Georgian, Svan, and Mingrelian languages. Another lower case, Nuskhuri, is encoded in a separate Georgian Supplement block, which is used with the Asomtavruli to write the ecclesiastical Khutsuri Georgian script.
Ethiopic Supplement is a Unicode block containing extra Geʽez characters for writing the Sebatbeit language, and Ethiopic tone marks.
Tamil is a Unicode block containing characters for the Tamil, and Saurashtra languages of Tamil Nadu India, Sri Lanka, Singapore, and Malaysia. In its original incarnation, the code points U+0B02..U+0BCD were a direct copy of the Tamil characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.
Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.
Hiragana is a Unicode block containing hiragana characters for the Japanese language.
Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.
Bamum is a Unicode block containing the characters of stage-G Bamum script, used for modern writing of the Bamum language of western Cameroon. Characters for writing earlier orthographies are contained in a Bamum Supplement block.
Bamum Supplement is a Unicode block containing the characters of the historic stage A-F of the Bamum script, used for writing the Bamum language of western Cameroon. The modern stage G characters, which include many characters used for stage A-F orthographies, are included in the Bamum block.
Sundanese is a Unicode block containing modern characters for writing the Sundanese script of the Sundanese language of the island of Java, Indonesia.
Lisu is a Unicode block containing characters of the Fraser alphabet, which is used to write the Lisu language. This alphabet consists of glyphs resembling capital letters in the basic Latin alphabet in their standard form and horizontally or vertically mirrored.
Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.
Tangut Supplement is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty. This block is a supplement to the main Tangut block.
Arabic Extended-B is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages. The block also includes currency symbols and an abbreviation mark.