Arabic Supplement

Arabic Supplement
Arabic Supplement
Range	U+0750..U+077F; (48 code points)
Plane	BMP
Scripts	Arabic
Major alphabets	Khowar ; Torwali ; Burushaski ; African languages; Early Persian
Assigned	48 code points
Unused	0 reserved code points
Unicode version history
4.1 (2005)	30 (+30)
5.1 (2008)	48 (+18)
	Code chart Note:

Last updated February 26, 2022

Arabic Supplement is a Unicode block that encodes Arabic letter variants used for writing non-Arabic languages, including languages of Pakistan and Africa, and old Persian.

Block

Arabic Supplement ^[1] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+075x	ݐ	ݑ	ݒ	ݓ	ݔ	ݕ	ݖ	ݗ	ݘ	ݙ	ݚ	ݛ	ݜ	ݝ	ݞ	ݟ
U+076x	ݠ	ݡ	ݢ	ݣ	ݤ	ݥ	ݦ	ݧ	ݨ	ݩ	ݪ	ݫ	ݬ	ݭ	ݮ	ݯ
U+077x	ݰ	ݱ	ݲ	ݳ	ݴ	ݵ	ݶ	ݷ	ݸ	ݹ	ݺ	ݻ	ݼ	ݽ	ݾ	ݿ
Notes 1. ^ As of Unicode version 14.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Arabic Supplement block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
4.1	U+0750..0769	26	L2/02-274		Kew, Jonathan (2002-07-16), Proposal for extensions to the Arabic block
			L2/03-168		Kew, Jonathan (2003-06-02), Proposal to encode Arabic-script letters for African languages
			L2/03-176		Kew, Jonathan (2003-06-03), Proposal to encode Jawi and Moroccan Arabic GAF characters
			L2/03-210		Kew, Jonathan (2003-06-12), Draft chart showing UTC #95 additions to Arabic blocks
			L2/03-223	N2598	Kew, Jonathan (2003-07-10), Proposal to encode additional Arabic-script characters
	U+076A	1	L2/03-228R2	N2627	Kew, Jonathan (2003-09-29), Proposal to encode Marwari LAM WITH BAR Character
	U+076A	1	L2/03-240R3		Moore, Lisa (2003-10-21), "Marwari Lam with Bar (B.14.6)", UTC #96 Minutes
	U+076B..076D	3	L2/04-025R	N2723	Kew, Jonathan (2004-03-15), Proposal to encode Additional Arabic script characters
5.1	U+076E..077D	16		N3117	Bashir, Elena; Hussain, Sarmad; Anderson, Deborah (2006-07-27), Proposal to add characters needed for Khowar, Torwali, and Burushaski
			L2/06-150		Bashir, Elena (2006-05-05), Letters of support for characters needed for Khowar, Torwali, and Burushaski
			L2/06-149		Bashir, Elena; Hussain, Sarmad; Anderson, Deborah (2006-05-09), Proposal to add characters needed for Khowar, Torwali, and Burushaski
			L2/06-108		Moore, Lisa (2006-05-25), "C.18", UTC #107 Minutes
				N3153 (pdf, doc)	Umamaheswaran, V. S. (2007-02-16), "M49.8", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29
			L2/06-328		Pournader, Roozbeh (2006-10-11), Proposal to change the previously decided name of some Arabic characters
			L2/06-324R2		Moore, Lisa (2006-11-29), "Consensus 109-C27", UTC #109 Minutes
			L2/07-268	N3253 (pdf, doc)	Umamaheswaran, V. S. (2007-07-26), "M50.4e", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27, Names of characters in the range 0773 to 077D are changed by replacing the word 'EASTERN' with 'EXTENDED' in them.
			L2/07-264		Anderson, Deborah (2007-08-06), Shaping behavior of Burushaski characters and other Arabic additions in L2/06-149
			L2/07-225		Moore, Lisa (2007-08-21), "Burushaski Shaping Behavior", UTC #112 Minutes
			L2/10-158		Mansour, Kamal (2010-05-04), Shaping Behavior of U+0777
			L2/10-108		Moore, Lisa (2010-05-19), "Action item 123-A50", UTC #123 / L2 #220 Minutes, Suggest clarifying text in section 8.2 of TUS 5.2 pp 248-249 regarding Yeh and Farsi Yeh joining groups.
	U+077E..077F	2	L2/06-345R	N3180R	Everson, Michael; Pournader, Roozbeh; Sarbar, Elnaz (2006-10-24), Proposal to encode eight Arabic characters for Persian and Azerbaijani in the UCS
			L2/06-324R2		Moore, Lisa (2006-11-29), "C.12", UTC #109 Minutes
			L2/07-268	N3253 (pdf, doc)	Umamaheswaran, V. S. (2007-07-26), "M50.15", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Cyrillic Supplement is a Unicode block containing Cyrillic letters for writing several minority languages, including Abkhaz, Kurdish, Komi, Mordvin, Aleut, Azerbaijani, and Jakovlev's Chuvash orthography.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Arabic Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.

Arabic Presentation Forms-B is a Unicode block encoding spacing forms of Arabic diacritics, and contextual letter forms. The special codepoint, ZWNBSP is also here, which is used as a byte order mark. Its block name in Unicode 1.0 was Basic Glyphs for Arabic Language; its characters were re-ordered in the process of merging with ISO 10646 in Unicode 1.0.1 and 1.1.

Syriac is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam.

Georgian is a Unicode block containing the Mkhedruli and Asomtavruli Georgian characters used to write Modern Georgian, Svan, and Mingrelian languages. Another lower case, Nuskhuri, is encoded in a separate Georgian Supplement block, which is used with the Asomtavruli to write the ecclesiastical Khutsuri Georgian script.

Ethiopic Supplement is a Unicode block containing extra Geʽez characters for writing the Sebatbeit language, and Ethiopic tone marks.

Tamil is a Unicode block containing characters for the Tamil, and Saurashtra languages of Tamil Nadu India, Sri Lanka, Singapore, and Malaysia. In its original incarnation, the code points U+0B02..U+0BCD were a direct copy of the Tamil characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Bamum is a Unicode block containing the characters of stage-G Bamum script, used for modern writing of the Bamum language of western Cameroon. Characters for writing earlier orthographies are contained in a Bamum Supplement block.

Bamum Supplement is a Unicode block containing the characters of the historic stage A-F of the Bamum script, used for writing the Bamum language of western Cameroon. The modern stage G characters, which include many characters used for stage A-F orthographies, are included in the Bamum block.

Sundanese is a Unicode block containing modern characters for writing the Sundanese script of the Sundanese language of the island of Java, Indonesia.

Lisu is a Unicode block containing characters of the Fraser alphabet, which is used to write the Lisu language. This alphabet consists of glyphs resembling capital letters in the basic Latin alphabet in their standard form and horizontally or vertically mirrored.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Tangut Supplement is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty. This block is a supplement to the main Tangut block.

Arabic Extended-B is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages. The block also includes currency symbols and an abbreviation mark.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-3] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.

[1]

[2]

[1]

[lower-alpha 1]

Arabic Supplement
Range	U+0750..U+077F (48 code points)
Plane	BMP
Scripts	Arabic
Major alphabets	Khowar Torwali Burushaski African languages Early Persian
Assigned	48 code points
Unused	0 reserved code points
Unicode version history

4.1 (2005)	30 (+30)
5.1 (2008)	48 (+18)

Code chart Note: ^[1]^[2]

Arabic Supplement

Contents

Block

History

Related Research Articles

References