Cham (Unicode block)

Cham
Cham
Range	U+AA00..U+AA5F; (96 code points)
Plane	BMP
Scripts	Cham
Major alphabets	Eastern Cham
Assigned	83 code points
Unused	13 reserved code points
Unicode version history
5.1 (2008)	83 (+83)
Chart
	Code chart
	Note:

Last updated May 30, 2023

Cham is a Unicode block containing characters of the Cham script, which is used for writing the Cham language, primarily used for the Eastern dialect in Cambodia and Vietnam.

A separate block for Western Cham, used in Cambodia, was first proposed to Unicode in 2016. As of May 2022 it is still being finalized.^[3]

Cham ^[1]^[2] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+AA0x	ꨀ	ꨁ	ꨂ	ꨃ	ꨄ	ꨅ	ꨆ	ꨇ	ꨈ	ꨉ	ꨊ	ꨋ	ꨌ	ꨍ	ꨎ	ꨏ
U+AA1x	ꨐ	ꨑ	ꨒ	ꨓ	ꨔ	ꨕ	ꨖ	ꨗ	ꨘ	ꨙ	ꨚ	ꨛ	ꨜ	ꨝ	ꨞ	ꨟ
U+AA2x	ꨠ	ꨡ	ꨢ	ꨣ	ꨤ	ꨥ	ꨦ	ꨧ	ꨨ	ꨩ	ꨪ	ꨫ	ꨬ	ꨭ	ꨮ	ꨯ
U+AA3x	ꨰ	ꨱ	ꨲ	ꨳ	ꨴ	ꨵ	ꨶ
U+AA4x	ꩀ	ꩁ	ꩂ	ꩃ	ꩄ	ꩅ	ꩆ	ꩇ	ꩈ	ꩉ	ꩊ	ꩋ	ꩌ	ꩍ
U+AA5x	꩐	꩑	꩒	꩓	꩔	꩕	꩖	꩗	꩘	꩙			꩜	꩝	꩞	꩟
Notes 1. ^ As of Unicode version 15.0 2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Cham block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
5.1	U+AA00..AA36, AA40..AA4D, AA50..AA59, AA5C..AA5F	83		N1126	Cham script [proposal summary form], 1994-10-14
				N1203	Umamaheswaran, V. S.; Ksar, Mike (1995-05-03), "6.1.2.2", Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva
			L2/97-143	N1578	Everson, Michael (1997-04-06), Cham encoding discussion
			L2/97-124	N1559	Everson, Michael (1997-05-01), Proposal for encoding the Cham script in ISO/IEC 10646
			L2/97-288	N1603	Umamaheswaran, V. S. (1997-10-24), "8.22", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June – 4 July 1997
			L2/99-081	N1960	Everson, Michael (1999-02-01), Response to Ngo Trung Viet on feedback from Cham experts
				N1997	Nhan, Ngo Than (1999-02-26), Response to Michael Everson
			L2/06-257	N3120	Everson, Michael (2006-08-06), Proposal for encoding the Cham script in the BMP of the UCS
			L2/06-231		Moore, Lisa (2006-08-17), "C.14", UTC #108 Minutes
				N3153 (pdf, doc)	Umamaheswaran, V. S. (2007-02-16), "M49.18", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

Cham or CHAM may refer to:

The Cham script is a Brahmic abugida used to write Cham, an Austronesian language spoken by some 245,000 Chams in Vietnam and Cambodia. It is written horizontally left to right, just like other Brahmic abugidas.

Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.

Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.

Arabic Mathematical Alphabetic Symbols is a Unicode block encoding characters used in Arabic mathematical expressions.

Mandaic is a Unicode block containing characters of the Mandaic script used for writing the historic Eastern Aramaic, also called Classical Mandaic, and the modern Neo-Mandaic language.

Georgian is a Unicode block containing the Mkhedruli and Asomtavruli Georgian characters used to write Modern Georgian, Svan, and Mingrelian languages. Another lower case, Nuskhuri, is encoded in a separate Georgian Supplement block, which is used with the Asomtavruli to write the ecclesiastical Khutsuri Georgian script.

Gujarati is a Unicode block containing characters for writing the Gujarati language. In its original incarnation, the code points U+0A81..U+0AD0 were a direct copy of the Gujarati characters A1-F0 from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Khmer Symbols is a Unicode block containing lunar date symbols, used in the writing system of the Khmer (Cambodian) language. For further details see Khmer alphabet – Unicode.

Vedic Extensions is a Unicode block containing characters for representing tones and other vedic symbols in Devanagari and other Indic scripts. Related symbols are defined in two other blocks: Devanagari (U+0900–U+097F) and Devanagari Extended (U+A8E0–U+A8FF).

Phags-pa is a Unicode block containing characters from the 'Phags-pa script promulgated as a national script by Kublai Khan, the founder of the Yuan dynasty. It was used primarily in writing Mongolian and Chinese, although it was intended for the use of all written languages of the Mongol Empire.

Bamum is a Unicode block containing the characters of stage-G Bamum script, used for modern writing of the Bamum language of western Cameroon. Characters for writing earlier orthographies are contained in a Bamum Supplement block.

Javanese is a Unicode block containing aksara Jawa characters traditionally used for writing the Javanese language.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Manichaean is a Unicode block containing characters historically used for writing Sogdian, Parthian, and the dialects of Fars.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Multani is a Unicode block containing characters used for writing the Multani alphabet, a Brahmic script used in the Multan region of Punjab and in northern Sindh in Pakistan. The script is now obsolete, but was historically used to write the Saraiki language.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

The Khmer keyboard includes several keyboard layouts for Khmer script.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
↑ https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=z6dlrcd64h

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-4] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.

[3] ttps://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=z6dlrcd64h

[1]

[2]

[3]

[1]

[2]

[lower-alpha 1]

Cham
Range	U+AA00..U+AA5F (96 code points)
Plane	BMP
Scripts	Cham
Major alphabets	Eastern Cham
Assigned	83 code points
Unused	13 reserved code points
Unicode version history

5.1 (2008)	83 (+83)

Chart
Code chart
Note: ^[1]^[2]