Warang Citi (Unicode block)

Warang Citi
Warang Citi
Range	U+118A0..U+118FF; (96 code points)
Plane	SMP
Scripts	Warang Citi
Major alphabets	Warang Citi (Varang Kshiti)
Assigned	84 code points
Unused	12 reserved code points
Unicode version history
7.0 (2014)	84 (+84)
Unicode documentation
	Code chart ∣ Web page
	Note:

Last updated August 11, 2024

Warang Citi is a Unicode block containing characters for Warang Citi (Varang Kshiti) script which is used by some to write the Ho language.^[3]

Warang Citi ^[1]^[2] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+118Ax	𑢠‎	𑢡‎	𑢢‎	𑢣‎	𑢤‎	𑢥‎	𑢦‎	𑢧‎	𑢨‎	𑢩‎	𑢪‎	𑢫‎	𑢬‎	𑢭‎	𑢮‎	𑢯‎
U+118Bx	𑢰‎	𑢱‎	𑢲‎	𑢳‎	𑢴‎	𑢵‎	𑢶‎	𑢷‎	𑢸‎	𑢹‎	𑢺‎	𑢻‎	𑢼‎	𑢽‎	𑢾‎	𑢿‎
U+118Cx	𑣀‎	𑣁‎	𑣂‎	𑣃‎	𑣄‎	𑣅‎	𑣆‎	𑣇‎	𑣈‎	𑣉‎	𑣊‎	𑣋‎	𑣌‎	𑣍‎	𑣎‎	𑣏‎
U+118Dx	𑣐‎	𑣑‎	𑣒‎	𑣓‎	𑣔‎	𑣕‎	𑣖‎	𑣗‎	𑣘‎	𑣙‎	𑣚‎	𑣛‎	𑣜‎	𑣝‎	𑣞‎	𑣟‎
U+118Ex	𑣠‎	𑣡‎	𑣢‎	𑣣‎	𑣤‎	𑣥‎	𑣦‎	𑣧‎	𑣨‎	𑣩‎	𑣪‎	𑣫‎	𑣬‎	𑣭‎	𑣮‎	𑣯‎
U+118Fx	𑣰‎	𑣱‎	𑣲‎													𑣿‎
Notes 1. ^ As of Unicode version 15.1 2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Warang Citi block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
7.0	U+118A0..118F2, 118FF	84	L2/99-058	N1958	Everson, Michael (1999-01-29), Proposal for encoding the Varang Kshiti script in the BMP of the UCS
			L2/07-137		Harrison, K. David; Anderson, Gregory (2007-04-22), Review of Proposal for Encoding Warang Chiti (Ho orthography) in Unicode
			L2/08-130	N3411	Everson, Michael (2008-04-08), Preliminary proposal for encoding the Varang Kshiti script in the UCS
			L2/09-291	N3668	Everson, Michael (2009-08-05), Proposal for encoding the Warang Citi script in the BMP of the UCS
			L2/11-444	N4176	Everson, Michael (2011-12-31), Revised proposal for encoding the Warang Citi script in the SMP
			L2/12-031		Anderson, Deborah; McGowan, Rick; Whistler, Ken (2012-01-27), "VIII. WARANG CITI", Review of Indic-related L2 documents and Recommendations to the UTC
			L2/12-118	N4259	Everson, Michael (2012-04-19), Final proposal for encoding the Warang Citi script
			L2/12-147		Anderson, Deborah; McGowan, Rick; Whistler, Ken (2012-04-25), "X. WARANG CITI", Review of Indic-related L2 documents and Recommendations to the UTC
			L2/12-112		Moore, Lisa (2012-05-17), "D.11", UTC #131 / L2 #228 Minutes
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin and Greek letters and decimal digits that enable mathematicians to denote different notions with different letter styles. The letters in various fonts often have specific, fixed meanings in particular areas of mathematics. By providing uniformity over numerous mathematical articles and books, these conventions help to read mathematical formulas. These also may be used to differentiate between concepts that share a letter in a single problem.

Ho is a Munda language of the Austroasiatic language family spoken primarily in India by about 2.2 million people per the 2001 census. It is spoken by the Ho, Munda, Kolha and Kol tribal communities of Jharkhand, Odisha, West Bengal and Assam and is written using Warang Citi script. Devanagari, Latin and Odia script are also used, although native speakers are said to prefer Warang Chiti, invented by Lako Bodra.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorize these characters as being "letterlike."

Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1, which was incorporated whole as the Latin-1 Supplement block.

Warang Citi is a writing system invented by Lako Bodra for the Ho language spoken in East India. It is used in primary and adult education and in various publications.

Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English Dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.

In the Unicode standard, a plane is a contiguous group of 65,536 (2¹⁶) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–10₁₆ of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Unicode Standard assigns various properties to each Unicode character and code point.

Mandaic is a Unicode block containing characters of the Mandaic script used for writing the historic Eastern Aramaic, also called Classical Mandaic, and the modern Neo-Mandaic language.

Georgian is a Unicode block containing the Mkhedruli and Asomtavruli Georgian characters used to write Modern Georgian, Svan, and Mingrelian languages. Another lower case, Nuskhuri, is encoded in a separate Georgian Supplement block, which is used with the Asomtavruli to write the ecclesiastical Khutsuri Georgian script.

Georgian Supplement is a Unicode block containing characters for the ecclesiastical form of the Georgian script, Nuskhuri. To write the full ecclesiastical Khutsuri orthography, the Asomtavruli capitals encoded in the Georgian block.

Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Vedic Extensions is a Unicode block containing characters for representing tones and other vedic symbols in Devanagari and other Indic scripts. Related symbols are defined in two other blocks: Devanagari (U+0900–U+097F) and Devanagari Extended (U+A8E0–U+A8FF).

Tagalog is a Unicode block containing characters of the Baybayin script, specifically the variety used for writing the Tagalog language before and during Spanish colonization of the Philippines eventually led to the adoption of the Latin alphabet. It has been a part of the Unicode Standard since version 3.2 in April 2002. Tagalog characters can be found in the Noto Sans Tagalog font, among others. The Tagalog Baybayin script was originally proposed for inclusion in Unicode alongside its descendant Hanunoo, Buhid and Tagbanwa scripts as a single block called "Philippine Scripts" and two punctuation marks are only part of the Hanunoo block. In 2021, with version 14.0, the Unicode Standard was updated to add three new characters: the "ra" and archaic "ra", and the pamudpod.

Javanese is a Unicode block containing aksara Jawa characters traditionally used for writing the Javanese language.

Cham is a Unicode block containing characters of the Cham script, which is used for writing the Cham language, primarily used for the Eastern dialect in Cambodia and Vietnam.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
↑ Everson, Michael (2012-04-19). "N4259: Final proposal for encoding the Warang Citi script in the SMP of the UCS" (PDF). Retrieved 3 July 2014.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-4] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

[3] Everson, Michael (2012-04-19). "N4259: Final proposal for encoding the Warang Citi script in the SMP of the UCS" (PDF). Retrieved 3 July 2014.

[1]

[2]

[3]

[lower-alpha 1]

Warang Citi
Range	U+118A0..U+118FF (96 code points)
Plane	SMP
Scripts	Warang Citi
Major alphabets	Warang Citi (Varang Kshiti)
Assigned	84 code points
Unused	12 reserved code points
Unicode version history

7.0 (2014)	84 (+84)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2]