Khitan Small Script (Unicode block)

Khitan Small Script
Khitan Small Script
Range	U+18B00..U+18CFF; (512 code points)
Plane	SMP
Scripts	Khitan small script
Assigned	471 code points
Unused	41 reserved code points
Unicode version history
13.0 (2020)	470 (+470)
16.0 (2024)	471 (+1)
Unicode documentation
	Code chart ∣ Web page
	Note:

Last updated September 11, 2024

Khitan Small Script is a Unicode block containing characters from the Khitan small script, which was used for writing the Khitan language spoken by the Khitan people in northern China during the Liao dynasty.

Block

Khitan Small Script ^[1]^[2] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+18B0x	𘬀‎	𘬁‎	𘬂‎	𘬃‎	𘬄‎	𘬅‎	𘬆‎	𘬇‎	𘬈‎	𘬉‎	𘬊‎	𘬋‎	𘬌‎	𘬍‎	𘬎‎	𘬏‎
U+18B1x	𘬐‎	𘬑‎	𘬒‎	𘬓‎	𘬔‎	𘬕‎	𘬖‎	𘬗‎	𘬘‎	𘬙‎	𘬚‎	𘬛‎	𘬜‎	𘬝‎	𘬞‎	𘬟‎
U+18B2x	𘬠‎	𘬡‎	𘬢‎	𘬣‎	𘬤‎	𘬥‎	𘬦‎	𘬧‎	𘬨‎	𘬩‎	𘬪‎	𘬫‎	𘬬‎	𘬭‎	𘬮‎	𘬯‎
U+18B3x	𘬰‎	𘬱‎	𘬲‎	𘬳‎	𘬴‎	𘬵‎	𘬶‎	𘬷‎	𘬸‎	𘬹‎	𘬺‎	𘬻‎	𘬼‎	𘬽‎	𘬾‎	𘬿‎
U+18B4x	𘭀‎	𘭁‎	𘭂‎	𘭃‎	𘭄‎	𘭅‎	𘭆‎	𘭇‎	𘭈‎	𘭉‎	𘭊‎	𘭋‎	𘭌‎	𘭍‎	𘭎‎	𘭏‎
U+18B5x	𘭐‎	𘭑‎	𘭒‎	𘭓‎	𘭔‎	𘭕‎	𘭖‎	𘭗‎	𘭘‎	𘭙‎	𘭚‎	𘭛‎	𘭜‎	𘭝‎	𘭞‎	𘭟‎
U+18B6x	𘭠‎	𘭡‎	𘭢‎	𘭣‎	𘭤‎	𘭥‎	𘭦‎	𘭧‎	𘭨‎	𘭩‎	𘭪‎	𘭫‎	𘭬‎	𘭭‎	𘭮‎	𘭯‎
U+18B7x	𘭰‎	𘭱‎	𘭲‎	𘭳‎	𘭴‎	𘭵‎	𘭶‎	𘭷‎	𘭸‎	𘭹‎	𘭺‎	𘭻‎	𘭼‎	𘭽‎	𘭾‎	𘭿‎
U+18B8x	𘮀‎	𘮁‎	𘮂‎	𘮃‎	𘮄‎	𘮅‎	𘮆‎	𘮇‎	𘮈‎	𘮉‎	𘮊‎	𘮋‎	𘮌‎	𘮍‎	𘮎‎	𘮏‎
U+18B9x	𘮐‎	𘮑‎	𘮒‎	𘮓‎	𘮔‎	𘮕‎	𘮖‎	𘮗‎	𘮘‎	𘮙‎	𘮚‎	𘮛‎	𘮜‎	𘮝‎	𘮞‎	𘮟‎
U+18BAx	𘮠‎	𘮡‎	𘮢‎	𘮣‎	𘮤‎	𘮥‎	𘮦‎	𘮧‎	𘮨‎	𘮩‎	𘮪‎	𘮫‎	𘮬‎	𘮭‎	𘮮‎	𘮯‎
U+18BBx	𘮰‎	𘮱‎	𘮲‎	𘮳‎	𘮴‎	𘮵‎	𘮶‎	𘮷‎	𘮸‎	𘮹‎	𘮺‎	𘮻‎	𘮼‎	𘮽‎	𘮾‎	𘮿‎
U+18BCx	𘯀‎	𘯁‎	𘯂‎	𘯃‎	𘯄‎	𘯅‎	𘯆‎	𘯇‎	𘯈‎	𘯉‎	𘯊‎	𘯋‎	𘯌‎	𘯍‎	𘯎‎	𘯏‎
U+18BDx	𘯐‎	𘯑‎	𘯒‎	𘯓‎	𘯔‎	𘯕‎	𘯖‎	𘯗‎	𘯘‎	𘯙‎	𘯚‎	𘯛‎	𘯜‎	𘯝‎	𘯞‎	𘯟‎
U+18BEx	𘯠‎	𘯡‎	𘯢‎	𘯣‎	𘯤‎	𘯥‎	𘯦‎	𘯧‎	𘯨‎	𘯩‎	𘯪‎	𘯫‎	𘯬‎	𘯭‎	𘯮‎	𘯯‎
U+18BFx	𘯰‎	𘯱‎	𘯲‎	𘯳‎	𘯴‎	𘯵‎	𘯶‎	𘯷‎	𘯸‎	𘯹‎	𘯺‎	𘯻‎	𘯼‎	𘯽‎	𘯾‎	𘯿‎
U+18C0x	𘰀‎	𘰁‎	𘰂‎	𘰃‎	𘰄‎	𘰅‎	𘰆‎	𘰇‎	𘰈‎	𘰉‎	𘰊‎	𘰋‎	𘰌‎	𘰍‎	𘰎‎	𘰏‎
U+18C1x	𘰐‎	𘰑‎	𘰒‎	𘰓‎	𘰔‎	𘰕‎	𘰖‎	𘰗‎	𘰘‎	𘰙‎	𘰚‎	𘰛‎	𘰜‎	𘰝‎	𘰞‎	𘰟‎
U+18C2x	𘰠‎	𘰡‎	𘰢‎	𘰣‎	𘰤‎	𘰥‎	𘰦‎	𘰧‎	𘰨‎	𘰩‎	𘰪‎	𘰫‎	𘰬‎	𘰭‎	𘰮‎	𘰯‎
U+18C3x	𘰰‎	𘰱‎	𘰲‎	𘰳‎	𘰴‎	𘰵‎	𘰶‎	𘰷‎	𘰸‎	𘰹‎	𘰺‎	𘰻‎	𘰼‎	𘰽‎	𘰾‎	𘰿‎
U+18C4x	𘱀‎	𘱁‎	𘱂‎	𘱃‎	𘱄‎	𘱅‎	𘱆‎	𘱇‎	𘱈‎	𘱉‎	𘱊‎	𘱋‎	𘱌‎	𘱍‎	𘱎‎	𘱏‎
U+18C5x	𘱐‎	𘱑‎	𘱒‎	𘱓‎	𘱔‎	𘱕‎	𘱖‎	𘱗‎	𘱘‎	𘱙‎	𘱚‎	𘱛‎	𘱜‎	𘱝‎	𘱞‎	𘱟‎
U+18C6x	𘱠‎	𘱡‎	𘱢‎	𘱣‎	𘱤‎	𘱥‎	𘱦‎	𘱧‎	𘱨‎	𘱩‎	𘱪‎	𘱫‎	𘱬‎	𘱭‎	𘱮‎	𘱯‎
U+18C7x	𘱰‎	𘱱‎	𘱲‎	𘱳‎	𘱴‎	𘱵‎	𘱶‎	𘱷‎	𘱸‎	𘱹‎	𘱺‎	𘱻‎	𘱼‎	𘱽‎	𘱾‎	𘱿‎
U+18C8x	𘲀‎	𘲁‎	𘲂‎	𘲃‎	𘲄‎	𘲅‎	𘲆‎	𘲇‎	𘲈‎	𘲉‎	𘲊‎	𘲋‎	𘲌‎	𘲍‎	𘲎‎	𘲏‎
U+18C9x	𘲐‎	𘲑‎	𘲒‎	𘲓‎	𘲔‎	𘲕‎	𘲖‎	𘲗‎	𘲘‎	𘲙‎	𘲚‎	𘲛‎	𘲜‎	𘲝‎	𘲞‎	𘲟‎
U+18CAx	𘲠‎	𘲡‎	𘲢‎	𘲣‎	𘲤‎	𘲥‎	𘲦‎	𘲧‎	𘲨‎	𘲩‎	𘲪‎	𘲫‎	𘲬‎	𘲭‎	𘲮‎	𘲯‎
U+18CBx	𘲰‎	𘲱‎	𘲲‎	𘲳‎	𘲴‎	𘲵‎	𘲶‎	𘲷‎	𘲸‎	𘲹‎	𘲺‎	𘲻‎	𘲼‎	𘲽‎	𘲾‎	𘲿‎
U+18CCx	𘳀‎	𘳁‎	𘳂‎	𘳃‎	𘳄‎	𘳅‎	𘳆‎	𘳇‎	𘳈‎	𘳉‎	𘳊‎	𘳋‎	𘳌‎	𘳍‎	𘳎‎	𘳏‎
U+18CDx	𘳐‎	𘳑‎	𘳒‎	𘳓‎	𘳔‎	𘳕‎
U+18CEx
U+18CFx																𘳿‎
Notes 1. ^ As of Unicode version 16.0 2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Khitan Small Script block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
13.0	U+18B00..18CD5	470	L2/10-130	N3820	Sun, Bojun; Jing, Yongshi; Li, Yang (2010-04-05), Preliminary Proposal for Encoding Khitan Characters in UCS
			L2/10-369	N3918	Sun, Bojun; Jing, Yongshi; Li, Yang (2010-09-16), Proposal of Encode the Khitan Characters to UCS plane
			L2/10-400	N3942	Anderson, Deborah (2010-10-06), Ad hoc report on Khitan Small Script
				N3903 (pdf, doc)	"M57.27", Unconfirmed minutes of WG2 meeting 57, 2011-03-31
			L2/16-113R	N4725R	West, Andrew; Zaytsev, Viacheslav; Everson, Michael (2016-05-21), Towards an Encoding of the Khitan Small Script
			L2/16-156		Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu (2016-05-06), "9. Khitan", Recommendations to UTC #147 May 2016 on Script Proposals
			L2/16-243	N4736	Anderson, Deborah (2016-09-06), "For Khitan Small Script", Summary of Meeting on Khitan Scripts, 20 August 2016 (Yinchuan, China) - Ad Hoc Report #1
			L2/16-244	N4737	Anderson, Deborah (2016-09-06), Summary of Meeting on Khitan Scripts, 22 August 2016 (Yinchuan, China) - Ad Hoc Report #2
			L2/16-245R2	N4738R2	Wu, Yingzhe; Sun, Bojun; Jing, Yongshi; Zaytsev, Viacheslav; West, Andrew; Everson, Michael (2016-09-17), Final proposal to encode the Small Khitan Script in the SMP
			L2/16-266	N4763	Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa (2016-09-26), "2. Khitan Small Script", Comments on Mongolian, Small Khitan, and other WG2 #65 documents
			L2/16-271	N4771	Everson, Michael (2016-09-29), Khitan Small Script code chart based on the ad-hoc in San Jose
				N4873R (pdf, doc)	"10.2.5", Unconfirmed minutes of WG 2 meeting 65, 2018-03-16
			L2/16-277	N4765	Zaytsev, Viacheslav; West, Andrew (2016-10-12), Discussion of 29 proposed Khitan Small Script characters
			L2/16-296	N4775	West, Andrew; Everson, Michael; Zaytsev, Viacheslav (2016-11-04), Discussion of Cluster Formation in Khitan Small Script
			L2/16-338	N4768	Moore, Lisa (2016-11-04), Summary of Ad Hoc Meeting on Khitan Small Script, 28 September 2016
			L2/16-342		Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu (2016-11-07), "10", Recommendations to UTC #149 November 2016 on Script Proposals
			L2/16-376		Listener, Snow (2016-11-17), Layman's comments on the encoding proposal Khitan small script
			L2/17-037		Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa; Liang, Hai; Ishida, Richard; Misra, Karan; McGowan, Rick (2017-01-21), "15", Recommendations to UTC #150 January 2017 on Script Proposals
			L2/17-016		Moore, Lisa (2017-02-08), "Consensus 150-C20", UTC #150 Minutes
			L2/17-161	N4794	Suignard, Michel (2017-05-08), "China T2, Ireland T1, UK T5", Draft disposition of comments on PDAM1.2 to ISO/IEC 10646 5th edition
				N4953 (pdf, doc)	"M66.03b, c, and f, M66.07l", Unconfirmed minutes of WG 2 meeting 66, 2018-03-23
			L2/18-121R	N4943R	West, Andrew; Zaytsev, Viacheslav; Everson, Michael (2018-05-19), Cluster Formation Model for Khitan Small Script
			L2/18-168		Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Chapman, Chris; Cook, Richard (2018-04-28), "14. Khitan Small Script", Recommendations to UTC #155 April-May 2018 on Script Proposals
			L2/18-115		Moore, Lisa (2018-05-09), "C.12", UTC #155 Minutes
			L2/18-210	N4977	Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Constable, Peter; Moore, Lisa; Jeziorek, Marek; Yang, Ben (2018-06-09), "1", Comments on WG2 #67 documents (June 2018)
			L2/18-213	N5002	Anderson, Deborah; Constable, Peter (2018-06-20), Khitan Small Script Ad Hoc Report (London)
			L2/18-241		Anderson, Deborah; et al. (2018-07-20), "9", Recommendations to UTC # 156 July 2018 on Script Proposals
			L2/18-285		Anderson, Deborah (2018-08-31), Further information on Khitan Small Script clusters
			L2/18-300		Anderson, Deborah; et al. (2018-09-14), "9. a.", Recommendations to UTC #157 on Script Proposals
			L2/18-183		Moore, Lisa (2018-11-20), "C.12 Cluster Formation Model for Khitan Small Script", UTC #156 Minutes
				N5020 (pdf, doc)	Umamaheswaran, V. S. (2019-01-11), "9.2.3", Unconfirmed minutes of WG 2 meeting 67
			L2/20-015R		Moore, Lisa (2020-05-14), "Consensus 162-C16", Draft Minutes of UTC Meeting 162
			L2/21-182		Chan, Eiso; You, Jerry; Yu, Fitzgerald; Wong, Victor (2021-08-16), Request to modify U+18CCA glyph in Khitan Small Script block
			L2/21-174		Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Liang, Hai (2021-10-01), "14. Khitan Small Script", Recommendations to UTC #169 October 2021 on Script Proposals
			L2/21-167		Cummings, Craig (2022-01-27), "Consensus 169-C18", Approved Minutes of UTC Meeting 169, Accept a glyph change for U+18CCA
			L2/23-199		West, Andrew (2023-07-29), Glyph correction for Khitan Small Script U+18BD2
			L2/23-238R		Anderson, Deborah; Kučera, Jan; Whistler, Ken; Pournader, Roozbeh; Constable, Peter (2023-11-01), "6 Khitan Small Script", Recommendations to UTC #177 November 2023 on Script Proposals
			L2/23-231		Constable, Peter (2023-12-08), "Consensus 177-C29", UTC #177 Minutes, Accept the glyph change for U+18BD2
16.0	U+18CFF	1	L2/23-065	N5205	West, Andrew (2023-03-01), Proposal to encode a blank character for Khitan Small Script
			L2/23-083		Anderson, Deborah; Kučera, Jan; Whistler, Ken; Pournader, Roozbeh; Constable, Peter (2023-04-21), "3 Khitan Small Script", Recommendations to UTC #175 April 2023 on Script Proposals
			L2/23-076		Constable, Peter (2023-05-01), "Consensus 175-C15", UTC #175 Minutes, Provisionally assign U+18CFF KHITAN SMALL SCRIPT CHARACTER-18CFF
			L2/23-231		Constable, Peter (2023-12-08), "Consensus 177-C30", UTC #177 Minutes, Accept the provisionally assigned character U+18CFF
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 of the standard defines 154998 characters and 168 scripts used in various ordinary, literary, academic, and technical contexts.

Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin and Greek letters and decimal digits that enable mathematicians to denote different notions with different letter styles. The letters in various fonts often have specific, fixed meanings in particular areas of mathematics. By providing uniformity over numerous mathematical articles and books, these conventions help to read mathematical formulas. These also may be used to differentiate between concepts that share a letter in a single problem.

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1, which was incorporated whole as the Latin-1 Supplement block.

In the Unicode standard, a plane is a contiguous group of 65,536 (2¹⁶) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–10₁₆ of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 16.0, five of the planes have assigned code points (characters), and seven are named.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Khitan small script was one of two writing systems used for the now-extinct Khitan language. It was used during the 10th–12th century by the Khitan people, who had created the Liao Empire in present-day northeastern China. In addition to the small script, the Khitans simultaneously also used a functionally independent writing system known as the Khitan large script. Both Khitan scripts continued to be in use to some extent by the Jurchens for several decades after the fall of the Liao dynasty, until the Jurchens fully switched to a script of their own. Examples of the scripts appeared most often on epitaphs and monuments, although other fragments sometimes surface.

The Unicode Standard assigns various properties to each Unicode character and code point.

The Unicode block Braille Patterns (U+2800..U+28FF) contains all 256 possible patterns of an 8-dot braille cell, thereby including the complete 6-dot cell range. In Unicode, a braille cell does not have a letter or meaning defined. For example, Unicode does not define U+2817⠗BRAILLE PATTERN DOTS-1235 to be "R".

Mandaic is a Unicode block containing characters of the Mandaic script used for writing the historic Eastern Aramaic, also called Classical Mandaic, and the modern Neo-Mandaic language.

Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Phags-pa is a Unicode block containing characters from the 'Phags-pa script promulgated as a national script by Kublai Khan, the founder of the Yuan dynasty. It was used primarily in writing Mongolian and Chinese, although it was intended for the use of all written languages of the Mongol Empire.

Javanese is a Unicode block containing aksara Jawa characters traditionally used for writing the Javanese language.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Anthropos alphabet, Sakha and Americanist usage.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

Tangut is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty.

Nushu is a Unicode block containing characters from the Nüshu script, which is a syllabary derived from Chinese characters that was used exclusively among women in Jiangyong County in Hunan province of southern China.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-3] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

[1]

[2]

[lower-alpha 1]

Khitan Small Script (Unicode block)

Contents

Block

History

See also

Related Research Articles

References

Khitan Small Script
Range	U+18B00..U+18CFF (512 code points)
Plane	SMP
Scripts	Khitan small script
Assigned	471 code points
Unused	41 reserved code points
Unicode version history

13.0 (2020)	470 (+470)
16.0 (2024)	471 (+1)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2]