Khitan Small Script (Unicode block)

Last updated
Khitan Small Script
RangeU+18B00..U+18CFF
(512 code points)
Plane SMP
Scripts Khitan small script
Assigned470 code points
Unused42 reserved code points
Unicode version history
13.0 (2020)470 (+470)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Khitan Small Script is a Unicode block containing characters from the Khitan small script, which was used for writing the Khitan language spoken by the Khitan people in northern China during the Liao dynasty.

Contents

Khitan Small Script characters do not have descriptive character names, but have names derived algorithmically from their code point value (e.g. U+18B00 is named KHITAN SMALL SCRIPT CHARACTER-18B00).

Block

Khitan Small Script [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+18B0x𘬀𘬁𘬂𘬃𘬄𘬅𘬆𘬇𘬈𘬉𘬊𘬋𘬌𘬍𘬎𘬏
U+18B1x𘬐𘬑𘬒𘬓𘬔𘬕𘬖𘬗𘬘𘬙𘬚𘬛𘬜𘬝𘬞𘬟
U+18B2x𘬠𘬡𘬢𘬣𘬤𘬥𘬦𘬧𘬨𘬩𘬪𘬫𘬬𘬭𘬮𘬯
U+18B3x𘬰𘬱𘬲𘬳𘬴𘬵𘬶𘬷𘬸𘬹𘬺𘬻𘬼𘬽𘬾𘬿
U+18B4x𘭀𘭁𘭂𘭃𘭄𘭅𘭆𘭇𘭈𘭉𘭊𘭋𘭌𘭍𘭎𘭏
U+18B5x𘭐𘭑𘭒𘭓𘭔𘭕𘭖𘭗𘭘𘭙𘭚𘭛𘭜𘭝𘭞𘭟
U+18B6x𘭠𘭡𘭢𘭣𘭤𘭥𘭦𘭧𘭨𘭩𘭪𘭫𘭬𘭭𘭮𘭯
U+18B7x𘭰𘭱𘭲𘭳𘭴𘭵𘭶𘭷𘭸𘭹𘭺𘭻𘭼𘭽𘭾𘭿
U+18B8x𘮀𘮁𘮂𘮃𘮄𘮅𘮆𘮇𘮈𘮉𘮊𘮋𘮌𘮍𘮎𘮏
U+18B9x𘮐𘮑𘮒𘮓𘮔𘮕𘮖𘮗𘮘𘮙𘮚𘮛𘮜𘮝𘮞𘮟
U+18BAx𘮠𘮡𘮢𘮣𘮤𘮥𘮦𘮧𘮨𘮩𘮪𘮫𘮬𘮭𘮮𘮯
U+18BBx𘮰𘮱𘮲𘮳𘮴𘮵𘮶𘮷𘮸𘮹𘮺𘮻𘮼𘮽𘮾𘮿
U+18BCx𘯀𘯁𘯂𘯃𘯄𘯅𘯆𘯇𘯈𘯉𘯊𘯋𘯌𘯍𘯎𘯏
U+18BDx𘯐𘯑𘯒𘯓𘯔𘯕𘯖𘯗𘯘𘯙𘯚𘯛𘯜𘯝𘯞𘯟
U+18BEx𘯠𘯡𘯢𘯣𘯤𘯥𘯦𘯧𘯨𘯩𘯪𘯫𘯬𘯭𘯮𘯯
U+18BFx𘯰𘯱𘯲𘯳𘯴𘯵𘯶𘯷𘯸𘯹𘯺𘯻𘯼𘯽𘯾𘯿
U+18C0x𘰀𘰁𘰂𘰃𘰄𘰅𘰆𘰇𘰈𘰉𘰊𘰋𘰌𘰍𘰎𘰏
U+18C1x𘰐𘰑𘰒𘰓𘰔𘰕𘰖𘰗𘰘𘰙𘰚𘰛𘰜𘰝𘰞𘰟
U+18C2x𘰠𘰡𘰢𘰣𘰤𘰥𘰦𘰧𘰨𘰩𘰪𘰫𘰬𘰭𘰮𘰯
U+18C3x𘰰𘰱𘰲𘰳𘰴𘰵𘰶𘰷𘰸𘰹𘰺𘰻𘰼𘰽𘰾𘰿
U+18C4x𘱀𘱁𘱂𘱃𘱄𘱅𘱆𘱇𘱈𘱉𘱊𘱋𘱌𘱍𘱎𘱏
U+18C5x𘱐𘱑𘱒𘱓𘱔𘱕𘱖𘱗𘱘𘱙𘱚𘱛𘱜𘱝𘱞𘱟
U+18C6x𘱠𘱡𘱢𘱣𘱤𘱥𘱦𘱧𘱨𘱩𘱪𘱫𘱬𘱭𘱮𘱯
U+18C7x𘱰𘱱𘱲𘱳𘱴𘱵𘱶𘱷𘱸𘱹𘱺𘱻𘱼𘱽𘱾𘱿
U+18C8x𘲀𘲁𘲂𘲃𘲄𘲅𘲆𘲇𘲈𘲉𘲊𘲋𘲌𘲍𘲎𘲏
U+18C9x𘲐𘲑𘲒𘲓𘲔𘲕𘲖𘲗𘲘𘲙𘲚𘲛𘲜𘲝𘲞𘲟
U+18CAx𘲠𘲡𘲢𘲣𘲤𘲥𘲦𘲧𘲨𘲩𘲪𘲫𘲬𘲭𘲮𘲯
U+18CBx𘲰𘲱𘲲𘲳𘲴𘲵𘲶𘲷𘲸𘲹𘲺𘲻𘲼𘲽𘲾𘲿
U+18CCx𘳀𘳁𘳂𘳃𘳄𘳅𘳆𘳇𘳈𘳉𘳊𘳋𘳌𘳍𘳎𘳏
U+18CDx𘳐𘳑𘳒𘳓𘳔𘳕
U+18CEx
U+18CFx
Notes
1. ^ As of Unicode version 15.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Khitan Small Script block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
13.0U+18B00..18CD5470 L2/10-130 N3820 Sun, Bojun; Jing, Yongshi; Li, Yang (2010-04-05), Preliminary Proposal for Encoding Khitan Characters in UCS
L2/10-369 N3918 Sun, Bojun; Jing, Yongshi; Li, Yang (2010-09-16), Proposal of Encode the Khitan Characters to UCS plane
L2/10-400 N3942 Anderson, Deborah (2010-10-06), Ad hoc report on Khitan Small Script
N3903 (pdf, doc)"M57.27", Unconfirmed minutes of WG2 meeting 57, 2011-03-31
L2/16-113R N4725R West, Andrew; Zaytsev, Viacheslav; Everson, Michael (2016-05-21), Towards an Encoding of the Khitan Small Script
L2/16-156 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu (2016-05-06), "9. Khitan", Recommendations to UTC #147 May 2016 on Script Proposals
L2/16-243 N4736 Anderson, Deborah (2016-09-06), "For Khitan Small Script", Summary of Meeting on Khitan Scripts, 20 August 2016 (Yinchuan, China) - Ad Hoc Report #1
L2/16-244 N4737 Anderson, Deborah (2016-09-06), Summary of Meeting on Khitan Scripts, 22 August 2016 (Yinchuan, China) - Ad Hoc Report #2
L2/16-245R2 N4738R2 Wu, Yingzhe; Sun, Bojun; Jing, Yongshi; Zaytsev, Viacheslav; West, Andrew; Everson, Michael (2016-09-17), Final proposal to encode the Small Khitan Script in the SMP
L2/16-266 N4763 Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa (2016-09-26), "2. Khitan Small Script", Comments on Mongolian, Small Khitan, and other WG2 #65 documents
L2/16-271 N4771 Everson, Michael (2016-09-29), Khitan Small Script code chart based on the ad-hoc in San Jose
N4873R (pdf, doc)"10.2.5", Unconfirmed minutes of WG 2 meeting 65, 2018-03-16
L2/16-277 N4765 Zaytsev, Viacheslav; West, Andrew (2016-10-12), Discussion of 29 proposed Khitan Small Script characters
L2/16-296 N4775 West, Andrew; Everson, Michael; Zaytsev, Viacheslav (2016-11-04), Discussion of Cluster Formation in Khitan Small Script
L2/16-338 N4768 Moore, Lisa (2016-11-04), Summary of Ad Hoc Meeting on Khitan Small Script, 28 September 2016
L2/16-342 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu (2016-11-07), "10", Recommendations to UTC #149 November 2016 on Script Proposals
L2/16-376 Listener, Snow (2016-11-17), Layman's comments on the encoding proposal Khitan small script
L2/17-037 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa; Liang, Hai; Ishida, Richard; Misra, Karan; McGowan, Rick (2017-01-21), "15", Recommendations to UTC #150 January 2017 on Script Proposals
L2/17-016 Moore, Lisa (2017-02-08), "Consensus 150-C20", UTC #150 Minutes
L2/17-161 N4794 Suignard, Michel (2017-05-08), "China T2, Ireland T1, UK T5", Draft disposition of comments on PDAM1.2 to ISO/IEC 10646 5th edition
N4953 (pdf, doc)"M66.03b, c, and f, M66.07l", Unconfirmed minutes of WG 2 meeting 66, 2018-03-23
L2/18-121R N4943R West, Andrew; Zaytsev, Viacheslav; Everson, Michael (2018-05-19), Cluster Formation Model for Khitan Small Script
L2/18-168 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Chapman, Chris; Cook, Richard (2018-04-28), "14. Khitan Small Script", Recommendations to UTC #155 April-May 2018 on Script Proposals
L2/18-115 Moore, Lisa (2018-05-09), "C.12", UTC #155 Minutes
L2/18-210 N4977 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Constable, Peter; Moore, Lisa; Jeziorek, Marek; Yang, Ben (2018-06-09), "1", Comments on WG2 #67 documents (June 2018)
L2/18-213 N5002 Anderson, Deborah; Constable, Peter (2018-06-20), Khitan Small Script Ad Hoc Report (London)
L2/18-241 Anderson, Deborah; et al. (2018-07-25), "9", Recommendations to UTC # 156 July 2018 on Script Proposals
L2/18-285 Anderson, Deborah (2018-08-31), Further information on Khitan Small Script clusters
L2/18-300 Anderson, Deborah; et al. (2018-09-14), "9. a.", Recommendations to UTC #157 on Script Proposals
L2/18-183 Moore, Lisa (2018-11-20), "C.12 Cluster Formation Model for Khitan Small Script", UTC #156 Minutes
N5020 (pdf, doc)Umamaheswaran, V. S. (2019-01-11), "9.2.3", Unconfirmed minutes of WG 2 meeting 67
L2/20-015R Moore, Lisa (2020-05-14), "Consensus 162-C16", Draft Minutes of UTC Meeting 162
L2/21-182 Chan, Eiso; You, Jerry; Yu, Fitzgerald; Wong, Victor (2021-08-16), Request to modify U+18CCA glyph in Khitan Small Script block
L2/21-174 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Liang, Hai (2021-10-01), "14. Khitan Small Script", Recommendations to UTC #169 October 2021 on Script Proposals
L2/21-167 Cummings, Craig (2022-01-27), "Consensus 169-C18", Approved Minutes of UTC Meeting 169, Accept a glyph change for U+18CCA
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, thousands of emoji, and non-visual control and formatting codes.

Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin and Greek letters and decimal digits that enable mathematicians to denote different notions with different letter styles. The letters in various fonts often have specific, fixed meanings in particular areas of mathematics. By providing uniformity over numerous mathematical articles and books, these conventions help to read mathematical formulas. These also may be used to differentiate between concepts that share a letter in a single problem.

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorise these characters as being "letterlike".

In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.0, five of the planes have assigned code points (characters), and seven are named.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

<span class="mw-page-title-main">Khitan small script</span> Writing system of the medieval Khitan people

The Khitan small script was one of two writing systems used for the now-extinct Khitan language. It was used during the 10th–12th century by the Khitan people, who had created the Liao Empire in present-day northeastern China. In addition to the small script, the Khitans simultaneously also used a functionally independent writing system known as the Khitan large script. Both Khitan scripts continued to be in use to some extent by the Jurchens for several decades after the fall of the Liao dynasty, until the Jurchens fully switched to a script of their own. Examples of the scripts appeared most often on epitaphs and monuments, although other fragments sometimes surface.

The Unicode Standard assigns various properties to each Unicode character and code point.

The Unicode block Braille Patterns (U+2800..U+28FF) contains all 256 possible patterns of an 8-dot braille cell, thereby including the complete 6-dot cell range. In Unicode, a braille cell does not have a letter or meaning defined. For example, Unicode does not define U+2817 to be "R".

Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:

Phags-pa is a Unicode block containing characters from the 'Phags-pa script promulgated as a national script by Kublai Khan, the founder of the Yuan dynasty. It was used primarily in writing Mongolian and Chinese, although it was intended for the use of all written languages of the Mongol Empire.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Variation Selectors is the block name of a Unicode code point block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

Manichaean is a Unicode block containing characters historically used for writing Sogdian, Parthian, and the dialects of Fars.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

Tangut is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty.

Nushu is a Unicode block containing characters from the Nüshu script, which is a syllabary derived from Chinese characters that was used exclusively among women in Jiangyong County in Hunan province of southern China.

Small Kana Extension is a Unicode block containing additional small variants for the Hiragana and Katakana syllabaries, in addition to those in the Hiragana, Katakana and Katakana Phonetic Extensions blocks.

A number of Greek letters, variants, digits, and other symbols are supported by the Unicode character encoding standard.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.