Chorasmian (Unicode block)

Last updated
Chorasmian
RangeU+10FB0..U+10FDF
(48 code points)
Plane SMP
Scripts Chorasmian
Assigned28 code points
Unused20 reserved code points
Unicode version history
13.028 (+28)
Note: [1] [2]

Chorasmian is a Unicode block containing characters from the Chorasmian script, which was used for writing the Khwarezmian language in Transoxiana during the 8th century.

Contents

Block

Chorasmian [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+10FBx𐾰𐾱𐾲𐾳𐾴𐾵𐾶𐾷𐾸𐾹𐾺𐾻𐾼𐾽𐾾𐾿
U+10FCx𐿀𐿁𐿂𐿃𐿄𐿅𐿆𐿇𐿈𐿉𐿊𐿋
U+10FDx
Notes
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Chorasmian block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
13.0U+10FB0..10FCB28 L2/17-054R Pandey, Anshuman (2017-01-31), Proposal to encode the Khwarezmian script in Unicode
L2/17-255 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai (2017-07-28), "19. Khwarezmian", Recommendations to UTC #152 July-August 2017 on Script Proposals
L2/18-010R Pandey, Anshuman (2018-03-26), Proposal to encode the Khwarezmian script in Unicode
L2/18-039 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Cook, Richard (2018-01-19), "13. Khwarezmian", Recommendations to UTC #154 January 2018 on Script Proposals
L2/18-168 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Chapman, Chris; Cook, Richard (2018-04-28), "13. Khwarezmian", Recommendations to UTC #155 April-May 2018 on Script Proposals
L2/18-164R2 N5010 Pandey, Anshuman (2018-07-26), Proposal to encode the Chorasmian script in Unicode
L2/18-115 Moore, Lisa (2018-05-09), "D.9", UTC #155 Minutes
L2/18-241 Anderson, Deborah; et al. (2018-07-25), "8", Recommendations to UTC # 156 July 2018 on Script Proposals
L2/18-183 Moore, Lisa (2018-11-20), "D.9", UTC #156 Minutes
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Unicode Character encoding standard

Unicode is a information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of March 2020 the most recent version, Unicode 13.0, contains a repertoire of 143,924 characters covering 154 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, and both are code-for-code identical.

Dingbat typographic symbol

In typography, a dingbat is an ornament, character, or spacer used in typesetting, often employed for the creation of box frames. The term continues to be used in the computer industry to describe fonts that have symbols and shapes in the positions designated for alphabetical or numeric characters.

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0-25FF.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorise these characters as being "letterlike".

Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions. For example:

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.

Block Elements is a Unicode block containing square block symbols of various fill and shading. Used along with block elements are box-drawing characters, shade characters, and terminal graphic characters. These can be used for filling regions of the screen and portraying drop shadows.

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. During the unification with ISO 10646 for version 1.1, the Japanese Industrial Standard Symbol was reassigned from the code point U+32FF at the end of the block to U+3004. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Symbols and Pictographs Extended-A is a Unicode block containing emoji characters. It extends the set of symbols included in the Supplemental Symbols and Pictographs block.

CJK Unified Ideographs Extension G is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese. It is the first block to be allocated to the Tertiary Ideographic Plane.

Khitan Small Script is a Unicode block containing characters from the Khitan small script, which was used for writing the Khitan language spoken by the Khitan people in northern China during the Liao dynasty.

Dives Akuru is a Unicode block containing characters from the Dhives Akuru script, which was used for writing the Maldivian language up until the 20th century.

Yezidi is a Unicode block containing characters from the Yezidi script, which was used for writing the Kurmanji language for liturgical purposes in Iraq and Georgia. There is also some limited modern usage.

Lisu Supplement is a Unicode block containing supplementary characters of the Fraser alphabet, which is used to write the Lisu language. This is a supplement to the main Lisu block.

Symbols for Legacy Computing is a Unicode block containing graphic characters that were used for various home computers from the 1970s and 1980s and in Teletext broadcasting standards. It includes characters from the Amstrad CPC, MSX, Mattel Aquarius, RISC OS, MouseText, ATARI ST, CoCo, Oric, Texas Instruments TI-99/4A, TRS-80, Minitel, Teletext, ATASCII, PETSCII, ZX80, and ZX81 character sets, as well as semigraphics characters.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2020-03-11.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2020-03-11.