Ahom (Unicode block)

Last updated
Ahom
RangeU+11700..U+1174F
(80 code points)
Plane SMP
Scripts Ahom
Major alphabetsAhom
Assigned65 code points
Unused15 reserved code points
Unicode version history
8.0 (2015)57 (+57)
11.0 (2018)58 (+1)
14.0 (2021)65 (+7)
Unicode documentation
Code chart ∣ Web page
Note: The Ahom block was expanded by 16 code points in Unicode 14.0. [1] [2]

Ahom is a Unicode block containing characters used for writing the Ahom alphabet, which was used to write the Ahom language spoken by the Ahom people in Assam between the 13th and the 18th centuries. [3]

The block size was expanded by 16 code points in Unicode version 14.0 (version 13: 1173F → version 14: 1174F), and 7 more characters were defined. [4] This is the first block to be expanded since Unicode version 1.1.

Ahom [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+1170x𑜀𑜁𑜂𑜃𑜄𑜅𑜆𑜇𑜈𑜉𑜊𑜋𑜌𑜍𑜎𑜏
U+1171x𑜐𑜑𑜒𑜓𑜔𑜕𑜖𑜗𑜘𑜙𑜚𑜝𑜞𑜟
U+1172x𑜠𑜡𑜢𑜣𑜤𑜥𑜦𑜧𑜨𑜩𑜪𑜫
U+1173x𑜰𑜱𑜲𑜳𑜴𑜵𑜶𑜷𑜸𑜹𑜺𑜻𑜼𑜽𑜾𑜿
U+1174x𑝀𑝁𑝂𑝃𑝄𑝅𑝆
Notes
1. ^ As of Unicode version 15.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Ahom block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
8.0U+11700..11719, 1171D..1172B, 11730..1173F57 L2/10-359 N3928 Hosken, Martin; Morey, Stephen (2010-09-17), Preliminary Proposal to add the Ahom Script in the SMP of the UCS
L2/12-222 N4290 Hosken, Martin; Morey, Stephen (2012-07-02), Proposal to add the Ahom Script in the SMP of the UCS
L2/12-267 Anderson, Deborah; McGowan, Rick; Whistler, Ken (2012-07-21), "V. AHOM", Review of Indic-related documents and Recommendations to the UTC
L2/12-309R N4321R Hosken, Martin (2012-10-23), Revised Proposal to add the Ahom Script in the SMP of the UCS
L2/12-343R2 Moore, Lisa (2012-12-04), "Consensus 133-C16", UTC #133 Minutes
N4353 (pdf, doc)"M60.09", Unconfirmed minutes of WG 2 meeting 60, 2013-05-23
L2/21-123 Cummings, Craig (2021-08-03), "Consensus 168-C3", Draft Minutes of UTC Meeting 168, Remove the assignment of GCB=SpacingMark for U+11720..U+11721 AHOM VOWEL SIGN A and AA, letting them default to GCB=Other, for Unicode version 14.0.
11.0U+1171A1 L2/15-272 Hosken, Martin; Morey, Stephen (2015-10-26), Proposal to add one extra character to the Ahom Block
L2/15-254 Moore, Lisa (2015-11-16), "D.13", UTC #145 Minutes
14.0U+11740..117467 L2/20-258 Morey, Stephen (2020-09-29), Proposal to encode additional signs for the Tai Ahom script
L2/20-250 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Constable, Peter; Liang, Hai (2020-10-01), "9. Ahom", Recommendations to UTC #165 October 2020 on Script Proposals
L2/20-237 Moore, Lisa (2020-10-27), "Consensus 165-C17", UTC #165 Minutes
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

Miscellaneous Symbols is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigrams, warning signs, and weather, among others.

Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions. For example:

Block Elements is a Unicode block containing square block symbols of various fill and shading. Used along with block elements are box-drawing characters, shade characters, and terminal graphic characters. These can be used for filling regions of the screen and portraying drop shadows. Its block name in Unicode 1.0 was Blocks.

Miscellaneous Symbols and Arrows is a Unicode block containing arrows and geometric shapes with various fills, astrological symbols, technical symbols, intonation marks, and others.

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Vedic Extensions is a Unicode block containing characters for representing tones and other vedic symbols in Devanagari and other Indic scripts. Related symbols are defined in two other blocks: Devanagari (U+0900–U+097F) and Devanagari Extended (U+A8E0–U+A8FF).

Kangxi Radicals is a Unicode block. In version 3.0 (1999), this separate Kangxi Radicals block was introduced which encodes the 214 radicals in sequence, at U+2F00–2FD5. These are specific code points intended to represent the radical qua radical, as opposed to the character consisting of the unaugmented radical; thus, U+2F00 represents radical 1 while U+4E00 represents the character meaning "one". In addition, the CJK Radicals Supplement block (2E80–2EFF) was introduced, encoding alternative forms taken by Kangxi radicals as they appear within specific characters. For example, ⺁ "CJK RADICAL CLIFF" (U+2E81) is a variant of ⼚ radical 27 (U+2F1A), itself identical in shape to the character consisting of unaugmented radical 27, 厂 "cliff" (U+5382).

Ideographic Description Characters is a Unicode block containing graphic characters used for describing CJK ideographs. They are used in Ideographic Description Sequences (IDS) to provide a description of an ideograph, in terms of what other ideographs make it up and how they are laid out relative to one another. An IDS provides the reader with a description of an ideograph that cannot be represented properly, usually because it is not encoded in Unicode; rendering systems are not intended to automatically compose the pieces into a complete ideograph, and the descriptions are not standardized.

Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text," and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.

Egyptian Hieroglyphs is a Unicode block containing the Gardiner's sign list of Egyptian hieroglyphs.

Transport and Map Symbols is a Unicode block containing transportation and map icons, largely for compatibility with Japanese telephone carriers' emoji implementations of Shift JIS, and to encode characters in the Wingdings and Wingdings 2 character sets.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.

Old Sogdian is a Unicode block containing characters for a group of related, non-cursive Sogdian writing systems used to write historic Sogdian in the 3rd to 5th centuries CE.

Sogdian is a Unicode block containing characters used to write the Sogdian language from the 7th to 14th centuries CE.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. Hosken, Martin; Morey, Stephen (2012-10-23). "N4321R: Revised Proposal to add the Ahom Script in the SMP of the UCS" (PDF). Working Group Document, ISO/IEC JTC1/SC2/WG2.
  4. "BETA Unicode 14.0.0". The Unicode Standard. Retrieved 17 September 2022.