CJK Unified Ideographs Extension A

Last updated
CJK Unified Ideographs Extension A
RangeU+3400..U+4DBF
(6,592 code points)
Plane BMP
Scripts Han
Assigned6,592 code points
Unused0 reserved code points
Unicode version history
3.0 (1999)6,582 (+6,582)
13.0 (2020)6,592 (+10)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]
Range used for Hangul syllables prior to Unicode 2.0 (see Hangul (obsolete Unicode block)).
CJK Ideographs Extension A UCB CJK Unified Ideographs Extension A 4800-4DBF.png
CJK Ideographs Extension A

CJK Unified Ideographs Extension-A is a Unicode block containing rare Han ideographs submitted to the Ideographic Research Group between 1992 and 1998, plus ten ideographs added in Unicode 13.0 which had previously been mistakenly unified with others. [3]

Contents

The block has dozens of variation sequences defined for standardized variants. [4]

It also has thousands of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD). [5] [6] These sequences specify the desired glyph variant for a given Unicode character.

Block

CJK Unified Ideographs Extension A [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+340x
U+341x
U+342x
U+343x
U+344x
U+345x
U+346x
U+347x
U+348x
U+349x
U+34Ax
U+34Bx
U+34Cx
U+34Dx
U+34Ex
U+34Fx
U+350x
U+351x
U+352x
U+353x
U+354x
U+355x
U+356x
U+357x
U+358x
U+359x
U+35Ax
U+35Bx
U+35Cx
U+35Dx
U+35Ex
U+35Fx
U+360x
U+361x
U+362x
U+363x
U+364x
U+365x
U+366x
U+367x
U+368x
U+369x
U+36Ax
U+36Bx
U+36Cx
U+36Dx
U+36Ex
U+36Fx
U+370x
U+371x
U+372x
U+373x
U+374x
U+375x
U+376x
U+377x
U+378x
U+379x
U+37Ax
U+37Bx
U+37Cx
U+37Dx
U+37Ex
U+37Fx
U+380x
U+381x
U+382x
U+383x
U+384x
U+385x
U+386x
U+387x
U+388x
U+389x
U+38Ax
U+38Bx
U+38Cx
U+38Dx
U+38Ex
U+38Fx
U+390x
U+391x
U+392x
U+393x 㤿
U+394x
U+395x
U+396x
U+397x 㥿
U+398x
U+399x
U+39Ax
U+39Bx 㦿
U+39Cx
U+39Dx
U+39Ex
U+39Fx 㧿
U+3A0x
U+3A1x
U+3A2x
U+3A3x 㨿
U+3A4x
U+3A5x
U+3A6x
U+3A7x 㩿
U+3A8x
U+3A9x
U+3AAx
U+3ABx 㪿
U+3ACx
U+3ADx
U+3AEx
U+3AFx 㫿
U+3B0x
U+3B1x
U+3B2x
U+3B3x 㬿
U+3B4x
U+3B5x
U+3B6x
U+3B7x 㭿
U+3B8x
U+3B9x
U+3BAx
U+3BBx 㮿
U+3BCx
U+3BDx
U+3BEx
U+3BFx 㯿
U+3C0x
U+3C1x
U+3C2x
U+3C3x 㰿
U+3C4x
U+3C5x
U+3C6x
U+3C7x 㱿
U+3C8x
U+3C9x
U+3CAx
U+3CBx 㲿
U+3CCx
U+3CDx
U+3CEx
U+3CFx 㳿
U+3D0x
U+3D1x
U+3D2x
U+3D3x 㴿
U+3D4x
U+3D5x
U+3D6x
U+3D7x 㵿
U+3D8x
U+3D9x
U+3DAx
U+3DBx 㶿
U+3DCx
U+3DDx
U+3DEx
U+3DFx 㷿
U+3E0x
U+3E1x
U+3E2x
U+3E3x 㸿
U+3E4x
U+3E5x
U+3E6x
U+3E7x 㹿
U+3E8x
U+3E9x
U+3EAx
U+3EBx 㺿
U+3ECx
U+3EDx
U+3EEx
U+3EFx 㻿
U+3F0x
U+3F1x
U+3F2x
U+3F3x 㼿
U+3F4x
U+3F5x
U+3F6x
U+3F7x 㽿
U+3F8x
U+3F9x
U+3FAx
U+3FBx 㾿
U+3FCx
U+3FDx
U+3FEx
U+3FFx 㿿
U+400x 䀀
U+401x
U+402x
U+403x
U+404x
U+405x
U+406x
U+407x
U+408x
U+409x
U+40Ax
U+40Bx
U+40Cx
U+40Dx
U+40Ex
U+40Fx
U+410x
U+411x
U+412x
U+413x
U+414x
U+415x
U+416x
U+417x
U+418x
U+419x
U+41Ax
U+41Bx
U+41Cx
U+41Dx
U+41Ex
U+41Fx
U+420x
U+421x
U+422x
U+423x
U+424x
U+425x
U+426x
U+427x
U+428x
U+429x
U+42Ax
U+42Bx
U+42Cx
U+42Dx
U+42Ex
U+42Fx
U+430x
U+431x
U+432x
U+433x
U+434x
U+435x
U+436x
U+437x
U+438x
U+439x
U+43Ax
U+43Bx
U+43Cx
U+43Dx
U+43Ex
U+43Fx
U+440x
U+441x
U+442x
U+443x
U+444x
U+445x
U+446x
U+447x
U+448x
U+449x
U+44Ax
U+44Bx
U+44Cx
U+44Dx
U+44Ex
U+44Fx
U+450x
U+451x
U+452x
U+453x
U+454x
U+455x
U+456x
U+457x
U+458x
U+459x
U+45Ax
U+45Bx
U+45Cx
U+45Dx
U+45Ex
U+45Fx
U+460x
U+461x
U+462x
U+463x
U+464x
U+465x
U+466x
U+467x
U+468x
U+469x
U+46Ax
U+46Bx
U+46Cx
U+46Dx
U+46Ex
U+46Fx
U+470x
U+471x
U+472x
U+473x
U+474x
U+475x
U+476x
U+477x
U+478x
U+479x
U+47Ax
U+47Bx
U+47Cx
U+47Dx
U+47Ex
U+47Fx
U+480x
U+481x
U+482x
U+483x
U+484x
U+485x
U+486x
U+487x
U+488x
U+489x
U+48Ax
U+48Bx
U+48Cx
U+48Dx
U+48Ex
U+48Fx
U+490x
U+491x
U+492x
U+493x 䤿
U+494x
U+495x
U+496x
U+497x 䥿
U+498x
U+499x
U+49Ax
U+49Bx 䦿
U+49Cx
U+49Dx
U+49Ex
U+49Fx 䧿
U+4A0x
U+4A1x
U+4A2x
U+4A3x 䨿
U+4A4x
U+4A5x
U+4A6x
U+4A7x 䩿
U+4A8x
U+4A9x
U+4AAx
U+4ABx 䪿
U+4ACx
U+4ADx
U+4AEx
U+4AFx 䫿
U+4B0x
U+4B1x
U+4B2x
U+4B3x 䬿
U+4B4x
U+4B5x
U+4B6x
U+4B7x 䭿
U+4B8x
U+4B9x
U+4BAx
U+4BBx 䮿
U+4BCx
U+4BDx
U+4BEx
U+4BFx 䯿
U+4C0x
U+4C1x
U+4C2x
U+4C3x 䰿
U+4C4x
U+4C5x
U+4C6x
U+4C7x 䱿
U+4C8x
U+4C9x
U+4CAx
U+4CBx 䲿
U+4CCx
U+4CDx
U+4CEx
U+4CFx 䳿
U+4D0x
U+4D1x
U+4D2x
U+4D3x 䴿
U+4D4x
U+4D5x
U+4D6x
U+4D7x 䵿
U+4D8x
U+4D9x
U+4DAx
U+4DBx 䶿
Notes
1. ^ As of Unicode version 16.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension A block:

Related Research Articles

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

<span class="mw-page-title-main">Biangbiang noodles</span> Type of Chinese noodles

Biangbiang noodles, alternatively known as youpo chemian in Chinese, are a type of Chinese noodle originating from Shaanxi cuisine. The noodles, touted as one of the "eight curiosities" of Shaanxi (陕西八大怪), are described as being like a belt, owing to their thickness and length.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 16.0, Unicode defines a total of 97,680 characters.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese characters. When contrasted with other blocks containing CJK Unified Ideographs, it is also referred to as the Unified Repertoire and Ordering (URO).

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for kunqu added in Unicode 13.0, and two characters for the Macao Supplementary Character Set added in Unicode 14.0.

CJK Unified Ideographs Extension C is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 2002 and 2006, plus five "urgently needed" characters added in Unicode versions 14.0 and 15.0, some of which had previously been mistakenly unified with other characters.

CJK Unified Ideographs Extension D is a Unicode block containing uncommon CJK ideographs for Chinese, Japanese, Korean, and Vietnamese, some of which are in current use. Much smaller than most Unicode blocks for CJK unified ideographs, Extension D consists of characters which were submitted to the Ideographic Research Group as "urgently needed characters" between 2006 and 2009. Characters submitted during the same period which were needed less urgently were included in CJK Unified Ideographs Extension E instead.

CJK Compatibility Ideographs is a Unicode block created to contain mostly Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. However, it also contains 12 unified ideographs sourced from Japanese character sets from IBM.

Ideographic Description Characters is a Unicode block containing graphic characters used for describing CJK ideographs. They are used in Ideographic Description Sequences (IDS) to provide a description of an ideograph, in terms of what other ideographs make it up and how they are laid out relative to one another. An IDS provides the reader with a description of an ideograph that cannot be represented properly, usually because it is not encoded in Unicode; rendering systems are not intended to automatically compose the pieces into a complete ideograph, and the descriptions are not standardized.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

<span class="mw-page-title-main">Enclosed Ideographic Supplement</span> Unicode character block

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1–VS4, VS7, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

CJK Unified Ideographs Extension E is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 2006 and 2013, excluding the characters submitted as "urgently needed" between 2006 and 2009, which were included in CJK Unified Ideographs Extension D.

CJK Unified Ideographs Extension F is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese, as well as more than a thousand Sawndip characters for writing the Zhuang language, which were submitted to the Ideographic Research Group between 2012 and 2015.

CJK Unified Ideographs Extension G is a Unicode block containing rare and historic CJK Unified Ideographs for Chinese, Japanese, Korean, and Vietnamese which were submitted to the Ideographic Research Group during 2015. It is the first block to be allocated to the Tertiary Ideographic Plane.

CJK Unified Ideographs Extension H is a Unicode block containing rare and historic CJK Unified Ideographs for Chinese, Japanese, Korean, Sawndip, and Vietnamese submitted to the Ideographic Research Group during 2017.

CJK Unified Ideographs Extension I is a Unicode block comprising CJK Unified Ideographs included in drafts of an amendment to China's GB 18030 standard circulated in 2022 and 2023, which were fast-tracked into Unicode in 2023.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "18.1: Han (§ Blocks Containing Han Ideographs)" (PDF). The Unicode Standard: Core Specification. Version 15.0. Unicode Consortium. pp. 741–744. 2022. ISBN   978-1-936213-32-0.
  4. "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
  5. "Ideographic Variation Database". Unicode Consortium.
  6. "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.