CJK Unified Ideographs Extension A

Last updated
CJK Unified Ideographs Extension A
RangeU+3400..U+4DBF
(6,592 code points)
Plane BMP
Scripts Han
Assigned6,592 code points
Unused0 reserved code points
Unicode version history
3.0 (1999)6,582 (+6,582)
13.0 (2020)6,592 (+10)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]
Range used for Hangul syllables prior to Unicode 2.0 (see Hangul (obsolete Unicode block)).

CJK Unified Ideographs Extension-A is a Unicode block containing rare Han ideographs.

Contents

The block has dozens of variation sequences defined for standardized variants. [3]

It also has thousands of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD). [4] [5] These sequences specify the desired glyph variant for a given Unicode character.

Block

CJK Unified Ideographs Extension A [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+340x
U+341x
U+342x
U+343x
U+344x
U+345x
U+346x
U+347x
U+348x
U+349x
U+34Ax
U+34Bx
U+34Cx
U+34Dx
U+34Ex
U+34Fx
U+350x
U+351x
U+352x
U+353x
U+354x
U+355x
U+356x
U+357x
U+358x
U+359x
U+35Ax
U+35Bx
U+35Cx
U+35Dx
U+35Ex
U+35Fx
U+360x
U+361x
U+362x
U+363x
U+364x
U+365x
U+366x
U+367x
U+368x
U+369x
U+36Ax
U+36Bx
U+36Cx
U+36Dx
U+36Ex
U+36Fx
U+370x
U+371x
U+372x
U+373x
U+374x
U+375x
U+376x
U+377x
U+378x
U+379x
U+37Ax
U+37Bx
U+37Cx
U+37Dx
U+37Ex
U+37Fx
U+380x
U+381x
U+382x
U+383x
U+384x
U+385x
U+386x
U+387x
U+388x
U+389x
U+38Ax
U+38Bx
U+38Cx
U+38Dx
U+38Ex
U+38Fx
U+390x
U+391x
U+392x
U+393x 㤿
U+394x
U+395x
U+396x
U+397x 㥿
U+398x
U+399x
U+39Ax
U+39Bx 㦿
U+39Cx
U+39Dx
U+39Ex
U+39Fx 㧿
U+3A0x
U+3A1x
U+3A2x
U+3A3x 㨿
U+3A4x
U+3A5x
U+3A6x
U+3A7x 㩿
U+3A8x
U+3A9x
U+3AAx
U+3ABx 㪿
U+3ACx
U+3ADx
U+3AEx
U+3AFx 㫿
U+3B0x
U+3B1x
U+3B2x
U+3B3x 㬿
U+3B4x
U+3B5x
U+3B6x
U+3B7x 㭿
U+3B8x
U+3B9x
U+3BAx
U+3BBx 㮿
U+3BCx
U+3BDx
U+3BEx
U+3BFx 㯿
U+3C0x
U+3C1x
U+3C2x
U+3C3x 㰿
U+3C4x
U+3C5x
U+3C6x
U+3C7x 㱿
U+3C8x
U+3C9x
U+3CAx
U+3CBx 㲿
U+3CCx
U+3CDx
U+3CEx
U+3CFx 㳿
U+3D0x
U+3D1x
U+3D2x
U+3D3x 㴿
U+3D4x
U+3D5x
U+3D6x
U+3D7x 㵿
U+3D8x
U+3D9x
U+3DAx
U+3DBx 㶿
U+3DCx
U+3DDx
U+3DEx
U+3DFx 㷿
U+3E0x
U+3E1x
U+3E2x
U+3E3x 㸿
U+3E4x
U+3E5x
U+3E6x
U+3E7x 㹿
U+3E8x
U+3E9x
U+3EAx
U+3EBx 㺿
U+3ECx
U+3EDx
U+3EEx
U+3EFx 㻿
U+3F0x
U+3F1x
U+3F2x
U+3F3x 㼿
U+3F4x
U+3F5x
U+3F6x
U+3F7x 㽿
U+3F8x
U+3F9x
U+3FAx
U+3FBx 㾿
U+3FCx
U+3FDx
U+3FEx
U+3FFx 㿿
U+400x 䀀
U+401x
U+402x
U+403x
U+404x
U+405x
U+406x
U+407x
U+408x
U+409x
U+40Ax
U+40Bx
U+40Cx
U+40Dx
U+40Ex
U+40Fx
U+410x
U+411x
U+412x
U+413x
U+414x
U+415x
U+416x
U+417x
U+418x
U+419x
U+41Ax
U+41Bx
U+41Cx
U+41Dx
U+41Ex
U+41Fx
U+420x
U+421x
U+422x
U+423x
U+424x
U+425x
U+426x
U+427x
U+428x
U+429x
U+42Ax
U+42Bx
U+42Cx
U+42Dx
U+42Ex
U+42Fx
U+430x
U+431x
U+432x
U+433x
U+434x
U+435x
U+436x
U+437x
U+438x
U+439x
U+43Ax
U+43Bx
U+43Cx
U+43Dx
U+43Ex
U+43Fx
U+440x
U+441x
U+442x
U+443x
U+444x
U+445x
U+446x
U+447x
U+448x
U+449x
U+44Ax
U+44Bx
U+44Cx
U+44Dx
U+44Ex
U+44Fx
U+450x
U+451x
U+452x
U+453x
U+454x
U+455x
U+456x
U+457x
U+458x
U+459x
U+45Ax
U+45Bx
U+45Cx
U+45Dx
U+45Ex
U+45Fx
U+460x
U+461x
U+462x
U+463x
U+464x
U+465x
U+466x
U+467x
U+468x
U+469x
U+46Ax
U+46Bx
U+46Cx
U+46Dx
U+46Ex
U+46Fx
U+470x
U+471x
U+472x
U+473x
U+474x
U+475x
U+476x
U+477x
U+478x
U+479x
U+47Ax
U+47Bx
U+47Cx
U+47Dx
U+47Ex
U+47Fx
U+480x
U+481x
U+482x
U+483x
U+484x
U+485x
U+486x
U+487x
U+488x
U+489x
U+48Ax
U+48Bx
U+48Cx
U+48Dx
U+48Ex
U+48Fx
U+490x
U+491x
U+492x
U+493x 䤿
U+494x
U+495x
U+496x
U+497x 䥿
U+498x
U+499x
U+49Ax
U+49Bx 䦿
U+49Cx
U+49Dx
U+49Ex
U+49Fx 䧿
U+4A0x
U+4A1x
U+4A2x
U+4A3x 䨿
U+4A4x
U+4A5x
U+4A6x
U+4A7x 䩿
U+4A8x
U+4A9x
U+4AAx
U+4ABx 䪿
U+4ACx
U+4ADx
U+4AEx
U+4AFx 䫿
U+4B0x
U+4B1x
U+4B2x
U+4B3x 䬿
U+4B4x
U+4B5x
U+4B6x
U+4B7x 䭿
U+4B8x
U+4B9x
U+4BAx
U+4BBx 䮿
U+4BCx
U+4BDx
U+4BEx
U+4BFx 䯿
U+4C0x
U+4C1x
U+4C2x
U+4C3x 䰿
U+4C4x
U+4C5x
U+4C6x
U+4C7x 䱿
U+4C8x
U+4C9x
U+4CAx
U+4CBx 䲿
U+4CCx
U+4CDx
U+4CEx
U+4CFx 䳿
U+4D0x
U+4D1x
U+4D2x
U+4D3x 䴿
U+4D4x
U+4D5x
U+4D6x
U+4D7x 䵿
U+4D8x
U+4D9x
U+4DAx
U+4DBx 䶿
Notes
1. ^ As of Unicode version 15.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension A block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  ID IRG  IDDocument
3.0U+3400..4DB56,582N1423N364Proposal Summary Form: CJK Unified Ideographs Extension A, 1996-08-01
N1424N382CJK Unified Ideograph Extension A Version 1.1, 1996-08-01
N1425N378Draft Text of General Description for Extension A, 1996-08-01
N1426N379Draft Text of CJK Annex for Extension A, 1996-08-01
N1455Lu, Chin (1996-08-06), More Evidence on CJK Extension A
N1439Korea's comments on IRG Proposal, 1996-08-12
N1449Wang, Xiaoming (1996-08-12), The Great Chinese Word Dictionary with CJK Extensions
N1453 Ksar, Mike; Umamaheswaran, V. S. (1996-12-06), "9. IRG status and reports", WG 2 Minutes - Quebec Meeting 31
N1479Ksar, Mike (1997-01-10), Results of SC2 questionnaire on allocation of 6585 Han characters in BMP
N1487IRG position concerning Japanese comments on extension A, 1997-01-15
L2/97-030 N1503 (pdf, doc)Umamaheswaran, V. S.; Ksar, Mike (1997-04-01), "9.2", Unconfirmed Minutes of WG 2 Meeting #32, Singapore; 1997-01-20--24
L2/97-254ROrita, Tetsuji (1997-11-20), Comment on CJK Unified Ideograph Extension-A by IRG
L2/97-279 Kobayashi, Tatsuo (1997-12-01), Comment on Contribution by Tet Orita titled "Comment on CJK Unified Ideograph Extension-A by IRG"
L2/98-039 Aliprand, Joan; Winkler, Arnold (1998-02-24), "3.B.1. Existing compatibility characters also in CJK Extension A", Preliminary Minutes - UTC #74 & L2 #171, Mountain View, CA - December 5, 1997
L2/98-096N1723CJK Unified Ideographs Extension A, High Quality Printing, Version 2.0, 1998-03-02
L2/98-103 N1733 Text for FPDAM #17 - CJK Unified Ideographs Extension A, 1998-03-20
L2/98-171N1776Text for PDAM registration and FPDAM ballot for ISO 10646-1 Amendment 17 - CJK Unified Ideographs Extension A, 1998-05-01
L2/98-170 Winkler, Arnold (1998-05-11), PDAM registration and FPDAM ballot for ISO 10646-1 Amendment 17 - CJK Extension A
L2/98-286 N1703 Umamaheswaran, V. S.; Ksar, Mike (1998-07-02), "9.1.2", Unconfirmed Meeting Minutes, WG 2 Meeting #34, Redmond, WA, USA; 1998-03-16--20
L2/98-334Revised Text of ISO/IEC 10646-1/FPDAM 17, AMENDMENT 17: CJK Unified Ideograph Extension A, 1998-11-02
L2/98-347Disposition of comments report on SC2 N3090, ISO 10646 Amd. 17: CJK unified ideograph extension A, 1998-11-02
L2/99-010 N1903 (pdf, html, doc)Umamaheswaran, V. S. (1998-12-30), "6.7.2", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25
L2/99-023 N1943 Paterson, Bruce; Sato, T. K. (1999-01-08), Revision of 10646-1 Annex T for CJK Unified Ideographs Extension A (Draft)
L2/99-156 Table of replies on ISO/IEC 10646-1/FDAM 17 - CJK Unified Ideographs Extension A, 1999-05-17
L2/99-232 N2003 Umamaheswaran, V. S. (1999-08-03), "7.1.3 Revision of Annex T for CJK Extension A", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15
L2/99-287 Notice of publication for ISO/IEC 10646-1/Amd. 17, CJK Unified Ideographs Extension A, 1999-08-19
L2/99-335 N2109 N674Zhang, Zhoucai (1999-09-03), SuperCJK, version 9.0 with Kangxi and HYD data
L2/03-287 Cook, Richard (2003-08-24), 16 UniHan.txt errors
L2/03-301 Cook, Richard (2003-08-27), 24 more UniHan.txt errors
L2/03-311 West, Andrew (2003-09-17), Unicode 4.0.1 Beta Review, comments from Andrew C. West
L2/03-399 Fok, Anthony (2003-10-13), Unihan reported errors / changes re kHKSCS entries
L2/03-367 N2667 Suignard, Michel; Muller, Eric; Jenkins, John (2003-10-22), CJK Ideograph source references corrections
L2/03-398 Nguyen, D. (2003-10-29), Unihan reported errors / changes re kCowles
L2/03-453 Minutes of the Editorial Group Ad Hoc Discussion, 2003-12-17
L2/04-208 N2774R N1064Proposal to add 6 KP source references to existing CJK Unified Ideographs, 2004-05-25
N3353 (pdf, doc)Umamaheswaran, V. S. (2007-10-10), "M51.9", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27
L2/07-208 N3285 Proposal to replace 11 KP source references to existing ISO/IEC 10646:2003, 2007-07-18
L2/10-215 Lunde, Ken (2010-06-22), "Hanyo-Denshi" IVD Collection (PRI 167) to Adobe-Japan1-6 Mapping Table
L2/13-016 Suignard, Michel (2013-01-24), CJK Ext-A fix
L2/13-011 Moore, Lisa (2013-02-04), "CJK — Extension-A Fix", UTC #134 Minutes
N4403 (pdf, doc)Umamaheswaran, V. S. (2014-01-28), "Resolution M61.02 item b", Unconfirmed minutes of WG 2 meeting 61, Holiday Inn, Vilnius, Lithuania; 2013-06-10/14
L2/14-149 N4544 Suignard, Michel (2014-07-18), CJK ideographs glyphs representation and sources references
N4553 (pdf, doc)Umamaheswaran, V. S. (2014-09-16), "9.1.3", Minutes of WG 2 meeting 62 Adobe, San Jose, CA, USA
L2/14-260 N4621 Suignard, Michel (2014-10-23), CJK chart and source references update
L2/16-052 N4603 (pdf, doc)Umamaheswaran, V. S. (2015-09-01), "M63.05", Unconfirmed minutes of WG 2 meeting 63
L2/17-180 N2202Chan, Eiso (2017-06-02), Request for consideration to add kIRG_GSource values to thirteen ideographs and change two G-source glyphs for the Table of General Standard Chinese Characters [Affects U+37C3 and 3FE0]
N4974 N2301Request of TCA's Horizontal Extension for Chemical Terminology [Affects U+44EC], 2018-06-12
N4987 Proposal on China's Horizontal Extension for 14 CJK Ideographs [Affects U+37C3 and 3FE0], 2018-06-13
N4988 Proposal on Updating 11 G glyphs of CJK Unified Ideographs to ISO/IEC 10646 [Affects U+3B9D, 3CFD and 4A76], 2018-06-13
N5016 N2349Shin, Sanghyun; Cho, Sungduk; Pyo, Seungju; Kim, Kyongsok (2018-12-13), Request to move character K6-1022 in Horizontal Extension of KS X 1027-5 from U+3EAC to U+248F2
N5020 (pdf, doc)Umamaheswaran, V. S. (2019-01-11), "10.4.6, 10.4.8 and 10.4.9", Unconfirmed minutes of WG 2 meeting 67
L2/19-242 N5094 N2370Chan, Eiso (2019-02-14), 20 questionable V4-Source characters in Ext. C and Ext. E [Affects U+440B]
N2369 Chan, Eiso (2019-05-06), Feedback on IRGN2369 [Affects U+38C7 and 46E9]
N5086 N2379Proposal of China's horizontal extension for technical used characters [Affects U+3472, 3DB8 and 3FE0], 2019-05-10
L2/19-237 N5068 Editorial Report on Miscellaneous Issues (meeting IRG#52) [Affects U+3EAC and 440B], 2019-05-17
L2/19-241 N5083 N2391Errata report for WG2 submission_TCA [Affects U+440B], 2019-05-31
L2/22-258 Shin, SangHyun; Kim, Kyongsok (2022-10-14), Changing glyphs and IDSs of 97 KR Hanja chars containing '叱 (U+53F1)' [Affects U+357E, 358B–358E, 3599–359D, 35AF, 35B0, 35B2, 35B3, 35DF–35E1, 35EF, 360F, and 3612]
L2/22-247 Lunde, Ken (2022-11-01), "25) L2/22-258", CJK & Unihan Group Recommendations for UTC #173 Meeting
L2/22-241 Constable, Peter (2022-11-09), "E.1 25) L2/22-258", Approved Minutes of UTC Meeting 173
L2/23-011 Lunde, Ken (2023-01-11), "07) 2022-12-27 19:01:03 CST [Affects U+44D5]", CJK & Unihan Group Recommendations for UTC #174 Meeting
L2/23-005 Constable, Peter (2023-02-01), "E.1 Section 07) 2022-12-27 19:01:03 CST [Affects U+44D5]", UTC #174 Minutes
13.0U+4DB6..4DBF10 L2/19-242 N5094 N2370Chan, Eiso (2019-02-14), 20 questionable V4-Source characters in Ext. C and Ext. E [Affects U+4DB6]
L2/19-239 N5080 N2338TCA's feedback to IRG N2338 [Affects U+4DB7, 4DB9, 4DBE and 4DBF], 2019-05-14
L2/19-240 N5081 N2337Disunification of U+2F83B/U+5406 [Affects U+4DB8], 2019-05-14
L2/19-241 N5083 N2391Errata report for WG2 submission_TCA [Affects U+4DBA, 4DBB, 4DBC and 4DBD], 2019-05-31
N5122 "M68.07", Unconfirmed minutes of WG 2 meeting 68, 2019-12-31
L2/19-270 Moore, Lisa (2019-10-07), "Consensus 160-C14", UTC #160 Minutes
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The Ideographic Research Group (IRG), formerly called the Ideographic Rapporteur Group, is a subgroup of Working Group 2 (WG2) of ISO/IEC JTC 1/SC 2, the subcommittee of the Joint Technical Committee of ISO and IEC which is responsible for developing standards within the field of coded character sets. IRG is composed of experts from China, Japan, South Korea, Vietnam and other countries and regions that use Han characters, as well as experts representing the Unicode Consortium. The group is responsible for coordinating the addition of new CJK unified ideographs to the Universal Multiple-Octet Coded Character Set and the Unicode Standard. The group meets twice a year for 4-5 days each time, and reports its activity to the subsequent meeting of WG2.

<span class="mw-page-title-main">Biangbiang noodles</span> Type of Chinese noodles

Biangbiang noodles, alternatively known as youpo chemian (油泼扯面) in Chinese, are a type of Chinese noodle originating from Shaanxi cuisine. The noodles, touted as one of the "eight curiosities" of Shaanxi (陕西八大怪), are described as being like a belt, owing to their thickness and length.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 15.1, Unicode defines a total of 97,680 characters.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Kangxi Radicals is a Unicode block. In version 3.0 (1999), this separate Kangxi Radicals block was introduced which encodes the 214 radicals in sequence, at U+2F00–2FD5. These are specific code points intended to represent the radical qua radical, as opposed to the character consisting of the unaugmented radical; thus, U+2F00 represents radical 1 while U+4E00 represents the character meaning "one". In addition, the CJK Radicals Supplement block (2E80–2EFF) was introduced, encoding alternative forms taken by Kangxi radicals as they appear within specific characters. For example, ⺁ "CJK RADICAL CLIFF" (U+2E81) is a variant of ⼚ radical 27 (U+2F1A), itself identical in shape to the character consisting of unaugmented radical 27, 厂 "cliff" (U+5382).

A variant form is a different glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese characters. When compared with other blocks containing CJK Unified Ideographs, it is also referred to as the Unified Repertoire and Ordering (URO).

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Unified Ideographs Extension C is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Unified Ideographs Extension D is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include:

Ideographic Description Characters is a Unicode block containing graphic characters used for describing CJK ideographs. They are used in Ideographic Description Sequences (IDS) to provide a description of an ideograph, in terms of what other ideographs make it up and how they are laid out relative to one another. An IDS provides the reader with a description of an ideograph that cannot be represented properly, usually because it is not encoded in Unicode; rendering systems are not intended to automatically compose the pieces into a complete ideograph, and the descriptions are not standardized.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Variation Selectors is the block name of a Unicode code point block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

CJK Unified Ideographs Extension E is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Unified Ideographs Extension F is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese, as well as more than a thousand Sawndip characters for writing the Zhuang language.

CJK Unified Ideographs Extension G is a Unicode block containing rare and historic CJK Unified Ideographs for Chinese, Japanese, Korean, and Vietnamese. It is the first block to be allocated to the Tertiary Ideographic Plane.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
  4. "Ideographic Variation Database". Unicode Consortium.
  5. "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.