CJK Compatibility Ideographs Supplement

Last updated
CJK Compatibility Ideographs Supplement
RangeU+2F800..U+2FA1F
(544 code points)
Plane SIP
Scripts Han
Assigned542 code points
Unused2 reserved code points
Source standards CNS 11643-1992
Unicode version history
3.1 (2001)542 (+542)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

CJK Compatibility Ideographs Supplement is a Unicode block containing Han characters used only for roundtrip compatibility mapping with planes 3, 4, 5, 6, 7, and 15 of CNS 11643-1992.

Contents

Block

CJK Compatibility Ideographs Supplement [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+2F80x丽丸乁𠄢你侮侻倂偺備僧像㒞𠘺免兔
U+2F81x兤具𠔜㒹內再𠕋冗冤仌冬况𩇟凵刃㓟
U+2F82x刻剆割剷㔕勇勉勤勺包匆北卉卑博即
U+2F83x卽卿卿卿𠨬灰及叟𠭣叫叱吆咞吸呈周
U+2F84x咢哶唐啓啣善善喙喫喳嗂圖嘆圗噑噴
U+2F85x切壮城埴堍型堲報墬𡓤売壷夆多夢奢
U+2F86x𡚨𡛪姬娛娧姘婦㛮㛼嬈嬾嬾𡧈寃寘寧
U+2F87x寳𡬘寿将当尢㞁屠屮峀岍𡷤嵃𡷦嵮嵫
U+2F88x嵼巡巢㠯巽帨帽幩㡢𢆃㡼庰庳庶廊𪎒
U+2F89x廾𢌱𢌱舁弢弢㣇𣊸𦇚形彫㣣徚忍志忹
U+2F8Ax悁㤺㤜悔𢛔惇慈慌慎慌慺憎憲憤憯懞
U+2F8Bx懲懶成戛扝抱拔捐𢬌挽拼捨掃揤𢯱搢
U+2F8Cx揅掩㨮摩摾撝摷㩬敏敬𣀊旣書晉㬙暑
U+2F8Dx㬈㫤冒冕最暜肭䏙朗望朡杞杓𣏃㭉柺
U+2F8Ex枅桒梅𣑭梎栟椔㮝楂榣槪檨𣚣櫛㰘次
U+2F8Fx𣢧歔㱎歲殟殺殻𣪍𡴋𣫺汎𣲼沿泍汧洖
U+2F90x派海流浩浸涅𣴞洴港湮㴳滋滇𣻑淹潮
U+2F91x𣽞𣾎濆瀹瀞瀛㶖灊災灷炭𠔥煅𤉣熜𤎫
U+2F92x爨爵牐𤘈犀犕𤜵𤠔獺王㺬玥㺸㺸瑇瑜
U+2F93x瑱璅瓊㼛甤𤰶甾𤲒異𢆟瘐𤾡𤾸𥁄㿼䀈
U+2F94x直𥃳𥃲𥄙𥄳眞真真睊䀹瞋䁆䂖𥐝硎碌
U+2F95x磌䃣𥘦祖𥚚𥛅福秫䄯穀穊穏𥥼𥪧𥪧竮
U+2F96x䈂𥮫篆築䈧𥲀糒䊠糨糣紀𥾆絣䌁緇縂
U+2F97x繅䌴𦈨𦉇䍙𦋙罺𦌾羕翺者𦓚𦔣聠𦖨聰
U+2F98x𣍟䏕育脃䐋脾媵𦞧𦞵𣎓𣎜舁舄辞䑫芑
U+2F99x芋芝劳花芳芽苦𦬼若茝荣莭茣莽菧著
U+2F9Ax荓菊菌菜𦰶𦵫𦳕䔫蓱蓳蔖𧏊蕤𦼬䕝䕡
U+2F9Bx𦾱𧃒䕫虐虜虧虩蚩蚈蜎蛢蝹蜨蝫螆䗗
U+2F9Cx蟡蠁䗹衠衣𧙧裗裞䘵裺㒻𧢮𧥦䚾䛇誠
U+2F9Dx諭變豕𧲨貫賁贛起𧼯𠠄跋趼跰𠣞軔輸
U+2F9Ex𨗒𨗭邔郱鄑𨜮鄛鈸鋗鋘鉼鏹鐕𨯺開䦕
U+2F9Fx閷𨵷䧦雃嶲霣𩅅𩈚䩮䩶韠𩐊䪲𩒖頋頋
U+2FA0x頩𩖶飢䬳餩馧駂駾䯎𩬰鬒鱀鳽䳎䳭鵧
U+2FA1x𪃎䳸𪄅𪈎𪊑麻䵖黹黾鼅鼏鼖鼻𪘀
Notes
1. ^ As of Unicode version 15.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Compatibility Ideographs Supplement block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  ID IRG  IDDocument
3.1U+2F800..2FA1D542 L2/00-032 Jenkins, John (2000-02-01), Compatibility Ideographs in the Unicode Standard
L2/00-005R2 Moore, Lisa (2000-02-14), "Compatibility Characters", Minutes of UTC #82 in San Jose
L2/00-049 N2159 Tseng, Shih-shyeng (2000-02-15), Proposal for Compatibility Ideographs (in Plane 2)
L2/00-087 N2159R Tseng, Shih-shyeng (2000-03-10), Proposal for Compatibility Ideographs (in plane 2)
L2/00-099 N2196 Sato, T. K. (2000-03-15), CJK COMPATIBILITY IDEOGRAPH
L2/00-146 N2222R Sato, T. K. (2000-04-21), More information needed for COMPATIBILITY IDEOGRAPHS ?
L2/00-147 N2223 Sato, T. K. (2000-04-21), Beyond the request of JIS COMPATIBILITY IDEOGRAPHS
L2/00-233 N2223R Sato, T. K. (2000-04-21), Beyond the request of JIS COMPATIBILITY IDEOGRAPHS
L2/00-234 N2203 (rtf, txt)Umamaheswaran, V. S. (2000-07-21), "7.3", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
L2/00-270 Suignard, Michel (2000-08-08), 10646-2 FCD: Comments on TCA CJK Compatibility
L2/00-187 Moore, Lisa (2000-08-23), "Motion 84-M9", UTC minutes -- Boston, August 8-11, 2000
L2/00-272 N2237 Suignard, Michel (2000-08-23), "T.10 Clause 11: Compatibility characters; T.12 Annex A.1", Comments accompanying the US positive vote on the CD ISO/IEC 10646-2
N2284 CJK Unified Ideographs Extension B for preDIS 10646-2, 2000-09-14
L2/00-334 N2270 Shih-Shyeng, Tseng (2000-09-15), Updated CJK Compatibility Ideographs sets from TCA
L2/00-369 Whistler, Ken (2000-10-06), "1.f CJK Compatibility Ideographs Supplement (Plane 2)", WG2 in Vouliagmeni (Athens)
L2/01-050 N2253 Umamaheswaran, V. S. (2001-01-21), "7.18 and 8.1", Minutes of the SC2/WG2 meeting in Athens, September 2000
L2/03-399 Fok, Anthony (2003-10-13), Unihan reported errors / changes re kHKSCS entries
L2/03-367 N2667 Suignard, Michel; Muller, Eric; Jenkins, John (2003-10-22), CJK Ideograph source references corrections
L2/03-398 Nguyen, D. (2003-10-29), Unihan reported errors / changes re kCowles
L2/04-209 N2775R N1063Proposal to add 13 KP source reference to existing CJK Compatibility Characters, 2004-05-25
L2/10-100 N3787 Request for disunifying U+2F89F from U+5FF9, 2010-04-07
L2/10-218 N1666Error report on U+225D6 AND U+2F89F, 2010-06-24
L2/11-243 N4111 Sources for Orphaned CJK Ideographs, 2011-06-14
L2/11-254 Constable, Peter (2011-06-20), "Update to UTR #45 U-Source Ideographs requested", UTC Liaison Report from WG2
N4103 "Resolution 58.05", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
L2/14-260 N4621 Suignard, Michel (2014-10-23), CJK chart and source references update
L2/16-052 N4603 (pdf, doc)Umamaheswaran, V. S. (2015-09-01), "M63.05", Unconfirmed minutes of WG 2 meeting 63
L2/19-239 N5080 N2338TCA's feedback to IRG N2338 [Affects U+2F878, U+2F8F0, and U+2FA02], 2019-05-14
L2/19-240 N5081 N2337Disunification of U+2F83B/U+5406, 2019-05-14
L2/19-237 N5068 Editorial Report on Miscellaneous Issues (meeting IRG#52) [Affects U+2F83B, U+2F878, U+2F8D6, U+2F8D7, U+2F8DA, U+2F8F0, U+2F984, and U+2FA02], 2019-05-17
L2/19-241 N5083 N2391Errata report for WG2 submission_TCA [Affects U+2F8D6, U+2F8D7, U+2F8DA, and U+2F984], 2019-05-31
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The Ideographic Research Group (IRG), formerly called the Ideographic Rapporteur Group, is a subgroup of Working Group 2 (WG2) of ISO/IEC JTC 1/SC 2, the subcommittee of the Joint Technical Committee of ISO and IEC which is responsible for developing standards within the field of coded character sets. IRG is composed of experts from China, Japan, South Korea, Vietnam and other countries and regions that use Han characters, as well as experts representing the Unicode Consortium. The group is responsible for coordinating the addition of new CJK unified ideographs to the Universal Multiple-Octet Coded Character Set and the Unicode Standard. The group meets twice a year for 4-5 days each time, and reports its activity to the subsequent meeting of WG2.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 15.0, Unicode defines a total of 97,058 characters.

CJK Radicals Supplement is a Unicode block containing alternative, often positional, forms of the Kangxi radicals. They are used as headers in dictionary indices and other CJK ideograph collections organized by radical-stroke.

In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.0, five of the planes have assigned code points (characters), and seven are named.

Kangxi Radicals is a Unicode block. In version 3.0 (1999), this separate Kangxi Radicals block was introduced which encodes the 214 radicals in sequence, at U+2F00–2FD5. These are specific code points intended to represent the radical qua radical, as opposed to the character consisting of the unaugmented radical; thus, U+2F00 represents radical 1 while U+4E00 represents the character meaning "one". In addition, the CJK Radicals Supplement block (2E80–2EFF) was introduced, encoding alternative forms taken by Kangxi radicals as they appear within specific characters. For example, ⺁ "CJK RADICAL CLIFF" (U+2E81) is a variant of ⼚ radical 27 (U+2F1A), itself identical in shape to the character consisting of unaugmented radical 27, 厂 "cliff" (U+5382).

A variant form is a different glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese characters. When compared with other blocks containing CJK Unified Ideographs, it is also referred to as the Unified Repertoire and Ordering (URO).

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Unified Ideographs Extension D is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include the South Korean KS X 1001:1998, Taiwanese Big5, Japanese IBM 32, South Korean KS X 1001:2004, Japanese JIS X 0213, Japanese ARIB STD-B24 and the North Korean KPS 10721-2000 source standards.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility. Its block name in Unicode 1.0 was CNS 11643 Compatibility, in reference to CNS 11643.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Variation Selectors is the block name of a Unicode code point block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

CJK Unified Ideographs Extension E is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

CJK Unified Ideographs Extension G is a Unicode block containing rare and historic CJK Unified Ideographs for Chinese, Japanese, Korean, and Vietnamese. It is the first block to be allocated to the Tertiary Ideographic Plane.

CJK Unified Ideographs Extension H is a Unicode block containing rare and historic CJK Unified Ideographs for Chinese, Japanese, Korean, Sawndip, and Vietnamese.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.