CJK Compatibility Forms

Last updated
CJK Compatibility Forms
RangeU+FE30..U+FE4F
(32 code points)
Plane BMP
Scripts Common
Assigned32 code points
Unused0 reserved code points
Source standards CNS 11643
Unicode version history
1.0.0 (1991)28 (+28)
3.2 (2002)30 (+2)
4.0 (2003)32 (+2)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility. Its block name in Unicode 1.0 was CNS 11643 Compatibility, in reference to CNS 11643. [3]

Contents

CJK Compatibility Forms [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+FE3x︿
U+FE4x
Notes
1. ^ As of Unicode version 15.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Compatibility Forms block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
1.0.0U+FE30..FE44, FE49..FE4F28(to be determined)
3.2U+FE45..FE462 L2/99-238 Consolidated document containing 6 Japanese proposals, 1999-07-15
N2092 Addition of forty eight characters, 1999-09-13
L2/00-024 Shibano, Kohji (2000-01-31), JCS proposal revised
L2/00-098, L2/00-098-page5 N2195 Rationale for non-Kanji characters proposed by JCS committee, 2000-03-15
L2/00-234 N2203 (rtf, txt)Umamaheswaran, V. S. (2000-07-21), "8.20", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
L2/01-114 N2328 Summary of Voting on SC 2 N 3503, ISO/IEC 10646-1: 2000/PDAM 1, 2001-03-09
4.0U+FE47..FE482 L2/99-353 N2056 "3", Amendment of the part concerning the Korean characters in ISO/IEC 10646-1:1998 amendment 5, 1999-07-29
L2/99-380 Proposal for a New Work item (NP) to amend the Korean part in ISO/IEC 10646-1:1993, 1999-12-07
L2/99-380.3 Annex B, Special characters compatible with KPS 9566-97 (To be extended), 1999-12-07
L2/00-084 N2182 "3", Amendment of the part concerning the Korean characters in ISO/IEC 10646-1:1998 amendment 5 (Cover page and outline of proposal L2/99-380), 1999-12-07
L2/99-382 Whistler, Ken (1999-12-09), "2.3", Comments to accompany a U.S. NO vote on JTC1 N5999, SC2 N3393, New Work item proposal (NP) for an amendment of the Korean part of ISO/IEC 10646-1:1993
L2/00-066 N2170 (pdf, doc)"3", The technical justification of the proposal to amend the Korean character part of ISO/IEC 10646-1 (proposed addition of 79 symbolic characters), 2000-02-10
L2/00-073 N2167 Karlsson, Kent (2000-03-02), Comments on DPRK New Work Item proposal on Korean characters
L2/00-285 N2244 Proposal for the Addition of 82 Symbols to ISO/IEC 10646-1:2000, 2000-08-10
L2/00-291 Everson, Michael (2000-08-30), Comments to Korean proposals (L2/00-284 - 289)
N2282 Report of the meeting of the Korean script ad hoc group, 2000-09-21
L2/01-349 N2374R Proposal to add of 70 symbols to ISO/IEC 10646-1:2000, 2001-09-03
L2/01-387 N2390 Kim, Kyongsok (2001-10-13), ROK's Comments about DPRK's proposal, WG2 N 2374, to add 70 symbols to ISO/IEC 10646-1:2000
L2/01-388 N2392 Kim, Kyongsok (2001-10-16), A Report of Korean Script ad hoc group meeting on Oct. 15, 2001
L2/01-420 Whistler, Ken (2001-10-30), "f. Miscellaneous symbol additions from DPRK standard", WG2 (Singapore) Resolution Consent Docket for UTC
L2/01-458 N2407 Umamaheswaran, V. S. (2001-11-16), Request to Korean ad hoc group to generate mapping tables between ROK and DPRK national standards
L2/02-372 N2453 (pdf, doc)Umamaheswaran, V. S. (2002-10-30), "T.12", Unconfirmed minutes of WG 2 meeting 42
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.

In internationalization, CJK characters is a collective term for the Chinese, Japanese, and Korean languages, all of which include Chinese characters and derivatives in their writing systems, sometimes paired with other scripts. Collectively, the CJK characters often include Hànzì in Chinese, Kanji and Kana in Japanese, and Hanja and Hangul in Korean. Vietnamese can be included, making the abbreviation CJKV, as Vietnamese historically used Chinese characters in which they were known as chữ Hán and chữ Nôm in Vietnamese.

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The CNS 11643 character set, also officially known as the Chinese Standard Interchange Code or CSIC, is officially the standard character set of Taiwan. In practice, variants of the related Big5 character set are de facto standard.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 15.0, Unicode defines a total of 97,058 characters.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Kangxi Radicals is a Unicode block. In version 3.0 (1999), this separate Kangxi Radicals block was introduced which encodes the 214 radicals in sequence, at U+2F00–2FD5. These are specific code points intended to represent the radical qua radical, as opposed to the character consisting of the unaugmented radical; thus, U+2F00 represents radical 1 while U+4E00 represents the character meaning "one". In addition, the CJK Radicals Supplement block (2E80–2EFF) was introduced, encoding alternative forms taken by Kangxi radicals as they appear within specific characters. For example, ⺁ "CJK RADICAL CLIFF" (U+2E81) is a variant of ⼚ radical 27 (U+2F1A), itself identical in shape to the character consisting of unaugmented radical 27, 厂 "cliff" (U+5382).

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Compatibility Ideographs Supplement is a Unicode block containing Han characters used only for roundtrip compatibility mapping with planes 3, 4, 5, 6, 7, and 15 of CNS 11643-1992.

CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include the South Korean KS X 1001:1998, Taiwanese Big5, Japanese IBM 32, South Korean KS X 1001:2004, Japanese JIS X 0213, Japanese ARIB STD-B24 and the North Korean KPS 10721-2000 source standards.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Small Form Variants is a Unicode block containing small punctuation characters for compatibility with the Chinese National Standard CNS 11643. Its block name in Unicode 1.0 was simply Small Variants.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Vertical Forms is a Unicode block containing vertical punctuation for compatibility characters with the Chinese Standard GB 18030.

Variation Selectors is the block name of a Unicode code point block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

GB 12345, entitled Code of Chinese ideogram set for information interchange supplementary set, is a Traditional Chinese character set standard established by China, and can be thought as the traditional counterpart of GB 2312. It is used as an encoding of traditional Chinese characters, although it is not as commonly used as Big5. It has 6,866 characters, and has no relationship nor compatibility with Big5 and CNS 11643.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.