CJK Compatibility Forms

CJK Compatibility Forms
CJK Compatibility Forms
Range	U+FE30..U+FE4F; (32 code points)
Plane	BMP
Scripts	Common
Assigned	32 code points
Unused	0 reserved code points
Source standards	CNS 11643
Unicode version history
1.0.0 (1991)	28 (+28)
3.2 (2002)	30 (+2)
4.0 (2003)	32 (+2)
Unicode documentation
	Code chart ∣ Web page
	Note:

Last updated July 29, 2023

CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility. Its block name in Unicode 1.0 was CNS 11643 Compatibility, in reference to CNS 11643.^[3]

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Compatibility Forms block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
1.0.0	U+FE30..FE44, FE49..FE4F	28			(to be determined)
3.2	U+FE45..FE46	2	L2/99-238		Consolidated document containing 6 Japanese proposals, 1999-07-15
				N2092	Addition of forty eight characters, 1999-09-13
			L2/00-024		Shibano, Kohji (2000-01-31), JCS proposal revised
			L2/00-098, L2/00-098-page5	N2195	Rationale for non-Kanji characters proposed by JCS committee, 2000-03-15
			L2/00-234	N2203 (rtf, txt)	Umamaheswaran, V. S. (2000-07-21), "8.20", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
			L2/01-114	N2328	Summary of Voting on SC 2 N 3503, ISO/IEC 10646-1: 2000/PDAM 1, 2001-03-09
4.0	U+FE47..FE48	2	L2/99-353	N2056	"3", Amendment of the part concerning the Korean characters in ISO/IEC 10646-1:1998 amendment 5, 1999-07-29
			L2/99-380		Proposal for a New Work item (NP) to amend the Korean part in ISO/IEC 10646-1:1993, 1999-12-07
			L2/99-380.3		Annex B, Special characters compatible with KPS 9566-97 (To be extended), 1999-12-07
			L2/00-084	N2182	"3", Amendment of the part concerning the Korean characters in ISO/IEC 10646-1:1998 amendment 5 (Cover page and outline of proposal L2/99-380), 1999-12-07
			L2/99-382		Whistler, Ken (1999-12-09), "2.3", Comments to accompany a U.S. NO vote on JTC1 N5999, SC2 N3393, New Work item proposal (NP) for an amendment of the Korean part of ISO/IEC 10646-1:1993
			L2/00-066	N2170 (pdf, doc)	"3", The technical justification of the proposal to amend the Korean character part of ISO/IEC 10646-1 (proposed addition of 79 symbolic characters), 2000-02-10
			L2/00-073	N2167	Karlsson, Kent (2000-03-02), Comments on DPRK New Work Item proposal on Korean characters
			L2/00-285	N2244	Proposal for the Addition of 82 Symbols to ISO/IEC 10646-1:2000, 2000-08-10
			L2/00-291		Everson, Michael (2000-08-30), Comments to Korean proposals (L2/00-284 - 289)
				N2282	Report of the meeting of the Korean script ad hoc group, 2000-09-21
			L2/01-349	N2374R	Proposal to add of 70 symbols to ISO/IEC 10646-1:2000, 2001-09-03
			L2/01-387	N2390	Kim, Kyongsok (2001-10-13), ROK's Comments about DPRK's proposal, WG2 N 2374, to add 70 symbols to ISO/IEC 10646-1:2000
			L2/01-388	N2392	Kim, Kyongsok (2001-10-16), A Report of Korean Script ad hoc group meeting on Oct. 15, 2001
			L2/01-420		Whistler, Ken (2001-10-30), "f. Miscellaneous symbol additions from DPRK standard", WG2 (Singapore) Resolution Consent Docket for UTC
			L2/01-458	N2407	Umamaheswaran, V. S. (2001-11-16), Request to Korean ad hoc group to generate mapping tables between ROK and DPRK national standards
			L2/02-372	N2453 (pdf, doc)	Umamaheswaran, V. S. (2002-10-30), "T.12", Unconfirmed minutes of WG 2 meeting 42
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.

In internationalization, CJK characters is a collective term for the Chinese, Japanese, and Korean languages, all of which include Chinese characters and derivatives in their writing systems, sometimes paired with other scripts. Collectively, the CJK characters often include Hànzì in Chinese, Kanji and Kana in Japanese, and Hanja and Hangul in Korean. Vietnamese can be included, making the abbreviation CJKV, as Vietnamese historically used Chinese characters in which they were known as chữ Hán and chữ Nôm in Vietnamese.

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The CNS 11643 character set, also officially known as the Chinese Standard Interchange Code or CSIC, is officially the standard character set of Taiwan. In practice, variants of the related Big5 character set are de facto standard.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 15.0, Unicode defines a total of 97,058 characters.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Kangxi Radicals is a Unicode block. In version 3.0 (1999), this separate Kangxi Radicals block was introduced which encodes the 214 radicals in sequence, at U+2F00–2FD5. These are specific code points intended to represent the radical qua radical, as opposed to the character consisting of the unaugmented radical; thus, U+2F00 represents radical 1 while U+4E00 represents the character yī meaning "one". In addition, the CJK Radicals Supplement block (2E80–2EFF) was introduced, encoding alternative forms taken by Kangxi radicals as they appear within specific characters. For example, ⺁ "CJK RADICAL CLIFF" (U+2E81) is a variant of ⼚ radical 27 (U+2F1A), itself identical in shape to the character consisting of unaugmented radical 27, 厂 "cliff" (U+5382).

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

CJK Compatibility Ideographs Supplement is a Unicode block containing Han characters used only for roundtrip compatibility mapping with planes 3, 4, 5, 6, 7, and 15 of CNS 11643-1992.

CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include the South Korean KS X 1001:1998, Taiwanese Big5, Japanese IBM 32, South Korean KS X 1001:2004, Japanese JIS X 0213, Japanese ARIB STD-B24 and the North Korean KPS 10721-2000 source standards.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Small Form Variants is a Unicode block containing small punctuation characters for compatibility with the Chinese National Standard CNS 11643. Its block name in Unicode 1.0 was simply Small Variants.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Vertical Forms is a Unicode block containing vertical punctuation for compatibility characters with the Chinese Standard GB 18030.

Variation Selectors is the block name of a Unicode code point block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

GB 12345, entitled Code of Chinese ideogram set for information interchange supplementary set, is a Traditional Chinese character set standard established by China, and can be thought as the traditional counterpart of GB 2312. It is used as an encoding of traditional Chinese characters, although it is not as commonly used as Big5. It has 6,866 characters, and has no relationship nor compatibility with Big5 and CNS 11643.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
↑ "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-4] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

[3] "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.

[cnote_a_grp_version] 
As of version 15.0

[1]

[2]

[3]

[1]

[lower-alpha 1]

[a]

CJK Compatibility Forms ^[1] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+FE3x	︰	︱	︲	︳	︴	︵	︶	︷	︸	︹	︺	︻	︼	︽	︾	︿
U+FE4x	﹀	﹁	﹂	﹃	﹄	﹅	﹆	﹇	﹈	﹉	﹊	﹋	﹌	﹍	﹎	﹏
Notes 1. ^ As of Unicode version 15.0

CJK Compatibility Forms

Contents

History

See also

Related Research Articles

References

CJK Compatibility Forms
Range	U+FE30..U+FE4F (32 code points)
Plane	BMP
Scripts	Common
Assigned	32 code points
Unused	0 reserved code points
Source standards	CNS 11643
Unicode version history

1.0.0 (1991)	28 (+28)
3.2 (2002)	30 (+2)
4.0 (2003)	32 (+2)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2]