Enclosed CJK Letters and Months

Last updated

Enclosed CJK Letters and Months
RangeU+3200..U+32FF
(256 code points)
Plane BMP
Scripts Hangul (62 char.)
Katakana (47 char.)
Common (146 char.)
Assigned255 code points
Unused1 reserved code points
Source standards ARIB STD-B24
Unicode version history
1.0.0 (1991)191 (+191)
1.0.1 (1992)190 (-1)
1.1 (1993)202 (+12)
3.2 (2002)232 (+30)
4.0 (2003)241 (+9)
4.1 (2005)242 (+1)
5.2 (2009)254 (+12)
12.1 (2019)255 (+1)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]
In Unicode 1.0.1, during the process of unifying with ISO 10646, one character from the Enclosed CJK Letters and Months block was relocated to the CJK Symbols and Punctuation block, and the encircled katakana letters were re-arranged. [3]

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares (representing speed limit signs).

Contents

Its block name in Unicode 1.0 was Enclosed CJK Letters and Ideographs. [4] As part of the process of unification with ISO 10646 for version 1.1, Unicode version 1.0.1 relocated the Japanese Industrial Standard Symbol from the code point U+32FF at the end of the block to U+3004, and re-arranged the encircled katakana letters (U+32D0–U+32FE) from iroha order to gojūon order. [3]

The Reiwa symbol (㋿) was added to Enclosed CJK Letters and Months in Unicode 12.1, continuing from the existing era symbols in the (fully allocated by that point) CJK Compatibility block (Meiji ㍾, Taishō ㍽, Shōwa ㍼, Heisei ㍻).

Block

Enclosed CJK Letters and Months [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+320x
U+321x
U+322x
U+323x
U+324x
U+325x
U+326x
U+327x
U+328x
U+329x
U+32Ax
U+32Bx
U+32Cx
U+32Dx
U+32Ex
U+32Fx
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey area indicates non-assigned code point

Emoji

The Enclosed CJK Letters and Months block contains two emoji: U+3297 and U+3299. [5] [6]

The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation. [7]

Emoji variation sequences
U+32973299
base code point
base+VS15 (text)
base+VS16 (emoji)

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Enclosed CJK Letters and Months block:

See also

Related Research Articles

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The Ideographic Research Group (IRG), formerly called the Ideographic Rapporteur Group, is a subgroup of Working Group 2 (WG2) of ISO/IEC JTC1 Subcommittee 2 (SC2), which is the committee responsible for developing the Universal Coded Character Set. IRG is tasked with preparing and reviewing sets of CJK unified ideographs for eventual inclusion in both ISO/IEC 10646 and The Unicode Standard. The IRG is composed of representatives from national standards bodies from China, Japan, South Korea, Vietnam, and other regions that have historically used Chinese characters, as well as experts from liaison organizations such as the SAT Daizōkyō Text Database Committee (SAT), Taipei Computer Association (TCA), and the Unicode Technical Committee (UTC). The group holds two meetings every year lasting 4-5 days each, subsequently reporting its activities to its parent ISO/IEC JTC 1/SC 2 (SC2/WG2) committee.

New Gulim (새굴림/SaeGulRim) is a sans-serif type Unicode font designed especially for the Korean-language script, designed by HanYang System Co., Limited. It is an expanded version of Hanyang Gulrim.

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards. As the Unicode Glossary says:

A character that would not have been encoded except for compatibility and round-trip convertibility with other standards

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

KS X 1001, "Code for Information Interchange ", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:

A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF). The square forms can have different presentations when they are used in horizontal or vertical text. For example, the characters U+333ESQUARE BORUTO and U+3327SQUARE TON should look different in horizontal and in vertical right-to-left: ㌧㌾

<span class="mw-page-title-main">Enclosed Ideographic Supplement</span> Unicode character block

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

Hangul, Hangul Supplementary-A, and Hangul Supplementary-B were character blocks that existed in Unicode 1.0 and 1.1, and ISO/IEC 10646-1:1993. These blocks encoded precomposed modern Hangul syllables. These three Unicode 1.x blocks were deleted and superseded by the new Hangul Syllables block (U+AC00–U+D7AF) in Unicode 2.0 and ISO/IEC 10646-1:1993 Amd. 5 (1998), and are now occupied by CJK Unified Ideographs Extension A and Yijing Hexagram Symbols. Moving or removing existing characters has been prohibited by the Unicode Stability Policy for all versions following Unicode 2.0, so the Hangul Syllables block introduced in Unicode 2.0 is immutable.

CJK Unified Ideographs Extension I is a Unicode block comprising CJK Unified Ideographs included in drafts of an amendment to China's GB 18030 standard circulated in 2022 and 2023, which were fast-tracked into Unicode in 2023.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 26 July 2023.
  2. 1 2 "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 26 July 2023.
  3. 1 2 "Unicode 1.0.1 Addendum" (PDF). The Unicode Standard. 3 November 1992. Retrieved 9 July 2016.
  4. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.
  5. "UTR #51: Unicode Emoji". Unicode Consortium. 5 September 2023.
  6. "UCD: Emoji Data for UTR #51". Unicode Consortium. 1 February 2023.
  7. "UTS #51 Emoji Variation Sequences". The Unicode Consortium.
  8. "Notice: Unicode 1.0.1" (PDF). Unicode.