Enclosed CJK Letters and Months

Last updated

Enclosed CJK Letters and Months
RangeU+3200..U+32FF
(256 code points)
Plane BMP
Scripts Hangul (62 char.)
Katakana (47 char.)
Common (146 char.)
Assigned255 code points
Unused1 reserved code points
Source standards ARIB STD-B24
Unicode version history
1.0.0 (1991)191 (+191)
1.0.1 (1992)190 (-1)
1.1 (1993)202 (+12)
3.2 (2002)232 (+30)
4.0 (2003)241 (+9)
4.1 (2005)242 (+1)
5.2 (2009)254 (+12)
12.1 (2019)255 (+1)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]
In Unicode 1.0.1, during the process of unifying with ISO 10646, one character from the Enclosed CJK Letters and Months block was relocated to the CJK Symbols and Punctuation block, and the encircled katakana letters were re-arranged. [3]

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares (representing speed limit signs).

Contents

Its block name in Unicode 1.0 was Enclosed CJK Letters and Ideographs. [4] As part of the process of unification with ISO 10646 for version 1.1, Unicode version 1.0.1 relocated the Japanese Industrial Standard Symbol from the code point U+32FF at the end of the block to U+3004, and re-arranged the encircled katakana letters (U+32D0–U+32FE) from iroha order to gojūon order. [3]

The Reiwa symbol (㋿) was added to Enclosed CJK Letters and Months in Unicode 12.1, continuing from the existing era symbols in the (fully allocated by that point) CJK Compatibility block (Meiji ㍾, Taishō ㍽, Shōwa ㍼, Heisei ㍻).

Block

Enclosed CJK Letters and Months [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+320x
U+321x
U+322x
U+323x
U+324x
U+325x
U+326x
U+327x
U+328x
U+329x
U+32Ax
U+32Bx
U+32Cx
U+32Dx
U+32Ex
U+32Fx
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey area indicates non-assigned code point

Emoji

The Enclosed CJK Letters and Months block contains two emoji: U+3297 and U+3299. [5] [6]

The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation. [7]

Emoji variation sequences
U+32973299
base code point
base+VS15 (text)
base+VS16 (emoji)

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Enclosed CJK Letters and Months block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
1.0.0U+3200..321C, 3220..3243, 3260..327B, 327F..32B0, 32D0..32FE190(to be determined)
L2/11-438 [lower-alpha 2] [lower-alpha 3] N4182 Edberg, Peter (22 December 2011), Emoji Variation Sequences (Revision of L2/11-429)
1.1U+32C0..32CB12(to be determined)
3.2U+3251..325F, 32B1..32BF30 L2/99-238 Consolidated document containing 6 Japanese proposals, 15 July 1999
N2093 Addition of medical symbols and enclosed numbers, 13 September 1999
L2/00-010 N2103 Umamaheswaran, V. S. (5 January 2000), "8.8", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13--16
L2/00-296 N2256 Sato, T. K. (4 September 2000), Circled Numbers in JIS X 0213
4.0U+321D..321E, 3250, 327C..327D, 32CC..32CF9 L2/99-353 N2056 "3", Amendment of the part concerning the Korean characters in ISO/IEC 10646-1:1998 amendment 5, 29 July 1999
L2/99-380 Proposal for a New Work item (NP) to amend the Korean part in ISO/IEC 10646-1:1993, 7 December 1999
L2/99-380.3 Annex B, Special characters compatible with KPS 9566-97 (To be extended), 7 December 1999
L2/00-084 N2182 "3", Amendment of the part concerning the Korean characters in ISO/IEC 10646-1:1998 amendment 5 (Cover page and outline of proposal L2/99-380), 7 December 1999
L2/99-382 Whistler, Ken (9 December 1999), "2.3", Comments to accompany a U.S. NO vote on JTC1 N5999, SC2 N3393, New Work item proposal (NP) for an amendment of the Korean part of ISO/IEC 10646-1:1993
L2/00-066 N2170 (pdf, doc)"3", The technical justification of the proposal to amend the Korean character part of ISO/IEC 10646-1 (proposed addition of 79 symbolic characters), 10 February 2000
L2/00-073 N2167 Karlsson, Kent (2 March 2000), Comments on DPRK New Work Item proposal on Korean characters
L2/00-285 N2244 Proposal for the Addition of 82 Symbols to ISO/IEC 10646-1:2000, 10 August 2000
L2/00-291 Everson, Michael (30 August 2000), Comments to Korean proposals (L2/00-284 - 289)
N2282 Report of the meeting of the Korean script ad hoc group, 21 September 2000
L2/01-349 N2374R Proposal to add of 70 symbols to ISO/IEC 10646-1:2000, 3 September 2001
L2/01-387 N2390 Kim, Kyongsok (13 October 2001), ROK's Comments about DPRK's proposal, WG2 N 2374, to add 70 symbols to ISO/IEC 10646-1:2000
L2/01-388 N2392 Kim, Kyongsok (16 October 2001), A Report of Korean Script ad hoc group meeting on Oct. 15, 2001
L2/01-420 Whistler, Ken (30 October 2001), "f. Miscellaneous symbol additions from DPRK standard", WG2 (Singapore) Resolution Consent Docket for UTC
L2/01-458 N2407 Umamaheswaran, V. S. (16 November 2001), Request to Korean ad hoc group to generate mapping tables between ROK and DPRK national standards
L2/02-372 N2453 (pdf, doc)Umamaheswaran, V. S. (30 October 2002), "M42.14 item j", Unconfirmed minutes of WG 2 meeting 42
4.1U+327E1 L2/04-267 N2815 Ahn, Dae Hyuk (18 June 2004), Proposal to add Postal Code Mark to BMP of UCS
N2753 (pdf, doc)"9.9", Unconfirmed minutes of WG 2 meeting 45; IBM Software Lab, Markham, Ontario, Canada; 2004-06-21/24, 26 December 2004
5.2U+3244..324F12 N3353 (pdf, doc)Umamaheswaran, V. S. (10 October 2007), "M51.32", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27
L2/07-259 Suignard, Michel (2 August 2007), Japanese TV Symbols
L2/07-391 N3341 Suignard, Michel (18 September 2007), Japanese TV Symbols
L2/08-077R2 N3397 Suignard, Michel (11 March 2008), Japanese TV symbols
L2/08-128 Iancu, Laurențiu (22 March 2008), Names and allocation of some Japanese TV symbols from N3397
L2/08-158 Pentzlin, Karl (16 April 2008), Comments on L2/08-077R2 "Japanese TV Symbols"
L2/08-188 N3468 Sekiguchi, Masahiro (22 April 2008), Collected comments on Japanese TV Symbols (WG2 N3397)
L2/08-077R3 N3469 Suignard, Michel (23 April 2008), Japanese TV symbols
L2/08-215 Pentzlin, Karl (7 May 2008), Comments on L2/08-077R2 "Japanese TV Symbols"
L2/08-289 Pentzlin, Karl (5 August 2008), Proposal to rename and reassign some Japanese TV Symbols from L2/08-077R3
L2/08-292 Stötzner, Andreas (6 August 2008), Improvement suggestions for n3469
L2/08-307 Scherer, Markus (8 August 2008), Feedback on the Japanese TV Symbols Proposal (L2/08-077R3)
L2/08-318 N3453 (pdf, doc)Umamaheswaran, V. S. (13 August 2008), "M52.14", Unconfirmed minutes of WG 2 meeting 52
L2/08-161R2 Moore, Lisa (5 November 2008), "Consensus 115-C17", UTC #115 Minutes, Approve 186 Japanese TV symbols for encoding in a future version of the standard.
12.1U+32FF1 N4953 (pdf, doc)"9.3.27", Unconfirmed minutes of WG 2 meeting 66, 23 March 2018
L2/17-429 Orita, Tetsuji (19 December 2017), Request to reserve the code point for square Japanese new era name (SC2 N4577)
L2/18-039 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Cook, Richard (19 January 2018), "22. CJK", Recommendations to UTC #154 January 2018 on Script Proposals
L2/18-007 Moore, Lisa (19 March 2018), "C.8", UTC #154 Minutes
L2/18-115 Moore, Lisa (9 May 2018), "C.8", UTC #155 Minutes
N4949 Update on SC2 N4577 "Request to reserve the code point for square Japanese new era name", 23 May 2018
L2/18-220 Whistler, Ken (16 July 2018), Unicode 12.1 Planning Considerations
L2/18-183 Moore, Lisa (20 November 2018), "B.13.3.1 Unicode 12.1 planning considerations", UTC #156 Minutes
N5020 (pdf, doc)Umamaheswaran, V. S. (11 January 2019), "10.3.9 Code point for Square Japanese New Era Name", Unconfirmed minutes of WG 2 meeting 67
L2/19-008 Moore, Lisa (8 February 2019), "B.13.4 Unicode V12.1", UTC #158 Minutes
L2/19-094 Orita, Tetsuji (1 April 2019), Announcement of Japanese new era name
  1. Proposed code points and characters names may differ from final code points and names
  2. See also L2/10-458, L2/11-414, L2/11-415, and L2/11-429
  3. Refer to the history section of the Miscellaneous Symbols and Pictographs block for additional emoji-related documents

See also

Related Research Articles

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The Ideographic Research Group (IRG), formerly called the Ideographic Rapporteur Group, is a subgroup of the ISO/IEC Joint Technical Committee, responsible for developing aspects of The Unicode Standard pertaining to CJK unified ideographs. The IRG is composed of representatives from the Unicode Consortium, as well as experts from China, Japan, South Korea, Vietnam, and other regions that have historically used Chinese characters, as well as experts. The group holds two meetings every year lasting 4-5 days each, subsequently reporting its activities to its parent ISO/IEC JTC 1/SC 2 (WG2) committee.

New Gulim (새굴림/SaeGulRim) is a sans-serif type Unicode font designed especially for the Korean-language script, designed by HanYang System Co., Limited. It is an expanded version of Hanyang Gulrim.

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards. As the Unicode Glossary says:

A character that would not have been encoded except for compatibility and round-trip convertibility with other standards

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

KS X 1001, "Code for Information Interchange ", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:

A variant form is a different glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).

<span class="mw-page-title-main">Enclosed Ideographic Supplement</span> Unicode character block

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

<span class="mw-page-title-main">Noto fonts</span> Multilingual font family from Google

Noto is a font family comprising over 100 individual computer fonts, which are together designed to cover all the scripts encoded in the Unicode standard. As of October 2016, Noto fonts cover all 93 scripts defined in Unicode version 6.1, although fewer than 30,000 of the nearly 75,000 CJK unified ideographs in version 6.0 are covered. In total, Noto fonts cover over 77,000 characters, which is around half of the 149,186 characters defined in Unicode 15.0.

Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

Hangul, Hangul Supplementary-A, and Hangul Supplementary-B were character blocks that existed in Unicode 1.0 and 1.1, and ISO/IEC 10646-1:1993. These blocks encoded precomposed modern Hangul syllables. These three Unicode 1.x blocks were deleted and superseded by the new Hangul Syllables block (U+AC00–U+D7AF) in Unicode 2.0 and ISO/IEC 10646-1:1993 Amd. 5 (1998), and are now occupied by CJK Unified Ideographs Extension A and Yijing Hexagram Symbols. Moving or removing existing characters has been prohibited by the Unicode Stability Policy for all versions following Unicode 2.0, so the Hangul Syllables block introduced in Unicode 2.0 is immutable.

CJK Unified Ideographs Extension I is a Unicode block comprising CJK Unified Ideographs included in drafts of an amendment to China's GB 18030 standard circulated in 2022 and 2023, which were fast-tracked into Unicode in 2023.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 26 July 2023.
  2. 1 2 "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 26 July 2023.
  3. 1 2 "Unicode 1.0.1 Addendum" (PDF). The Unicode Standard. 3 November 1992. Retrieved 9 July 2016.
  4. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.
  5. "UTR #51: Unicode Emoji". Unicode Consortium. 5 September 2023.
  6. "UCD: Emoji Data for UTR #51". Unicode Consortium. 1 February 2023.
  7. "UTS #51 Emoji Variation Sequences". The Unicode Consortium.
  8. "Notice: Unicode 1.0.1" (PDF). Unicode.