Hangul (obsolete Unicode block)

Last updated

Hangul, Hangul Supplementary-A, and Hangul Supplementary-B were character blocks that existed in Unicode 1.0 and 1.1, and ISO/IEC 10646-1:1993. These blocks encoded precomposed modern Hangul syllables. These three Unicode 1.x blocks were deleted and superseded by the new Hangul Syllables block (U+AC00–U+D7AF) in Unicode 2.0 (July 1996) and ISO/IEC 10646-1:1993 Amd. 5 (1998), and are now occupied by CJK Unified Ideographs Extension A and Yijing Hexagram Symbols. Moving or removing existing characters has been prohibited by the Unicode Stability Policy for all versions following Unicode 2.0, so the Hangul Syllables block introduced in Unicode 2.0 is immutable.

Contents

Documentation

The Unicode 1.0.0 code chart is still available online, including the Korean Hangul Syllables block, but not the supplements added in Unicode 1.1. [1] Full code charts for Unicode 1.1 were "never created", since Unicode 1.1 was published only as a report amending Unicode 1.0 due to the urgency of releasing it; [2] however, full code charts for ISO/IEC 10646-1:1993 were available, covering all three blocks. [3]

Data for mapping between Unicode 1.1, Unicode 2.0 and other hangul encodings has been supplied by the Unicode Consortium. [4] This data is archived as historic, but contains errors; an errata document is also supplied which corrects the mappings with reference to decompositions from the Unicode Character Database for Unicode 1.1.5, [5] which is itself also available. [6] However, the Unicode 1.1.5 data itself contains some errors; corrected data with reference to the ISO/IEC 10646-1:1993 code charts and the source standards is documented in the Unicode Technical Committee document UTC L2/17-080. [3]

Korean Hangul Syllables block

Korean Hangul Syllables
RangeU+3400..U+3D2D
(2,350 code points)
Plane BMP
Scripts Hangul
Status Deleted prior to the release of Unicode 2.0
Now occupied by CJK Unified Ideographs Extension A
Source standards KS C 5601-1987
Unicode version history
1.0.0 (1991)2,350 (+2,350)
2.0 (1996)0 (-2350)
Chart
Code chart
Note: Block deleted in Unicode 2.0, with characters moved to Hangul Syllables block.

Hangul (U+3400–U+3D2D), [7] also called Korean Hangul Syllables, [1] consisted of 2,350 syllables from KS C 5601-1987 (now KS X 1001). This block was encoded from Unicode 1.0.0 and included in the main code chart (without character names) [1] but not in the block charts (which included character names). [8]

Korean Hangul Syllables [1] [2]
Unicode Consortium code chart (obsolete) (PDF) [6] [3]
 0123456789ABCDEF
U+340x
U+341x
U+342x
U+343x
U+344x
U+345x
U+346x
U+347x굿
U+348x
U+349x귿
U+34Ax
U+34Bx
U+34Cx
U+34Dx
U+34Ex꼿
U+34Fx
U+350x뀀
U+351x
U+352x
U+353x
U+354x
U+355x
U+356x
U+357x
U+358x
U+359x
U+35Ax
U+35Bx
U+35Cx
U+35Dx
U+35Ex
U+35Fx
U+360x
U+361x
U+362x
U+363x
U+364x
U+365x
U+366x
U+367x
U+368x
U+369x
U+36Ax
U+36Bx
U+36Cx
U+36Dx
U+36Ex
U+36Fx
U+370x릿
U+371x
U+372x
U+373x
U+374x
U+375x
U+376x
U+377x믿
U+378x
U+379x
U+37Ax
U+37Bx
U+37Cx
U+37Dx
U+37Ex
U+37Fx
U+380x
U+381x
U+382x
U+383x
U+384x
U+385x
U+386x
U+387x
U+388x
U+389x
U+38Ax
U+38Bx
U+38Cx
U+38Dx
U+38Ex
U+38Fx
U+390x
U+391x
U+392x
U+393x
U+394x
U+395x
U+396x
U+397x
U+398x
U+399x
U+39Ax
U+39Bx
U+39Cx
U+39Dx
U+39Ex
U+39Fx
U+3A0x
U+3A1x
U+3A2x
U+3A3x
U+3A4x
U+3A5x
U+3A6x
U+3A7x
U+3A8x
U+3A9x
U+3AAx
U+3ABx
U+3ACx
U+3ADx
U+3AEx
U+3AFx
U+3B0x
U+3B1x
U+3B2x
U+3B3x
U+3B4x
U+3B5x
U+3B6x
U+3B7x
U+3B8x
U+3B9x
U+3BAx퀀
U+3BBx
U+3BCx
U+3BDx
U+3BEx
U+3BFx
U+3C0x
U+3C1x
U+3C2x
U+3C3x
U+3C4x
U+3C5x
U+3C6x
U+3C7x
U+3C8x
U+3C9x
U+3CAx
U+3CBx
U+3CCx
U+3CDx
U+3CEx
U+3CFx
U+3D0x
U+3D1x
U+3D2x 
Notes
1. ^ As of Unicode version 1.1. Characters in chart are shown by means of equivalent code points in Unicode 2.0 and all subsequent versions.
2. ^ Grey areas indicate points outside of the block, since its boundaries (unusually) were not aligned to multiples of 16.

Hangul Supplementary-A block

Hangul Supplementary-A
RangeU+3D2E..U+44B7
(1,930 code points)
Plane BMP
Scripts Hangul
Status Deleted prior to the release of Unicode 2.0
Now occupied by CJK Unified Ideographs Extension A
Source standards KS C 5657-1991
Unicode version history
1.1 (1993)1,930 (+1,930)
2.0 (1996)0 (-1930)
Note: Block deleted in Unicode 2.0, with characters moved to Hangul Syllables block.

Hangul Supplementary-A (U+3D2E–U+44B7) [7] consisted of 1,930 syllables from KS C 5657-1991 (now KS X 1002).

Hangul Supplementary-A [1] [2]
References: [6] [3]
 0123456789ABCDEF
U+3D2x 
U+3D3x갿
U+3D4x겿
U+3D5x
U+3D6x
U+3D7x
U+3D8x
U+3D9x
U+3DAx
U+3DBx
U+3DCx
U+3DDx꾿
U+3DEx
U+3DFx
U+3E0x
U+3E1x
U+3E2x
U+3E3x
U+3E4x
U+3E5x
U+3E6x
U+3E7x
U+3E8x
U+3E9x
U+3EAx
U+3EBx
U+3ECx
U+3EDx
U+3EEx
U+3EFx
U+3F0x
U+3F1x
U+3F2x
U+3F3x
U+3F4x
U+3F5x
U+3F6x
U+3F7x
U+3F8x
U+3F9x
U+3FAx먿
U+3FBx
U+3FCx몿
U+3FDx
U+3FEx뭿
U+3FFx
U+400x
U+401x
U+402x
U+403x볿
U+404x
U+405x
U+406x붿
U+407x
U+408x
U+409x뻿
U+40Ax뽿
U+40Bx
U+40Cx
U+40Dx
U+40Ex
U+40Fx
U+410x
U+411x
U+412x
U+413x
U+414x
U+415x
U+416x
U+417x
U+418x
U+419x
U+41Ax
U+41Bx
U+41Cx
U+41Dx
U+41Ex
U+41Fx
U+420x
U+421x
U+422x
U+423x
U+424x
U+425x
U+426x쥿
U+427x짿
U+428x
U+429x
U+42Ax
U+42Bx쬿
U+42Cx
U+42Dx
U+42Ex
U+42Fx찿
U+430x
U+431x쳿
U+432x
U+433x
U+434x
U+435x캿
U+436x
U+437x
U+438x
U+439x
U+43Ax
U+43Bx
U+43Cx
U+43Dx
U+43Ex
U+43Fx
U+440x
U+441x
U+442x
U+443x
U+444x
U+445x
U+446x
U+447x
U+448x
U+449x
U+44Ax
U+44Bx 
Notes
1. ^ As of Unicode version 1.1. Characters in chart are shown by means of equivalent code points in Unicode 2.0 and all subsequent versions.
2. ^ Grey areas indicate points outside of the block, since its boundaries (unusually) were not aligned to multiples of 16.

Hangul Supplementary-B block

Hangul Supplementary-B
RangeU+44B8..U+4DFF
(2,376 code points)
Plane BMP
Scripts Hangul
Status Deleted prior to the release of Unicode 2.0
Now occupied by
Source standards GB 12052-89 (U+44B8–U+44BD only)
Unicode version history
1.1 (1993)2,376 (+2,376)
2.0 (1996)0 (-2376)
Note: Block deleted in Unicode 2.0, with characters moved to Hangul Syllables block.

Hangul Supplementary-B (U+44B8–U+4DFF) [7] consisted of six syllables from GB 12052-89 (U+44B8–U+44BD) and the first 2,370 syllables that are not in the aforementioned three sets (U+44BE–U+4DFF).

Hangul Supplementary-B [1] [2]
References: [6] [3]
 0123456789ABCDEF
U+44Bx 
U+44Cx
U+44Dx
U+44Ex
U+44Fx
U+450x걿
U+451x
U+452x
U+453x곿
U+454x
U+455x괿
U+456x
U+457x
U+458x
U+459x
U+45Ax궿
U+45Bx
U+45Cx
U+45Dx
U+45Ex
U+45Fx긿
U+460x깿
U+461x
U+462x
U+463x
U+464x꺿
U+465x
U+466x껿
U+467x
U+468x
U+469x
U+46Ax
U+46Bx꽿
U+46Cx
U+46Dx
U+46Ex
U+46Fx
U+470x꿿
U+471x
U+472x
U+473x
U+474x
U+475x
U+476x
U+477x
U+478x
U+479x
U+47Ax
U+47Bx
U+47Cx
U+47Dx
U+47Ex
U+47Fx
U+480x
U+481x
U+482x
U+483x
U+484x
U+485x
U+486x
U+487x
U+488x
U+489x
U+48Ax
U+48Bx
U+48Cx
U+48Dx
U+48Ex
U+48Fx
U+490x
U+491x
U+492x
U+493x
U+494x
U+495x
U+496x
U+497x
U+498x
U+499x
U+49Ax
U+49Bx
U+49Cx
U+49Dx
U+49Ex
U+49Fx
U+4A0x
U+4A1x
U+4A2x
U+4A3x
U+4A4x
U+4A5x
U+4A6x
U+4A7x
U+4A8x
U+4A9x
U+4AAx
U+4ABx
U+4ACx
U+4ADx
U+4AEx
U+4AFx
U+4B0x
U+4B1x
U+4B2x
U+4B3x
U+4B4x
U+4B5x
U+4B6x
U+4B7x
U+4B8x
U+4B9x
U+4BAx
U+4BBx
U+4BCx
U+4BDx
U+4BEx
U+4BFx
U+4C0x
U+4C1x
U+4C2x
U+4C3x
U+4C4x
U+4C5x
U+4C6x
U+4C7x
U+4C8x
U+4C9x
U+4CAx
U+4CBx
U+4CCx뤿
U+4CDx
U+4CEx륿
U+4CFx
U+4D0x
U+4D1x
U+4D2x맿
U+4D3x
U+4D4x
U+4D5x
U+4D6x
U+4D7x멿
U+4D8x
U+4D9x
U+4DAx
U+4DBx
U+4DCx뫿
U+4DDx
U+4DEx묿
U+4DFx
Notes
1. ^ As of Unicode version 1.1. Characters in chart are shown by means of equivalent code points in Unicode 2.0 and all subsequent versions.
2. ^ Grey areas indicate points outside of the block, since its boundaries (unusually) were not aligned to multiples of 16.

See also

Related Research Articles

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text written in all of the world's major writing systems. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.

<span class="mw-page-title-main">ArmSCII</span> Set of obsolete single-byte character encodings

ArmSCII or ARMSCII is a set of obsolete single-byte character encodings for the Armenian alphabet defined by Armenian national standard 166–9. ArmSCII is an acronym for Armenian Standard Code for Information Interchange, similar to ASCII for the American standard. It has been superseded by the Unicode standard.

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

<span class="mw-page-title-main">Unified Hangul Code</span> Windows character set for Korean

Unified Hangul Code (UHC), or Extended Wansung, also known under Microsoft Windows as Code Page 949, is the Microsoft Windows code page for the Korean language. It is an extension of Wansung Code to include all 11172 non-partial Hangul syllables present in Johab. This corresponds to the pre-composed syllables available in Unicode 2.0 and later.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.

The Universal Coded Character Set is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.

KS X 1001, "Code for Information Interchange ", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer.

The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.

Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

Optical Character Recognition is a Unicode block containing signal characters for OCR and MICR standards.

Tamil All Character Encoding (TACE16) is a scheme for encoding the Tamil script in the Private Use Area of Unicode, implementing a syllabary-based character model differing from the modified-ISCII model used by Unicode's existing Tamil implementation.

<span class="mw-page-title-main">Code page 949 (IBM)</span>

IBM code page 949 (IBM-949) is a character encoding which has been used by IBM to represent Korean language text on computers. It is a variable-width encoding which represents the characters from the Wansung code defined by the South Korean standard KS X 1001 in a format compatible with EUC-KR, but adds IBM extensions for additional hanja, additional precomposed Hangul syllables, and user-defined characters.

KS X 1002 is a South Korean character set standard established in order to supplement KS X 1001. It consists of a total of 7,649 characters.

CJK Unified Ideographs Extension I is a Unicode block comprising CJK Unified Ideographs included in drafts of an amendment to China's GB 18030 standard circulated in 2022 and 2023, which were fast-tracked into Unicode in 2023.

References

  1. 1 2 3 "3.7: Code Charts" (PDF). The Unicode Standard. Version 1.0. Unicode Consortium.
  2. "Unicode 1.1". Unicode Technical Site. Unicode Consortium.
  3. 1 2 3 4 5 Chung, Jaemin (2017-03-29). "Informative document about three pre-Unicode-2.0 modern hangul syllables" (PDF).
  4. Chang, K. D.; Choi, In Sook; Kim, Jung Ho (1995-10-04). "Korean Hangul Encoding Conversion Table".
  5. "Notes and corrections for HANGUL.TXT". 2005-10-13.
  6. 1 2 3 4 "Unicode 1.1.5 data". 1995-07-05.
  7. 1 2 3 "Appendix E: Block Names" (PDF). The Unicode Standard. Version 1.1. Unicode Consortium.
  8. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. Version 1.0. Unicode Consortium.