Vertical Forms

Last updated
Vertical Forms
RangeU+FE10..U+FE1F
(16 code points)
Plane BMP
Scripts Common
Symbol setsVertical punctuation
Assigned10 code points
Unused6 reserved code points
Source standards GB 18030
Unicode version history
4.1 (2005)10 (+10)
Note: [1] [2]

Vertical Forms is a Unicode block containing vertical punctuation for compatibility characters with the Chinese Standard GB 18030.

Contents

In the Unicode specification, U+FE18PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET has a typo in its name; "BRACKET" is spelt as "BRAKCET". [3]

Vertical Forms [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+FE1x
Notes
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Vertical Forms block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
4.1U+FE10..FE1910 L2/03-411 Goldsmith, Deborah; Muller, Eric (2003-10-31), Unencoded chars in GB 18030 & HK-SCS
L2/04-161R N2807 Suignard, Michel; Muller, Eric; Jenkins, John (2004-06-17), HKSCS and GB 18030 PUA characters, background document
L2/04-263 N2808 Suignard, Michel (2004-06-17), HKSCS and GB 18030 PUA characters, request for additional characters and related information
L2/05-137 Freytag, Asmus (2005-05-10), Handling "defective" names
L2/05-108R Moore, Lisa (2005-08-26), "Consensus 103-C7", UTC #103 Minutes, Create a "Normative Name Alias" property and file in the UCD. Populate the property with names from the sections "Typos" and "Bad or misleading names" from document L2/05-137.
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix denotes a non-mandatory standard.

Supplemental Punctuation is a Unicode block containing historic and specialized punctuation characters, including biblical editorial symbols, ancient Greek punctuation, and German dictionary marks.

A numeral is a character that denotes a number. Decimal is used widely in various writing systems throughout the world, however the graphemes representing the decimal digits differ widely, therefore Unicode includes 22 different sets of graphemes for the decimal digits, and also various decimal points, thousands separators, negative signs, etc. Unicode also includes several non-Decimal numerals such as Aegean numerals, Roman numerals, counting rod numerals, Cuneiform numerals and ancient Greek numerals. There is also a large number of typographical variations of the Western Arabic numerals provided for specialized mathematical use and for compatibility with earlier character sets, such as ² or ②, and composite characters such as ½.

Halfwidth and fullwidth forms Alternative width characters in East Asian typography

In CJK computing, graphic characters are traditionally classed into fullwidth and halfwidth characters. With fixed-width fonts, a halfwidth character occupies half the width of a fullwidth character, hence the name.

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

The Unicode Standard assigns character properties to each code point. These properties can be used to handle "characters" in processes, like in line-breaking, script direction right-to-left or applying controls. Slightly inconsequently, some "character properties" are also defined for code points that have no character assigned, and code points that are labeled like "<not a character>". The character properties are described in Standard Annex #44.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages.

Hangul Jamo (Unicode block) Unicode character block

Hangul Jamo is a Unicode block containing positional forms of the Hangul consonant and vowel clusters. They can be used to dynamically compose syllables that are not available as precomposed Hangul syllables in Unicode, specifically syllables that are not used in standard modern Korean.

Hangul Compatibility Jamo Unicode character block

Hangul Compatibility Jamo is a Unicode block containing Hangul characters for compatibility with the South Korean national standard KS X 1001. Its block name in Unicode 1.0 was Hangul Elements.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility. Its block name in Unicode 1.0 was CNS 11643 Compatibility, in reference to CNS 11643.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with east Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interobang, and invisible mathematical operators.

Box Drawing is a Unicode block containing characters for compatibility with legacy graphics standards that contained characters for making bordered charts and tables, i.e. box-drawing characters. Its block name in Unicode 1.0 was Form and Chart Components.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Small Form Variants is a Unicode block containing small punctuation characters for compatibility with the Chinese National Standard CNS 11643. Its block name in Unicode 1.0 was simply Small Variants.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the last of the Basic Multilingual Plane excepting the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. "4. Character Properties, Types of Character Name Aliases". The Unicode Standard, Version 13.0 (PDF). Mountain View, CA: Unicode, Inc. March 2020.