Small Form Variants

Last updated
Small Form Variants
RangeU+FE50..U+FE6F
(32 code points)
Plane BMP
Scripts Common
Symbol setsSmall punctuation
Assigned26 code points
Unused6 reserved code points
Source standards CNS 11643
Unicode version history
1.0.026 (+26)
Note: [1] [2]

Small Form Variants is a Unicode block containing small punctuation characters for compatibility with the Chinese National Standard CNS 11643. Its block name in Unicode 1.0 was simply Small Variants. [3]


Small Form Variants [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+FE5x
U+FE6x
Notes
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

Related Research Articles

Dingbat Typographic symbol

In typography, a dingbat is an ornament, character, or spacer used in typesetting, often employed for the creation of box frames. The term continues to be used in the computer industry to describe fonts that have symbols and shapes in the positions designated for alphabetical or numeric characters.

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0-25FF.

Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions. For example:

Miscellaneous Symbols and Arrows is a Unicode block containing arrows and geometric shapes with various fills.

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points U+0180..U+01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block was expanded, and another 35 characters were added. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.

IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages.

Enclosed Alphanumeric Supplement is a Unicode block consisting Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.

Arabic Presentation Forms-A is a Unicode block encoding contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. This block also encodes 32 noncharacters in Unicode.

Arabic Extended-A is a Unicode block encoding Qur'anic annotations and letter variants used for various non-Arabic languages.

CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include the South Korean KS X 1001:1998, Taiwanese Big5, Japanese IBM 32, South Korean KS X 1001:2004, Japanese JIS X 0213, Japanese ARIB STD-B24 and the North Korean KPS 10721-2000 source standards.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility. Its block name in Unicode 1.0 was CNS 11643 Compatibility, in reference to CNS 11643.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the last of the Basic Multilingual Plane excepting the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.