Enclosed Alphanumerics

Last updated
Enclosed Alphanumerics
RangeU+2460..U+24FF
(160 code points)
Plane BMP
Scripts Common
Assigned160 code points
Unused0 reserved code points
Unicode version history
1.0.0 (1991)139 (+139)
3.2 (2002)159 (+20)
4.0 (2003)160 (+1)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Contents

It is currently fully allocated. Within the Basic Multilingual Plane, a few additional enclosed numerals are in the Dingbats and the Enclosed CJK Letters and Months blocks. There is also a block with more of these characters in the Supplementary Multilingual Plane named Enclosed Alphanumeric Supplement (U+1F100U+1F1FF), as of Unicode 6.0.

Purpose

Many of these characters were originally intended for use as bullets for lists. [3] The parenthesized forms are historically based on typewriter approximations of the circled versions. [3] Although these roles have been supplanted by styles and other markup in "rich text" contexts, the characters are included in the Unicode standard "for interoperability with the legacy East Asian character sets and for the occasional text context where such symbols otherwise occur." [3] The Unicode Standard considers these characters to be distinct from characters which are similar in form but specialized in purpose, such as the circled C, P or R characters which are defined as copyright and trademark symbols or the circled a used for an at sign. [3]

A circled s (Ⓢ) was used in documents circa 1900 printed by German missionaries, especially the Basel Mission, in the Malayalam language to denote a ditto mark. [4]

Block

Enclosed Alphanumerics [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+246x
U+247x
U+248x
U+249x
U+24Ax
U+24Bx
U+24Cx
U+24Dx
U+24Ex
U+24Fx
Notes
1. ^ As of Unicode version 15.1

Emoji

The Enclosed Alphanumerics block contains one emoji: U+24C2, the enclosed M used as a symbol for mask works. [5] [6]

It defaults to a text presentation and has two standardized variants defined to specify text presentation (U+FE0E VS15) or emoji-style (U+FE0F VS16). [7]

Emoji variation sequences
U+24C2
base code point
base+VS15 (text)
base+VS16 (emoji)

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Enclosed Alphanumerics block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
1.0.0U+2460..24EA139(to be determined)
L2/11-438 [lower-alpha 2] [lower-alpha 3] N4182 Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429)
3.2U+24EB..24FE20 L2/99-238 Consolidated document containing 6 Japanese proposals, 1999-07-15
N2093 Addition of medical symbols and enclosed numbers, 1999-09-13
L2/00-010 N2103 Umamaheswaran, V. S. (2000-01-05), "8.8", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13--16
L2/00-296 N2256 Sato, T. K. (2000-09-04), Circled Numbers in JIS X 0213
4.0U+24FF1 L2/01-480 Muller, Eric (2001-12-14), Proposal to add NEGATIVE CIRCLED DIGIT ZERO
L2/02-193 Muller, Eric (2001-12-14), Proposal to add Negative Circled Digit Zero
L2/02-070 Moore, Lisa (2002-08-26), "NEGATIVE CIRCLED DIGIT ZERO", Minutes for UTC #90, Consensus: Accept the character NEGATIVE CIRCLED DIGIT ZERO at U+24FF.
  1. Proposed code points and characters names may differ from final code points and names
  2. See also L2/10-458, L2/11-414, L2/11-415, and L2/11-429
  3. Refer to the history section of the Miscellaneous Symbols and Pictographs block for additional emoji-related documents

See also

Related Research Articles

Miscellaneous Symbols is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigrams, warning signs, and weather, among others.

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorize these characters as being "letterlike."

Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions. For example:

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

Supplemental Arrows-B is a Unicode block containing miscellaneous arrows, arrow tails, crossing arrows used in knot descriptions, curved arrows, and harpoons.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

A variant form is a different glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.

Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text", and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.

Arrows is a Unicode block containing line, curve, and semicircle symbols terminating in barbs or arrows.

<span class="mw-page-title-main">Enclosed Ideographic Supplement</span> Unicode character block

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Transport and Map Symbols is a Unicode block containing transportation and map icons, largely for compatibility with Japanese telephone carriers' emoji implementations of Shift JIS, and to encode characters in the Wingdings and Wingdings 2 character sets.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. 1 2 3 4 The Unicode Standard, 6.0.1
  4. Joseph Muliyil; M Krishnan (1904). "Contents". The New Malayalam Reader (in Malayalam). Mangalore: Basel Mission Book and Tract Repository. p. vii.
  5. "UTR #51: Unicode Emoji". Unicode Consortium. 2023-09-05.
  6. "UCD: Emoji Data for UTR #51". Unicode Consortium. 2023-02-01.
  7. "UTS #51 Emoji Variation Sequences". The Unicode Consortium.