Unicode symbols

Last updated

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

Contents

Many of the symbols are drawn from existing character sets or ISO/IEC or other national and international standards. The Unicode Standard states that "The universe of symbols is rich and open-ended," but that in order to be considered, a symbol must have a "demonstrated need or strong desire to exchange in plain text." [1] This makes the issue of what symbols to encode and how symbols should be encoded more complicated than the issues surrounding writing systems. Unicode focuses on symbols that make sense in a one-dimensional plain-text context. For example, the typical two-dimensional arrangement of electronic diagram symbols justifies their exclusion. [2] (Box-drawing characters are a partial exception, for legacy purposes, and a number of electronic diagram symbols are indeed encoded in Unicode's Miscellaneous Technical block.) For adequate treatment in plain text, symbols must also be displayable in a monochromatic setting. Even with these limitations  monochromatic, one-dimensional and standards-based  the domain of potential Unicode symbols is extensive. (However, emojis   ideograms, graphic symbols   that were admitted into Unicode, allow colors although the colors are not standardized.)

Symbol block list

There are 149,186 characters, with Unicode 15.0, [3] [4] including the following symbol blocks:

See also

Related Research Articles

<span class="mw-page-title-main">Emoji</span> Symbols often used as emotional cues in text

An emoji is a pictogram, logogram, ideogram or smiley embedded in text and used in electronic messages and web pages. The primary function of emoji is to fill in emotional cues otherwise missing from typed conversation. Examples of emoji are 😂, 😃, 🧘🏻‍♂️, 🌍, 🌦️, 🥖, 🚗, 📱, 🎉, ❤️, ✅, and 🏁. Emoji exist in various genres, including facial expressions, common objects, places and types of weather, and animals. They are much like emoticons, except emoji are pictures rather than typographic approximations; the term "emoji" in the strict sense refers to such pictures which can be represented as encoded characters, but it is sometimes applied to messaging stickers by extension. Originally meaning pictograph, the word emoji comes from Japanese e + moji; the resemblance to the English words emotion and emoticon is purely coincidental. The ISO 15924 script code for emoji is Zsye.

Apple Symbols is a font introduced in Mac OS X 10.3 "Panther." This is a TrueType font, intended to provide coverage for characters defined as symbols in the Unicode Standard. It continues to ship with Mac OS X as part of the default installation. Prior to Mac OS X 10.5, its path was /Library/Fonts/Apple Symbols.ttf. From Mac OS X 10.5 onward, is to be found at /System/Library/Fonts/Apple Symbols.ttf, meaning it is now considered an essential part of the system software, not to be deleted by users.

Miscellaneous Symbols is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigrams, warning signs, and weather, among others.

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.

New Gulim (새굴림/SaeGulRim) is a sans-serif type Unicode font designed especially for the Korean-language script, designed by HanYang System Co., Limited. It is an expanded version of Hanyang Gulrim.

<span class="mw-page-title-main">Mathematical operators and symbols in Unicode</span>

The Unicode Standard encodes almost all standard characters used in mathematics. Unicode Technical Report #25 provides comprehensive information about the character repertoire, their properties, and guidelines for implementation. Mathematical operators and symbols are in multiple Unicode blocks. Some of these blocks are dedicated to, or primarily contain, mathematical characters while others are a mix of mathematical and non-mathematical characters. This article covers all Unicode characters with a derived property of "Math".

In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.0, five of the planes have assigned code points (characters), and seven are named.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Unicode contains a number of characters that represent various cultural, political, and religious symbols. Most, but not all, of these symbols are in the Miscellaneous Symbols block.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Miscellaneous Symbols and Pictographs is a Unicode block containing meteorological and astronomical symbols, emoji characters largely for compatibility with Japanese telephone carriers' implementations of Shift JIS, and characters originally from the Wingdings and Webdings fonts found in Microsoft Windows.

A variant form is a different glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text," and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Variation Selectors is the block name of a Unicode code point block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

Supplemental Symbols and Pictographs is a Unicode block containing emoji characters. It extends the set of symbols included in the Miscellaneous Symbols and Pictographs block. It also includes Typikon symbols.

Quivira is a serif Unicode typeface by Alexander Lange.

Symbols and Pictographs Extended-A is a Unicode block containing emoji characters. It extends the set of symbols included in the Supplemental Symbols and Pictographs block.

References

  1. "Section 22: Symbols" (PDF). The Unicode Standard. The Unicode Consortium. September 2022.
  2. "Section 22: Miscellaneous Technical" (PDF). The Unicode Standard. The Unicode Consortium. September 2022.
  3. "Unicode character database". The Unicode Standard. Retrieved 2020-03-15.
  4. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2020-03-15.