Unicode symbol

Last updated

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

Contents

Many of the symbols are drawn from existing character sets or ISO/IEC or other national and international standards. The Unicode Standard states that "The universe of symbols is rich and open-ended," but that in order to be considered, a symbol must have a "demonstrated need or strong desire to exchange in plain text." [1] This makes the issue of what symbols to encode and how symbols should be encoded more complicated than the issues surrounding writing systems. Unicode focuses on symbols that make sense in a one-dimensional plain-text context. For example, the typical two-dimensional arrangement of electronic diagram symbols justifies their exclusion. [2] (Box-drawing characters are a partial exception, for legacy purposes, and a number of electronic diagram symbols are indeed encoded in Unicode's Miscellaneous Technical block.) For adequate treatment in plain text, symbols must also be displayable in a monochromatic setting. Even with these limitations  monochromatic, one-dimensional and standards-based  the domain of potential Unicode symbols is extensive. (However, emojis   ideograms, graphic symbols   that were admitted into Unicode, allow colors although the colors are not standardized.)

Symbol block list

There are 149,813 characters, with Unicode 15.1, [3] [4] including the following symbol blocks:

See also

Related Research Articles

An emoji is a pictogram, logogram, ideogram, or smiley embedded in text and used in electronic messages and web pages. The primary function of modern emoji is to fill in emotional cues otherwise missing from typed conversation as well as to replace words as part of a logographic system. Emoji exist in various genres, including facial expressions, expressions, activity, food and drinks, celebrations, flags, objects, symbols, places, types of weather, animals and nature.

<span class="mw-page-title-main">Bitstream Cyberbit</span> Unicode serif typeface

Bitstream Cyberbit is a commercial serif Unicode font designed by Bitstream Inc. It is freeware for non-commercial uses. It was one of the first widely available fonts to support a large portion of the Unicode repertoire.

Apple Symbols is a font introduced in Mac OS X 10.3 “Panther”. This is a TrueType font intended to provide coverage for characters defined as symbols in the Unicode Standard. It continues to ship with Mac OS X as part of the default installation. Prior to Mac OS X 10.5, its path was /Library/Fonts/Apple Symbols.ttf. From Mac OS X 10.5 onward, it is to be found at /System/Library/Fonts/Apple Symbols.ttf, meaning it is now considered an essential part of the system software, not to be deleted by users.

Miscellaneous Symbols is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigrams, warning signs, and weather, among others.

New Gulim (새굴림/SaeGulRim) is a sans-serif type Unicode font designed especially for the Korean-language script, designed by HanYang System Co., Limited. It is an expanded version of Hanyang Gulrim.

<span class="mw-page-title-main">Mathematical operators and symbols in Unicode</span> Blackboard at the Laurent Schwartz Center for Mathematics, École Polytechnique

The Unicode Standard encodes almost all standard characters used in mathematics. Unicode Technical Report #25 provides comprehensive information about the character repertoire, their properties, and guidelines for implementation. Mathematical operators and symbols are in multiple Unicode blocks. Some of these blocks are dedicated to, or primarily contain, mathematical characters while others are a mix of mathematical and non-mathematical characters. This article covers all Unicode characters with a derived property of "Math".

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Unicode contains a number of characters that represent various cultural, political, and religious symbols. Most, but not all, of these symbols are in the Miscellaneous Symbols block.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Miscellaneous Symbols and Pictographs is a Unicode block containing meteorological and astronomical symbols, emoji characters largely for compatibility with Japanese telephone carriers' implementations of Shift JIS, and characters originally from the Wingdings and Webdings fonts found in Microsoft Windows.

A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text", and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Transport and Map Symbols is a Unicode block containing transportation and map icons, largely for compatibility with Japanese telephone carriers' emoji implementations of Shift JIS, and to encode characters in the Wingdings and Wingdings 2 character sets.

Geometric Shapes Extended is a Unicode block containing Webdings/Wingdings symbols, mostly different weights of squares, crosses, and saltires, and different weights of variously spoked asterisks, stars, and various color squares and circles for emoji.

Supplemental Symbols and Pictographs is a Unicode block containing emoji characters. It extends the set of symbols included in the Miscellaneous Symbols and Pictographs block. It also includes Typikon symbols.

Symbols and Pictographs Extended-A is a Unicode block containing emoji characters. It extends the set of symbols included in the Supplemental Symbols and Pictographs block.

References

  1. "Section 22: Symbols" (PDF). The Unicode Standard. The Unicode Consortium. September 2022.
  2. "Section 22: Miscellaneous Technical" (PDF). The Unicode Standard. The Unicode Consortium. September 2022.
  3. "Unicode character database". The Unicode Standard. Retrieved 2020-03-15.
  4. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2020-03-15.