Supplemental Punctuation

Last updated
Supplemental Punctuation
RangeU+2E00..U+2E7F
(128 code points)
Plane BMP
Scripts Common
Assigned94 code points
Unused34 reserved code points
Unicode version history
4.1 (2005)26 (+26)
5.1 (2008)49 (+23)
5.2 (2009)50 (+1)
6.1 (2012)60 (+10)
7.0 (2014)67 (+7)
9.0 (2016)69 (+2)
10.0 (2017)74 (+5)
11.0 (2018)79 (+5)
12.0 (2019)80 (+1)
13.0 (2020)83 (+3)
14.0 (2021)94 (+11)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Supplemental Punctuation is a Unicode block containing historic and specialized punctuation characters, including biblical editorial symbols, ancient Greek punctuation, and German dictionary marks.

Contents

Additional punctuation characters are in the General Punctuation block and sprinkled in dozens of other Unicode blocks.

Block

Supplemental Punctuation [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+2E0x
U+2E1x
U+2E2x
U+2E3x  2M 
 3M 
⸿
U+2E4x
U+2E5x
U+2E6x
U+2E7x
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Supplemental Punctuation block:

See also

Related Research Articles

<span class="mw-page-title-main">Dingbat</span> Typographic symbol class

In typography, a dingbat is an ornament, specifically, a glyph used in typesetting, often employed to create box frames, or as a dinkus. Some of the dingbat symbols have been used as signature marks or used in bookbinding to order sections.

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

<span class="mw-page-title-main">Hangul Jamo (Unicode block)</span> Unicode character block

Hangul Jamo is a Unicode block containing positional forms of the Hangul consonant and vowel clusters. While the Hangul Syllables Unicode block contains precomposed syllables used in standard modern Korean, the Hangul Jamo block can be used to compose arbitrary syllables dynamically, including those not included in the Hangul Syllables block.

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.

Javanese is a Unicode block containing aksara Jawa characters traditionally used for writing the Javanese language.

Sundanese Supplement is a Unicode block containing punctuation characters for Sundanese.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Aegean Numbers is a Unicode block containing punctuation, number, and unit characters for Linear A, Linear B, and the Cypriot syllabary, together Aegean numerals.

Ornamental Dingbats is a Unicode block containing ornamental leaves, punctuation, and ampersands, quilt squares, and checkerboard patterns. It is a subset of dingbat fonts Webdings, Wingdings, and Wingdings 2.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

Tangut is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty.

Tangut Components is a Unicode block containing components and radicals used in the modern study of the Tangut script.

Tangut Supplement is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty. This block is a supplement to the main Tangut block.

Symbols for Legacy Computing is a Unicode block containing graphic characters that were used for various home computers from the 1970s and 1980s and in Teletext broadcasting standards. It includes characters from the Amstrad CPC, MSX, Mattel Aquarius, RISC OS, MouseText, Atari ST, TRS-80 Color Computer, Oric, Texas Instruments TI-99/4A, TRS-80, Minitel, Teletext, ATASCII, PETSCII, ZX80, and ZX81 character sets. Semigraphics characters are also included in the form of new block-shaped characters, line-drawing characters, and 60 "sextant" characters.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.