Combining Diacritical Marks for Symbols

Last updated
Combining Diacritical Marks for Symbols
RangeU+20D0..U+20FF
(48 code points)
Plane BMP
Scripts Inherited
Assigned33 code points
Unused15 reserved code points
Unicode version history
1.0.0 (1991)18 (+18)
3.0 (1999)20 (+2)
3.2 (2002)27 (+7)
4.1 (2005)28 (+1)
5.0 (2006)32 (+4)
5.1 (2008)33 (+1)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Combining Diacritical Marks for Symbols is a Unicode block containing arrows, dots, enclosures, and overlays for modifying symbol characters.

Contents

Its block name in Unicode 1.0 was simply Diacritical Marks for Symbols. [3]

Block

Combining Diacritical Marks for Symbols [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+20Dx◌⃐◌⃑◌⃒◌⃓◌⃔◌⃕◌⃖◌⃗◌⃘◌⃙◌⃚◌⃛◌⃜◌⃝◌⃞◌⃟
U+20Ex◌⃠◌⃡◌⃢◌⃣◌⃤◌⃥◌⃦◌⃧◌⃨◌⃩◌⃪◌⃫◌⃬◌⃭◌⃮◌⃯
U+20Fx◌⃰
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Combining Diacritical Marks for Symbols block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
1.0.0U+20D0..20E118(to be determined)
L2/06-181 Anderson, Deborah (2006-05-08), Responses to the UTC regarding L2/06-042, Proposal for Additional Cyrillic Characters
L2/06-108 Moore, Lisa (2006-05-25), "Action item 107-A94", UTC #107 Minutes, Add an annotation to the names list for U+20DD COMBINING ENCLOSING CIRCLE.
3.0U+20E2..20E32L2/97-206 N1668 Proposal to encode two symbols, 1997-08-05
L2/98-007 N1668R (pdf) Joint proposal to encode two symbols, 1998-02-13
L2/98-039 Aliprand, Joan; Winkler, Arnold (1998-02-24), "3.C.3. Cartouche proposal for keyboard symbols", Preliminary Minutes - UTC #74 & L2 #171, Mountain View, CA - December 5, 1997
L2/98-082 N1668R (doc) Joint proposal to encode enclosing screen and keycap, 1998-03-23
L2/98-286 N1703 Umamaheswaran, V. S.; Ksar, Mike (1998-07-02), "8.4", Unconfirmed Meeting Minutes, WG 2 Meeting #34, Redmond, WA, USA; 1998-03-16--20
L2/98-321 N1905 Revised text of 10646-1/FPDAM 23, AMENDMENT 23: Bopomofo Extended and other characters, 1998-10-22
L2/99-010 N1903 (pdf, html, doc)Umamaheswaran, V. S. (1998-12-30), "6.7.6", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25
L2/17-086 Burge, Jeremy; et al. (2017-03-27), Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component
L2/17-103 Moore, Lisa (2017-05-18), "E.1.7 Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component", UTC #151 Minutes
3.2U+20E41 L2/98-056 McGowan, Rick; Sampson, Geoffrey (1998-02-23), Triangular Overlay Character
L2/98-070 Aliprand, Joan; Winkler, Arnold, "4.C.1.", Minutes of the joint UTC and L2 meeting from the meeting in Cupertino, February 25-27, 1998
L2/99-021 N1941 McGowan, Rick (1998-12-07), Request for Addition of Triangular Overlay Character
L2/99-077.1 N1975 Irish Comments on SC 2 N 3210, 1999-01-20
L2/98-419 (pdf, doc)Aliprand, Joan (1999-02-05), "Enclosing Triangle", Approved Minutes -- UTC #78 & NCITS Subgroup L2 # 175 Joint Meeting, San Jose, CA -- December 1-4, 1998
L2/99-176R Moore, Lisa (1999-11-04), "Motion 80-M20", Minutes from the joint UTC/L2 meeting in Seattle, June 8-10, 1999
L2/99-232 N2003 Umamaheswaran, V. S. (1999-08-03), "7.2.1.2", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15
U+20E5..20E84 L2/00-119 [lower-alpha 2] N2191R Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode
L2/00-234 N2203 (rtf, txt)Umamaheswaran, V. S. (2000-07-21), "8.18", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
L2/00-115R2 Moore, Lisa (2000-08-08), "Motion 83-M11", Minutes Of UTC Meeting #83
U+20E9..20EA2 L2/99-010 N1903 (pdf, html, doc)Umamaheswaran, V. S. (1998-12-30), "6.7.6", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25
L2/01-142 [lower-alpha 2] N2336 Beeton, Barbara; Freytag, Asmus; Ion, Patrick (2001-04-02), Additional Mathematical Symbols
L2/01-156 N2356 Freytag, Asmus (2001-04-03), Additional Mathematical Characters (Draft 10)
L2/01-344 N2353 (pdf, doc)Umamaheswaran, V. S. (2001-09-09), "7.7 Mathematical Symbols", Minutes from SC2/WG2 meeting #40 -- Mountain View, April 2001
4.1U+20EB1 L2/03-194 N2590 Freytag, Asmus (2003-06-09), Additional Mathematical and Letterlike Characters
L2/04-196 N2653 (pdf, doc)Umamaheswaran, V. S. (2004-06-04), "RESOLUTION M44.5 (Additions of individual characters), item g", Unconfirmed minutes of WG 2 meeting 44
5.0U+20EC..20EF4 L2/04-406 Freytag, Asmus; Sargent, Murray; Beeton, Barbara; Carlisle, David (2004-11-15), Progress report on Mathematical Symbols
L2/04-410 Freytag, Asmus (2004-11-18), Twenty six mathematical characters
5.1U+20F01 L2/07-011R N3198R Freytag, Asmus; Beeton, Barbara; Ion, Patrick; Sargent, Murray; Carlisle, David; Pournader, Roozbeh (2007-01-15), 29 Additional Mathematical and Symbol Characters
L2/07-015 Moore, Lisa (2007-02-08), "Mathematical Characters and Symbols (C.4)", UTC #110 Minutes
L2/07-268 N3253 (pdf, doc)Umamaheswaran, V. S. (2007-07-26), "M50.16", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27
  1. Proposed code points and characters names may differ from final code points and names
  2. 1 2 Refer to the history section of the Miscellaneous Mathematical Symbols-B block for additional math-related documents

Related Research Articles

In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

As of Unicode version 15.1, Cyrillic script is encoded across several blocks:

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.

Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.

Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). It is an extension of the diacritic characters found in the Combining Diacritical Marks block.

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

Combining Half Marks is a Unicode block containing diacritical combining characters for spanning multiple characters.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Latin Extended Additional is a Unicode block.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

<span class="mw-page-title-main">Greek and Coptic</span> Unicode character block

Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally also used for writing Coptic, using the similar Greek letters in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.

NKo is a Unicode block containing characters for the Manding languages of West Africa, including Bamanan, Jula, Maninka, Mandinka, and a common literary language, Kangbe, also called NKo.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista).

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.