Combining Diacritical Marks for Symbols

Combining Diacritical Marks for Symbols
Combining Diacritical Marks for Symbols
Range	U+20D0..U+20FF; (48 code points)
Plane	BMP
Scripts	Inherited
Assigned	33 code points
Unused	15 reserved code points
Unicode version history
1.0.0 (1991)	18 (+18)
3.0 (1999)	20 (+2)
3.2 (2002)	27 (+7)
4.1 (2005)	28 (+1)
5.0 (2006)	32 (+4)
5.1 (2008)	33 (+1)
Unicode documentation
	Code chart ∣ Web page
	Note:

Last updated June 13, 2024

Combining Diacritical Marks for Symbols is a Unicode block containing arrows, dots, enclosures, and overlays for modifying symbol characters.

Block

Combining Diacritical Marks for Symbols ^[1]^[2] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+20Dx	◌⃐	◌⃑	◌⃒	◌⃓	◌⃔	◌⃕	◌⃖	◌⃗	◌⃘	◌⃙	◌⃚	◌⃛	◌⃜	◌⃝	◌⃞	◌⃟
U+20Ex	◌⃠	◌⃡	◌⃢	◌⃣	◌⃤	◌⃥	◌⃦	◌⃧	◌⃨	◌⃩	◌⃪	◌⃫	◌⃬	◌⃭	◌⃮	◌⃯
U+20Fx	◌⃰
Notes 1. ^ As of Unicode version 15.1 2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Combining Diacritical Marks for Symbols block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
1.0.0	U+20D0..20E1	18			(to be determined)
			L2/06-181		Anderson, Deborah (2006-05-08), Responses to the UTC regarding L2/06-042, Proposal for Additional Cyrillic Characters
			L2/06-108		Moore, Lisa (2006-05-25), "Action item 107-A94", UTC #107 Minutes, Add an annotation to the names list for U+20DD COMBINING ENCLOSING CIRCLE.
3.0	U+20E2..20E3	2	L2/97-206	N1668	Proposal to encode two symbols, 1997-08-05
			L2/98-007	N1668R (pdf)	Joint proposal to encode two symbols, 1998-02-13
			L2/98-039		Aliprand, Joan; Winkler, Arnold (1998-02-24), "3.C.3. Cartouche proposal for keyboard symbols", Preliminary Minutes - UTC #74 & L2 #171, Mountain View, CA - December 5, 1997
			L2/98-082	N1668R (doc)	Joint proposal to encode enclosing screen and keycap, 1998-03-23
			L2/98-286	N1703	Umamaheswaran, V. S.; Ksar, Mike (1998-07-02), "8.4", Unconfirmed Meeting Minutes, WG 2 Meeting #34, Redmond, WA, USA; 1998-03-16--20
			L2/98-321	N1905	Revised text of 10646-1/FPDAM 23, AMENDMENT 23: Bopomofo Extended and other characters, 1998-10-22
			L2/99-010	N1903 (pdf, html, doc)	Umamaheswaran, V. S. (1998-12-30), "6.7.6", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25
			L2/17-086		Burge, Jeremy; et al. (2017-03-27), Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component
			L2/17-103		Moore, Lisa (2017-05-18), "E.1.7 Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component", UTC #151 Minutes
3.2	U+20E4	1	L2/98-056		McGowan, Rick; Sampson, Geoffrey (1998-02-23), Triangular Overlay Character
			L2/98-070		Aliprand, Joan; Winkler, Arnold, "4.C.1.", Minutes of the joint UTC and L2 meeting from the meeting in Cupertino, February 25-27, 1998
			L2/99-021	N1941	McGowan, Rick (1998-12-07), Request for Addition of Triangular Overlay Character
			L2/99-077.1	N1975	Irish Comments on SC 2 N 3210, 1999-01-20
			L2/98-419 (pdf, doc)		Aliprand, Joan (1999-02-05), "Enclosing Triangle", Approved Minutes -- UTC #78 & NCITS Subgroup L2 # 175 Joint Meeting, San Jose, CA -- December 1-4, 1998
			L2/99-176R		Moore, Lisa (1999-11-04), "Motion 80-M20", Minutes from the joint UTC/L2 meeting in Seattle, June 8-10, 1999
			L2/99-232	N2003	Umamaheswaran, V. S. (1999-08-03), "7.2.1.2", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15
	U+20E5..20E8	4	L2/00-119 ^{[lower-alpha 2]}	N2191R	Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode
			L2/00-234	N2203 (rtf, txt)	Umamaheswaran, V. S. (2000-07-21), "8.18", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
			L2/00-115R2		Moore, Lisa (2000-08-08), "Motion 83-M11", Minutes Of UTC Meeting #83
	U+20E9..20EA	2	L2/99-010	N1903 (pdf, html, doc)	Umamaheswaran, V. S. (1998-12-30), "6.7.6", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25
			L2/01-142 ^{[lower-alpha 2]}	N2336	Beeton, Barbara; Freytag, Asmus; Ion, Patrick (2001-04-02), Additional Mathematical Symbols
			L2/01-156	N2356	Freytag, Asmus (2001-04-03), Additional Mathematical Characters (Draft 10)
			L2/01-344	N2353 (pdf, doc)	Umamaheswaran, V. S. (2001-09-09), "7.7 Mathematical Symbols", Minutes from SC2/WG2 meeting #40 -- Mountain View, April 2001
4.1	U+20EB	1	L2/03-194	N2590	Freytag, Asmus (2003-06-09), Additional Mathematical and Letterlike Characters
4.1	U+20EB	1	L2/04-196	N2653 (pdf, doc)	Umamaheswaran, V. S. (2004-06-04), "RESOLUTION M44.5 (Additions of individual characters), item g", Unconfirmed minutes of WG 2 meeting 44
5.0	U+20EC..20EF	4	L2/04-406		Freytag, Asmus; Sargent, Murray; Beeton, Barbara; Carlisle, David (2004-11-15), Progress report on Mathematical Symbols
5.0	U+20EC..20EF	4	L2/04-410		Freytag, Asmus (2004-11-18), Twenty six mathematical characters
5.1	U+20F0	1	L2/07-011R	N3198R	Freytag, Asmus; Beeton, Barbara; Ion, Patrick; Sargent, Murray; Carlisle, David; Pournader, Roozbeh (2007-01-15), 29 Additional Mathematical and Symbol Characters
			L2/07-015		Moore, Lisa (2007-02-08), "Mathematical Characters and Symbols (C.4)", UTC #110 Minutes
			L2/07-268	N3253 (pdf, doc)	Umamaheswaran, V. S. (2007-07-26), "M50.16", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27
↑ Proposed code points and characters names may differ from final code points and names 1 2 Refer to the history section of the Miscellaneous Mathematical Symbols-B block for additional math-related documents

Related Research Articles

In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

As of Unicode version 15.1, Cyrillic script is encoded across several blocks:

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.

Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.

Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). It is an extension of the diacritic characters found in the Combining Diacritical Marks block.

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

In the Unicode standard, a plane is a contiguous group of 65,536 (2¹⁶) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–10₁₆ of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

Combining Half Marks is a Unicode block containing diacritical combining characters for spanning multiple characters.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Latin Extended Additional is a Unicode block.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally also used for writing Coptic, using the similar Greek letters in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.

NKo is a Unicode block containing characters for the Manding languages of West Africa, including Bamanan, Jula, Maninka, Mandinka, and a common literary language, Kangbe, also called NKo.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista).

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
↑ "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-4] Proposed code points and characters names may differ from final code points and names

[mathdocs-5] 1 2 Refer to the history section of the Miscellaneous Mathematical Symbols-B block for additional math-related documents

[1] "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

[3] "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. version 1.0. Unicode Consortium.

[1]

[2]

[3]

[lower-alpha 1]

[lower-alpha 2]

Combining Diacritical Marks for Symbols
Range	U+20D0..U+20FF (48 code points)
Plane	BMP
Scripts	Inherited
Assigned	33 code points
Unused	15 reserved code points
Unicode version history

1.0.0 (1991)	18 (+18)
3.0 (1999)	20 (+2)
3.2 (2002)	27 (+7)
4.1 (2005)	28 (+1)
5.0 (2006)	32 (+4)
5.1 (2008)	33 (+1)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2]

Combining Diacritical Marks for Symbols

Contents

Block

History

Related Research Articles

References