Combining Diacritical Marks Supplement

Combining Diacritical Marks Supplement
Combining Diacritical Marks Supplement
Range	U+1DC0..U+1DFF; (64 code points)
Plane	BMP
Scripts	Inherited
Major alphabets	UPA
Symbol sets	Medieval letter diacritics
Assigned	63 code points
Unused	1 reserved code points
Unicode version history
4.1 (2005)	4 (+4)
5.0 (2006)	13 (+9)
5.1 (2008)	41 (+28)
5.2 (2009)	42 (+1)
6.0 (2010)	43 (+1)
7.0 (2014)	58 (+15)
9.0 (2016)	59 (+1)
10.0 (2017)	63 (+4)
	Note:

Last updated August 27, 2021

Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista).^[3] It is an extension of the diacritic characters found in the Combining Diacritical Marks block.

Block

Combining Diacritical Marks Supplement ^[1]^[2] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+1DCx	◌᷀	◌᷁	◌᷂	◌᷃	◌᷄	◌᷅	◌᷆	◌᷇	◌᷈	◌᷉	◌᷊	◌᷋	◌᷌	◌᷍	◌᷎	◌᷏
U+1DDx	◌᷐	◌᷑	◌᷒	◌ᷓ	◌ᷔ	◌ᷕ	◌ᷖ	◌ᷗ	◌ᷘ	◌ᷙ	◌ᷚ	◌ᷛ	◌ᷜ	◌ᷝ	◌ᷞ	◌ᷟ
U+1DEx	◌ᷠ	◌ᷡ	◌ᷢ	◌ᷣ	◌ᷤ	◌ᷥ	◌ᷦ	◌ᷧ	◌ᷨ	◌ᷩ	◌ᷪ	◌ᷫ	◌ᷬ	◌ᷭ	◌ᷮ	◌ᷯ
U+1DFx	◌ᷰ	◌ᷱ	◌ᷲ	◌ᷳ	◌ᷴ	◌᷵	◌᷶	◌᷷	◌᷸	◌᷹		◌᷻	◌᷼	◌᷽	◌᷾	◌᷿
Notes 1. ^ As of Unicode version 13.0 2. ^ Grey area indicates non-assigned code point

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Combining Diacritical Marks Supplement block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
4.1	U+1DC0..1DC1	2	L2/02-031		Anderson, Deborah (2002-01-21), TLG Miscellanea Proposal
			L2/02-033		Anderson, Deborah (2002-01-21), TLG Unicode Proposal (draft)
			L2/02-053		Anderson, Deborah (2002-02-04), Description of TLG Documents
			L2/02-273		Pantelia, Maria (2002-07-31), TLG Unicode Proposal
			L2/02-287		Pantelia, Maria (2002-08-09), Proposal Summary Form accompanying TLG Unicode Proposal (L2/02-273)
			L2/02-312R		Pantelia, Maria (2002-11-07), Proposal to encode additional Greek editorial and punctuation characters in the UCS
			L2/03-324	N2642	Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS
			L2/04-132	N2740	Constable, Peter (2004-04-19), Proposal to add additional phonetic characters to the UCS
	U+1DC2	1	L2/03-190R		Constable, Peter (2003-06-08), Proposal to Encode Additional Phonetic Symbols in the UCS
			L2/04-047		Constable, Peter (2004-02-01), Revised Proposal to Encode Additional Phonetic Symbols in the UCS
			L2/04-132	N2740	Constable, Peter (2004-04-19), Proposal to add additional phonetic characters to the UCS
			L2/04-003R		Moore, Lisa (2004-05-17), "Additional Phonetic Symbols (B.14.13)", UTC #98 Minutes
	U+1DC3	1	L2/04-051		Anderson, Deborah (2004-01-29), Comments on 2619R Final Glagolitic proposal
	U+1DC3	1	L2/04-171	N2763	Everson, Michael (2004-05-29), Proposal to add COMBINING GLAGOLITIC SUSPENSION MARK to the BMP of the UCS
5.0	U+1DC4..1DCA	7	L2/04-246R		Priest, Lorna (2004-07-26), Revised Proposal for Additional Latin Phonetic and Orthographic Characters
			L2/04-316		Moore, Lisa (2004-08-19), "C.6", UTC #100 Minutes
			L2/04-348	N2906	Priest, Lorna (2004-08-23), Revised Proposal for Additional Latin Phonetic and Orthographic Characters
	U+1DFE..1DFF	2	L2/05-189	N2958	Lehtiranta, Juhani; Ruppel, Klaas; Suutari, Toni; Trosterud, Trond (2005-07-22), Report on progress in implementing the Uralic Phonetic Alphabet with indication of the need for additional characters and symbols
			L2/05-261	N2989	Ruppel, Klaas; Kolehmainen, Erkki I.; Everson, Michael; Freytag, Asmus; Whistler, Ken (2005-09-13), Proposal to add six additional Uralicist characters to the UCS
			L2/05-270		Whistler, Ken (2005-09-21), "A. Uralicist character additions", WG2 Consent Docket (Sophia Antipolis)
			L2/05-279		Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes
				N2953 (pdf, doc)	Umamaheswaran, V. S. (2006-02-16), "7.4.7", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
5.1	U+1DCB..1DCC	2	L2/06-214	N3048	Proposal to encode two combining characters in the UCS, 2006-03-02
			L2/06-108		Moore, Lisa (2006-05-25), "Consensus 107-C35", UTC #107 Minutes
				N3103 (pdf, doc)	Umamaheswaran, V. S. (2006-08-25), "M48.17", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
	U+1DCD..1DE6	26	L2/05-183	N2957	Everson, Michael; Haugen, Odd Einar; Emiliano, António; Pedro, Susana; Grammel, Florian; Baker, Peter; Stötzner, Andreas; Dohnicht, Marcus; Luft, Diana (2005-08-02), Preliminary proposal to add medievalist characters to the UCS
			L2/06-027	N3027	Everson, Michael; Baker, Peter; Emiliano, António; Grammel, Florian; Haugen, Odd Einar; Luft, Diana; Pedro, Susana; Schumacher, Gerd; Stötzner, Andreas (2006-01-30), Proposal to add Medievalist characters to the UCS
			L2/06-049		Pedro, Susana (2006-01-31), Letter of support for Medievalist letters (L2/06-027)
			L2/06-048		Emiliano, Antonio (2006-02-02), Letter of support for Medievalist letters (L2/06-027)
			L2/06-008R2		Moore, Lisa (2006-02-13), "C.14", UTC #106 Minutes
				N2953 (pdf, doc)	Umamaheswaran, V. S. (2006-02-16), "7.4.6", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
			L2/06-074R	N3039R	Feedback on N3027 Proposal to add Medievalist Characters, 2006-03-16
			L2/06-101	N3060	Feedback on N3027 "Proposal to add medievalist characters to the UCS", 2006-03-27
			L2/06-116	N3077	Everson, Michael; Baker, Peter; Emiliano, António; Grammel, Florian; Haugen, Odd Einar; Luft, Diana; Pedro, Susana; Schumacher, Gerd; Stötzner, Andreas (2006-03-31), Response to UTC/US contribution N3037R, "Feedback on N3027 Proposal to add medievalist characters"
			L2/06-108		Moore, Lisa (2006-05-25), "Consensus 107-C36", UTC #107 Minutes
				N3103 (pdf, doc)	Umamaheswaran, V. S. (2006-08-25), "M48.14", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
			L2/06-318	N3160	Response to Project Editor's contribution N3146, "Draft disposition of comments on SC2 N3875 (PDAM text for Amendment 3.2 to ISO/IEC 10646:2003)", 2006-09-21
5.2	U+1DFD	1	L2/07-334R2	N3447	Priest, Lorna (2007-10-15), Proposal to encode two phonetic characters and two Shona characters
			L2/07-345		Moore, Lisa (2007-10-25), "C.4", UTC #113 Minutes
			L2/08-318	N3453 (pdf, doc)	Umamaheswaran, V. S. (2008-08-13), "M52.20f", Unconfirmed minutes of WG 2 meeting 52
6.0	U+1DFC	1	L2/09-028	N3571	Ruppel, Klaas; Aalto, Tero; Everson, Michael (2009-01-27), Proposal to encode additional characters for the Uralic Phonetic Alphabet
			L2/09-234	N3603 (pdf, doc)	Umamaheswaran, V. S. (2009-07-08), "M54.13g", Unconfirmed minutes of WG 2 meeting 54
			L2/09-104		Moore, Lisa (2009-05-20), "Consensus 119-C27", UTC #119 / L2 #216 Minutes
7.0	U+1DE7..1DF4	14	L2/08-428	N3555	Everson, Michael (2008-11-27), Exploratory proposal to encode Germanicist, Nordicist, and other phonetic characters in the UCS
			L2/10-346	N3907	Everson, Michael; Wandl-Vogt, Eveline; Dicklberger, Alois (2010-09-23), Preliminary proposal to encode "Teuthonista" phonetic characters in the UCS
			L2/11-137	N4031	Everson, Michael; Wandl-Vogt, Eveline; Dicklberger, Alois (2011-05-09), Proposal to encode "Teuthonista" phonetic characters in the UCS
			L2/11-203	N4082	Everson, Michael; et al. (2011-05-27), Support for "Teuthonista" encoding proposal
			L2/11-202	N4081	Everson, Michael; Dicklberger, Alois; Pentzlin, Karl; Wandl-Vogt, Eveline (2011-06-02), Revised proposal to encode "Teuthonista" phonetic characters in the UCS
			L2/11-240	N4106	Everson, Michael; Pentzlin, Karl (2011-06-09), Report on the ad hoc re "Teuthonista" (SC2/WG2 N4081) held during the SC2/WG2 meeting at Helsinki
			L2/11-261R2		Moore, Lisa (2011-08-16), "Consensus 128-C38", UTC #128 / L2 #225 Minutes, Approve 85 characters for German dialectology...
				N4103	"11.16 Teuthonista phonetic characters", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
			L2/12-269	N4296	Request to change the names of three Teuthonista characters under ballot, 2012-07-26
	U+1DF5	1	L2/12-209R	N4279R	Everson, Michael; Starner, David (2012-07-31), Proposal to add COMBINING UP TACK ABOVE to the UCS
	U+1DF5	1	L2/12-239		Moore, Lisa (2012-08-14), "C.5", UTC #132 Minutes
9.0	U+1DFB	1	L2/12-349		Manandhar, Dev Dass; Karmacharya, Samir; Chitrakar, Bishnu (2012-10-29), Proposal for the Nepaalalipi script in the UCS
			L2/12-390		Anderson, Deborah (2012-11-08), Comparison between Newar and Nepaalalipi proposals (L2/12-003 and L2/12-349)
			L2/14-253		Anderson, Deborah (2014-10-06), Recommendations to UTC from Script Meeting in Nepal
			L2/14-250		Moore, Lisa (2014-11-10), "Consensus 141-C25", UTC #141 Minutes
			L2/14-285R3	N4660	Whistler, Ken (2014-12-04), Towards a Consensus Encoding of Newa
10.0	U+1DF6..1DF9	4	L2/15-173		Andreev, Aleksandr; Shardt, Yuri; Simmons, Nikita (2015-07-29), Proposal to Encode some Additional Symbols used in Church Slavonic Text
			L2/15-187		Moore, Lisa (2015-08-11), "E.2", UTC #144 Minutes
				N4739	"M64.06", Unconfirmed minutes of WG 2 meeting 64, 2016-08-31
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

As of Unicode version 13.0 Cyrillic script is encoded across several blocks, all in the BMP:

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.

Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista).

Unicode supports several phonetic scripts and notations through the existing writing systems and the addition of extra blocks with phonetic characters. These phonetic extras are derived of an existing script, usually Latin, Greek or Cyrillic. In Unicode there is no "IPA script". Apart from IPA, extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.

GNU FreeFont is a family of free OpenType, TrueType and WOFF vector fonts, implementing as much of the Universal Character Set (UCS) as possible, aside from the very large CJK Asian character set. The project was initiated in 2002 by Primož Peterlin and is now maintained by Steve White.

Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.

Combining Diacritical Marks for Symbols is a Unicode block containing arrows, dots, enclosures, and overlays for modifying symbol characters.

Macron below, U+0331◌̱COMBINING MACRON BELOW, is a combining diacritical mark that is used in various orthographies.

In the Unicode standard, a plane is a continuous group of 65,536 (2¹⁶) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–10₁₆ of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 13.0, seven of the planes have assigned code points (characters), and five are named.

Combining Half Marks is a Unicode block containing diacritic mark parts for spanning multiple characters.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Teuthonista is a phonetic transcription system used predominantly for the transcription of (High) German dialects. It is very similar to other Central European transcription systems from the early 20th century. The base characters are mostly based on the Latin alphabet, which can be modified by various diacritics.

Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista).

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
↑ Everson, Michael; Dicklberger, Alois; Pentzlin, Karl; Wandl-Vogt, Eveline (2011-06-02). "Revised proposal to encode "Teuthonista" phonetic characters in the UCS" (PDF).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-4] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.

[3] Everson, Michael; Dicklberger, Alois; Pentzlin, Karl; Wandl-Vogt, Eveline (2011-06-02). "Revised proposal to encode "Teuthonista" phonetic characters in the UCS" (PDF).

[1]

[2]

[3]

[1]

[2]

[lower-alpha 1]

Combining Diacritical Marks Supplement
Range	U+1DC0..U+1DFF (64 code points)
Plane	BMP
Scripts	Inherited
Major alphabets	UPA
Symbol sets	Medieval letter diacritics
Assigned	63 code points
Unused	1 reserved code points
Unicode version history

4.1 (2005)	4 (+4)
5.0 (2006)	13 (+9)
5.1 (2008)	41 (+28)
5.2 (2009)	42 (+1)
6.0 (2010)	43 (+1)
7.0 (2014)	58 (+15)
9.0 (2016)	59 (+1)
10.0 (2017)	63 (+4)

Note: ^[1]^[2]

Combining Diacritical Marks Supplement

Contents

Block

History

Related Research Articles

References