Combining Diacritical Marks Supplement | |
---|---|
Range | U+1DC0..U+1DFF (64 code points) |
Plane | BMP |
Scripts | Inherited |
Major alphabets | UPA |
Symbol sets | Medieval letter diacritics |
Assigned | 63 code points |
Unused | 1 reserved code points |
Unicode version history | |
4.1 (2005) | 4 (+4) |
5.0 (2006) | 13 (+9) |
5.1 (2008) | 41 (+28) |
5.2 (2009) | 42 (+1) |
6.0 (2010) | 43 (+1) |
7.0 (2014) | 58 (+15) |
9.0 (2016) | 59 (+1) |
10.0 (2017) | 63 (+4) |
Note: [1] [2] |
Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). [3] It is an extension of the diacritic characters found in the Combining Diacritical Marks block.
Combining Diacritical Marks Supplement [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1DCx | ◌᷀ | ◌᷁ | ◌᷂ | ◌᷃ | ◌᷄ | ◌᷅ | ◌᷆ | ◌᷇ | ◌᷈ | ◌᷉ | ◌᷊ | ◌᷋ | ◌᷌ | ◌᷍ | ◌᷎ | ◌᷏ |
U+1DDx | ◌᷐ | ◌᷑ | ◌᷒ | ◌ᷓ | ◌ᷔ | ◌ᷕ | ◌ᷖ | ◌ᷗ | ◌ᷘ | ◌ᷙ | ◌ᷚ | ◌ᷛ | ◌ᷜ | ◌ᷝ | ◌ᷞ | ◌ᷟ |
U+1DEx | ◌ᷠ | ◌ᷡ | ◌ᷢ | ◌ᷣ | ◌ᷤ | ◌ᷥ | ◌ᷦ | ◌ᷧ | ◌ᷨ | ◌ᷩ | ◌ᷪ | ◌ᷫ | ◌ᷬ | ◌ᷭ | ◌ᷮ | ◌ᷯ |
U+1DFx | ◌ᷰ | ◌ᷱ | ◌ᷲ | ◌ᷳ | ◌ᷴ | ◌᷵ | ◌᷶ | ◌᷷ | ◌᷸ | ◌᷹ | ◌᷻ | ◌᷼ | ◌᷽ | ◌᷾ | ◌᷿ | |
Notes |
The following Unicode-related documents record the purpose and process of defining specific characters in the Combining Diacritical Marks Supplement block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
4.1 | U+1DC0..1DC1 | 2 | L2/02-031 | Anderson, Deborah (2002-01-21), TLG Miscellanea Proposal | |
L2/02-033 | Anderson, Deborah (2002-01-21), TLG Unicode Proposal (draft) | ||||
L2/02-053 | Anderson, Deborah (2002-02-04), Description of TLG Documents | ||||
L2/02-273 | Pantelia, Maria (2002-07-31), TLG Unicode Proposal | ||||
L2/02-287 | Pantelia, Maria (2002-08-09), Proposal Summary Form accompanying TLG Unicode Proposal (L2/02-273) | ||||
L2/02-312R | Pantelia, Maria (2002-11-07), Proposal to encode additional Greek editorial and punctuation characters in the UCS | ||||
L2/03-324 | N2642 | Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS | |||
L2/04-132 | N2740 | Constable, Peter (2004-04-19), Proposal to add additional phonetic characters to the UCS | |||
U+1DC2 | 1 | L2/03-190R | Constable, Peter (2003-06-08), Proposal to Encode Additional Phonetic Symbols in the UCS | ||
L2/04-047 | Constable, Peter (2004-02-01), Revised Proposal to Encode Additional Phonetic Symbols in the UCS | ||||
L2/04-132 | N2740 | Constable, Peter (2004-04-19), Proposal to add additional phonetic characters to the UCS | |||
L2/04-003R | Moore, Lisa (2004-05-17), "Additional Phonetic Symbols (B.14.13)", UTC #98 Minutes | ||||
U+1DC3 | 1 | L2/04-051 | Anderson, Deborah (2004-01-29), Comments on 2619R Final Glagolitic proposal | ||
L2/04-171 | N2763 | Everson, Michael (2004-05-29), Proposal to add COMBINING GLAGOLITIC SUSPENSION MARK to the BMP of the UCS | |||
5.0 | U+1DC4..1DCA | 7 | L2/04-246R | Priest, Lorna (2004-07-26), Revised Proposal for Additional Latin Phonetic and Orthographic Characters | |
L2/04-316 | Moore, Lisa (2004-08-19), "C.6", UTC #100 Minutes | ||||
L2/04-348 | N2906 | Priest, Lorna (2004-08-23), Revised Proposal for Additional Latin Phonetic and Orthographic Characters | |||
U+1DFE..1DFF | 2 | L2/05-189 | N2958 | Lehtiranta, Juhani; Ruppel, Klaas; Suutari, Toni; Trosterud, Trond (2005-07-22), Report on progress in implementing the Uralic Phonetic Alphabet with indication of the need for additional characters and symbols | |
L2/05-261 | N2989 | Ruppel, Klaas; Kolehmainen, Erkki I.; Everson, Michael; Freytag, Asmus; Whistler, Ken (2005-09-13), Proposal to add six additional Uralicist characters to the UCS | |||
L2/05-270 | Whistler, Ken (2005-09-21), "A. Uralicist character additions", WG2 Consent Docket (Sophia Antipolis) | ||||
L2/05-279 | Moore, Lisa (2005-11-10), "Consensus 105-C29", UTC #105 Minutes | ||||
N2953 (pdf, doc) | Umamaheswaran, V. S. (2006-02-16), "7.4.7", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15 | ||||
5.1 | U+1DCB..1DCC | 2 | L2/06-214 | N3048 | Proposal to encode two combining characters in the UCS, 2006-03-02 |
L2/06-108 | Moore, Lisa (2006-05-25), "Consensus 107-C35", UTC #107 Minutes | ||||
N3103 (pdf, doc) | Umamaheswaran, V. S. (2006-08-25), "M48.17", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27 | ||||
U+1DCD..1DE6 | 26 | L2/05-183 | N2957 | Everson, Michael; Haugen, Odd Einar; Emiliano, António; Pedro, Susana; Grammel, Florian; Baker, Peter; Stötzner, Andreas; Dohnicht, Marcus; Luft, Diana (2005-08-02), Preliminary proposal to add medievalist characters to the UCS | |
L2/06-027 | N3027 | Everson, Michael; Baker, Peter; Emiliano, António; Grammel, Florian; Haugen, Odd Einar; Luft, Diana; Pedro, Susana; Schumacher, Gerd; Stötzner, Andreas (2006-01-30), Proposal to add Medievalist characters to the UCS | |||
L2/06-049 | Pedro, Susana (2006-01-31), Letter of support for Medievalist letters (L2/06-027) | ||||
L2/06-048 | Emiliano, Antonio (2006-02-02), Letter of support for Medievalist letters (L2/06-027) | ||||
L2/06-008R2 | Moore, Lisa (2006-02-13), "C.14", UTC #106 Minutes | ||||
N2953 (pdf, doc) | Umamaheswaran, V. S. (2006-02-16), "7.4.6", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15 | ||||
L2/06-074R | N3039R | Feedback on N3027 Proposal to add Medievalist Characters, 2006-03-16 | |||
L2/06-101 | N3060 | Feedback on N3027 "Proposal to add medievalist characters to the UCS", 2006-03-27 | |||
L2/06-116 | N3077 | Everson, Michael; Baker, Peter; Emiliano, António; Grammel, Florian; Haugen, Odd Einar; Luft, Diana; Pedro, Susana; Schumacher, Gerd; Stötzner, Andreas (2006-03-31), Response to UTC/US contribution N3037R, "Feedback on N3027 Proposal to add medievalist characters" | |||
L2/06-108 | Moore, Lisa (2006-05-25), "Consensus 107-C36", UTC #107 Minutes | ||||
N3103 (pdf, doc) | Umamaheswaran, V. S. (2006-08-25), "M48.14", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27 | ||||
L2/06-318 | N3160 | Response to Project Editor's contribution N3146, "Draft disposition of comments on SC2 N3875 (PDAM text for Amendment 3.2 to ISO/IEC 10646:2003)", 2006-09-21 | |||
5.2 | U+1DFD | 1 | L2/07-334R2 | N3447 | Priest, Lorna (2007-10-15), Proposal to encode two phonetic characters and two Shona characters |
L2/07-345 | Moore, Lisa (2007-10-25), "C.4", UTC #113 Minutes | ||||
L2/08-318 | N3453 (pdf, doc) | Umamaheswaran, V. S. (2008-08-13), "M52.20f", Unconfirmed minutes of WG 2 meeting 52 | |||
6.0 | U+1DFC | 1 | L2/09-028 | N3571 | Ruppel, Klaas; Aalto, Tero; Everson, Michael (2009-01-27), Proposal to encode additional characters for the Uralic Phonetic Alphabet |
L2/09-234 | N3603 (pdf, doc) | Umamaheswaran, V. S. (2009-07-08), "M54.13g", Unconfirmed minutes of WG 2 meeting 54 | |||
L2/09-104 | Moore, Lisa (2009-05-20), "Consensus 119-C27", UTC #119 / L2 #216 Minutes | ||||
7.0 | U+1DE7..1DF4 | 14 | L2/08-428 | N3555 | Everson, Michael (2008-11-27), Exploratory proposal to encode Germanicist, Nordicist, and other phonetic characters in the UCS |
L2/10-346 | N3907 | Everson, Michael; Wandl-Vogt, Eveline; Dicklberger, Alois (2010-09-23), Preliminary proposal to encode "Teuthonista" phonetic characters in the UCS | |||
L2/11-137 | N4031 | Everson, Michael; Wandl-Vogt, Eveline; Dicklberger, Alois (2011-05-09), Proposal to encode "Teuthonista" phonetic characters in the UCS | |||
L2/11-203 | N4082 | Everson, Michael; et al. (2011-05-27), Support for "Teuthonista" encoding proposal | |||
L2/11-202 | N4081 | Everson, Michael; Dicklberger, Alois; Pentzlin, Karl; Wandl-Vogt, Eveline (2011-06-02), Revised proposal to encode "Teuthonista" phonetic characters in the UCS | |||
L2/11-240 | N4106 | Everson, Michael; Pentzlin, Karl (2011-06-09), Report on the ad hoc re "Teuthonista" (SC2/WG2 N4081) held during the SC2/WG2 meeting at Helsinki | |||
L2/11-261R2 | Moore, Lisa (2011-08-16), "Consensus 128-C38", UTC #128 / L2 #225 Minutes, Approve 85 characters for German dialectology... | ||||
N4103 | "11.16 Teuthonista phonetic characters", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03 | ||||
L2/12-269 | N4296 | Request to change the names of three Teuthonista characters under ballot, 2012-07-26 | |||
U+1DF5 | 1 | L2/12-209R | N4279R | Everson, Michael; Starner, David (2012-07-31), Proposal to add COMBINING UP TACK ABOVE to the UCS | |
L2/12-239 | Moore, Lisa (2012-08-14), "C.5", UTC #132 Minutes | ||||
9.0 | U+1DFB | 1 | L2/12-349 | Manandhar, Dev Dass; Karmacharya, Samir; Chitrakar, Bishnu (2012-10-29), Proposal for the Nepaalalipi script in the UCS | |
L2/12-390 | Anderson, Deborah (2012-11-08), Comparison between Newar and Nepaalalipi proposals (L2/12-003 and L2/12-349) | ||||
L2/14-253 | Anderson, Deborah (2014-10-06), Recommendations to UTC from Script Meeting in Nepal | ||||
L2/14-250 | Moore, Lisa (2014-11-10), "Consensus 141-C25", UTC #141 Minutes | ||||
L2/14-285R3 | N4660 | Whistler, Ken (2014-12-04), Towards a Consensus Encoding of Newa | |||
10.0 | U+1DF6..1DF9 | 4 | L2/15-173 | Andreev, Aleksandr; Shardt, Yuri; Simmons, Nikita (2015-07-29), Proposal to Encode some Additional Symbols used in Church Slavonic Text | |
L2/15-187 | Moore, Lisa (2015-08-11), "E.2", UTC #144 Minutes | ||||
N4739 | "M64.06", Unconfirmed minutes of WG 2 meeting 64, 2016-08-31 | ||||
|
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
As of Unicode version 13.0 Cyrillic script is encoded across several blocks, all in the BMP:
Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista).
Unicode supports several phonetic scripts and notations through the existing writing systems and the addition of extra blocks with phonetic characters. These phonetic extras are derived of an existing script, usually Latin, Greek or Cyrillic. In Unicode there is no "IPA script". Apart from IPA, extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.
Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.
GNU FreeFont is a family of free OpenType, TrueType and WOFF vector fonts, implementing as much of the Universal Character Set (UCS) as possible, aside from the very large CJK Asian character set. The project was initiated in 2002 by Primož Peterlin and is now maintained by Steve White.
Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.
Combining Diacritical Marks for Symbols is a Unicode block containing arrows, dots, enclosures, and overlays for modifying symbol characters.
Macron below, U+0331◌̱COMBINING MACRON BELOW, is a combining diacritical mark that is used in various orthographies.
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 13.0, seven of the planes have assigned code points (characters), and five are named.
Combining Half Marks is a Unicode block containing diacritic mark parts for spanning multiple characters.
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.
Teuthonista is a phonetic transcription system used predominantly for the transcription of (High) German dialects. It is very similar to other Central European transcription systems from the early 20th century. The base characters are mostly based on the Latin alphabet, which can be modified by various diacritics.
Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.
Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.
Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista).