Combining Diacritical Marks Extended | |
---|---|
Range | U+1AB0..U+1AFF (80 code points) |
Plane | BMP |
Scripts | Inherited |
Assigned | 17 code points |
Unused | 63 reserved code points |
Unicode version history | |
7.0 (2014) | 15 (+15) |
13.0 (2020) | 17 (+2) |
Note: [1] [2] |
Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista). [3]
Combining Diacritical Marks Extended [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1ABx | ◌᪰ | ◌᪱ | ◌᪲ | ◌᪳ | ◌᪴ | ◌᪵ | ◌᪶ | ◌᪷ | ◌᪸ | ◌᪹ | ◌᪺ | ◌᪻ | ◌᪼ | ◌᪽ | ◌᪾ | ◌ᪿ |
U+1ACx | ◌ᫀ | |||||||||||||||
U+1ADx | ||||||||||||||||
U+1AEx | ||||||||||||||||
U+1AFx | ||||||||||||||||
Notes |
The following Unicode-related documents record the purpose and process of defining specific characters in the Combining Diacritical Marks Extended block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
7.0 | U+1AB0..1ABE | 15 | L2/08-428 | N3555 | Everson, Michael (2008-11-27), Exploratory proposal to encode Germanicist, Nordicist, and other phonetic characters in the UCS |
L2/10-346 | N3907 | Everson, Michael; Wandl-Vogt, Eveline; Dicklberger, Alois (2010-09-23), Preliminary proposal to encode "Teuthonista" phonetic characters in the UCS | |||
L2/11-137 | N4031 | Everson, Michael; Wandl-Vogt, Eveline; Dicklberger, Alois (2011-05-09), Proposal to encode "Teuthonista" phonetic characters in the UCS | |||
L2/11-203 | N4082 | Everson, Michael; et al. (2011-05-27), Support for "Teuthonista" encoding proposal | |||
L2/11-202 | N4081 | Everson, Michael; Dicklberger, Alois; Pentzlin, Karl; Wandl-Vogt, Eveline (2011-06-02), Revised proposal to encode "Teuthonista" phonetic characters in the UCS | |||
L2/11-240 | N4106 | Everson, Michael; Pentzlin, Karl (2011-06-09), Report on the ad hoc re "Teuthonista" (SC2/WG2 N4081) held during the SC2/WG2 meeting at Helsinki | |||
L2/11-261R2 | Moore, Lisa (2011-08-16), "Consensus 128-C38", UTC #128 / L2 #225 Minutes, Approve 85 characters for German dialectology... | ||||
N4103 | "11.16 Teuthonista phonetic characters", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03 | ||||
L2/12-269 | N4296 | Request to change the names of three Teuthonista characters under ballot, 2012-07-26 | |||
13.0 | U+1ABF..1AC0 | 2 | L2/19-075R | N5036R | Everson, Michael (2019-05-05), Proposal to add six phonetic characters for Scots to the UCS |
L2/19-173 | Anderson, Deborah; et al. (2019-04-29), "Phonetic characters for Scots", Recommendations to UTC #159 April-May 2019 on Script Proposals | ||||
L2/19-122 | Moore, Lisa (2019-05-08), "C.6", UTC #159 Minutes | ||||
N5122 | "M68.05", Unconfirmed minutes of WG 2 meeting 68, 2019-12-31 | ||||
L2/20-052 | Pournader, Roozbeh (2020-01-15), Changes to Identifier_Type of some Unicode 13.0 characters | ||||
L2/20-015 | Moore, Lisa (2020-01-23), "B.13.4 Changes to Identifier_Type of some Unicode 13.0 characters", Draft Minutes of UTC Meeting 162 | ||||
|
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks.
Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
As of Unicode version 13.0 Cyrillic script is encoded across several blocks, all in the BMP:
Monospace is a monospaced Unicode font, developed by George Williams. It is based on the typeface Courier. This font contains 2860 glyphs. It includes characters in the following unicode ranges: Basic Latin, Latin-1 Supplement, Latin Extended-A, Latin Extended-B, IPA Extensions, Spacing Modifier Letters, Combining Diacritical Marks, Greek, Cyrillic, Hebrew, Latin Extended Additional, Greek Extended, General Punctuation, Superscripts and Subscripts, Currency Symbols, Combining Diacritical Marks for Symbols, Letterlike Symbols, Number Forms, Arrows, Mathematical Operators, Miscellaneous Technical, Control Pictures, Enclosed Alphanumerics, Box Drawing, Block Elements, Geometric Shapes, Miscellaneous Symbols, Alphabetic Presentation Forms, Halfwidth and Fullwidth Forms.
An arrow is a graphical symbol, such as ← or →, or a pictogram, used to point or indicate direction. In its simplest form, an arrow is a triangle, chevron, or concave kite, usually affixed to a line segment or rectangle, and in more complex forms a representation of an actual arrow. The direction indicated by an arrow is the one along the length of the line or rectangle towards the single pointed end.
Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). It is an extension of the diacritic characters found in the Combining Diacritical Marks block.
Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista).
Combining Diacritical Marks for Symbols is a Unicode block containing arrows, dots, enclosures, and overlays for modifying symbol characters.
Macron below, U+0331◌̱COMBINING MACRON BELOW, is a combining diacritical mark that is used in various orthographies.
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 13.0, seven of the planes have assigned code points (characters), and five are named.
Combining Half Marks is a Unicode block containing diacritic mark parts for spanning multiple characters.
IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.
Latin Extended Additional is a Unicode block.
Teuthonista is a phonetic transcription system used predominantly for the transcription of (High) German dialects. It is very similar to other Central European transcription systems from the early 20th century. The base characters are mostly based on the Latin alphabet, which can be modified by various diacritics.
Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.
Greek Extended is a Unicode block containing the accented vowels necessary for writing polytonic Greek. The regular, unaccented Greek characters as well as the characters with tonos and diaeresis can be found in the Greek and Coptic block. Greek Extended was encoded in version 1.1 of the Unicode Standard. As an alternative to Greek Extended, combining characters can be used to represent the tones and breath marks of polytonic Greek.
Devanagari Extended is a Unicode block containing cantilation marks for writing the Samaveda, and nasalization marks for the Devanagari script.