Combining Half Marks

Last updated
Combining Half Marks
RangeU+FE20..U+FE2F
(16 code points)
Plane BMP
Scripts Cyrillic (2 char.)
Inherited (14 char.)
Symbol setsHalf diacritics
Assigned16 code points
Unused0 reserved code points
Unicode version history
1.14 (+4)
5.17 (+3)
7.014 (+7)
8.016 (+2)
Note: [1] [2]

Combining Half Marks is a Unicode block containing diacritic mark parts for spanning multiple characters.

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

Contents

Block

Combining Half Marks [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+FE2x
Notes
1. ^ As of Unicode version 12.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Combining Half Marks block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
1.1U+FE20..FE234(to be determined)
5.1U+FE24..FE263 L2/07-085R N3222R Everson, Michael; Emmel, Stephen; Marjanen, Antti; Dunderberg, Ismo; Baines, John; Pedro, Susana; Emiliano, António (2007-03-15), Proposal to add additional characters for Coptic and Latin in the UCS
L2/07-150 Whistler, Ken (2007-05-10), "G", WG2 Consent Docket
L2/07-118R2 Moore, Lisa (2007-05-23), "111-C17", UTC #111 Minutes
L2/07-268 N3253 (pdf, doc)Umamaheswaran, V. S. (2007-07-26), "M50.22", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27
7.0U+FE27..FE2D7 L2/08-392 Pentzlin, Karl (2008-10-25), Proposal to encode a combining diacritical mark for Low German dialect writing
L2/09-028 N3571 Ruppel, Klaas; Aalto, Tero; Everson, Michael (2009-01-27), Proposal to encode additional characters for the Uralic Phonetic Alphabet
L2/09-281 Anderson, Deborah (2009-08-06), COMBINING TRIPLE INVERTED BREVE and other triple-length combining marks
L2/09-225R Moore, Lisa (2009-08-17), "C.14", UTC #120 / L2 #217 Minutes
L2/10-353 N3915 Pentzlin, Karl (2010-09-23), Preliminary Proposal to enable the use of Combining Triple Diacritics in Plain Text
L2/10-416R Moore, Lisa (2010-11-09), "C.4", UTC #125 / L2 #222 Minutes
N3903 (pdf, doc)"10.25", Unconfirmed minutes of WG2 meeting 57, 2011-03-31
L2/11-224 N4078 Proposal to enable the use of Combining Triple Diacritics in Plain Text, 2011-05-22
L2/11-261R2 Moore, Lisa (2011-08-16), "Consensus 128-C34", UTC #128 / L2 #225 Minutes
L2/11-296R N4131 Everson, Michael (2011-10-28), Proposal for encoding the Caucasian Albanian script in the SMP of the UCS
N4103 "11.15 Combining Triple Diacritics in plain text", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
N4243 Everson, Michael; Gippert, Jost (2012-02-14), Documentation for Two Characters FE2B and FE2C for Caucasian Albanian (N4131R)
L2/12-112 Moore, Lisa (2012-05-17), "131-C22", UTC #131 / L2 #228 Minutes
N4253 (pdf, doc)"M59.01e", Unconfirmed minutes of WG 2 meeting 59, 2012-09-12
8.0U+FE2E..FE2F2 L2/13-164 Cleminson, Ralph; Birnbaum, David (2013-07-25), Feedback from Experts on Cyrillic proposals
L2/13-165 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh (2013-07-25), "4", Recommendations to UTC on Script Proposals
L2/13-132 Moore, Lisa (2013-07-29), "Consensus 136-C23", UTC #136 Minutes
L2/13-139 N4475 Andreev, Aleksandr; Shardt, Yuri; Simmons, Nikita (2013-08-07), Proposal to Encode Combining Half Marks Used for Cyrillic Supralineation in Unicode
N4553 (pdf, doc)Umamaheswaran, V. S. (2014-09-16), "M62.04a", Minutes of WG 2 meeting 62 Adobe, San Jose, CA, USA
  1. Proposed code points and characters names may differ from final code points and names

See also

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.

In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.

As of Unicode version 12.0 Cyrillic script is encoded across several blocks, all in the BMP:

Related Research Articles

Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). It is an extension of the diacritic characters found in the Combining Diacritical Marks block.

Spacing Modifier Letters is a Unicode block containing characters for the IPA, UPA, and other phonetic transcriptions. Included are the IPA tone marks, and modifiers for aspiration and palatalization.

Combining Diacritical Marks for Symbols is a Unicode block containing arrows, dots, enclosures, and overlays for modifying symbol characters.

Supplemental Punctuation is a Unicode block containing historic and specialized punctuation characters, including biblical editorial symbols, ancient Greek punctuation, and German dictionary marks.

Yi Syllables is a Unicode block containing the characters of the Liangshan Standard Yi script for writing the Nuosu, or Yi, language.

Specials is a short Unicode block allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five are assigned as of Unicode 12.0:

Cyrillic Extended-B is a Unicode block containing Cyrillic characters for writing Old Cyrillic and Old Abkhazian, and combining numeric signs.

IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks.

Latin Extended Additional is a Unicode block.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages.

Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.

Greek Extended is a Unicode block containing the accented vowels necessary for writing polytonic Greek. The regular, unaccented Greek characters as well as the characters with tonos and diaeresis can be found in the Greek and Coptic. Greek Extended was encoded in version 1.1 of the Unicode Standard. As an alternative to Greek Extended, combining characters can be used to represent the tones and breath marks of polytonic Greek.

Ethiopic Supplement is a Unicode block containing extra Geʽez characters for writing the Sebatbeit language, and Ethiopic tone marks.

Devanagari Extended is a Unicode block containing cantilation marks for writing the Samaveda, and nasalization marks for the Devanagari script.

Egyptian Hieroglyphs is a Unicode block containing the Gardiner's sign list of Egyptian hieroglyphs.

Javanese is a Unicode block containing aksara Jawa characters traditionally used for writing the Javanese language. The Javanese script was added to the Unicode Standard in October 2009 with the release of version 5.2.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the last of the Basic Multilingual Plane excepting the short Specials block at U+FFF0–FFFF.

Combining Diacritical Marks Extended is a Unicode block containing diacritical marks used in German dialectology (Teuthonista).

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.