Latin Extended-D

Last updated
Latin Extended-D
RangeU+A720..U+A7FF
(224 code points)
Plane BMP
Scripts Latin (188 char.)
Common (5 char.)
Major alphabets IPA
UPA
Medieval characters
Mayanist transcription
African languages
Assigned193 code points
Unused31 reserved code points
Source standards MUFI
Unicode version history
5.0 (2006)2 (+2)
5.1 (2008)114 (+112)
6.0 (2010)129 (+15)
6.1 (2012)134 (+5)
7.0 (2014)152 (+18)
8.0 (2015)159 (+7)
9.0 (2016)160 (+1)
11.0 (2018)163 (+3)
12.0 (2019)174 (+11)
13.0 (2020)180 (+6)
14.0 (2021)193 (+13)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Latin Extended-D is a Unicode block containing Latin characters for phonetic, Mayanist, and Medieval transcription and notation systems. 89 of the characters in this block are for medieval characters proposed by the Medieval Unicode Font Initiative, many of which are representative of scribal abbreviations used in Medieval manuscript texts.

Contents

Block

Latin Extended-D [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+A72x
U+A73x
U+A74x
U+A75x
U+A76x
U+A77x
U+A78x
U+A79x
U+A7Ax
U+A7Bx
U+A7Cx
U+A7Dx
U+A7Ex
U+A7Fx
Notes
1. ^ As of Unicode version 15.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Latin Extended-D block:

Related Research Articles

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1, which was incorporated whole as the Latin-1 Supplement block.

Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.

Latin Extended-C is a Unicode block containing Latin characters for Uighur New Script, the Uralic Phonetic Alphabet, Shona, Claudian Latin and the Swedish Dialect Alphabet.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block range was extended by 80 code points and another 35 characters were assigned. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.

Latin Extended Additional is a Unicode block.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Kana Supplement is a Unicode block containing one archaic katakana character and 255 hentaigana characters. Additional hentaigana characters are encoded in the Kana Extended-A block.

Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Anthropos alphabet, Sakha and Americanist usage.

Kana Extended-A is a Unicode block containing hentaigana and historic kana characters. Additional hentaigana characters are encoded in the Kana Supplement block.

Kana Extended-B is a Unicode block containing kana originally created by Japanese linguists to write Taiwanese Hokkien known as Taiwanese kana.

Latin Extended-F is a Unicode block containing modifier letters, nearly all IPA and extIPA, for phonetic transcription. The Latin Extended-F and -G blocks contain the first Latin characters defined outside of the Basic Multilingual Plane (BMP). They were added to the free Gentium Plus and Andika fonts with version 6.2 in February 2023.

Latin Extended-G is a Unicode block containing additional characters for phonetic transcription. The Latin Extended-F and -G blocks contain the first Latin characters defined outside of the Basic Multilingual Plane (BMP).

Cyrillic Extended-D is a Unicode block containing superscript and subscript Cyrillic characters used in Cyrillic-based phonetic transcription. The block contains the first Cyrillic characters defined outside of the Basic Multilingual Plane (BMP).

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.