Mongolian (Unicode block)

Last updated
Mongolian
RangeU+1800..U+18AF
(176 code points)
Plane BMP
Scripts Mongolian (155 char.)
Common (3 char.)
Major alphabetsMongolian
Manchu
Assigned158 code points
Unused18 reserved code points
Unicode version history
3.0 (1999)155 (+155)
5.1 (2008)156 (+1)
11.0 (2018)157 (+1)
14.0 (2021)158 (+1)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Mongolian is a Unicode block containing characters for dialects of Mongolian, Manchu, and Sibe languages. It is traditionally written in vertical lines Text direction TDright.svg Top-Down, right across the page, although the Unicode code charts cite the characters rotated to horizontal orientation as this is the orientation of glyphs in a font that supports layout in vertical orientation.

Contents

The block has dozens of variation sequences defined for standardized variants. [3]

Block

Mongolian [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+180xFVS
1
FVS
2
FVS
3
MVSFVS
4
U+181x
U+182x
U+183x
U+184x
U+185x
U+186x
U+187x
U+188x
U+189x
U+18Ax
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

Presentation forms

Vowels
LetterSubsetUnicodeIsolateInitialMedialFinal
ABasic1820
1820 180B
1820 180C
EBasic1821
1821 180B
Todo 1844
1844 180B
Sibe 185D
185D 180B
EEBasic1827
IBasic1822
1822 180B
Todo1845
1845 180B
Sibe185E
185E 180B
185E 180C
Manchu 1873
1873 180B
1873 180C
1873 180D
IYSibe185F
OBasic1823
1823 180B
Todo1846
1846 180B
OEBasic1825
1825 180B
1825 180C
Todo1848
1848 180B
UBasic1824
1824 180B
Todo1847
1847 180B
1847 180C
Sibe1861
UEBasic1826
1826 180B
1826 180C
Todo1849
1849 180B
Sibe1860
1860 180B
Long
vowel
sign
Todo1843
Consonants
LetterSubsetUnicodeIsolateInitialMedialFinal
NABasic1828
1828 180B
1828 180C
1828 180D
ANGBasic1829
Todo184A 180B
Sibe1862 180B
BABasic182A
182A 180B
Todo184B
PABasic182B
Todo184C
Sibe1866
QABasic182C
182C 180B
182C 180C
182C 180D
Todo184D
184D 180B
GABasic182D
182D 180B
182D 180C
182D 180D
Todo184E
184E 180B
Sibe1864
MABasic182E
Todo184F
LABasic182F
SABasic1830
1830 180B
1830 180C
SHABasic1831
1878 [a]
Sibe1867
TABasic1832
1832 180B
Todo1850
Sibe1868
1868 180B
1868 180C
DABasic1833
1833 180B
Todo1851
Sibe1869
1869 180B
CHABasic1834
Todo1852
Sibe1871
JABasic1835
1835 180B
Todo1853
Sibe186A
YABasic1836
1836 180B
1836 180C
Todo1855
RABasic1837
Manchu1875
RAASibe1870
WABasic1838
1838 180B
Todo1856
FABasic1839
Sibe186B
Manchu1876
1876 180B
KABasic183A
Todo1857
Sibe1863
1863 180B
Manchu1874
1874 180B
1874 180C
1874 180D
KHABasic183B
GAATodo1858
Sibe186C
TSABasic183C
Todo1854
Sibe186E
ZABasic183D
Sibe186F
186F 180B
HAABasic183E
Todo1859
Sibe186D
HASibe1865
ZRABasic183F
ZHASibe1872
Manchu1877
LHABasic1840
ZHIBasic1841
CHIBasic1842
JIATodo185A
NIATodo185B
DZATodo185C

Notes

^ [a] U+1878 used historically for Buryat. [4]

Extensions for Sanskrit and Tibetan

Vowels
Letter
Ali Gali
SubsetUnicodeIsolateInitialMedialFinal
ABasic1887
1887 180B
1887 180C
1887 180D
IBasic1888
1888 180B
AHBasic1897
Half UBasic18A6
Half YABasic18A7
Signs and marks
Letter
Ali Gali
SubsetUnicodeIsolateInitialMedialFinal
Anusvara
One
Basic1880
1880 180B
Visarga
One
Basic1881
1881 180B
DamaruBasic1882
UbadamaBasic1883
Inverted
Ubadama
Basic1884
BaludaBasic1885
Three
Baluda
Basic1886
Consonants
Letter
Ali Gali
SubsetUnicodeIsolateInitialMedialFinal
KABasic1889
NGABasic188A
188A 180B
Manchu189B
CABasic188B
Manchu189C
TTABasic188C
Manchu189E
TTHABasic188D
DDABasic188E
NNABasic188F
TABasic1890
Todo1898
Manchu18A0
DABasic1891
PABasic1892
PHABasic1893
SSABasic1894
Manchu18A2
ZHABasic1895
Todo1899
Manchu18A4
ZABasic1896
Manchu18A5
GHAManchu189A
JHAManchu189D
DDHAManchu189F
DHAManchu18A1
CYAManchu18A3
BHAManchu18A8
LHAManchu18AA

Variations and vowel separation

The Mongolian Unicode block contains its own variation selectors (listed as format controls) for use with the traditional Mongolian alphabet: [5]

Additional variations may be also available for traditional Mongolian script characters according to the context of the character, or by using a zero-width joiner (ZWJ, U+200D) and/or a zero width non-joiner (ZWNJ, U+200C) to select the specific form. The block also contains a format control named "Mongolian vowel separator" (MVS, U+180E).

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Mongolian block:

Related Research Articles

A whitespace character is a character data element that represents white space when text is rendered for display by a computer.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorize these characters as being "letterlike."

Supplemental Mathematical Operators is a Unicode block containing various mathematical symbols, including N-ary operators, summations and integrals, intersections and unions, logical and relational operators, and subset/superset relations.

<span class="mw-page-title-main">Universal Character Set characters</span> Complete list of the characters available on most computers

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

Mathematical Operators is a Unicode block containing characters for mathematical, logical, and set notation.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Unicode Standard assigns various properties to each Unicode character and code point.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Tibetan is a Unicode block containing characters for the Tibetan, Dzongkha, and other languages of China, Bhutan, Nepal, Mongolia, northern India, eastern Pakistan and Russia.

Myanmar is a Unicode block containing characters for the Burmese, Mon, Shan, Palaung, and the Karen languages of Myanmar, as well as the Aiton and Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.

Myanmar Extended-A is a Unicode block containing Myanmar characters for writing the Khamti Shan and Aiton languages.

A variant form is a different glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

Variation Selectors Supplement is a Unicode block containing additional variation selectors beyond those found in the Variation Selectors block.

Phags-pa is a Unicode block containing characters from the 'Phags-pa script promulgated as a national script by Kublai Khan, the founder of the Yuan dynasty. It was used primarily in writing Mongolian and Chinese, although it was intended for the use of all written languages of the Mongol Empire.

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.

Egyptian Hieroglyphs is a Unicode block containing the Gardiner's sign list of Egyptian hieroglyphs.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

Manichaean is a Unicode block containing characters historically used for writing Sogdian, Parthian, and the dialects of Fars.

Egyptian Hieroglyph Format Controls is a Unicode block containing formatting characters that enable full formatting of quadrats for Egyptian hieroglyphs.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
  4. West, Andrew; Zhamsoev, Amgalan; Zaytsev, Viacheslav (2017-01-13). "L2/17-007: Proposal to encode one historical Mongolian letter for Buryat Mongolian" (PDF).
  5. "Free Variation Selectors" (PDF). www.unicode.org.