Phags-pa (Unicode block)

Last updated
Phags-pa
RangeU+A840..U+A87F
(64 code points)
Plane BMP
Scripts Phags Pa
Major alphabetsMongolian
Chinese
Assigned56 code points
Unused8 reserved code points
Unicode version history
5.0 (2006)56 (+56)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Phags-pa is a Unicode block containing characters from the 'Phags-pa script promulgated as a national script by Kublai Khan, the founder of the Yuan dynasty. It was used primarily in writing Mongolian and Chinese, although it was intended for the use of all written languages of the Mongol Empire.

Contents

Block

Phags-pa [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+A84x
U+A85x
U+A86x
U+A87x
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

The block has six variation sequences defined for standardized variants. [3] They use U+FE00 VARIATION SELECTOR-1 (VS01):

Variation sequences for reversed shaping
U+CharacterBase code pointBase + VS01
A856Phags‑Pa Letter Small Aꡖ︀
A85CPhags‑Pa Letter Haꡜ︀
A85EPhags‑Pa Letter Iꡞ︀
A85FPhags‑Pa Letter Uꡟ︀
A860Phags‑Pa Letter Eꡠ︀
A868Phags‑Pa Subjoined Letter Yaꡨ︀

Note that four vowel letters have positional variants:

Positional forms of I, U, E, and O
U+CharacterOrientationIsolateInitialMedialFinal
U+A85EPhags‑Pa Letter Iregular
reversedꡞ︀ꡞ︀ꡞ︀ꡞ︀
U+A85FPhags‑Pa Letter Uregular
reversedꡟ︀ꡟ︀ꡟ︀ꡟ︀
U+A860Phags‑Pa Letter Eregular
reversedꡠ︀ꡠ︀ꡠ︀ꡠ︀
U+A861Phags‑Pa Letter Oregular

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Phags-pa block:

Related Research Articles

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorize these characters as being "letterlike."

Supplemental Mathematical Operators is a Unicode block containing various mathematical symbols, including N-ary operators, summations and integrals, intersections and unions, logical and relational operators, and subset/superset relations.

Mathematical Operators is a Unicode block containing characters for mathematical, logical, and set notation.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

CJK Unified Ideographs Extension-A is a Unicode block containing rare Han ideographs submitted to the Ideographic Research Group between 1992 and 1998, plus ten ideographs added in Unicode 13.0 which had previously been mistakenly unified with others.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Myanmar is a Unicode block containing characters for the Burmese, Mon, Shan, Palaung, and the Karen languages of Myanmar, as well as the Aiton and Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.

Mongolian is a Unicode block containing characters for dialects of Mongolian, Manchu, and Sibe languages. It is traditionally written in vertical lines Top-Down, right across the page, although the Unicode code charts cite the characters rotated to horizontal orientation as this is the orientation of glyphs in a font that supports layout in vertical orientation.

Myanmar Extended-A is a Unicode block containing Myanmar characters for writing the Khamti Shan and Aiton languages.

A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.

Variation Selectors Supplement is a Unicode block containing additional variation selectors beyond those found in the Variation Selectors block.

CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese characters. When contrasted with other blocks containing CJK Unified Ideographs, it is also referred to as the Unified Repertoire and Ordering (URO).

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for kunqu added in Unicode 13.0, and two characters for the Macao Supplementary Character Set added in Unicode 14.0.

General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interrobang, and invisible mathematical operators.

Egyptian Hieroglyphs is a Unicode block containing the Gardiner's sign list of Egyptian hieroglyphs.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symbols, emoji symbols, 'Phags-pa letters, and CJK unified ideographs corresponding to CJK compatibility ideographs. At present only standardized variation sequences with VS1, VS2, VS3, VS15 and VS16 have been defined; VS15 and VS16 are reserved to request that a character should be displayed as text or as an emoji respectively.

Manichaean is a Unicode block containing characters historically used for writing Sogdian, Parthian, and the dialects of Fars.

Egyptian Hieroglyph Format Controls is a Unicode block containing formatting characters that enable full formatting of quadrats for Egyptian hieroglyphs.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.