Tangut Supplement

Last updated
Tangut Supplement
RangeU+18D00..U+18D7F
(128 code points)
Plane SMP
Scripts Tangut
Assigned9 code points
Unused119 reserved code points
Unicode version history
13.0 (2020)9 (+9)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]
16 unused code points were trimmed from the end of the block in Unicode 14.0. [3]

Tangut Supplement is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty. This block is a supplement to the main Tangut block.

Contents

The Tangut Supplement block size was changed in Unicode version 14.0 to correct the erroneous block end point (version 13: 18D8F → version 14.0: 18D7F). [3]

Block

Tangut Supplement [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+18D0x 𘴀 𘴁 𘴂 𘴃 𘴄 𘴅 𘴆 𘴇 𘴈
U+18D1x
U+18D2x
U+18D3x
U+18D4x
U+18D5x
U+18D6x
U+18D7x
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Tangut Supplement block:

See also

Related Research Articles

<span class="mw-page-title-main">Tangut script</span> Chinese-based script for Tangut language

The Tangut script was a logographic writing system, used for writing the extinct Tangut language of the Western Xia dynasty. According to the latest count, 5863 Tangut characters are known, excluding variants. The Tangut characters are similar in appearance to Chinese characters, with the same type of strokes, but the methods of forming characters in the Tangut writing system are significantly different from those of forming Chinese characters. As in Chinese calligraphy, regular, running, cursive and seal scripts were used in Tangut writing.

Mathematical Operators is a Unicode block containing characters for mathematical, logical, and set notation.

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Miscellaneous Symbols and Pictographs is a Unicode block containing meteorological and astronomical symbols, emoji characters largely for compatibility with Japanese telephone carriers' implementations of Shift JIS, and characters originally from the Wingdings and Webdings fonts found in Microsoft Windows.

Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text", and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.

<span class="mw-page-title-main">Enclosed Ideographic Supplement</span> Unicode character block

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Emoticons is a Unicode block containing emoticons or emoji. Most of them are intended as representations of faces, although some of them include hand gestures or non-human characters.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

Supplemental Symbols and Pictographs is a Unicode block containing emoji characters. It extends the set of symbols included in the Miscellaneous Symbols and Pictographs block. It also includes Typikon symbols.

Ideographic Symbols and Punctuation is a Unicode block containing symbols and punctuation marks used by ideographic scripts such as Tangut and Nüshu.

Tangut is a Unicode block containing characters from the Tangut script, which was used for writing the Tangut language spoken by the Tangut people in the Western Xia Empire, and in China during the Yuan dynasty and early Ming dynasty.

Tangut Components is a Unicode block containing components and radicals used in the modern study of the Tangut script.

Symbols for Legacy Computing is a Unicode block containing graphic characters that were used for various home computers from the 1970s and 1980s and in Teletext broadcasting standards. It includes characters from the Amstrad CPC, MSX, Mattel Aquarius, RISC OS, MouseText, Atari ST, TRS-80 Color Computer, Oric, Texas Instruments TI-99/4A, TRS-80, Minitel, Teletext, ATASCII, PETSCII, ZX80, and ZX81 character sets. Semigraphics characters are also included in the form of new block-shaped characters, line-drawing characters, and 60 "sextant" characters.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. 1 2 "Errate fixed in version 14.0.0" . Retrieved 2023-07-26.