Nag Mundari | |
---|---|
Range | U+1E4D0..U+1E4FF (48 code points) |
Plane | SMP |
Scripts | Nag Mundari |
Assigned | 42 code points |
Unused | 6 reserved code points |
Unicode version history | |
15.0 (2022) | 42 (+42) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1] [2] |
Nag Mundari is a Unicode block containing the letters for writing the Mundari language. [3] Nag Mundari is encoded as a unicameral alphabet. [4] The Nag Mundari block contains 27 letters plus five diacritics and ten digits.
Nag Mundari [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1E4Dx | 𞓐 | 𞓑 | 𞓒 | 𞓓 | 𞓔 | 𞓕 | 𞓖 | 𞓗 | 𞓘 | 𞓙 | 𞓚 | 𞓛 | 𞓜 | 𞓝 | 𞓞 | 𞓟 |
U+1E4Ex | 𞓠 | 𞓡 | 𞓢 | 𞓣 | 𞓤 | 𞓥 | 𞓦 | 𞓧 | 𞓨 | 𞓩 | 𞓪 | 𞓫 | 𞓬 | 𞓭 | 𞓮 | 𞓯 |
U+1E4Fx | 𞓰 | 𞓱 | 𞓲 | 𞓳 | 𞓴 | 𞓵 | 𞓶 | 𞓷 | 𞓸 | 𞓹 | ||||||
Notes |
The following Unicode-related documents record the purpose and process of defining specific characters in the Nag Mundari block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | Document |
---|---|---|---|---|
15.0 | U+1E4D0..1E4F9 | 42 | L2/21-031R | Wolf-Sonkin, Lawrence; Mandal, Biswajit (2021-04-23), Proposal to Encode the Mundari Bani Script |
L2/21-016R | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai (2021-01-14), "13. Mundari Bani", Recommendations to UTC #166 January 2021 on Script Proposals | |||
L2/21-073 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai (2021-04-23), "7. Nag Mundari", Recommendations to UTC #167 April 2021 on Script Proposals | |||
L2/21-066 | Moore, Lisa (2021-05-05), "Consensus 167-C6", UTC #167 Minutes | |||
|
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.
Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:
The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.
The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). C1 Controls (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.
Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.
The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.
Tibetan is a Unicode block containing characters for the Tibetan, Dzongkha, and other languages of China, Bhutan, Nepal, Mongolia, northern India, eastern Pakistan and Russia.
Myanmar is a Unicode block containing characters for the Burmese, Mon, Shan, Palaung, and the Karen languages of Myanmar, as well as the Aiton and Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.
Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block, added in version 8.0, contains the rest of the lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.
CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for kunqu added in Unicode 13.0, and two characters for the Macao Supplementary Character Set added in Unicode 14.0.
CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF). The square forms can have different presentations when they are used in horizontal or vertical text. For example, the characters U+333E㌾SQUARE BORUTO and U+3327㌧SQUARE TON should look different in horizontal and in vertical right-to-left: ㌧㌾
Deseret is a Unicode block containing characters in the Deseret alphabet, which were invented by the Church of Jesus Christ of Latter-day Saints to write English. The Deseret block was derived from an earlier private use encoding in the ConScript Unicode Registry, like the Shavian and Phaistos Disc encodings. The block was added in version 3.1 of the Unicode Standard; the letters Oi and Ew, both uppercase and lowercase, were added in version 4.0.
Optical Character Recognition is a Unicode block containing signal characters for OCR and MICR standards.
Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.
Phoenician is a Unicode block containing characters used across the Mediterranean world from the 12th century BCE to the 3rd century CE. The Phoenician alphabet was added to the Unicode Standard in July 2006 with the release of version 5.0. An alternative proposal to handle it as a font variation of Hebrew was turned down.
Coptic Epact Numbers is a Unicode block containing Old Coptic number forms.
Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.
Glagolitic Supplement is a Unicode block containing supplementary characters used in the Glagolitic script. It currently contains 38 combining letters.
Mundari Bani is the writing system created for the Mundari language, spoken in eastern India. Mundari is an Austroasiatic language. Mundari Bani has 27 letters and five diacritics, the forms of which are intended to evoke natural shapes. The script is written from left to right.