Hanifi Rohingya (Unicode block)

Last updated
Hanifi Rohingya
RangeU+10D00..U+10D3F
(64 code points)
Plane SMP
Scripts Hanifi Rohingya
Assigned50 code points
Unused14 reserved code points
Unicode version history
11.050 (+50)
Note: [1] [2]

Hanifi Rohingya is a Unicode block containing characters for Hanifi Rohingya script used for writing the Rohingya language in Myanmar and Bangladesh. [3]

Contents

Block

Hanifi Rohingya [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+10D0x𐴀𐴁𐴂𐴃𐴄𐴅𐴆𐴇𐴈𐴉𐴊𐴋𐴌𐴍𐴎𐴏
U+10D1x𐴐𐴑𐴒𐴓𐴔𐴕𐴖𐴗𐴘𐴙𐴚𐴛𐴜𐴝𐴞𐴟
U+10D2x𐴠𐴡𐴢𐴣𐴤𐴥𐴦𐴧
U+10D3x𐴰𐴱𐴲𐴳𐴴𐴵𐴶𐴷𐴸𐴹
Notes
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Hanifi Rohingya block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
11.0U+10D00..10D27, 10D30..10D3950 L2/12-214 N4283 Pandey, Anshuman (2012-06-20), Preliminary Proposal to Encode the Rohingya Script
L2/12-267 Anderson, Deborah; McGowan, Rick; Whistler, Ken (2012-07-21), "IX. ROHINGYA", Review of Indic-related documents and Recommendations to the UTC
L2/13-028 Anderson, Deborah; McGowan, Rick; Whistler, Ken; Pournader, Roozbeh (2013-01-28), "24", Recommendations to UTC on Script Proposals
L2/15-278R Pandey, Anshuman (2015-12-31), Proposal to encode the Hanifi Rohingya script in Unicode
L2/16-037 Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu (2016-01-22), "4. Hanifi Rohingya", Recommendations to UTC #146 January 2016 on Script Proposals
L2/16-311R N4813 Pandey, Anshuman (2016-12-31), Revised proposal to encode Hanifi Rohingya
L2/17-037 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa; Liang, Hai; Ishida, Richard; Misra, Karan; McGowan, Rick (2017-01-21), "7. Hanifi Rohingya", Recommendations to UTC #150 January 2017 on Script Proposals
L2/17-016 Moore, Lisa (2017-02-08), "D.10", UTC #150 Minutes
L2/18-115 Moore, Lisa (2018-05-09), "Consensus 154-C14", UTC #155 Minutes
N5020 (pdf, doc)Umamaheswaran, V. S. (2019-01-11), "M67.01", Unconfirmed minutes of WG 2 meeting 67
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Unicode Character encoding standard

Unicode is an information technology (IT) standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of March 2020, there is a repertoire of 143,859 characters, with Unicode 13.0 covering 154 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, and both are code-for-code identical.

In a right-to-left, top-to-bottom script, writing starts from the right of the page and continues to the left, proceeding from top to bottom for new lines. This can be contrasted against left-to-right writing systems, where writing starts from the left of the page and continues to the right.

In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 13.0, seven of the planes have assigned code points (characters), and five are named.

The Unicode Standard assigns character properties to each code point. These properties can be used to handle "characters" in processes, like in line-breaking, script direction right-to-left or applying controls. Slightly inconsequently, some "character properties" are also defined for code points that have no character assigned, and code points that are labeled like "<not a character>". The character properties are described in Standard Annex #44.

CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese characters. When compared with other blocks containing CJK Unified Ideographs, it is also referred to as the Unified Repertoire and Ordering (URO).

Kana Supplement is a Unicode block containing one archaic katakana character and 255 hentaigana characters. Additional hentaigana characters are encoded in the Kana Extended-A block.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the last of the Basic Multilingual Plane excepting the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Kana Extended-A is a Unicode block containing hentaigana characters. Additional hentaigana characters are encoded in the Kana Supplement block.

The Hanifi Rohingya script is a unified script for the Rohingya language. Rohingya was first written in the 19th century with a version of the Perso-Arabic script. In 1975, an orthographic Arabic script was developed, based on the Urdu alphabet.

Gunjala Gondi is a Unicode block containing characters of Gunjala Gondi script used for writing the Adilabad dialect of the Gondi language.

Medefaidrin is a Unicode block containing characters for the constructed script Medefaidrin which is used to write the constructed language of the same name. The Medefaidrin language and script were created as a Christian sacred language by an Ibibio congregation in 1930s Nigeria.

Chess Symbols is a Unicode block containing characters for chess notations beyond the basic Western chess symbols in the Miscellaneous Symbols block, as well as symbols representing game pieces for xiangqi.

Georgian Extended is a Unicode block containing Georgian Mtavruli letters that function as uppercase versions of their Mkhedruli counterparts in the Georgian block. Unlike all other casing scripts in Unicode, there is no title casing between Mkhedruli and Mtavruli letters, because Mtavruli is typically used only in all-caps text, although there have been some historical attempts at capitalization.

Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.

Makasar is a Unicode block containing characters for Makasar script . The script was used historically in South Sulawesi, Indonesia for writing the Makassarese language.

Old Sogdian is a Unicode block containing characters for a group of related, non-cursive Sogdian writing systems used to write historic Sogdian in the 3rd to 5th centuries CE.

Sogdian is a Unicode block containing characters used to write the Sogdian language from the 7th to 14th centuries CE.

Small Kana Extension is a Unicode block containing additional small variants for the Hiragana and Katakana syllabaries, in addition to those in the Hiragana, Katakana and Katakana Phonetic Extensions blocks.

Dogri script Unicode character block

Dogra is a Unicode block containing characters of the Dogri script originally used for writing the Dogri language in Jammu and Kashmir in the northern part of the Indian subcontinent. The Takri script version of Jammu is known as Dogra Akkhar

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2018-06-06.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2018-06-06.
  3. "Chapter 16: Southeast Asia". The Unicode Standard, Version 11.0 (PDF). Mountain View, CA: Unicode, Inc. June 2018. ISBN   978-1-936213-19-1.