Pau Cin Hau (Unicode block)

Last updated
Pau Cin Hau
RangeU+11AC0..U+11AFF
(64 code points)
Plane SMP
Scripts Pau Cin Hau
Major alphabetsPau Cin Hau
Assigned57 code points
Unused7 reserved code points
Unicode version history
7.0 (2014)57 (+57)
Code chart
Note: [1] [2]

Pau Cin Hau is a Unicode block containing characters for the Pau Cin Hau alphabet which was created by Pau Cin Hau, founder of the Laipian religion, to represent his religious teachings. [3] It was used primarily in the 1930s to write Tedim which is spoken in Chin State, Myanmar.

Contents

Block

Pau Cin Hau [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+11ACx𑫀𑫁𑫂𑫃𑫄𑫅𑫆𑫇𑫈𑫉𑫊𑫋𑫌𑫍𑫎𑫏
U+11ADx𑫐𑫑𑫒𑫓𑫔𑫕𑫖𑫗𑫘𑫙𑫚𑫛𑫜𑫝𑫞𑫟
U+11AEx𑫠𑫡𑫢𑫣𑫤𑫥𑫦𑫧𑫨𑫩𑫪𑫫𑫬𑫭𑫮𑫯
U+11AFx𑫰𑫱𑫲𑫳𑫴𑫵𑫶𑫷𑫸
Notes
1. ^ As of Unicode version 14.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Pau Cin Hau block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
7.0U+11AC0..11AF857 L2/10-080 N3781 Pandey, Anshuman (2010-02-28), Preliminary Proposal to Encode the Pau Cin Hau Script in ISO/IEC 10646
L2/10-092R N3784R Pandey, Anshuman (2010-05-22), Defining Properties for Tone Marks of the Pau Cin Hau Script
L2/10-073R1 N3865R Pandey, Anshuman (2010-07-26), Allocating the Pau Cin Hau Scripts in the Unicode Roadmap
L2/10-437 N3960 Pandey, Anshuman (2010-10-27), Preliminary Proposal to Encode the Pau Cin Hau Alphabet in ISO/IEC 10646
L2/11-104R Pandey, Anshuman (2011-04-27), Proposal to Encode the Pau Cin Hau Alphabet in ISO/IEC 10646
N4017 Pandey, Anshuman (2011-04-27), Proposal to Encode the Pau Cin Hau Alphabet in ISO/IEC 10646
L2/11-116 Moore, Lisa (2011-05-17), "D.3", UTC #127 / L2 #224 Minutes
L2/11-287 N4129 Pandey, Anshuman (2011-07-25), Proposal to Change the Names for Some Pau Cin Hau Characters
L2/11-298 Anderson, Deborah; McGowan, Rick; Whistler, Ken (2011-07-27), "6. Pau Cin Hau", South Asian subcommittee report
L2/11-261R2 Moore, Lisa (2011-08-16), "D.9", UTC #128 / L2 #225 Minutes
N4103 "11.4 Pau Cin Hau alphabet", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
L2/13-086 Anderson, Deborah; McGowan, Rick; Whistler, Ken; Pournader, Roozbeh (2013-04-26), "7", Recommendations to UTC on Script Proposals
L2/13-067 N4412 Pandey, Anshuman (2013-04-27), Preliminary Code Chart for the Pau Cin Hau Syllabary
N4403 (pdf, doc)Umamaheswaran, V. S. (2014-01-28), "10.2.3 Pau Cin Hau syllabary - chart", Unconfirmed minutes of WG 2 meeting 61, Holiday Inn, Vilnius, Lithuania; 2013-06-10/14
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined.

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-4 or North European. It was designed to cover Estonian, Latvian, Lithuanian, Greenlandic, and Sami. It has been largely superseded by ISO/IEC 8859-10 and Unicode. Microsoft has assigned code page 28594 a.k.a. Windows-28594 to ISO-8859-4 in Windows. IBM has assigned code page 914 to ISO 8859-4.

ISO/IEC 8859-6:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Arabic. It was designed to cover Arabic. Only nominal letters are encoded, no preshaped forms of the letters, so shaping processing is required for display. It does not include the extra letters needed to write most Arabic-script languages other than Arabic itself.

ISO/IEC 8859-7:2003, Information technology — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Greek. It was designed to cover the modern Greek language. The original 1987 version of the standard had the same character assignments as the Greek national standard ELOT 928, published in 1986. The table in this article shows the updated 2003 version which adds three characters. Microsoft has assigned code page 28597 a.k.a. Windows-28597 to ISO-8859-7 in Windows. IBM has assigned code page 813 to ISO 8859-7. (IBM CCSID 813 is the original encoding. CCSID 4909 adds the euro sign. CCSID 9005 further adds the drachma sign and ypogegrammeni.)

Pau Cin Hau is the founder and the name of a religion followed by some Tedim, Hakha in Chin state and Kale in Sagaing division in the north-western part of Burma.

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The ISO basic Latin alphabet is a Latin-script alphabet and consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the English alphabet.

Coptic Epact Numbers is a Unicode block containing old Coptic number forms.

Mahajani is a Unicode block containing characters historically used for writing Punjabi and Marwari.

Modi is a Unicode block containing the Modi alphabet characters for writing the Marathi language.

Siddham is a Unicode block containing characters for the historical, Brahmi-derived Siddham script used for writing Sanskrit between the years c. 550 – c. 1200.

Tirhuta is a Unicode block containing characters for Brahmi-derived Tirhuta script which was the primary writing system for Maithili in Bihar, India and Madhesh, Nepal until the 20th century.

Ahom is a Unicode block containing characters used for writing the Ahom alphabet, which was used to write the Ahom language spoken by the Ahom people in Assam between the 13th and the 18th centuries.

Multani is a Unicode block containing characters used for writing the Multani alphabet, a Brahmic script used in the Multan region of Punjab and in northern Sindh in Pakistan. The script is now obsolete, but was historically used to write the Saraiki language.

The Pau Cin Hau scripts, known as Pau Cin Hau lai, or tual lai in Zomi, are two scripts, a logographic script and an alphabetic script created by Pau Cin Hau, a Zomi religious leader from Chin State, Burma. The logographic script consists of 1,050 characters, which is a traditionally significant number based on the number of characters appearing in a religious text. The alphabetic script is a simplified script of 57 characters, which is divided into 21 consonants, 7 vowels, 9 final consonants, and 20 tone, length, and glottal marks. The original script was produced in 1902, but it is thought to have undergone at least two revisions, of which the first revision produced the logographic script.

Bhaiksuki is a Unicode block containing characters from the Bhaiksuki alphabet, which is a Brahmi-based script that was used for writing Sanskrit during the 11th and 12th centuries CE, mainly in the present-day states of Bihar and West Bengal in India, and in parts of Bangladesh.

Newa is a Unicode block containing characters from the Newa alphabet, which is used to write Nepal Bhasa.

Osage is a Unicode block containing characters from the Osage alphabet, which was devised in 2006 for writing the Osage language spoken by the Osage people of Oklahoma, United States.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. Pandey, Anshuman (2011-04-27). "N4017: Proposal to Encode the Pau Cin Hau Alphabet in ISO/IEC 10646" (PDF). Working Group Document, ISO/IEC JTC1/SC2/WG2.