Javanese (Unicode block)

Last updated
Javanese
RangeU+A980..U+A9DF
(96 code points)
Plane BMP
Scripts Javanese (90 char.)
Common (1 char.)
Major alphabetsAksara Jawa
Assigned91 code points
Unused5 reserved code points
Unicode version history
5.291 (+91)
Note: [1] [2]

Javanese is a Unicode block containing aksara Jawa characters traditionally used for writing the Javanese language. The Javanese script was added to the Unicode Standard in October 2009 with the release of version 5.2.

Contents

Block

The Unicode block for Javanese is U+A980U+A9DF. There are 91 code points for Javanese script: 53 letters, 19 punctuation marks, 10 numbers, and 9 vowels:

Javanese [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+A98x
U+A99x
U+A9Ax
U+A9Bxꦿ
U+A9Cx
U+A9Dx
Notes
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Javanese block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
5.2U+A980..A9CD, A9CF..A9D9, A9DE..A9DF91 L2/98-041 Hellingman, Jeroen (1997-05-19), Javanese Proposal
L2/98-070 Aliprand, Joan; Winkler, Arnold, "3.D.1", Minutes of the joint UTC and L2 meeting from the meeting in Cupertino, February 25-27, 1998
L2/06-080 Sayoga, Teguh Budi (2006-03-13), Proposal for encoding the Javanese Script in the UCS (A900-A97F)
L2/07-232R2 N3292R2 Everson, Michael (2007-07-31), Preliminary proposal for encoding the Javanese script in the UCS
L2/07-237 Pentzlin, Karl (2007-07-31), Comment on L2/07-232 - Suggestion to encode Javanese in the SMP
L2/07-248 Anderson, Deborah (2007-08-01), Re L2/07-237 Comment on L2/07-232 Suggestion to encode Javanese in the SMP
L2/07-295 N3319R Everson, Michael (2007-09-11), Proposal for encoding the Javanese script in the UCS
L2/07-298 N3329 Everson, Michael (2007-09-11), Javanese government support for encoding the Javanese script
L2/08-003 Moore, Lisa (2008-02-14), "Javanese", UTC #114 Minutes
L2/08-015R N3319R3 Everson, Michael (2008-03-06), Proposal for encoding the Javanese script in the UCS
L2/08-318 N3453 (pdf, doc)Umamaheswaran, V. S. (2008-08-13), "M52.7", Unconfirmed minutes of WG 2 meeting 52
L2/09-122 N3613 Anderson, Deborah (2009-04-10), Name correction for FPDAM 6: Javanese A9C0 JAVANESE PANGKON
L2/09-234 N3603 (pdf, doc)Umamaheswaran, V. S. (2009-07-08), "M54.03a", Unconfirmed minutes of WG 2 meeting 54
L2/09-104 Moore, Lisa (2009-05-20), "Consensus 119-C19", UTC #119 / L2 #216 Minutes, Approve the name changes in section A, B, and C of document L2/09-177... [U+A9C0]
L2/09-350 Whistler, Ken (2009-10-21), Property Correction for U+A9B3 Javanese Sign Cecak Telu
L2/16-327 McGowan, Rick (2016-11-07), "Incorrect Indic positional category for Javanese consonant sign cakra", Comments on Public Review Issues (July 27 - Nov 7, 2016)
L2/17-037 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa; Liang, Hai; Ishida, Richard; Misra, Karan; McGowan, Rick (2017-01-21), "11. Javanese", Recommendations to UTC #150 January 2017 on Script Proposals
L2/17-038 Lindenberg, Norbert (2017-01-21), Indic positional category for Javanese cakra
L2/17-016 Moore, Lisa (2017-02-08), "B.14.5 Indic positional category for Javanese cakra", UTC #150 Minutes
L2/17-163 Pournader, Roozbeh (2017-05-09), Indic Syllabic Category of Javanese Cakra
L2/17-103 Moore, Lisa (2017-05-18), "B.14.7 Indic positional category of Javanese Cakra", UTC #151 Minutes
L2/19-003 Liang, Hai; Perdana, Aditya Bayu (2019-01-04), Suspicious identity of U+A9B5 JAVANESE VOWEL SIGN TOLONG
L2/19-004 Liang, Hai; Perdana, Aditya Bayu (2019-01-04), Properties of U+A9BD JAVANESE CONSONANT SIGN KERET
L2/19-047 Anderson, Deborah; et al. (2019-01-13), "16.a. Javanese Consonant sign Keret, 16.b. Javanese Vowel sign Tolong", Recommendations to UTC #158 January 2019 on Script Proposals
L2/19-008 Moore, Lisa (2019-02-08), "B.14.1 Properties of U+A9BD JAVANESE CONSONANT SIGN KERET, C.4 Suspicious identity of U+A9B5 JAVANESE VOWEL SIGN TOLONG", UTC #158 Minutes
L2/19-083 Lindenberg, Norbert; Perdana, Aditya Bayu (2019-03-22), Positional category of Javanese pengkal
L2/19-173 Anderson, Deborah; et al. (2019-04-29), "17. Javanese", Recommendations to UTC #159 April-May 2019 on Script Proposals
L2/19-122 Moore, Lisa (2019-05-08), "B.13.1 Positional category of Javanese pengkal", UTC #159 Minutes
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane (U+E000U+F8FF), and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

Number Forms is a Unicode block containing characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1, which was incorporated whole as the Latin-1 supplement block.

Yi Syllables is a Unicode block containing the characters of the Liangshan Standard Yi script for writing the Nuosu, or Yi, language.

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Gujarati is a Unicode block containing characters for writing the Gujarati language. In its original incarnation, the code points U+0A81..U+0AD0 were a direct copy of the Gujarati characters A1-F0 from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Oriya is a Unicode block containing characters for the Oriya (Odia), Khondi, and Santali languages of the state of Odisha in India. In its original incarnation, the code points U+0B01..U+0B4D were a direct copy of the Oriya characters A1-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Tamil is a Unicode block containing characters for the Tamil, Badaga, and Saurashtra languages of Tamil Nadu India, Sri Lanka, Singapore, and Malaysia. In its original incarnation, the code points U+0B02..U+0BCD were a direct copy of the Tamil characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Malayalam is a Unicode block containing characters of the Malayalam script. In its original incarnation, the code points U+0D02..U+0D4D were a direct copy of the Malayalam characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, and Kannada blocks were similarly all based on their ISCII encodings.

Cherokee is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee block contains all the uppercase letters plus six lowercase letters. The Cherokee Supplement block contains the rest of the lowercase letters.

Vedic Extensions is a Unicode block containing characters for representing tones and other vedic symbols in Devanagari and other Indic scripts. Related symbols are defined in two other blocks: Devanagari (U+0900–U+097F) and Devanagari Extended (U+A8E0–U+A8FF).

Phags-pa is a Unicode block containing characters from the 'Phags-pa script promulgated as a national script by Kublai Khan. It was used primarily in writing Mongolian and Chinese, although it was intended for the use of all languages of the Mongol Empire.

Currency Symbols is a Unicode block containing characters for representing unique monetary signs. Many currency signs can be found in other unicode blocks, especially when the currency symbol is unique to a country that uses a script not generally used outside that country.

Balinese is a Unicode block containing characters of Balinese script for the Balinese language. Balinese language is mainly spoken on the island of Bali, Indonesia.

Sundanese is a Unicode block containing modern characters for writing the Sundanese script of the Sundanese language of the island of Java, Indonesia.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the last of the Basic Multilingual Plane excepting the short Specials block at U+FFF0–FFFF.

Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Sakha and Americanist usage.

Manichaean is a Unicode block containing characters historically used for writing Sogdian, Parthian, and the dialects of Fars.

Old North Arabian is a Unicode block containing characters for writing the Ancient North Arabian language.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.