Common Indic Number Forms

Last updated
Common Indic Number Forms
RangeU+A830..U+A83F
(16 code points)
Plane BMP
Scripts Common
Symbol setsIndic numbers
Assigned10 code points
Unused6 reserved code points
Unicode version history
5.210 (+10)
Note: [1] [2]

Common Indic Number Forms is a Unicode block containing characters for representing fractions in north India, Pakistan, and Nepal.

Common Indic Number Forms [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+A83x
Notes
1. ^ As of Unicode version 13.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Common Indic Number Forms block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
5.2U+A830..A83910 L2/03-102 Vikas, Om (2003-03-04), Unicode Standard for Indic Scripts
L2/03-101.3 Proposed Changes in Indic Scripts [Gujarati document], 2003-03-04
L2/04-358 Jain, Manoj (2004-09-29), Encoding of Gujarati Signs Pao, Addho & Pono in Gujarati code block
L2/04-402 Muller, Eric (2004-11-14), Clarifications on L2/04-358, Gujarati fractions
L2/04-418 Muller, Eric (2004-11-18), "Gujarati fractions", Report of the Indic ad-hoc
L2/05-063 Vikas, Om (2005-02-07), "Awaiting Updates-Gujarati", Issues in Representation of Indic Scripts in Unicode
L2/05-070 McGowan, Rick (2005-02-09), Indic ad hoc report
L2/05-026 Moore, Lisa (2005-05-16), "Scripts - Indic (C.12)", UTC #102 Minutes
N3353 (pdf, doc)Umamaheswaran, V. S. (2007-10-10), "M51.17", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27
L2/07-139 N3312 Pandey, Anshuman (2007-05-04), Proposal to Encode North Indian Accounting Signs in Plane 1 of ISO/IEC 10646
L2/07-118R2 Moore, Lisa (2007-05-23), "Consensus 111-C18", UTC #111 Minutes
L2/07-238 N3334 Pandey, Anshuman (2007-07-31), Towards an Encoding for North Indic Number Forms in the UCS
L2/07-272 Muller, Eric (2007-08-10), "7", Report of the South Asia subcommittee
L2/07-225 Moore, Lisa (2007-08-21), "North Indic Number Forms", UTC #112 Minutes
L2/07-354 N3367 Pandey, Anshuman (2007-10-07), Proposal to Encode North Indic Number Forms
L2/07-390 Anderson, Deborah (2007-10-14), Changes in L2/07-354 North Indic Number Forms (vs. L2/07-139)
L2/17-340 Johny, Cibu (2017-09-22), Request to Annotate North Indian Quarter Signs for Malayalam Usage
L2/17-424 A, Srinidhi; A, Sridatta (2017-12-08), Changes to ScriptExtensions.txt for Indic characters for Unicode 11.0
L2/18-039 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Cook, Richard (2018-01-19), "North Indian Quarter Signs, ScriptExtensions.txt changes for Indic", Recommendations to UTC #154 January 2018 on Script Proposals
L2/18-007 Moore, Lisa (2018-03-19), "Action item 154-A120", UTC #154 Minutes, Make script extension changes in version 11.0 as documented in section 6B, pages 6-9 of L2/18-039.
L2/18-115 Moore, Lisa (2018-05-09), "Action item 154-A118", UTC #155 Minutes, Update U+A830..U+A832 in ScriptExtensions.txt with script code Mlym for Unicode 11.0.
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Unicode Character encoding standard

Unicode is a information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of March 2020 the most recent version, Unicode 13.0, contains a repertoire of 143,924 characters covering 154 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, and both are code-for-code identical.

The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanization of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the nineteenth century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0-25FF.

Number Forms is a Unicode block containing characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1, which was incorporated whole as the Latin-1 supplement block.

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.

Block Elements is a Unicode block containing square block symbols of various fill and shading. Used along with block elements are box-drawing characters, shade characters, and terminal graphic characters. These can be used for filling regions of the screen and portraying drop shadows.

Specials is a short Unicode block allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five are assigned as of Unicode 13.0:

As of Unicode 12.0, the Arabic script is contained in the following blocks:

The Latin-1 Supplement is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.

Arabic is a Unicode block, containing the standard letters and the most common diacritics of the Arabic script, and the Arabic-Indic digits.

Hebrew is a Unicode block containing characters for writing the Hebrew, Yiddish, Ladino, and other Jewish diaspora languages.

Vedic Extensions is a Unicode block containing characters for representing tones and other vedic symbols in Devanagari and other Indic scripts. Related symbols are defined in two other blocks: Devanagari (U+0900–U+097F) and Devanagari Extended (U+A8E0–U+A8FF).

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Brahmi is a Unicode block containing characters written in India from the 3rd century BCE through the first millennium CE. It is the predecessor to all modern Indic scripts.

Coptic Epact Numbers is a Unicode block containing old Coptic number forms.

Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.