Oriya (Unicode block)

Last updated
Oriya
RangeU+0B00..U+0B7F
(128 code points)
Plane BMP
Scripts Oriya
Major alphabetsOriya
Khondi
Santali
Assigned91 code points
Unused37 reserved code points
Source standards ISCII
Unicode version history
1.0.0 (1991)78 (+78)
1.1 (1993)79 (+1)
4.0 (2003)81 (+2)
5.1 (2008)84 (+3)
6.0 (2010)90 (+6)
13.0 (2020)91 (+1)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2] [3]

Oriya is a Unicode block containing characters for the Odia, Khondi and Santali languages of the state of Odisha in India. In its original incarnation, the code points U+0B01..U+0B4D were a direct copy of the Odia characters A1-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Contents

Odia script combines symbols into hundreds of consonant ligatures.

Block

Oriya [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+0B0x
U+0B1x
U+0B2x
U+0B3xି
U+0B4x
U+0B5x
U+0B6x
U+0B7x
Notes
1. ^ As of Unicode version 15.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Oriya block:

Version Final code points [lower-alpha 1] Count UTC  ID L2  ID WG2  IDDocument
1.0.0U+0B01..0B03, 0B05..0B0C, 0B0F..0B10, 0B13..0B28, 0B2A..0B30, 0B32..0B33, 0B36..0B39, 0B3C..0B43, 0B47..0B48, 0B4B..0B4D, 0B57, 0B5C..0B5D, 0B5F..0B61, 0B66..0B7078UTC/1991-056Whistler, Ken, Indic Charts: Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam
UTC/1991-057Whistler, Ken, Indic names list
UTC/1991-048B Whistler, Ken (1991-03-27), "III. L. Walk In proposals", Draft Minutes from the UTC meeting #46 day 2, 3/27 at Apple
L2/01-303 Vikas, Om (2001-07-26), Letter from the Government from India on "Draft for Unicode Standard for Indian Scripts"
L2/01-304 Feedback on Unicode Standard 3.0, 2001-08-02
L2/01-305 McGowan, Rick (2001-08-08), Draft UTC Response to L2/01-304, "Feedback on Unicode Standard 3.0"
L2/01-430R McGowan, Rick (2001-11-20), UTC Response to L2/01-304, "Feedback on Unicode Standard 3.0"
L2/20-055 Pournader, Roozbeh (2020-01-16), Proposed sequences for composition exclusions
L2/20-015 Moore, Lisa (2020-01-23), "B.13.1.1 Proposed sequences for composition exclusions", Draft Minutes of UTC Meeting 162
1.1U+0B561(to be determined)
4.0U+0B35, 0B712 L2/01-431R [lower-alpha 2] McGowan, Rick (2001-11-08), Actions for UTC and Editorial Committee in response to L2/01-430R
L2/01-405R Moore, Lisa (2001-12-12), "Consensus 89-C19", Minutes from the UTC/L2 meeting in Mountain View, November 6-9, 2001, Accept the twelve Indic characters with names and coding positions as documented in L2/01-431R
L2/02-117 N2425 McGowan, Rick (2002-03-21), Additional Characters for Indic Scripts
L2/02-425 Everson, Michael; Stone, Anthony (2002-11-20), On Oriya VA and WA
N2525 Everson, Michael; Stone, Anthony (2002-11-21), On Oriya VA and WA, and a proposal to encode one Oriya letter in the UCS
L2/03-102 Vikas, Om (2003-03-04), Unicode Standard for Indic Scripts
L2/03-101.7 Proposed Changes in Indic Scripts [Oriya document], 2003-03-04
5.1U+0B44, 0B62..0B633 L2/03-102 Vikas, Om (2003-03-04), Unicode Standard for Indic Scripts
L2/03-101.7 Proposed Changes in Indic Scripts [Oriya document], 2003-03-04
L2/05-063 Vikas, Om (2005-02-07), "Awaiting Updates-Bengali & Oriya", Issues in Representation of Indic Scripts in Unicode
L2/05-070 McGowan, Rick (2005-02-09), Indic ad hoc report
L2/05-026 Moore, Lisa (2005-05-16), "Scripts - Indic (C.12)", UTC #102 Minutes
L2/07-095R N3235R Everson, Michael; Scharf, Peter; Angot, Michel; Chandrashekar, R.; Hyman, Malcolm; Rosenfield, Susan; Sastry, B. V. Venkatakrishna; Witzel, Michael (2007-04-13), Proposal to encode characters for Vedic Sanskrit in the BMP of the UCS
L2/07-118R2 Moore, Lisa (2007-05-23), "111-C17", UTC #111 Minutes
L2/07-196 N3272 Everson, Michael (2007-05-25), Proposal to encode four characters for Oriya and Malayalam
L2/07-268 N3253 (pdf, doc)Umamaheswaran, V. S. (2007-07-26), "M50.23", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27
6.0U+0B72..0B776 L2/07-413 Pandey, Anshuman (2007-12-04), Proposal to Encode Oriya Fraction Signs
L2/08-199 N3471 Pandey, Anshuman (2008-05-05), Proposal to Encode Oriya Fraction Signs in ISO/IEC 10646
L2/08-199R Pandey, Anshuman (2008-05-05), Proposal to Encode Oriya Fraction Signs in ISO/IEC 10646
L2/08-161R2 Moore, Lisa (2008-11-05), "Oriya Fraction Signs", UTC #115 Minutes
L2/08-412 N3553 (pdf, doc)Umamaheswaran, V. S. (2008-11-05), "M53.24d", Unconfirmed minutes of WG 2 meeting 53
13.0U+0B551 L2/19-005R2 N5023 Evans, Lorna (2019-01-01), Proposal to encode ORIYA SIGN OVERLINE
L2/19-047 Anderson, Deborah; et al. (2019-01-13), "11. Oriya", Recommendations to UTC #158 January 2019 on Script Proposals
L2/19-008 Moore, Lisa (2019-02-08), "D.4", UTC #158 Minutes
L2/19-286 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai (2019-07-22), "9. Oriya", Recommendations to UTC #160 July 2019 on Script Proposals
L2/19-270 Moore, Lisa (2019-08-02), "D.9", UTC #160 Minutes
  1. Proposed code points and characters names may differ from final code points and names
  2. See also L2/01-303, L2/01-304, L2/01-305, and L2/01-430R

Related Research Articles

Oriya may refer to:

<span class="mw-page-title-main">Odia script</span> Script primarily used to write the Odia language

The Odia script is a Brahmic script used to write primarily Odia language and others including Sanskrit and other regional languages. It is one of the official scripts of the Indian Republic. The script has developed over more than 1000 years from a variant of Siddhaṃ script which was used in Eastern India, where the characteristic top line transformed into a distinct round umbrella shape due to the influence of palm leaf manuscripts and also being influenced by the neighbouring scripts from the Western and Southern regions.

Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Bengali–Assamese, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. ISCII does not encode the writing systems of India that are based on Persian, but its writing system switching codes nonetheless provide for Kashmiri, Sindhi, Urdu, Persian, Pashto and Arabic. The Persian-based writing systems were subsequently encoded in the PASCII encoding.

In Indic scripts, the daṇḍa is a punctuation mark. The glyph consists of a single vertical stroke.

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.

Odia, also spelled Oriya or Odiya, may refer to:

Devanagari is a Unicode block containing characters for writing languages such as Hindi, Marathi, Bodo, Maithili, Sindhi, Nepali, and Sanskrit, among others. In its original incarnation, the code points U+0900..U+0954 were a direct copy of the characters A0-F4 from the 1988 ISCII standard. The Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Bengali Unicode block contains characters for the Bengali, Assamese, Bishnupriya Manipuri, Daphla, Garo, Hallam, Khasi, Mizo, Munda, Naga, Riang, and Santali languages. In its original incarnation, the code points U+0981..U+09CD were a direct copy of the Bengali characters A1-ED from the 1988 ISCII standard, as well as several Assamese ISCII characters in the U+09F0 column. The Devanagari, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on ISCII encodings.

Gurmukhi is a Unicode block containing characters for the Punjabi language, in the Gurmukhi script. In its original incarnation, the code points U+0A02..U+0A4C were a direct copy of the Gurmukhi characters A2-EC from the 1988 ISCII standard. The Devanagari, Bengali, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Gujarati is a Unicode block containing characters for writing the Gujarati language. In its original incarnation, the code points U+0A81..U+0AD0 were a direct copy of the Gujarati characters A1-F0 from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Tamil is a Unicode block containing characters for the Tamil, and Saurashtra languages of Tamil Nadu India, Sri Lanka, Singapore, and Malaysia. In its original incarnation, the code points U+0B82..U+0BCD were a direct copy of the Tamil characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Telugu is a Unicode block containing characters for the Telugu, Gondi, and Lambadi languages of Indian states of Andhra Pradesh and Telangana. In its original incarnation, the code points U+0C01..U+0C4D were a direct copy of the Telugu characters A1-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Kannada is a Unicode block containing characters for the Kannada, Sanskrit, Konkani, Sankethi, Havyaka, Tulu and Kodava languages. In its original incarnation, the code points U+0C82..U+0CCD were a direct copy of the Kannada characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, and Malayalam blocks were similarly all based on their ISCII encodings.

Malayalam is a Unicode block containing characters of the Malayalam script. In its original incarnation, the code points U+0D02..U+0D4D were a direct copy of the Malayalam characters A2-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, and Kannada blocks were similarly all based on their ISCII encodings.

Sinhala is a Unicode block containing characters for the Sinhala and Pali languages of Sri Lanka, and is also used for writing Sanskrit in Sri Lanka. The Sinhala allocation is loosely based on the ISCII standard, except that Sinhala contains extra prenasalized consonant letters, leading to inconsistencies with other ISCII-Unicode script allocations.

CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include the South Korean KS X 1001:1998, Taiwanese Big5, Japanese IBM 32, South Korean KS X 1001:2004, Japanese JIS X 0213, Japanese ARIB STD-B24 and the North Korean KPS 10721-2000 source standards.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

References

  1. "Unicode 1.0.1 Addendum" (PDF). The Unicode Standard. 1992-11-03. Retrieved 2016-07-09.
  2. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  3. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.