Elbasan (Unicode block)

Last updated
Elbasan
RangeU+10500..U+1052F
(48 code points)
Plane SMP
Scripts Elbasan
Major alphabets Elbasan
Assigned40 code points
Unused8 reserved code points
Unicode version history
7.040 (+40)
Note: [1] [2]

Elbasan is a Unicode block containing the historic Elbasan characters for writing the Albanian language.

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

The Elbasan script is a mid 18th-century alphabetic script used for the Albanian language. It was named after the city of Elbasan, where it was invented, and was used mainly in the area of Elbasan and Berat, and is the oldest original script used to write Albanian. It was created for the "Elbasan Gospel Manuscript", also known as the Anonimi i Elbasanit, which is the primary document associated with it. The document was created at St. Jovan Vladimir's Church in central Albania, but is preserved today at the National Archives of Albania in Tirana. Its 59 pages contain Biblical content written in an alphabet of 40 letters, of which 35 frequently recur and 5 are rare. The name "Papa Totasi" is written on the cover's verso, thus sometimes the script is attributed to him.

Albanian language Indo-European language

Albanian is an Indo-European language spoken by the Albanians in the Balkans and the Albanian diaspora in the Americas, Europe and Oceania. With about 7.5 million speakers, it comprises an independent branch within the Indo-European languages and is not closely related to any other language in Europe.

Elbasan [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+1050x𐔀𐔁𐔂𐔃𐔄𐔅𐔆𐔇𐔈𐔉𐔊𐔋𐔌𐔍𐔎𐔏
U+1051x𐔐𐔑𐔒𐔓𐔔𐔕𐔖𐔗𐔘𐔙𐔚𐔛𐔜𐔝𐔞𐔟
U+1052x𐔠𐔡𐔢𐔣𐔤𐔥𐔦𐔧
Notes
1. ^ As of Unicode version 12.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Elbasan block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
7.0U+10500..1052740 L2/09-328 Anderson, Deborah; Glavy, Jason (2009-11-30), Old Albanian Scripts
L2/10-216 N3856 Everson, Michael; Elsie, Robert (2010-06-23), Preliminary proposal for encoding the Elbasan script in the SMP of the UCS
L2/11-050 N3985 Everson, Michael (2011-02-03), Proposal for encoding the Elbasan script in the SMP of the UCS
L2/11-016 Moore, Lisa (2011-02-15), "C.7", UTC #126 / L2 #223 Minutes
N4103 "11.2.9 Elbasan script", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0-25FF.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorise these characters as being "letterlike".

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.

Block Elements is a Unicode block containing square block symbols of various fill and shading. Used along with block elements are box-drawing characters, shade characters, and terminal graphic characters. These can be used for filling regions of the screen and portraying drop shadows.

Specials is a short Unicode block allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five are assigned as of Unicode 12.0:

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.

CJK Unified Ideographs Extension-A is a Unicode block containing rare Han ideographs.

Cyrillic is a Unicode block containing the characters used to write the most widely used languages with a Cyrillic orthography. The core of the block is based on the ISO 8859-5 standard, with additions for minority languages and historic orthographies.

Hebrew is a Unicode block containing characters for writing the Hebrew, Yiddish, Ladino, and other Jewish diaspora languages.

Georgian Supplement is a Unicode block containing characters for the ecclesiastical form of the Georgian script, Nuskhuri. To write the full ecclesiastical Khutsuri orthography, the Asomtavruli capitals encoded in the Georgian block.

Tibetan is a Unicode block containing characters for the Tibetan, Dzongkha, and other languages of Tibet, Bhutan, Nepal, and northern India. The Tibetan Unicode block is unique for having been allocated as a standard virama-based encoding for version 1.0, removed from the Unicode Standard when unifying with ISO 10646 for version 1.1, then reintroduced as an explicit root/subjoined encoding, with a larger block size in version 2.0.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Katakana Phonetic Extensions is a Unicode block containing additional small katakana characters for writing the Ainu language, in addition to characters in the Katakana block.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. During the unification with ISO 10646 for version 1.1, the Japanese Industrial Standard Symbol was reassigned from the code point U+32FF at the end of the block to U+3004. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

Kana Supplement is a Unicode block containing one archaic katakana character and 255 hentaigana characters. Additional hentaigana characters are encoded in the Kana Extended-A block.

Byzantine Musical Symbols is a Unicode block containing characters for representing Byzantine-era musical notation.

Ancient Greek Musical Notation is a Unicode block containing symbols representing musical notations used in ancient Greece.

Tai Viet is a Unicode block containing characters for writing the Tai languages Tai Dam, Tai Dón, and Thai Song.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.