Carian (Unicode block)

Last updated
Carian
RangeU+102A0..U+102DF
(64 code points)
Plane SMP
Scripts Carian
Major alphabetsCarian
Assigned49 code points
Unused15 reserved code points
Unicode version history
5.149 (+49)
Note: [1] [2]

Carian is a Unicode block containing the Masson set and four additional characters for writing the ancient Carian language in Caria and Egypt, where the Carians served as mercenaries.

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

The Carian language is an extinct language of the Luwian subgroup of the Anatolian branch of the Indo-European language family. The Carian language was spoken in Caria, a region of western Anatolia between the ancient regions of Lycia and Lydia, by the Carians, a name possibly first mentioned in Hittite sources. Carian is closely related to Lycian and Milyan, and both are closely related to, though not direct descendants of, Luwian. Whether the correspondences between Luwian, Carian, and Lycian are due to direct descent, or are due to dialect geography, is disputed.

Carian [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+102Ax𐊠𐊡𐊢𐊣𐊤𐊥𐊦𐊧𐊨𐊩𐊪𐊫𐊬𐊭𐊮𐊯
U+102Bx𐊰𐊱𐊲𐊳𐊴𐊵𐊶𐊷𐊸𐊹𐊺𐊻𐊼𐊽𐊾𐊿
U+102Cx𐋀𐋁𐋂𐋃𐋄𐋅𐋆𐋇𐋈𐋉𐋊𐋋𐋌𐋍𐋎𐋏
U+102Dx𐋐
Notes
1. ^ As of Unicode version 12.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Carian block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
5.1U+102A0..102D049 L2/00-128 Bunz, Carl-Martin (2000-03-01), Scripts from the Past in Future Versions of Unicode
L2/00-153 Bunz, Carl-Martin (2000-04-26), Further comments on historic scripts
L2/05-100 N2938 Everson, Michael (2005-04-27), Proposal for encoding the Carian script in the UCS
L2/05-241 Everson, Michael (2005-08-31), Old Anatolian scripts
L2/05-386 N3020R Everson, Michael (2006-01-12), Proposal to encode the Carian script in the SMP of the UCS
L2/06-008R2 Moore, Lisa (2006-02-13), "C.2", UTC #106 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "7.4.2", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
N3103 (pdf, doc)Umamaheswaran, V. S. (2006-08-25), "M48.8", Unconfirmed minutes of WG 2 meeting 48, Mountain View, CA, USA; 2006-04-24/27
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0-25FF.

Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters. In addition to this block, Unicode includes full styled mathematical alphabets, although Unicode does not explicitly categorise these characters as being "letterlike".

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.

Spacing Modifier Letters is a Unicode block containing characters for the IPA, UPA, and other phonetic transcriptions. Included are the IPA tone marks, and modifiers for aspiration and palatalization.

Block Elements is a Unicode block containing square block symbols of various fill and shading. Used along with block elements are box-drawing characters, shade characters, and terminal graphic characters. These can be used for filling regions of the screen and portraying drop shadows.

Specials is a short Unicode block allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five are assigned as of Unicode 12.0:

Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages.

Hebrew is a Unicode block containing characters for writing the Hebrew, Yiddish, Ladino, and other Jewish diaspora languages.

Georgian Supplement is a Unicode block containing characters for the ecclesiastical form of the Georgian script, Nuskhuri. To write the full ecclesiastical Khutsuri orthography, the Asomtavruli capitals encoded in the Georgian block.

Devanagari is a Unicode block containing characters for writing languages such as Hindi, Marathi, Sindhi, Nepali, and Sanskrit, among others. In its original incarnation, the code points U+0900..U+0954 were a direct copy of the characters A0-F4 from the 1988 ISCII standard. The Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Katakana Phonetic Extensions is a Unicode block containing additional small katakana characters for writing the Ainu language, in addition to characters in the Katakana block.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. During the unification with ISO 10646 for version 1.1, the Japanese Industrial Standard Symbol was reassigned from the code point U+32FF at the end of the block to U+3004. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility Forms is a Unicode block containing vertical glyph variants for east Asian compatibility.

Byzantine Musical Symbols is a Unicode block containing characters for representing Byzantine-era musical notation.

Ancient Greek Musical Notation is a Unicode block containing symbols representing musical notations used in ancient Greece.

Tai Viet is a Unicode block containing characters for writing the Tai languages Tai Dam, Tai Dón, and Thai Song.

CJK Unified Ideographs Extension E is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.