Phoenician (Unicode block)

Last updated
Phoenician
RangeU+10900..U+1091F
(32 code points)
Plane SMP
Scripts Phoenician
Assigned29 code points
Unused3 reserved code points
Unicode version history
5.0 (2006)27 (+27)
5.2 (2009)29 (+2)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

Phoenician is a Unicode block containing characters used across the Mediterranean world from the 12th century BCE to the 3rd century CE. The Phoenician alphabet was added to the Unicode Standard in July 2006 with the release of version 5.0. An alternative proposal to handle it as a font variation of Hebrew was turned down. (See PDF [ dead link ] summary.)

Contents

The Unicode block for Phoenician is U+10900–U+1091F. It is intended for the representation of text in Paleo-Hebrew, Archaic Phoenician, Phoenician, Early Aramaic, Late Phoenician cursive, Phoenician papyri, Siloam Hebrew, Hebrew seals, Ammonite, Moabite and Punic. [3]

The letters are encoded U+10900 𐤀aleph through to U+10915 𐤕taw, U+10916 𐤖, U+10917 𐤗, U+10918 𐤘 and U+10919 𐤙 encode the numerals 1, 10, 20, and 100, respectively, and U+1091F 𐤟 is the word separator.

Characters

Phoenician [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+1090x𐤀𐤁𐤂𐤃𐤄𐤅𐤆𐤇𐤈𐤉𐤊𐤋𐤌𐤍𐤎𐤏
U+1091x𐤐𐤑𐤒𐤓𐤔𐤕𐤖𐤗𐤘𐤙𐤚𐤛𐤟
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Phoenician block:

Related Research Articles

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.

Koppa or qoppa is a letter that was used in early forms of the Greek alphabet, derived from Phoenician qoph (𐤒). It was originally used to denote the sound, but dropped out of use as an alphabetic character and replaced by Kappa (Κ). It has remained in use as a numeral symbol (90) in the system of Greek numerals, although with a modified shape. Koppa is the source of Latin Q, as well as the Cyrillic numeral sign of the same name (Koppa).

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

As of Unicode version 15.1, Cyrillic script is encoded across several blocks:

In Unicode, the Sumero-Akkadian Cuneiform script is covered in three blocks in the Supplementary Multilingual Plane (SMP):

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.

The Unicode Standard assigns various properties to each Unicode character and code point.

Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:

Gurmukhi is a Unicode block containing characters for the Punjabi language, in the Gurmukhi script. In its original incarnation, the code points U+0A02..U+0A4C were a direct copy of the Gurmukhi characters A2-EC from the 1988 ISCII standard. The Devanagari, Bengali, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Gujarati is a Unicode block containing characters for writing the Gujarati language. In its original incarnation, the code points U+0A81..U+0AD0 were a direct copy of the Gujarati characters A1-F0 from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Oriya is a Unicode block containing characters for the Odia, Khondi and Santali languages of the state of Odisha in India. In its original incarnation, the code points U+0B01..U+0B4D were a direct copy of the Odia characters A1-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.

Myanmar is a Unicode block containing characters for the Burmese, Mon, Shan, Palaung, and the Karen languages of Myanmar, as well as the Aiton and Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Old South Arabian is a Unicode block containing characters for writing the Minean, Sabaean, Qatabanian, Hadramite, and Himyaritic languages of Yemen from the 8th century BCE to the 6th century CE.

Manichaean is a Unicode block containing characters historically used for writing Sogdian, Parthian, and the dialects of Fars.

<span class="mw-page-title-main">Dogra (Unicode block)</span> Unicode character block

Dogra is a Unicode block for the Dogri script, for writing the Dogri language in Jammu and Kashmir in the northern part of the Indian subcontinent. The Takri script version of Jammu is known as Dogra Akkhar.

Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. "Middle-East scripts II: Ancient Scripts" (PDF). The Unicode Standard: Version 13.0 – Core Specification. The Unicode Consortium. 2020. Retrieved 2021-01-28.