Phoenician | |
---|---|
Range | U+10900..U+1091F (32 code points) |
Plane | SMP |
Scripts | Phoenician |
Assigned | 29 code points |
Unused | 3 reserved code points |
Unicode version history | |
5.0 (2006) | 27 (+27) |
5.2 (2009) | 29 (+2) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1] [2] |
Phoenician is a Unicode block containing characters used across the Mediterranean world from the 12th century BCE to the 3rd century CE. The Phoenician alphabet was added to the Unicode Standard in July 2006 with the release of version 5.0. An alternative proposal to handle it as a font variation of Hebrew was turned down. (See PDF [ dead link ] summary.)
The Unicode block for Phoenician is U+10900–U+1091F. It is intended for the representation of text in Paleo-Hebrew, Archaic Phoenician, Phoenician, Early Aramaic, Late Phoenician cursive, Phoenician papyri, Siloam Hebrew, Hebrew seals, Ammonite, Moabite and Punic. [3]
The letters are encoded U+10900 𐤀aleph through to U+10915 𐤕taw, U+10916 𐤖, U+10917 𐤗, U+10918 𐤘 and U+10919 𐤙 encode the numerals 1, 10, 20, and 100, respectively, and U+1091F 𐤟 is the word separator.
Phoenician [1] [2] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1090x | 𐤀 | 𐤁 | 𐤂 | 𐤃 | 𐤄 | 𐤅 | 𐤆 | 𐤇 | 𐤈 | 𐤉 | 𐤊 | 𐤋 | 𐤌 | 𐤍 | 𐤎 | 𐤏 |
U+1091x | 𐤐 | 𐤑 | 𐤒 | 𐤓 | 𐤔 | 𐤕 | 𐤖 | 𐤗 | 𐤘 | 𐤙 | 𐤚 | 𐤛 | 𐤟 | |||
Notes |
The following Unicode-related documents record the purpose and process of defining specific characters in the Phoenician block:
Version | Final code points [lower-alpha 1] | Count | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|
5.0 | U+10900..10919, 1091F | 27 | N1579 | Everson, Michael (1997-05-27), Proposal for encoding the Phoenician script | |
L2/97-288 | N1603 | Umamaheswaran, V. S. (1997-10-24), "8.24.1", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June – 4 July 1997 | |||
L2/99-013 | N1932 | Everson, Michael (1998-11-23), Revised proposal for encoding the Phoenician script in the UCS | |||
L2/99-224 | N2097, N2025-2 | Röllig, W. (1999-07-23), Comments on proposals for the Universal Multiple-Octed Coded Character Set | |||
N2133 | Response to comments on the question of encoding Old Semitic scripts in the UCS (N2097), 1999-10-04 | ||||
L2/00-010 | N2103 | Umamaheswaran, V. S. (2000-01-05), "10.4", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13—16 | |||
L2/04-149 | Kass, James; Anderson, Deborah W.; Snyder, Dean; Lehmann, Reinhard G.; Cowie, Paul James; Kirk, Peter; Cowan, John; Khalaf, S. George; Richmond, Bob (2004-05-25), Miscellaneous Input on Phoenician Encoding Proposal | ||||
L2/04-141R2 | N2746R2 | Everson, Michael (2004-05-29), Final proposal for encoding the Phoenician script in the UCS | |||
L2/04-177 | Anderson, Deborah (2004-05-31), Expert Feedback on Phoenician | ||||
L2/04-178 | N2772 | Anderson, Deborah (2004-06-04), Additional Support for Phoenician | |||
L2/04-181 | Keown, Elaine (2004-06-04), REBUTTAL to "Final proposal for encoding the Phoenician script in the UCS" | ||||
L2/04-190 | N2787 | Everson, Michael (2004-06-06), Additional examples of the Phoenician script in use | |||
L2/04-187 | McGowan, Rick (2004-06-07), Phoenician Recommendation | ||||
L2/04-206 | N2793 | Kirk, Peter (2004-06-07), Response to the revised "Final proposal for encoding the Phoenician script" (L2/04-141R2) | |||
L2/04-213 | Rosenne, Jony (2004-06-07), Responses to Several Hebrew Related Items | ||||
L2/04-217R | Keown, Elaine (2004-06-07), Proposal to add Archaic Mediterranean Script block to ISO 10646 | ||||
L2/04-226 | Durusau, Patrick (2004-06-07), Statement of the Society of Biblical Literature on WG2 N2746R2 | ||||
L2/04-218 | N2792 | Snyder, Dean (2004-06-08), Response to the Proposal to Encode Phoenician in Unicode | |||
L2/05-009 | N2909 | Anderson, Deborah (2005-01-19), Letters in support of Phoenician | |||
5.2 | U+1091A..1091B | 2 | N3353 (pdf, doc) | Umamaheswaran, V. S. (2007-10-10), "M51.14", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27 | |
L2/07-206 | N3284 | Everson, Michael (2007-07-25), Proposal to add two numbers for the Phoenician script | |||
L2/07-225 | Moore, Lisa (2007-08-21), "Phoenician", UTC #112 Minutes | ||||
|
Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, thousands of emoji, and non-visual control and formatting codes.
UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from UnicodeTransformation Format – 8-bit.
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.
In Unicode, the Sumero-Akkadian Cuneiform script is covered in three blocks in the Supplementary Multilingual Plane (SMP):
Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.0, five of the planes have assigned code points (characters), and seven are named.
The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.
The Unicode Standard assigns various properties to each Unicode character and code point.
Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:
Gurmukhi is a Unicode block containing characters for the Punjabi language, in the Gurmukhi script. In its original incarnation, the code points U+0A02..U+0A4C were a direct copy of the Gurmukhi characters A2-EC from the 1988 ISCII standard. The Devanagari, Bengali, Gujarati, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.
Gujarati is a Unicode block containing characters for writing the Gujarati language. In its original incarnation, the code points U+0A81..U+0AD0 were a direct copy of the Gujarati characters A1-F0 from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Oriya, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.
Oriya is a Unicode block containing characters for the Odia, Khondi and Santali languages of the state of Odisha in India. In its original incarnation, the code points U+0B01..U+0B4D were a direct copy of the Odia characters A1-ED from the 1988 ISCII standard. The Devanagari, Bengali, Gurmukhi, Gujarati, Tamil, Telugu, Kannada, and Malayalam blocks were similarly all based on their ISCII encodings.
Myanmar is a Unicode block containing characters for the Burmese, Mon, Shan, Palaung, and the Karen languages of Myanmar, as well as the Aiton and Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.
CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. Such encodings include the South Korean KS X 1001:1998, Taiwanese Big5, Japanese IBM 32, South Korean KS X 1001:2004, Japanese JIS X 0213, Japanese ARIB STD-B24 and the North Korean KPS 10721-2000 source standards.
Dingbats is a Unicode block containing dingbats. Most of its characters were taken from Zapf Dingbats; it was the Unicode block to have imported characters from a specific typeface; Unicode later adopted a policy that excluded symbols with "no demonstrated need or strong desire to exchange in plain text," and thus no further dingbat typefaces were encoded until Webdings and Wingdings were encoded in Version 7.0. Some ornaments are also an emoji, having optional presentation variants.
Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.
Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Anthropos alphabet, Sakha and Americanist usage.
Dogra is a Unicode block for the Dogri script, for writing the Dogri language in Jammu and Kashmir in the northern part of the Indian subcontinent. The Takri script version of Jammu is known as Dogra Akkhar.
Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.
Old Sogdian is a Unicode block containing characters for a group of related, non-cursive Sogdian writing systems used to write historic Sogdian in the 3rd to 5th centuries CE.