Cuneiform Numbers and Punctuation

Last updated
Cuneiform Numbers and Punctuation
RangeU+12400..U+1247F
(128 code points)
Plane SMP
Scripts Cuneiform
Symbol setsNumeric signs
Fractions
Punctuation
Assigned116 code points
Unused12 reserved code points
Unicode version history
5.0 (2006)103 (+103)
7.0 (2014)116 (+13)
Unicode documentation
Code chart ∣ Web page
Note: [1] [2]

In Unicode , the Sumero-Akkadian Cuneiform script is covered in three blocks in the Supplementary Multilingual Plane (SMP):

Contents

The sample glyphs in the chart file published by the Unicode Consortium [3] show the characters in their Classical Sumerian form (Early Dynastic period, mid 3rd millennium BCE). The characters as written during the 2nd and 1st millennia BCE, the era during which the vast majority of cuneiform texts were written, are considered font variants of the same characters.

Organization

The final proposal for Unicode encoding of the script was submitted by two cuneiform scholars working with an experienced Unicode proposal writer in June 2004. [4] The base character inventory is derived from the list of Ur III signs compiled by the Cuneiform Digital Library Initiative of UCLA based on the inventories of Miguel Civil, Rykle Borger (2003), and Robert Englund. Rather than opting for a direct ordering by glyph shape and complexity, according to the numbering of an existing catalogue, the Unicode order of glyphs was based on the Latin alphabetic order of their 'main' Sumerian transliteration as a practical approximation.

Block

Cuneiform Numbers and Punctuation [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+1240x𒐀𒐁𒐂𒐃𒐄𒐅𒐆𒐇𒐈𒐉𒐊𒐋𒐌𒐍𒐎𒐏
U+1241x𒐐𒐑𒐒𒐓𒐔𒐕𒐖𒐗𒐘𒐙𒐚𒐛𒐜𒐝𒐞𒐟
U+1242x𒐠𒐡𒐢𒐣𒐤𒐥𒐦𒐧𒐨𒐩𒐪𒐫𒐬𒐭𒐮𒐯
U+1243x𒐰𒐱𒐲𒐳𒐴𒐵𒐶𒐷𒐸𒐹𒐺𒐻𒐼𒐽𒐾𒐿
U+1244x𒑀𒑁𒑂𒑃𒑄𒑅𒑆𒑇𒑈𒑉𒑊𒑋𒑌𒑍𒑎𒑏
U+1245x𒑐𒑑𒑒𒑓𒑔𒑕𒑖𒑗𒑘𒑙𒑚𒑛𒑜𒑝𒑞𒑟
U+1246x𒑠𒑡𒑢𒑣𒑤𒑥𒑦𒑧𒑨𒑩𒑪𒑫𒑬𒑭𒑮
U+1247x𒑰𒑱𒑲𒑳𒑴
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

Signs

See also list of cuneiform signs.

The following table allows matching of Borger's 1981 and 2003 numbering with Unicode characters [5] The "primary" transliteration column has the glyphs' Sumerian values as given by the official glyph name, slightly modified here for legibility by including traditional assyriological symbols such as "x" rather than "TIMES". The exact Unicode names can be unambiguously recovered by prefixing, "CUNEIFORM [NUMERIC] SIGN", replacing "TIMES" for "x", "PLUS" for "+" and "OVER" for "/", "ASTERISK" for "*", "H" for "Ḫ", "SH" for "Š", and switching to uppercase.

SignCode pointNameBorger
(2003)
Borger
(1981)
Comments
𒀸U+12038one AŠ001 1, from general Cuneiform_(Unicode_block) not this block
𒐀U+12400two AŠ00222, = U+1212C
𒐁U+12401three AŠ0043, EŠ6
𒐂U+12402four AŠ215124,424, LIMMU2, LIMM2, TAB.TAB
𒐃U+12403five AŠ2165, IA7, TAB.TAB.AŠ
𒐄U+12404six AŠ2176, AŠ4, TAB.TAB.TAB
𒐅U+12405seven AŠ2187, IMIN2, TAB.TAB.TAB.AŠ
𒐆U+12406eight AŠ2198, USSU2, TAB.TAB.TAB.TAB
𒐇U+12407nine AŠ2209, ILIMMU2, TAB.TAB.TAB.TAB.AŠ
𒐈U+12408three DIŠ8345933, 180, EŠ5
𒐉U+12409four DIŠ851; 852; 8533164, 240, ZA, LIMMU5, NIGIDALIMMU, = U+1235D
𒐊U+1240Afive DIŠ861598a5, 300, IA2
𒐋U+1240Bsix DIŠ862598b6, 360, AŠ3
𒐌U+1240Cseven DIŠ863598c7, 420
𒐍U+1240Deight DIŠ864598d8, 480
𒐎U+1240Enine DIŠ9, 540
𒐏U+1240Ffour U 71347440, NIMIN
𒐐U+12410five U71447550, NINNU
𒐑U+12411six U71547660
𒐒U+12412seven U71647770
𒐓U+12413eight U71747880
𒐔U+12414nine U71847990
𒐕U+12415one GEŠ2
𒐖U+12416two GEŠ2
𒐗U+12417three GEŠ2
𒐘U+12418four GEŠ2
𒐙U+12419five GEŠ2
𒐚U+1241Asix GEŠ2
𒐛U+1241Bseven GEŠ2
𒐜U+1241Ceight GEŠ2
𒐝U+1241Dnine GEŠ2
𒐞U+1241Eone GEŠU824534GEŠ2.U; 600 or 70
𒐟U+1241Ftwo GEŠU1200 or 80
𒐠U+12420three GEŠU1800 or 90
𒐡U+12421four GEŠU2400 or 100
𒐢U+12422five GEŠU3000 or 110
𒐣U+12423two ŠAR2
𒐤U+12424three ŠAR2
𒐥U+12425three ŠAR2 variant form
𒐦U+12426four ŠAR2
𒐧U+12427five ŠAR2
𒐨U+12428six ŠAR2
𒐩U+12429seven ŠAR2
𒐪U+1242Aeight ŠAR2
𒐫U+1242Bnine ŠAR2
𒐬U+1242Cone ŠARU65340936,000
𒐭U+1242Dtwo ŠARU72,000
𒐮U+1242Ethree ŠARU108,000
𒐯U+1242Fthree ŠARU variant form108,000
𒐰U+12430four ŠARU144,000
𒐱U+12431five ŠARU180,000
𒐲U+12432ŠAR2 x GAL.DIŠ651408216,000
𒐳U+12433ŠAR2 x GAL.MIN652408432,000
𒐴U+12434one BURU662350,8U gunû
𒐵U+12435two BURU
𒐶U+12436three BURU
𒐷U+12437three BURU variant form
𒐸U+12438four BURU
𒐹U+12439five BURU
𒐺U+1243A165053, = U+1203C
𒐻U+1243B212103
𒐼U+1243CLIMMU859; 8604, NIG2, GAR, NINDA
𒐽U+1243DLIMMU45064
𒐾U+1243E
𒐿U+1243F
𒑀U+1244095366, EŠ16.EŠ16
𒑁U+12441IMIN35377, UMUN9
𒑂U+12442IMIN8637
𒑃U+12443IMIN variant form8667
𒑄U+12444USSU8678
𒑅U+12445USSU35388
𒑆U+12446ILIMMU8689
𒑇U+12447ILIMMU35399, EŠ16.EŠ16.EŠ16
𒑈U+12448ILIMMU45779
𒑉U+12449DIŠ / DIŠ / DIŠ865v9
𒑊U+1244Atwo AŠ tenû593
𒑋U+1244Bthree AŠ tenû629
𒑌U+1244Cfour AŠ tenû854379; 380ZA tenû, ERIM tenû
𒑍U+1244Dfive AŠ tenû
𒑎U+1244Esix AŠ tenû
𒑏U+1244Fone BAN2122= U+12047
𒑐U+12450two BAN2
𒑑U+12451three BAN2
𒑒U+12452four BAN2
𒑓U+12453four BAN2 variant form
𒑔U+12454five BAN2
𒑕U+12455five BAN2 variant form
𒑖U+12456NIGIDAMIN847, 848
𒑗U+12457NIGIDAEŠ850
𒑘U+12458one EŠE3= U+12041, U+12300
𒑙U+12459two EŠE3= U+12049
𒑚U+1245Aone third826571ŠUŠANA
𒑛U+1245Btwo thirds832572
𒑜U+1245Cfive sixths838573KINGUSILA
𒑝U+1245Done third variant form
𒑞U+1245Etwo thirds variant form
𒑟U+1245Fone eighth
𒑠U+12460one quarter
𒑡U+12461Old Assyrian one sixth630Kültepe only
𒑢U+12462Old Assyrian one quarter
𒑰U+12470Old Assyrian word divider
𒑱U+12471vertical colon592 Glossenkeil
𒑲U+12472diagonal colon592 Glossenkeil
𒑳U+12473diagonal tricolon


History

The following Unicode-related documents record the purpose and process of defining specific characters in the Cuneiform Numbers and Punctuation block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
5.0U+12400..12462, 12470..12473103 L2/00-128 Bunz, Carl-Martin (2000-03-01), Scripts from the Past in Future Versions of Unicode
L2/00-153 Bunz, Carl-Martin (2000-04-26), Further comments on historic scripts
L2/00-398 Snyder, Dean (2000-11-07), Cuneiform: From Clay Tablet to Computer
L2/00-419 N2297 Everson, Michael (2000-11-20), Legacy cuneiform font implementations and the ICE project
L2/03-162 N2585 Everson, Michael; Feuerherm, Karljürgen (2003-05-25), Basic principles for the encoding of Sumero-Akkadian Cuneiform
L2/03-415 Snyder, Dean (2003-11-01), Proposal to Encode the Sumero-Akkadian Cuneiform Script in the UCS
L2/03-393R N2664R Everson, Michael; Feuerherm, Karljürgen; Tinney, Steve (2003-11-03), Preliminary proposal to encode Cuneiform script in the SMP of the UCS
L2/03-416 Anderson, Lloyd (2003-11-03), The Cuneiform Encoding Proposal -- a View of its Current Status
L2/04-080 Tinney, Steve (2004-01-24), Rationale for changes to N2664R
L2/04-036 N2698 Everson, Michael; Feuerherm, Karljürgen; Tinney, Steve (2004-01-29), Revised proposal to encode Cuneiform script in the SMP of the UCS
L2/04-041 Anderson, Lloyd (2004-01-29), Fitting Cuneiform Encoding to Cuneiform Script
L2/04-059 Feuerherm, Karljürgen (2004-01-30), Short Response to L2/04-041 "Fitting Cuneiform Encoding to Cuneiform Script"
L2/04-063 Gewecke, Tom (2004-01-30), Re: Cuneiform at UTC
L2/04-056 Veldhuis, Niek (2004-01-31), Letter re "Cuneiform Unicode"
L2/04-057 Jones, Charles E. (2004-02-01), Letter re "Cuneiform"
L2/04-058 Michalowski, Piotr (2004-02-01), Letter re "cuneiform unicode"
L2/04-064 Cooper, Jerry (2004-02-01), Letter re "unicode proposal"
L2/04-066 Durusau, Patrick (2004-02-02), Letter re "Proposal N2698"
L2/04-081 Black, Jeremy (2004-02-02), Letter re "cuneiform Unicode proposal"
L2/04-086 Anderson, Lloyd (2004-02-03), Notes for verbal presentation to UTC meeting, 3 February 2004
L2/04-099 Anderson, Lloyd (2004-02-09), Unification of cuneiform numbers
L2/04-225 Anderson, Lloyd (2004-06-07), Proposed modifications to introductory text of N2798 = L204-189 Proposal for Cuneiform Encoding
L2/04-189 N2786 Everson, Michael; Feuerherm, Karljürgen; Tinney, Steve (2004-06-08), Final proposal to encode Cuneiform script
L2/04-223R Anderson, Lloyd (2004-06-11), Proposed modifications to delete and add signs to N2798 = L204-189 Proposal for Cuneiform Encoding
L2/04-354 McGowan, Rick (2004-09-20), Cuneiform Properties
L2/05-135 Tinney, Steve (2005-05-10), Corrections to N2786
L2/05-174 Everson, Michael (2005-07-28), Irish comments on Cuneiform
L2/05-108R Moore, Lisa (2005-08-26), "Cuneiform (C.17)", UTC #103 Minutes
N2953 (pdf, doc)Umamaheswaran, V. S. (2006-02-16), "M47.12", Unconfirmed minutes of WG 2 meeting 47, Sophia Antipolis, France; 2005-09-12/15
L2/12-112 Moore, Lisa (2012-05-17), "Consensus 131-C30", UTC #131 / L2 #228 Minutes, Change the numeric values for 1240F..12414 to 40..90, for Unicode 6.2.
L2/12-240 Davis, Mark (2012-07-20), Property Issues for U6.2
L2/12-239 Moore, Lisa (2012-08-14), "Consensus 132-C19", UTC #132 Minutes, Give U+12432 and U+12433 the numeric type "numeric" and the numeric values 216,000, and 432,000 respectively. Make U+12456 and 12457 have the numeric type "numeric" and value "-1".
L2/12-328 Anderson, Deborah (2012-10-16), Numeric value fixes for two cuneiform characters
L2/12-343R2 Moore, Lisa (2012-12-04), "Consensus 133-C30", UTC #133 Minutes, Change the numeric value of U+12456 to 2 and U+12457 to 3, for Unicode 6.3.
7.0U+12463..1246E, 1247413 L2/12-002 N4178R Everson, Michael; Tinney, Steve (2012-01-16), Proposal for additions and corrections to Sumero-Akkadian Cuneiform
L2/12-207R N4277R Everson, Michael; Tinney, Steve (2012-07-31), Proposal for additions and corrections to Sumero-Akkadian Cuneiform
L2/12-239 Moore, Lisa (2012-08-14), "C.3", UTC #132 Minutes
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

The Coptic script is the script used for writing the Coptic language, the most recent development of Egyptian. The repertoire of glyphs is based on the uncial Greek alphabet, augmented by letters borrowed from the Egyptian Demotic. It was the first alphabetic script used for the Egyptian language. There are several Coptic alphabets, as the script varies greatly among the various dialects and eras of the Coptic language.

<span class="mw-page-title-main">Ugaritic alphabet</span> Cuneiform consonantal alphabet of 30 letters

The Ugaritic writing system is a cuneiform abjad with syllabic elements used from around either 1400 BCE or 1300 BCE for Ugaritic, an extinct Northwest Semitic language. It was discovered in Ugarit, modern Ras Al Shamra, Syria, in 1928. It has 30 letters. Other languages, particularly Hurrian, were occasionally written in the Ugaritic script in the area around Ugarit, although not elsewhere.

<span class="mw-page-title-main">Cuneiform</span> Writing system of the ancient Near East

Cuneiform is a logo-syllabic writing system that was used to write several languages of the Ancient Near East. The script was in active use from the early Bronze Age until the beginning of the Common Era. Cuneiform scripts are marked by and named for the characteristic wedge-shaped impressions which form their signs. Cuneiform is the earliest known writing system and was originally developed to write the Sumerian language of southern Mesopotamia.

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

<span class="mw-page-title-main">Lugal</span> Sumerian term for rulers

Lugal is the Sumerian term for "king, ruler". Literally, the term means "big man." In Sumerian, "𒇽" is "man" and gal "𒃲" is "great", or "big."

Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.

<span class="mw-page-title-main">Linear Elamite</span> Writing system from Elam

Linear Elamite was a writing system used in Elam during the Bronze Age between c. 2300 and 1850 BCE, and known mainly from a few extant monumental inscriptions. It was used contemporaneously with Elamite cuneiform and records the Elamite language. The French archaeologist François Desset and his colleagues have argued that it is the oldest known purely phonographic writing system, although others, such as the linguist Michael Mäder, have argued that it is partly logographic.

<span class="mw-page-title-main">Old Persian cuneiform</span> Semi-alphabetic cuneiform script

Old Persian cuneiform is a semi-alphabetic cuneiform script that was the primary script for Old Persian. Texts written in this cuneiform have been found in Iran, Armenia, Romania (Gherla), Turkey, and along the Suez Canal. They were mostly inscriptions from the time period of Darius I, such as the DNa inscription, as well as his son, Xerxes I. Later kings down to Artaxerxes III used more recent forms of the language classified as "pre-Middle Persian".

In Unicode, the Sumero-Akkadian Cuneiform script is covered in three blocks in the Supplementary Multilingual Plane (SMP):

<span class="mw-page-title-main">Winkelhaken</span>

The Winkelhaken, also simply called a hook, is one of five basic wedge elements appearing in the composition of signs in Akkadian cuneiform. It was realized by pressing the point of the stylus into the clay.

<span class="mw-page-title-main">Universal Character Set characters</span> Complete list of the characters available on most computers

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

A numeral is a character that denotes a number. The decimal number digits 0–9 are used widely in various writing systems throughout the world, however the graphemes representing the decimal digits differ widely. Therefore Unicode includes 22 different sets of graphemes for the decimal digits, and also various decimal points, thousands separators, negative signs, etc. Unicode also includes several non-decimal numerals such as Aegean numerals, Roman numerals, counting rod numerals, Mayan numerals, Cuneiform numerals and ancient Greek numerals. There is also a large number of typographical variations of the Western Arabic numerals provided for specialized mathematical use and for compatibility with earlier character sets, such as ² or ②, and composite characters such as ½.

Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t were combined. The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType.

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

The rupee sign "" is a currency sign used to represent the monetary unit of account in Pakistan, Sri Lanka, Nepal, Mauritius, Seychelles, and formerly in India. It resembles, and is often written as, the Latin character sequence "Rs", of which it is an orthographic ligature.

<span class="mw-page-title-main">Ukkin</span> Sumerian word for Divine council

Ukkin (UKKIN) is the Sumerian word or symbol for assembly, temple council or Divine council, written ideographically with the cuneiform sign 𒌺.

Runic is a Unicode block containing runic characters. It was introduced in Unicode 3.0 (1999), with eight additional characters introduced in Unicode 7.0 (2014). The original encoding of runes in UCS was based on the recommendations of the "ISO Runes Project" submitted in 1997.

Early Dynastic Cuneiform is the name of a Unicode block of the Supplementary Multilingual Plane (SMP), at U+12480–U+1254F, introduced in version 8.0. It is a supplement to the earlier encoding of the cuneiform script in the two blocks U+12000–U+123FF "Cuneiform" and U+12400–U+1247F "Cuneiform Numbers and Punctuation".

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. Cuneiform Unicode.org chart (PDF)
  4. Unicode cuneiform
  5. (after Anderson's sign list Archived 2010-04-08 at the Wayback Machine )

Font packages