Hebrew (Unicode block)

Last updated
Hebrew
RangeU+0590..U+05FF
(112 code points)
Plane BMP
Scripts Hebrew
Major alphabets Hebrew
Yiddish
Assigned88 code points
Unused24 reserved code points
Source standards ISO 8859-8
Unicode version history
1.0.0 (1991)52 (+52)
1.0.1 (1992)51 (-1)
2.0 (1996)82 (+31)
4.1 (2005)86 (+4)
5.0 (2006)87 (+1)
11.0 (2018)88 (+1)
Code chart
Note: One character was moved from the Hebrew block to the Alphabetic Presentation Forms block in version 1.0.1 during the process of unifying with ISO 10646. [1] [2] [3]

Hebrew is a Unicode block containing characters for writing the Hebrew, Yiddish, Ladino, and other Jewish diaspora languages.

Contents

Block

Hebrew [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+059x֑ ֒ ֓ ֔ ֕ ֖ ֗ ֘ ֙ ֚ ֛ ֜ ֝ ֞ ֟ 
U+05Ax֠ ֡ ֢ ֣ ֤ ֥ ֦ ֧ ֨ ֩ ֪ ֫ ֬ ֭ ֮ ֯ 
U+05Bxְ ֱ ֲ ֳ ִ ֵ ֶ ַ ָ ֹ ֺ ֻ ּ ֽ ־ֿ 
U+05Cx׀ׁ ׂ ׃ׄ ׅ ׆ׇ 
U+05Dxאבגדהוזחטיךכלםמן
U+05Exנסעףפץצקרשתׯ
U+05Fxװױײ׳״
Notes
1. ^ As of Unicode version 14.0
2. ^ Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Hebrew block:

Version Final code points [lower-alpha 1] Count UTC  ID L2  ID WG2  IDDocument
1.0.0U+05B0..05B9, 05BB..05C3, 05D0..05EA, 05F0..05F451UTC/1991-053Rosenne, Jony (1991-03-26), Hebrew
UTC/1991-048B Whistler, Ken (1991-03-27), "5) Hebrew", Draft Minutes from the UTC meeting #46 day 2, 3/27 at Apple
N1026Liaison Report - Encoding Newsletter, April 1994
X3L2/94-098 N1033 (pdf, doc)Umamaheswaran, V. S.; Ksar, Mike (1994-06-01), "8.1.14", Unconfirmed Minutes of ISO/IEC JTC 1/SC 2/WG 2 Meeting 25, Falez Hotel, Antalya, Turkey, 1994-04-18--22
L2/03-234 Hudson, John (2003-08-05), More on Meteg and CGJ [1]
L2/03-235 Whistler, Ken (2003-08-05), More on Meteg and CGJ [2]
L2/03-236 Whistler, Ken (2003-08-05), More on Meteg and CGJ [3]
L2/03-261 Keown, Elaine (2003-08-05), E-mail to ANSI regarding Hebrew encoding
L2/03-297 Rosenne, Jony (2003-08-24), Hebrew Issues
L2/04-194 Kirk, Peter (2004-06-05), On the Hebrew mark METEG
L2/04-213 Rosenne, Jony (2004-06-07), Responses to Several Hebrew Related Items
L2/06-104 Konstantinov, Ilya (2006-01-18), Feedback for Unicode 5.0.0: HEBREW PUNCTUATION MAQAF is a Dash-character
L2/06-108 Moore, Lisa (2006-05-25), "B.14.5, B.11.8", UTC #107 Minutes
2.0U+0591..05A1, 05A3..05AF, 05C431N1079RHebrew cantillation marks in ISO/IEC 10646-1
N1117 Umamaheswaran, V. S.; Ksar, Mike (1994-10-31), "7.2.2 item f", Unconfirmed Minutes of ISO/IEC JTC 1/SC 2/WG 2 Meeting 26, Tuscan Inn - Fisherman's Wharf, San Francisco, CA, UAS; 1994-10-10 through 14
N1079RASummary Proposal Form and Examples
N1079R2Hebrew cantillation marks in ISO/IEC 10646-1
N1195 Hebrew Cantillation marks
N1203 Umamaheswaran, V. S.; Ksar, Mike (1995-05-03), "6.1.8", Unconfirmed minutes of SC2/WG2 Meeting 27, Geneva
N1217 Further clarifications regarding WG2 N1195, 1995-05-21
X3L2/95-090 N1253 (doc, txt)Umamaheswaran, V. S.; Ksar, Mike (1995-09-09), "6.4.3", Unconfirmed Minutes of WG 2 Meeting # 28 in Helsinki, Finland; 1995-06-26--27
N1315Updated Table of replies and national body feedback on pDAM7 - Additional characters (SC2 N2656), 1996-01-09
N1539Table of Replies and Feedback on Amendment 7 – Hebrew etc., 1997-01-29
L2/97-127N1563Paterson, Bruce (1997-05-27), Draft Report on JTC1 letter ballot on DAM No. 7 to ISO/IEC 10646-1 (33 additional characters)
N1572Paterson, Bruce (1997-06-23), Almost Final Text – DAM 7 – 33 additional characters
L2/97-288 N1603 Umamaheswaran, V. S. (1997-10-24), "5.3.3", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June - 4 July 1997
4.1U+05A21 L2/03-443 N2692 Shoulson, Mark; Kirk, Peter; Everson, Michael (2003-12-11), Proposal to add ATNAH HAFUKH to the BMP of the UCS
L2/04-156R2 Moore, Lisa (2004-08-13), "Atnah Hafukh (A.17.3)", UTC #99 Minutes
U+05C5..05C62 L2/03-297 Rosenne, Jony (2003-08-24), Hebrew Issues
L2/03-299 Kirk, Peter (2003-08-25), Issues in the Representation of Pointed Hebrew in Unicode
L2/04-089R N2714 Shoulson, Mark; Kirk, Peter; Hudson, John; Everson, Michael; Constable, Peter (2004-03-04), Proposal to add two Masoretic punctuation marks to the BMP of the UCS
L2/04-156R2 Moore, Lisa (2004-08-13), "Two Hebrew punctuation marks (A.17.4)", UTC #99 Minutes
U+05C71 L2/03-297 Rosenne, Jony (2003-08-24), Hebrew Issues
L2/04-150 N2755 Everson, Michael; Shoulson, Mark (2004-05-03), Proposal to add QAMATS QATAN to the BMP of the UCS
L2/04-213 Rosenne, Jony (2004-06-07), Responses to Several Hebrew Related Items
N2821 Everson, Michael; Shoulson, Mark (2004-06-21), Clarification on the name QAMATS QATAN
L2/04-346 Kirk, Peter (2004-08-12), Proposal to change the provisional code point allocations for proposed characters HEBREW POINT HOLAM HASER FOR VAV and HEBREW POINT QAMATS QATAN
L2/04-156R2 Moore, Lisa (2004-08-13), "QAMATS QATAN (A.17.5)", UTC #99 Minutes
L2/18-274 McGowan, Rick (2018-09-14), "Identifier_Type of U+05C7 HEBREW POINT QAMATS QATAN", Comments on Public Review Issues (July 24 - Sept 14, 2018)
L2/18-272 Moore, Lisa (2018-10-29), "157-C17 Consensus", UTC #157 Minutes, Change the Identifier_Type of U+05C7 HEBREW POINT QAMATS QATAN from "Obsolete" to "Uncommon_Use Technical" for Unicode version 12.0.
5.0U+05BA1 L2/03-297 Rosenne, Jony (2003-08-24), Hebrew Issues
L2/04-193 Kirk, Peter (2004-06-05), On the Hebrew vowel HOLAM
L2/04-213 Rosenne, Jony (2004-06-07), Responses to Several Hebrew Related Items
L2/04-306 Kirk, Peter (2004-07-29), Background material for the proposal on the Hebrew vowel HOLAM
L2/04-307 Kirk, Peter; Shmidman, Avi; Cowan, John; Hopp, Ted; Peterson, Trevor; Lowery, Kirk; Keown, Elaine; Robertson, Stuart (2004-07-29), New proposal on the Hebrew vowel HOLAM
L2/04-310 N2840 Everson, Michael; Shoulson, Mark (2004-07-29), Proposal to add HEBREW POINT HOLAM HASER FOR VAV to the BMP of the UCS
L2/04-313 Kirk, Peter (2004-08-02), Response to "Proposal to add HEBREW POINT HOLAM HASER FOR VAV"
L2/04-326 Rosenne, Jony (2004-08-02), UTC - Holam proposals
L2/04-327 Hudson, John (2004-08-03), Distinction of Vav Haluma and Holam Male
L2/04-346 Kirk, Peter (2004-08-12), Proposal to change the provisional code point allocations for proposed characters HEBREW POINT HOLAM HASER FOR VAV and HEBREW POINT QAMATS QATAN
L2/04-344 Everson, Michael; Shoulson, Mark (2004-08-18), Disunification costs regarding HOLAM and VAV in Hebrew
11.0U+05EF1 N1740 (html, doc)Shoulson, Mark; Everson, Michael (1998-05-09), Proposal to add the Hebrew Tetragrammaton to ISO/IEC 10646
N1807 (pdf, doc, txt)Rosenne, Jonathan (1998-07-07), Israeli Response to the Tetragrammaton Proposal N1740
L2/15-092 Shoulson, Mark (2015-03-10), Typographic Concerns and the Hebrew Nomina Sacra
L2/15-149 Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Pandey, Anshuman; Glass, Andrew (2015-05-03), "24 Hebrew Nomina Sacra", Recommendations to UTC #143 May 2015 on Script Proposals
L2/15-204 Anderson, Deborah; et al. (2015-07-25), "14. Hebrew Nomina Sacra", Recommendations to UTC #144 July 2015 on Script Proposals
L2/16-305 N4807 Shoulson, Mark (2016-10-28), Proposal to add HEBREW YOD TRIANGLE
L2/17-037 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa; Liang, Hai; Ishida, Richard; Misra, Karan; McGowan, Rick (2017-01-21), "16. Hebrew", Recommendations to UTC #150 January 2017 on Script Proposals
L2/17-016 Moore, Lisa (2017-02-08), "C.12", UTC #150 Minutes
  1. Proposed code points and characters names may differ from final code points and names

See also

Related Research Articles

Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.

Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1, which was incorporated whole as the Latin-1 supplement block.

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.

Spacing Modifier Letters is a Unicode block containing characters for the IPA, UPA, and other phonetic transcriptions. Included are the IPA tone marks, and modifiers for aspiration and palatalization. The word spacing indicates that these characters occupy their own horizontal space within a line of text. Its block name in Unicode 1.0 was simply Modifier Letters.

Block Elements is a Unicode block containing square block symbols of various fill and shading. Used along with block elements are box-drawing characters, shade characters, and terminal graphic characters. These can be used for filling regions of the screen and portraying drop shadows. Its block name in Unicode 1.0 was Blocks.

Control Pictures is a Unicode block containing characters for graphically representing the C0 control codes, and other control characters. Its block name in Unicode 1.0 was Pictures for Control Codes.

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF. Of these 16 code points, five have been assigned since Unicode 3.0:

Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 and also legacy characters from the ISO 6937 standard.

Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.

Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.

The Unicode Standard assigns various properties to each Unicode character and code point.

CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.

Samaritan is a Unicode block containing characters used for writing Samaritan Hebrew and Aramaic.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

Byzantine Musical Symbols is a Unicode block containing characters for representing Byzantine-era musical notation.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Cherokee Supplement is a Unicode block containing the syllabic characters for writing the Cherokee language. When Cherokee was first added to Unicode in version 3.0 it was treated as a unicameral alphabet, but in version 8.0 it was redefined as a bicameral script. The Cherokee Supplement block contains lowercase letters only, whereas the Cherokee block contains all the uppercase letters, together with six lowercase letters. For backwards compatibility, the Unicode case folding algorithm—which usually converts a string to lowercase characters—maps Cherokee characters to uppercase.

References

  1. "Unicode 1.0.1 Addendum" (PDF). The Unicode Standard. 1992-11-03. Retrieved 2016-07-09.
  2. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  3. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.