Unicode and HTML for the Hebrew alphabet

Last updated

The Unicode and HTML for the Hebrew alphabet are found in the following tables. The Unicode Hebrew block extends from U+0590 to U+05FF and from U+FB1D to U+FB4F. It includes letters, ligatures, combining diacritical marks ( niqqud and cantillation marks) and punctuation. The Numeric Character References are included for HTML. These can be used in many markup languages, and they are often used on web pages to create the Hebrew glyphs presentable by the majority of web browsers.

Contents

Unicode

Character table

CodeResultDescription
U+05BE־Hebrew Punctuation Maqaf
U+05C0׀Hebrew Punctuation Paseq
U+05C3׃Hebrew Punctuation Sof Pasuq
U+05C6׆Hebrew Punctuation Nun Hafukha
U+05D0אHebrew Letter Alef
U+05D1בHebrew Letter Bet
U+05D2גHebrew Letter Gimel
U+05D3דHebrew Letter Dalet
U+05D4הHebrew Letter He
U+05D5וHebrew Letter Vav
U+05D6זHebrew Letter Zayin
U+05D7חHebrew Letter Het
U+05D8טHebrew Letter Tet
U+05D9יHebrew Letter Yod
U+05DAךHebrew Letter Final Kaf
U+05DBכHebrew Letter Kaf
U+05DCלHebrew Letter Lamed
U+05DDםHebrew Letter Final Mem
U+05DEמHebrew Letter Mem
U+05DFןHebrew Letter Final Nun
U+05E0נHebrew Letter Nun
U+05E1סHebrew Letter Samekh
U+05E2עHebrew Letter Ayin
U+05E3ףHebrew Letter Final Pe
U+05E4פHebrew Letter Pe
U+05E5ץHebrew Letter Final Tsadi
U+05E6צHebrew Letter Tsadi
U+05E7קHebrew Letter Qof
U+05E8רHebrew Letter Resh
U+05E9שHebrew Letter Shin
U+05EAתHebrew Letter Tav
U+05F0װHebrew Ligature Yiddish Double Vav
U+05F1ױHebrew Ligature Yiddish Vav Yod
U+05F2ײHebrew Ligature Yiddish Double Yod
U+05F3׳Hebrew Punctuation Geresh
U+05F4״Hebrew Punctuation Gershayim

Compact table

Hebrew [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+059x֑ ֒ ֓ ֔ ֕ ֖ ֗ ֘ ֙ ֚ ֛ ֜ ֝ ֞ ֟ 
U+05Ax֠ ֡ ֢ ֣ ֤ ֥ ֦ ֧ ֨ ֩ ֪ ֫ ֬ ֭ ֮ ֯ 
U+05Bxְ ֱ ֲ ֳ ִ ֵ ֶ ַ ָ ֹ ֺ ֻ ּ ֽ ־ֿ 
U+05Cx׀ׁ ׂ ׃ׄ ׅ ׆ׇ 
U+05Dxאבגדהוזחטיךכלםמן
U+05Exנסעףפץצקרשתׯ
U+05Fxװױײ׳״
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

Note I: The ligatures װ ױ ײ are intended for Yiddish. They are not used in Hebrew.
Note II: The symbol ״ is called gershayim and is a punctuation mark used in the Hebrew language to denote acronyms. It is written before the last letter in the acronym. Gershayim is also the name of a note of cantillation in the reading of the Torah, printed above the accented letter.

Remaining graphs are in the Alphabetic Presentation Forms block:

Hebrew subset of Alphabetic Presentation Forms [1] [2]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+FB1x(U+FB00U+FB1C omitted) 
U+FB2x
U+FB3x
U+FB4x
Notes
1. ^ As of Unicode version 15.1
2. ^ Grey areas indicate non-assigned code points

Note: In Yiddish orthography only, the glyph, yud-ḥiriq (), pronounced /i/, can be optionally used, rather than typing yud then ḥiriq (יִ). In Hebrew spelling this would be pronounced /ji/. /i/ is written ḥiriq under the previous letter then yud ( ִ י).

HTML code tables

Note: HTML numeric character references can be in decimal format (&#DDDD;) or hexadecimal format (&#xHHHH;). For example, ג and ג (where "05D2" in hexadecimal is the same as "1490" in decimal) both represent the Hebrew letter gimmel.

Hebrew alphabet
בא
בבּא
הדג
הדג
חזו
חזו
יט
כּיט
לךכ
לךכ
נםמ
נםמ
עסן
עסן
ףפ
ףפפּ
קץצ
קץצ
ר
שׂשׁר
תש
תתּש
Vowels and unique characters
Patach
ַ 
Kamatz
ָ 
ַָ
Tzere
ֵ 
Segol
ֶ 
ֵֶ
Holam male
Holam haser
ֹ 
וֹֹ
Shuruk
Kubutz
ֻ 
וֻּ
Hiriq haser
ִ 
Hataf patach
ֲ 
ִ&‌#1458;
Hataf kamatz
ֳ 
Hataf segol
ֱ 
&‌#1459;&‌#1457;
Shva (nach and na)
ְ 
ְ
Dagesh
ּ 
ּ

See also

Related Research Articles

The Hebrew alphabet, known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is traditionally an abjad script used in the writing of the Hebrew language and other Jewish languages, most notably Yiddish, Ladino, Judeo-Arabic, and Judeo-Persian. In modern Hebrew, vowels are increasingly introduced. It is also used informally in Israel to write Levantine Arabic, especially among Druze. It is an offshoot of the Imperial Aramaic alphabet, which flourished during the Achaemenid Empire and which itself derives from the Phoenician alphabet.

Matres lectionis are consonants that are used to indicate a vowel, primarily in the writing of Semitic languages such as Arabic, Hebrew and Syriac. The letters that do this in Hebrew are alephא‎, heה‎, vavו‎ and yodי‎, and in Arabic, the matres lectionis are ʾalifا‎, wāwو‎ and yāʾي‎. The 'yod and waw in particular are more often vowels than they are consonants.

<span class="mw-page-title-main">ʻOkina</span> Letter of the Latin alphabet

The ʻokina, also called by several other names, is a unicameral consonant letter used within the Latin script to mark the phonemic glottal stop in many Polynesian languages.

In Hebrew orthography, niqqud or nikud is a system of diacritical signs used to represent vowels or distinguish between alternative pronunciations of letters of the Hebrew alphabet. Several such diacritical systems were developed in the Early Middle Ages. The most widespread system, and the only one still used to a significant degree today, was created by the Masoretes of Tiberias in the second half of the first millennium AD in the Land of Israel. Text written with niqqud is called ktiv menuqad.

<span class="mw-page-title-main">ArmSCII</span> Set of obsolete single-byte character encodings

ArmSCII or ARMSCII is a set of obsolete single-byte character encodings for the Armenian alphabet defined by Armenian national standard 166–9. ArmSCII is an acronym for Armenian Standard Code for Information Interchange, similar to ASCII for the American standard. It has been superseded by the Unicode standard.

A Unicode block is one of several contiguous ranges of numeric character codes of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.

Waw is the sixth letter of the Semitic abjads, including Phoenician wāw 𐤅, Aramaic waw 𐡅, Hebrew vav ו, Syriac waw ܘ and Arabic wāw و.

Yiddish orthography is the writing system used for the Yiddish language. It includes Yiddish spelling rules and the Hebrew script, which is used as the basis of a full vocalic alphabet. Letters that are silent or represent glottal stops in the Hebrew language are used as vowels in Yiddish. Other letters that can serve as both vowels and consonants are either read as appropriate to the context in which they appear, or are differentiated by diacritical marks derived from Hebrew nikkud, commonly referred to as "nekudot" or "pintalach". Additional phonetic distinctions between letters that share the same base character are also indicated by either pointing or adjacent placement of otherwise silent base characters. Several Yiddish points are not commonly used in any latter-day Hebrew context; others are used in a manner that is specific to Yiddish orthography. There is significant variation in the way this is applied in literary practice. There are also several differing approaches to the disambiguation of characters that can be used as either vowels or consonants.

<span class="mw-page-title-main">Hebrew keyboard</span> Keyboard layout

A Hebrew keyboard comes in two different keyboard layouts. Most Hebrew keyboards are bilingual, with Latin characters, usually in a US Qwerty layout. Trilingual keyboard options also exist, with the third script being Arabic or Russian, due to the sizable Arabic- and Russian-speaking populations in Israel.

The combining grapheme joiner (CGJ), U+034F͏COMBINING GRAPHEME JOINER is a Unicode character that has no visible glyph and is "default ignorable" by applications. Its name is a misnomer and does not describe its function: the character does not join graphemes. Its purpose is to semantically separate characters that should not be considered digraphs as well as to block canonical reordering of combining marks during normalization.

Geresh is a sign in Hebrew writing. It has two meanings.

  1. An apostrophe-like sign placed after a letter:
  2. A note of cantillation in the reading of the Torah and other Biblical books, taking the form of a curved diagonal stroke placed above a letter.

In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards. As the Unicode Glossary says:

A character that would not have been encoded except for compatibility and round-trip convertibility with other standards

Hiriq, also called Chirik is a Hebrew niqqud vowel sign represented by a single dot ⟨ ִ‎ ⟩ underneath the letter. In Modern Hebrew, it indicates the phoneme which is similar to the "ee" sound in the English word deep and is transliterated with "i". In Yiddish, it indicates the phoneme which is the same as the "i" sound in the English word skip and is transliterated with "i".

Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t were combined. The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType.

In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

Hebrew punctuation is similar to that of English and other Western languages, Modern Hebrew having imported additional punctuation marks from these languages in order to avoid the ambiguities sometimes occasioned by the relative paucity of such symbols in Biblical Hebrew.

The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.

Hebrew orthography includes three types of diacritics:

The Unicode Standard assigns various properties to each Unicode character and code point.