ISO 15924, Codes for the representation of names of scripts, is an international standard defining codes for writing systems or scripts (a "set of graphic characters used for the written form of one or more languages"). Each script is given both a four-letter code and a numeric code. [1]
Where possible the codes are derived from ISO 639-2, where the name of a script and the name of a language using the script are identical (example: Gujarātī ISO 639 guj, ISO 15924 Gujr). Preference is given to the 639-2 Bibliographical codes, which is different from the otherwise often preferred use of the Terminological codes. [1]
4-letter ISO 15924 codes are incorporated into the IANA Language Subtag Registry for IETF language tags and so can be used in file formats that make use of such language tags. For example, they can be used in HTML and XML to help Web browsers determine which typeface to use for foreign text. This way one could differentiate, for example, between Serbian written in the Cyrillic (sr-Cyrl
) or Latin (sr-Latn
) script, or mark romanized or transliterated text as such.
ISO appointed the Unicode Consortium as the Registration Authority (RA) for the standard. The RA is responsible for appointing a registrar who works with a Joint Advisory Committee (JAC) in developing and implementing the standard. The registrar from 2004 to 2018 was Michael Everson, and from January 2019 the registrar has been Markus Scherer, a technical director of the Unicode Consortium. [2] [3] The JAC consists of six members: one representative of the RA (Markus Scherer), one representative of ISO 639-2 (Randall K. Barry of the Library of Congress), one representative of ISO TC37 (Christian Galinski), one representative of ISO TC46 (Peeter Päll), and two representatives of ISO/IEC JTC 1/SC 2 (Rick McGowan and Ken Whistler, both also officers of the Unicode Consortium). [4]
Two four-letter codes are reserved at the request of the Common Locale Data Repository (CLDR) project: [8]
This list of codes is from the ISO 15924 standard. [8]
ISO 15924 | Script in Unicode [e] | |||||||
---|---|---|---|---|---|---|---|---|
Code | ISO number | ISO formal name | Directionality | Unicode Alias [f] | Version | Characters | Notes | Description |
Adlm | 166 | Adlam | right-to-left script | Adlam | 9.0 | 88 | Ch 19.9 | |
Afak | 439 | Afaka | varies | — Not in Unicode, proposal is explored [i] | ||||
Aghb | 239 | Caucasian Albanian | left-to-right | Caucasian Albanian | 7.0 | 53 | Ancient/historic | Ch 8.11 |
Ahom | 338 | Ahom, Tai Ahom | left-to-right | Ahom | 8.0 | 65 | Ancient/historic | Ch 15.16 |
Arab | 160 | Arabic | right-to-left script | Arabic | 1.0 | 1,373 | Ch 9.2 | |
Aran | 161 | Arabic (Nastaliq variant) | mixed | — Typographic variant of Arabic (see § Arab) | ||||
Armi | 124 | Imperial Aramaic | right-to-left script | Imperial Aramaic | 5.2 | 31 | Ancient/historic | Ch 10.4 |
Armn | 230 | Armenian | left-to-right | Armenian | 1.0 | 96 | Ch 7.6 | |
Avst | 134 | Avestan | right-to-left script | Avestan | 5.2 | 61 | Ancient/historic | Ch 10.7 |
Bali | 360 | Balinese | left-to-right | Balinese | 5.0 | 127 | Ch 17.3 | |
Bamu | 435 | Bamum | left-to-right | Bamum | 5.2 | 657 | Ch 19.6 | |
Bass | 259 | Bassa Vah | left-to-right | Bassa Vah | 7.0 | 36 | Ancient/historic | Ch 19.7 |
Batk | 365 | Batak | left-to-right | Batak | 6.0 | 56 | Ch 17.6 | |
Beng | 325 | Bengali (Bangla) | left-to-right | Bengali | 1.0 | 96 | Ch 12.2 | |
Bhks | 334 | Bhaiksuki | left-to-right | Bhaiksuki | 9.0 | 97 | Ancient/historic | Ch 14.3 |
Blis | 550 | Blissymbols | varies | — Not in Unicode, proposal is explored [i] | ||||
Bopo | 285 | Bopomofo | left-to-right, right-to-left script | Bopomofo | 1.0 | 77 | Ch 18.3 | |
Brah | 300 | Brahmi | left-to-right | Brahmi | 6.0 | 115 | Ancient/historic | Ch 14.1 |
Brai | 570 | Braille | left-to-right | Braille | 3.0 | 256 | Ch 21.1 | |
Bugi | 367 | Buginese | left-to-right | Buginese | 4.1 | 30 | Ch 17.2 | |
Buhd | 372 | Buhid | left-to-right | Buhid | 3.2 | 20 | Ch 17.1 | |
Cakm | 349 | Chakma | left-to-right | Chakma | 6.1 | 71 | Ch 13.11 | |
Cans | 440 | Unified Canadian Aboriginal Syllabics | left-to-right | Canadian Aboriginal | 3.0 | 726 | Ch 20.2 | |
Cari | 201 | Carian | left-to-right, right-to-left script | Carian | 5.1 | 49 | Ancient/historic | Ch 8.5 |
Cham | 358 | Cham | left-to-right | Cham | 5.1 | 83 | Ch 16.10 | |
Cher | 445 | Cherokee | left-to-right | Cherokee | 3.0 | 172 | Ch 20.1 | |
Chis | 298 | Chisoi | left-to-right | — Not in Unicode, proposal is mature [ii] | ||||
Chrs | 109 | Chorasmian | right-to-left script, top-to-bottom | Chorasmian | 13.0 | 28 | Ancient/historic | Ch 10.8 |
Cirt | 291 | Cirth | varies | — Not in Unicode | ||||
Copt | 204 | Coptic | left-to-right | Coptic | 1.0 | 137 | Ancient/historic, disunified from Greek in 4.1 | Ch 7.3 |
Cpmn | 402 | Cypro-Minoan | left-to-right | Cypro Minoan | 14.0 | 99 | Ancient/historic | Ch 8.4 |
Cprt | 403 | Cypriot syllabary | right-to-left script | Cypriot | 4.0 | 55 | Ancient/historic | Ch 8.3 |
Cyrl | 220 | Cyrillic | left-to-right | Cyrillic | 1.0 | 508 | Includes typographic variant Old Church Slavonic (see § Cyrs) | Ch 7.4 |
Cyrs | 221 | Cyrillic (Old Church Slavonic variant) | varies | — Typographic variant of Cyrillic (see § Cyrl); Ancient/historic | ||||
Deva | 315 | Devanagari (Nagari) | left-to-right | Devanagari | 1.0 | 164 | Ch 12.1 | |
Diak | 342 | Dives Akuru | left-to-right | Dives Akuru | 13.0 | 72 | Ancient/historic | Ch 15.15 |
Dogr | 328 | Dogra | left-to-right | Dogra | 11.0 | 60 | Ancient/historic | Ch 15.18 |
Dsrt | 250 | Deseret (Mormon) | left-to-right | Deseret | 3.1 | 80 | Ch 20.4 | |
Dupl | 755 | Duployan shorthand, Duployan stenography | left-to-right | Duployan | 7.0 | 143 | Ch 21.6 | |
Egyd | 070 | Egyptian demotic | mixed | — Not in Unicode | ||||
Egyh | 060 | Egyptian hieratic | mixed | — Not in Unicode | ||||
Egyp | 050 | Egyptian hieroglyphs | right-to-left script, left-to-right | Egyptian Hieroglyphs | 5.2 | 5,105 | Ancient/historic | Ch 11.4 |
Elba | 226 | Elbasan | left-to-right | Elbasan | 7.0 | 40 | Ancient/historic | Ch 8.10 |
Elym | 128 | Elymaic | right-to-left script | Elymaic | 12.0 | 23 | Ancient/historic | Ch 10.9 |
Ethi | 430 | Ethiopic (Geʻez) | left-to-right | Ethiopic | 3.0 | 523 | Ch 19.1 | |
Gara | 164 | Garay | right-to-left | Garay | 16.0 | 69 | ||
Geok | 241 | Khutsuri (Asomtavruli and Nuskhuri) | left-to-right | Georgian | Unicode groups Khutsori, Asomtavruli and Nuskhuri into 'Georgian' (see § Geok). Similarly, Mkhedruli and Mtavruli are 'Georgian' (see § Geor) | Ch 7.7 | ||
Geor | 240 | Georgian (Mkhedruli and Mtavruli) | left-to-right | Georgian | 1.0 | 173 | In Unicode this also includes Nuskhuri (Geok) | Ch 7.7 |
Glag | 225 | Glagolitic | left-to-right | Glagolitic | 4.1 | 134 | Ancient/historic | Ch 7.5 |
Gong | 312 | Gunjala Gondi | left-to-right | Gunjala Gondi | 11.0 | 63 | Ch 13.15 | |
Gonm | 313 | Masaram Gondi | left-to-right | Masaram Gondi | 10.0 | 75 | Ch 13.14 | |
Goth | 206 | Gothic | left-to-right | Gothic | 3.1 | 27 | Ancient/historic | Ch 8.9 |
Gran | 343 | Grantha | left-to-right | Grantha | 7.0 | 85 | Ancient/historic | Ch 15.14 |
Grek | 200 | Greek | left-to-right | Greek | 1.0 | 518 | Directionality sometimes as boustrophedon | Ch 7.2 |
Gujr | 320 | Gujarati | left-to-right | Gujarati | 1.0 | 91 | Ch 12.4 | |
Gukh | 397 | Gurung Khema | left-to-right | Gurung Khema | 16.0 | 58 | ||
Guru | 310 | Gurmukhi | left-to-right | Gurmukhi | 1.0 | 80 | Ch 12.3 | |
Hanb | 503 | Han with Bopomofo (alias for Han + Bopomofo) | mixed | — See § Hani, § Bopo | ||||
Hang | 286 | Hangul (Hangŭl, Hangeul) | left-to-right, vertical right-to-left | Hangul | 1.0 | 11,739 | Hangul syllables relocated in 2.0 | Ch 18.6 |
Hani | 500 | Han (Hanzi, Kanji, Hanja) | top-to-bottom, columns right-to-left (historically) | Han | 1.0 | 99,030 | Ch 18.1 | |
Hano | 371 | Hanunoo (Hanunóo) | left-to-right, bottom-to-top | Hanunoo | 3.2 | 21 | Ch 17.1 | |
Hans | 501 | Han (Simplified variant) | varies | — Subset of Han (Hanzi, Kanji, Hanja) (see § Hani) | ||||
Hant | 502 | Han (Traditional variant) | varies | — Subset of § Hani | ||||
Hatr | 127 | Hatran | right-to-left script | Hatran | 8.0 | 26 | Ancient/historic | Ch 10.12 |
Hebr | 125 | Hebrew | right-to-left script | Hebrew | 1.0 | 134 | Ch 9.1 | |
Hira | 410 | Hiragana | vertical right-to-left, left-to-right | Hiragana | 1.0 | 381 | Ch 18.4 | |
Hluw | 080 | Anatolian Hieroglyphs (Luwian Hieroglyphs, Hittite Hieroglyphs) | left-to-right | Anatolian Hieroglyphs | 8.0 | 583 | Ancient/historic | Ch 11.6 |
Hmng | 450 | Pahawh Hmong | left-to-right | Pahawh Hmong | 7.0 | 127 | Ch 16.11 | |
Hmnp | 451 | Nyiakeng Puachue Hmong | left-to-right | Nyiakeng Puachue Hmong | 12.0 | 71 | Ch 16.12 | |
Hrkt | 412 | Japanese syllabaries (alias for Hiragana + Katakana) | vertical right-to-left, left-to-right | Katakana or Hiragana | See § Hira, § Kana | Ch 18.4 | ||
Hung | 176 | Old Hungarian (Hungarian Runic) | right-to-left script | Old Hungarian | 8.0 | 108 | Ancient/historic | Ch 8.8 |
Inds | 610 | Indus (Harappan) | mixed | — Not in Unicode, proposal is explored [i] | ||||
Ital | 210 | Old Italic (Etruscan, Oscan, etc.) | right-to-left script, left-to-right | Old Italic | 3.1 | 39 | Ancient/historic | Ch 8.6 |
Jamo | 284 | Jamo (alias for Jamo subset of Hangul) | varies | — Subset of § Hang | ||||
Java | 361 | Javanese | left-to-right | Javanese | 5.2 | 90 | Ch 17.4 | |
Jpan | 413 | Japanese (alias for Han + Hiragana + Katakana) | varies | — See § Hani, § Hira and § Kana | ||||
Jurc | 510 | Jurchen | left-to-right | — Not in Unicode | ||||
Kali | 357 | Kayah Li | left-to-right | Kayah Li | 5.1 | 47 | Ch 16.9 | |
Kana | 411 | Katakana | vertical right-to-left, left-to-right | Katakana | 1.0 | 321 | Ch 18.4 | |
Kawi | 368 | Kawi | left-to-right | Kawi | 15.0 | 87 | Ancient/historic | Ch 17.9 |
Khar | 305 | Kharoshthi | right-to-left script | Kharoshthi | 4.1 | 68 | Ancient/historic | Ch 14.2 |
Khmr | 355 | Khmer | left-to-right | Khmer | 3.0 | 146 | Ch 16.4 | |
Khoj | 322 | Khojki | left-to-right | Khojki | 7.0 | 65 | Ancient/historic | Ch 15.7 |
Kitl | 505 | Khitan large script | left-to-right | — Not in Unicode | ||||
Kits | 288 | Khitan small script | vertical right-to-left | Khitan Small Script | 13.0 | 472 | Ancient/historic | Ch 18.12 |
Knda | 345 | Kannada | left-to-right | Kannada | 1.0 | 91 | Ch 12.8 | |
Kore | 287 | Korean (alias for Hangul + Han) | left-to-right | — See § Hani, § Hang | ||||
Kpel | 436 | Kpelle | left-to-right | — Not in Unicode, proposal is explored [i] | ||||
Krai | 396 | Kirat Rai | left-to-right | Kirat Rai | 16.0 | 58 | ||
Kthi | 317 | Kaithi | left-to-right | Kaithi | 5.2 | 68 | Ancient/historic | Ch 15.2 |
Lana | 351 | Tai Tham (Lanna) | left-to-right | Tai Tham | 5.2 | 127 | Ch 16.7 | |
Laoo | 356 | Lao | left-to-right | Lao | 1.0 | 83 | Ch 16.2 | |
Latf | 217 | Latin (Fraktur variant) | varies | — Typographic variant of Latin (see § Latn) | ||||
Latg | 216 | Latin (Gaelic variant) | left-to-right | — Typographic variant of Latin (see § Latn) | ||||
Latn | 215 | Latin | left-to-right | Latin | 1.0 | 1,487 | See also: Latin script in Unicode | Ch 7.1 |
Leke | 364 | Leke | left-to-right | — Not in Unicode | ||||
Lepc | 335 | Lepcha (Róng) | left-to-right | Lepcha | 5.1 | 74 | Ch 13.12 | |
Limb | 336 | Limbu | left-to-right | Limbu | 4.0 | 68 | Ch 13.6 | |
Lina | 400 | Linear A | left-to-right | Linear A | 7.0 | 341 | Ancient/historic | Ch 8.1 |
Linb | 401 | Linear B | left-to-right | Linear B | 4.0 | 211 | Ancient/historic | Ch 8.2 |
Lisu | 399 | Lisu (Fraser) | left-to-right | Lisu | 5.2 | 49 | Ch 18.9 | |
Loma | 437 | Loma | left-to-right | — Not in Unicode, proposal is explored [i] | ||||
Lyci | 202 | Lycian | left-to-right | Lycian | 5.1 | 29 | Ancient/historic | Ch 8.5 |
Lydi | 116 | Lydian | right-to-left script | Lydian | 5.1 | 27 | Ancient/historic | Ch 8.5 |
Mahj | 314 | Mahajani | left-to-right | Mahajani | 7.0 | 39 | Ancient/historic | Ch 15.6 |
Maka | 366 | Makasar | left-to-right | Makasar | 11.0 | 25 | Ancient/historic | Ch 17.8 |
Mand | 140 | Mandaic, Mandaean | right-to-left script | Mandaic | 6.0 | 29 | Ch 9.5 | |
Mani | 139 | Manichaean | right-to-left script | Manichaean | 7.0 | 51 | Ancient/historic | Ch 10.5 |
Marc | 332 | Marchen | left-to-right | Marchen | 9.0 | 68 | Ancient/historic | Ch 14.5 |
Maya | 090 | Mayan hieroglyphs | mixed | — Not in Unicode | ||||
Medf | 265 | Medefaidrin (Oberi Okaime, Oberi Ɔkaimɛ) | left-to-right | Medefaidrin | 11.0 | 91 | Ch 19.10 | |
Mend | 438 | Mende Kikakui | right-to-left script | Mende Kikakui | 7.0 | 213 | Ch 19.8 | |
Merc | 101 | Meroitic Cursive | right-to-left script | Meroitic Cursive | 6.1 | 90 | Ancient/historic | Ch 11.5 |
Mero | 100 | Meroitic Hieroglyphs | right-to-left script | Meroitic Hieroglyphs | 6.1 | 32 | Ancient/historic | Ch 11.5 |
Mlym | 347 | Malayalam | left-to-right | Malayalam | 1.0 | 118 | Ch 12.9 | |
Modi | 324 | Modi, Moḍī | left-to-right | Modi | 7.0 | 79 | Ancient/historic | Ch 15.12 |
Mong | 145 | Mongolian | vertical left-to-right, left-to-right | Mongolian | 3.0 | 168 | Mong includes Clear and Manchu scripts | Ch 13.5 |
Moon | 218 | Moon (Moon code, Moon script, Moon type) | mixed | — Not in Unicode, proposal is explored [i] | ||||
Mroo | 264 | Mro, Mru | left-to-right | Mro | 7.0 | 43 | Ch 13.8 | |
Mtei | 337 | Meitei Mayek (Meithei, Meetei) | left-to-right | Meetei Mayek | 5.2 | 79 | Ch 13.7 | |
Mult | 323 | Multani | left-to-right | Multani | 8.0 | 38 | Ancient/historic | Ch 15.10 |
Mymr | 350 | Myanmar (Burmese) | left-to-right | Myanmar | 3.0 | 243 | Ch 16.3 | |
Nagm | 295 | Nag Mundari | left-to-right | Nag Mundari | 15.0 | 42 | ||
Nand | 311 | Nandinagari | left-to-right | Nandinagari | 12.0 | 65 | Ancient/historic | Ch 15.13 |
Narb | 106 | Old North Arabian (Ancient North Arabian) | right-to-left script | Old North Arabian | 7.0 | 32 | Ancient/historic | Ch 10.1 |
Nbat | 159 | Nabataean | right-to-left script | Nabataean | 7.0 | 40 | Ancient/historic | Ch 10.10 |
Newa | 333 | Newa, Newar, Newari, Nepāla lipi | left-to-right | Newa | 9.0 | 97 | Ch 13.3 | |
Nkdb | 085 | Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba) | left-to-right | — Not in Unicode | ||||
Nkgb | 420 | Naxi Geba (na²¹ɕi³³ gʌ²¹ba²¹, 'Na-'Khi ²Ggŏ-¹baw, Nakhi Geba) | left-to-right | — Not in Unicode, proposal is explored [i] | ||||
Nkoo | 165 | N’Ko | right-to-left script | NKo | 5.0 | 62 | Ch 19.4 | |
Nshu | 499 | Nüshu | vertical right-to-left | Nushu | 10.0 | 397 | Ch 18.8 | |
Ogam | 212 | Ogham | bottom-to-top, left-to-right | Ogham | 3.0 | 29 | Ancient/historic | Ch 8.14 |
Olck | 261 | Ol Chiki (Ol Cemet’, Ol, Santali) | left-to-right | Ol Chiki | 5.1 | 48 | Ch 13.10 | |
Onao | 296 | Ol Onal | left-to-right | Ol Onal | 16.0 | 44 | ||
Orkh | 175 | Old Turkic, Orkhon Runic | right-to-left script | Old Turkic | 5.2 | 73 | Ancient/historic | Ch 14.8 |
Orya | 327 | Oriya (Odia) | left-to-right | Oriya | 1.0 | 91 | Ch 12.5 | |
Osge | 219 | Osage | left-to-right | Osage | 9.0 | 72 | Ch 20.3 | |
Osma | 260 | Osmanya | left-to-right | Osmanya | 4.0 | 40 | Ch 19.2 | |
Ougr | 143 | Old Uyghur | mixed | Old Uyghur | 14.0 | 26 | Ancient/historic | Ch 14.11 |
Palm | 126 | Palmyrene | right-to-left script | Palmyrene | 7.0 | 32 | Ancient/historic | Ch 10.11 |
Pauc | 263 | Pau Cin Hau | left-to-right | Pau Cin Hau | 7.0 | 57 | Ch 16.13 | |
Pcun | 015 | Proto-Cuneiform | left-to-right | — Not in Unicode | ||||
Pelm | 016 | Proto-Elamite | left-to-right | — Not in Unicode | ||||
Perm | 227 | Old Permic | left-to-right | Old Permic | 7.0 | 43 | Ancient/historic | Ch 8.13 |
Phag | 331 | Phags-pa | vertical left-to-right | Phags-pa | 5.0 | 56 | Ancient/historic | Ch 14.4 |
Phli | 131 | Inscriptional Pahlavi | right-to-left script | Inscriptional Pahlavi | 5.2 | 27 | Ancient/historic | Ch 10.6 |
Phlp | 132 | Psalter Pahlavi | right-to-left script | Psalter Pahlavi | 7.0 | 29 | Ancient/historic | Ch 10.6 |
Phlv | 133 | Book Pahlavi | mixed | — Not in Unicode | ||||
Phnx | 115 | Phoenician | right-to-left script | Phoenician | 5.0 | 29 | Ancient/historic [g] | Ch 10.3 |
Piqd | 293 | Klingon (KLI pIqaD) | left-to-right | — Rejected for inclusion in Unicode [iii] [iv] | ||||
Plrd | 282 | Miao (Pollard) | left-to-right | Miao | 6.1 | 149 | Ch 18.10 | |
Prti | 130 | Inscriptional Parthian | right-to-left script | Inscriptional Parthian | 5.2 | 30 | Ancient/historic | Ch 10.6 |
Psin | 103 | Proto-Sinaitic | mixed | — Not in Unicode | ||||
Qaaa-Qabx | 900-949 | Reserved for private use (range) | — Not in Unicode | |||||
Ranj | 303 | Ranjana | left-to-right | — Not in Unicode | ||||
Rjng | 363 | Rejang (Redjang, Kaganga) | left-to-right | Rejang | 5.1 | 37 | Ch 17.5 | |
Rohg | 167 | Hanifi Rohingya | right-to-left script | Hanifi Rohingya | 11.0 | 50 | Ch 16.14 | |
Roro | 620 | Rongorongo | mixed | — Not in Unicode, proposal is explored [i] | ||||
Runr | 211 | Runic | left-to-right, boustrophedon | Runic | 3.0 | 86 | Ancient/historic | Ch 8.7 |
Samr | 123 | Samaritan | right-to-left script, top-to-bottom | Samaritan | 5.2 | 61 | Ch 9.4 | |
Sara | 292 | Sarati | mixed | — Not in Unicode | ||||
Sarb | 105 | Old South Arabian | right-to-left script | Old South Arabian | 5.2 | 32 | Ancient/historic | Ch 10.2 |
Saur | 344 | Saurashtra | left-to-right | Saurashtra | 5.1 | 82 | Ch 13.13 | |
Sgnw | 095 | SignWriting | vertical left-to-right | SignWriting | 8.0 | 672 | Ch 21.7 | |
Shaw | 281 | Shavian (Shaw) | left-to-right | Shavian | 4.0 | 48 | Ch 8.15 | |
Shrd | 319 | Sharada, Śāradā | left-to-right | Sharada | 6.1 | 96 | Ch 15.3 | |
Shui | 530 | Shuishu | left-to-right | — Not in Unicode | ||||
Sidd | 302 | Siddham, Siddhaṃ, Siddhamātṛkā | left-to-right | Siddham | 7.0 | 92 | Ancient/historic | Ch 15.5 |
Sidt | 180 | Sidetic | right-to-left | — Not in Unicode, proposal is mature [ii] | ||||
Sind | 318 | Khudawadi, Sindhi | left-to-right | Khudawadi | 7.0 | 69 | Ch 15.9 | |
Sinh | 348 | Sinhala | left-to-right | Sinhala | 3.0 | 111 | Ch 13.2 | |
Sogd | 141 | Sogdian | horizontal and vertical writing in East Asian scripts, top-to-bottom | Sogdian | 11.0 | 42 | Ancient/historic | Ch 14.10 |
Sogo | 142 | Old Sogdian | right-to-left script | Old Sogdian | 11.0 | 40 | Ancient/historic | Ch 14.9 |
Sora | 398 | Sora Sompeng | left-to-right | Sora Sompeng | 6.1 | 35 | Ch 15.17 | |
Soyo | 329 | Soyombo | left-to-right | Soyombo | 10.0 | 83 | Ancient/historic | Ch 14.7 |
Sund | 362 | Sundanese | left-to-right | Sundanese | 5.1 | 72 | Ch 17.7 | |
Sunu | 274 | Sunuwar | left-to-right | Sunuwar | 16.0 | 44 | ||
Sylo | 316 | Syloti Nagri | left-to-right | Syloti Nagri | 4.1 | 45 | Ancient/historic | Ch 15.1 |
Syrc | 135 | Syriac | right-to-left script | Syriac | 3.0 | 88 | Includes typographic variants Estrangelo (see § Syre), Western (§ Syrj), and Eastern (§ Syrn) | Ch 9.3 |
Syre | 138 | Syriac (Estrangelo variant) | mixed | — Typographic variant of Syriac (see § Syrc) | ||||
Syrj | 137 | Syriac (Western variant) | mixed | — Typographic variant of Syriac (see § Syrc) | ||||
Syrn | 136 | Syriac (Eastern variant) | mixed | — Typographic variant of Syriac (see § Syrc) | ||||
Tagb | 373 | Tagbanwa | left-to-right | Tagbanwa | 3.2 | 18 | Ch 17.1 | |
Takr | 321 | Takri, Ṭākrī, Ṭāṅkrī | left-to-right | Takri | 6.1 | 68 | Ch 15.4 | |
Tale | 353 | Tai Le | left-to-right | Tai Le | 4.0 | 35 | Ch 16.5 | |
Talu | 354 | New Tai Lue | left-to-right | New Tai Lue | 4.1 | 83 | Ch 16.6 | |
Taml | 346 | Tamil | left-to-right | Tamil | 1.0 | 123 | Ch 12.6 | |
Tang | 520 | Tangut | vertical right-to-left, left-to-right | Tangut | 9.0 | 6,914 | Ancient/historic | Ch 18.11 |
Tavt | 359 | Tai Viet | left-to-right | Tai Viet | 5.2 | 72 | Ch 16.8 | |
Tayo | 380 | Tai Yo | top-to-bottom, columns right-to-left | — Not in Unicode, proposal is mature [ii] | ||||
Telu | 340 | Telugu | left-to-right | Telugu | 1.0 | 100 | Ch 12.7 | |
Teng | 290 | Tengwar | left-to-right | — Not in Unicode | ||||
Tfng | 120 | Tifinagh (Berber) | left-to-right, right-to-left script, top-to-bottom, bottom-to-top | Tifinagh | 4.1 | 59 | Ch 19.3 | |
Tglg | 370 | Tagalog (Baybayin, Alibata) | left-to-right | Tagalog | 3.2 | 23 | Ch 17.1 | |
Thaa | 170 | Thaana | right-to-left script | Thaana | 3.0 | 50 | Ch 13.1 | |
Thai | 352 | Thai | left-to-right | Thai | 1.0 | 86 | Ch 16.1 | |
Tibt | 330 | Tibetan | left-to-right | Tibetan | 2.0 | 207 | Added in 1.0, removed in 1.1 and reintroduced in 2.0 | Ch 13.4 |
Tirh | 326 | Tirhuta | left-to-right | Tirhuta | 7.0 | 82 | Ch 15.11 | |
Tnsa | 275 | Tangsa | left-to-right | Tangsa | 14.0 | 89 | Ch 13.18 | |
Todr | 229 | Todhri | right-to-left | Todhri | 16.0 | 52 | Ancient/historic | |
Tols | 299 | Tolong Siki | left-to-right | — Not in Unicode, proposal is mature [ii] | ||||
Toto | 294 | Toto | left-to-right | Toto | 14.0 | 31 | Ch 13.17 | |
Tutg | 341 | Tulu-Tigalari | left-to-right | Tulu Tigalari | 16.0 | 80 | ||
Ugar | 040 | Ugaritic | left-to-right | Ugaritic | 4.0 | 31 | Ancient/historic | Ch 11.2 |
Vaii | 470 | Vai | left-to-right | Vai | 5.1 | 300 | Ch 19.5 | |
Visp | 280 | Visible Speech | left-to-right | — Not in Unicode | ||||
Vith | 228 | Vithkuqi | left-to-right | Vithkuqi | 14.0 | 70 | Ancient/historic | Ch 8.12 |
Wara | 262 | Warang Citi (Varang Kshiti) | left-to-right | Warang Citi | 7.0 | 84 | Ch 13.9 | |
Wcho | 283 | Wancho | left-to-right | Wancho | 12.0 | 59 | Ch 13.16 | |
Wole | 480 | Woleai | mixed | — Not in Unicode, proposal is explored [i] | ||||
Xpeo | 030 | Old Persian | left-to-right | Old Persian | 4.1 | 50 | Ancient/historic | Ch 11.3 |
Xsux | 020 | Cuneiform, Sumero-Akkadian | left-to-right | Cuneiform | 5.0 | 1,234 | Ancient/historic | Ch 11.1 |
Yezi | 192 | Yezidi | right-to-left script | Yezidi | 13.0 | 47 | Ancient/historic | Ch 9.6 |
Yiii | 460 | Yi | left-to-right | Yi | 3.0 | 1,220 | Ch 18.7 | |
Zanb | 339 | Zanabazar Square (Zanabazarin Dörböljin Useg, Xewtee Dörböljin Bicig, Horizontal Square Script) | left-to-right | Zanabazar Square | 10.0 | 72 | Ancient/historic | Ch 14.6 |
Zinh | 994 | Code for inherited script | Inherited | 657 | ||||
Zmth | 995 | Mathematical notation | — Not a 'script' in Unicode | |||||
Zsym | 996 | Symbols | — Not a 'script' in Unicode | |||||
Zsye | 993 | Symbols (emoji variant) | — Not a 'script' in Unicode | |||||
Zxxx | 997 | Code for unwritten documents | — Not a 'script' in Unicode | |||||
Zyyy | 998 | Code for undetermined script | Common | 9,053 | ||||
Zzzz | 999 | Code for uncoded script | Unknown | 959,049 | In Unicode: All other code points | |||
Notes
| ||||||||
References
|
The following standards are referred to as indispensable by ISO 15924.
For definition of font and glyph the standard refers to
Around 160 scripts are defined in Unicode . Through a linkpin called "Property Value Alias", Unicode has made a 1:1 connection between a script defined, and its ISO 15924 standard. See Script (Unicode).
Inherited
is the Unicode script property value of "characters that may be used with multiple scripts, and that inherit their script from a preceding base character. These include nonspacing combining marks and enclosing combining marks, as well as U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER." The value Inherited
is described as matching the ISO 15924 code Zinh
.Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 of the standard defines 154998 characters and 168 scripts used in various ordinary, literary, academic, and technical contexts.
Mojibake is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often from a different writing system.
ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization (ISO), to represent countries, dependent territories, and special areas of geographical interest. They are the most widely used of the country codes published by ISO, and are used most prominently for the Internet's country code top-level domains. They are also used as country identifiers extending the postal code when appropriate within the international postal system for paper mail, and have replaced the previous one consisting one-letter codes. They were first included as part of the ISO 3166 standard in its first edition in 1974.
The yen and yuan sign (¥) is a currency sign used for the Japanese yen and the Chinese yuan currencies when writing in Latin scripts. This character resembles a capital letter Y with a single or double horizontal stroke. The symbol is usually placed before the value it represents, for example: ¥50, or JP¥50 and CN¥50 when disambiguation is needed. When writing in Japanese and Chinese, the Japanese kanji and Chinese character is written following the amount, for example 50円 in Japan, and 50元 or 50圆 in China.
In computing, a locale is a set of parameters that defines the user's language, region and any special variant preferences that the user wants to see in their user interface. Usually a locale identifier consists of at least a language code and a country/region code. Locale is an important aspect of i18n.
Michael Everson is an American and Irish linguist, script encoder, typesetter, type designer and publisher. He runs a publishing company called Evertype, through which he has published over one hundred books since 2006.
The ConScript Unicode Registry is a volunteer project to coordinate the assignment of code points in the Unicode Private Use Areas (PUA) for the encoding of artificial scripts, such as those for constructed languages. It was founded by John Cowan and was maintained by him and Michael Everson. It is not affiliated with the Unicode Consortium.
TRON Code is a multi-byte character encoding used in the TRON project. It is similar to Unicode but does not use Unicode's Han unification process: each character from each CJK character set is encoded separately, including archaic and historical equivalents of modern characters. This means that Chinese, Japanese, and Korean text can be mixed without any ambiguity as to the exact form of the characters; however, it also means that many characters with equivalent semantics will be encoded more than once, complicating some operations.
The Common Locale Data Repository (CLDR) is a project of the Unicode Consortium to provide locale data in XML format for use in computer applications. CLDR contains locale-specific information that an operating system will typically provide to applications. CLDR is written in the Locale Data Markup Language (LDML).
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the standard. Three private use areas are defined: one in the Basic Multilingual Plane, and one each in, and nearly covering, planes 15 and 16. They are intentionally left undefined so that third parties may assign their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions.
An IETF BCP 47 language tag is a standardized code that is used to identify human languages on the Internet. The tag structure has been standardized by the Internet Engineering Task Force (IETF) in Best Current Practice (BCP) 47; the subtags are maintained by the IANA Language Subtag Registry.
The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.
Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.
In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems. Some scripts support one and only one writing system and language, for example, Armenian. Other scripts support many different writing systems; for example, the Latin script supports English, French, German, Italian, Vietnamese, Latin itself, and several other languages. Some languages make use of multiple alternate writing systems and thus also use several scripts; for example, in Turkish, the Arabic script was used before the 20th century but transitioned to Latin in the early part of the 20th century. More or less complementary to scripts are symbols and Unicode control characters.
Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF, containing these code points:
The Universal Coded Character Set is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.
The Unicode Standard assigns various properties to each Unicode character and code point.
The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country codes in a way that allows optional special treatment.
Zawgyi font is a predominant typeface used for Burmese language text on websites. It supports the Burmese script using its Myanmar Unicode block following a non-compliant implementation. Prior to 2019, it was the most popular font on Burmese websites.
Tamil All Character Encoding (TACE16) is a scheme for encoding the Tamil script in the Private Use Area of Unicode, implementing a syllabary-based character model differing from the modified-ISCII model used by Unicode's existing Tamil implementation.
Qaag is a special script code for identifying the non-standard use of Myanmar characters for display with the Zawgyi font. The purpose of the code is to enable migration to standard, interoperable use of Unicode by providing an identifier for Zawgyi for tagging text, applications, input methods, font tables, transformations, and other mechanisms used for migration.