Extended Latin-8

Last updated

This is an extension of ISO 8859-14 for Windows CeltScript fonts. [1] It deprecated CER-GS when this character encoding was updated in August 1998. [2]

Character set

Extended Latin-8
0123456789ABCDEF
0x
1x
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x   ° ƒ ± ɼ ſ Œ    
9x  œ    ı
Ax NBSP £ Ċ ċ § © SHY ® Ÿ
Bx Ġ ġ
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ŵ Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Ŷ ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ŵ ñ ò ó ô õ ö ø ù ú û ü ý ŷ ÿ
  Differences from CER-GS

Related Research Articles

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, thousands of emoji, and non-visual control and formatting codes.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

<span class="mw-page-title-main">ß</span> Letter of the Latin alphabet; used in German

In German orthography, the letter , called Eszett and scharfes S, represents the phoneme in Standard German when following long vowels and diphthongs. The letter-name Eszett combines the names of the letters of ⟨s⟩ and ⟨z⟩ in German. The character's Unicode names in English are sharp s and eszett. The Eszett letter is used only in German, and can be typographically replaced with the double-s digraph ⟨ss⟩, if the ß-character is unavailable. In the 20th century, the ß-character was replaced with ss in the spelling of Swiss Standard German, while remaining Standard German spelling in other varieties of German language.

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined.

<span class="mw-page-title-main">Emoji</span> Symbols often used as emotional cues in text

An emoji is a pictogram, logogram, ideogram or smiley embedded in text and used in electronic messages and web pages. The primary function of emoji is to fill in emotional cues otherwise missing from typed conversation. Examples of emoji are 😂, 😃, 🧘🏻‍♂️, 🌍, 🌦️, 🥖, 🚗, 📱, 🎉, ❤️, ✅, and 🏁. Emoji exist in various genres, including facial expressions, common objects, places and types of weather, and animals. They are much like emoticons, except emoji are pictures rather than typographic approximations; the term "emoji" in the strict sense refers to such pictures which can be represented as encoded characters, but it is sometimes applied to messaging stickers by extension. Originally meaning pictograph, the word emoji comes from Japanese e + moji; the resemblance to the English words emotion and emoticon is purely coincidental. The ISO 15924 script code for emoji is Zsye.

GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix denotes a non-mandatory standard.

Mac OS Roman is a character encoding created by Apple Computer, Inc. for use by Macintosh computers. It is suitable for representing text in English and several other Western languages. Mac OS Roman encodes 256 characters, the first 128 of which are identical to ASCII, with the remaining characters including mathematical symbols, diacritics, and additional punctuation marks. Mac OS Roman is an extension of the original Macintosh character set, which encoded only 217 characters. Full support for Mac OS Roman first appeared in System 6.0.4, released in 1989, and the encoding is still supported in current versions of macOS, though the standard character encodings are now UTF-8 or UTF-16. Apple modified Mac OS Roman in 1998 with the release of Mac OS 8.5 by replacing the currency sign at position hexadecimal 0xDB with the euro sign, but otherwise the encoding has been unchanged since its release.

<span class="mw-page-title-main">Code page 437</span> Character set of the original IBM PC

Code page 437 is the character set of the original IBM PC. It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacritics), Greek letters, icons, and line-drawing symbols. It is sometimes referred to as the "OEM font" or "high ASCII", or as "extended ASCII".

Windows code page 1253, commonly known by its IANA-registered name Windows-1253 or abbreviated as cp1253, is a Microsoft Windows code page used to write modern Greek. It is not capable of supporting the older polytonic Greek.

The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.

<span class="mw-page-title-main">Unified Hangul Code</span> Windows character encoding for Korean

Unified Hangul Code (UHC), or Extended Wansung, also known under Microsoft Windows as Code Page 949, is the Microsoft Windows code page for the Korean language. It is an extension of Wansung Code to include all 11172 non-partial Hangul syllables present in Johab. This corresponds to the pre-composed syllables available in Unicode 2.0 and later.

Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.

Mac OS Icelandic is an obsolete character encoding that was used in Apple Macintosh computers to represent Icelandic text. It is largely identical to Mac OS Roman, except for the Icelandic special characters Ý, Þ and Ð which have replaced typography characters.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

MacGreek encoding or Macintosh Greek encoding is used in Apple Macintosh computers to represent texts in the Greek language that uses the Greek script. This encoding is registered as IBM code page/CCSID 1280 and Windows code page 10006.

Mac OS Romanian is a character encoding used on Apple Macintosh computers to represent the Romanian language. It is a derivative of Mac OS Roman.

Mac OS Celtic is a character encoding used by Mac OS to represent Welsh text, replacing 14 of the Mac OS Roman characters with Welsh characters. This character set was developed by Michael Everson and was used for the Irish localizations of Mac OS 6.0.8 and 7.1 and for the Welsh localization of Mac OS 7.1.

The RPL character set is an 8-bit character set and encoding used by most RPL calculators manufactured by Hewlett-Packard as well as by the HP 82240B thermo printer. It is sometimes referred to simply as "ECMA-94" in documentation, although it is for the most part a superset of ISO/IEC 8859-1 / ECMA-94 in terms of printable characters, and it differs from ISO/IEC 8859-1 by using displayable characters rather than control characters in the 0x80 to 0x9F range of code points.

Mac OS Gaelic is a character encoding created for the Irish Gaelic language, based on the Welsh Mac OS Celtic encoding but replacing 23 characters with Gaelic characters. It was developed by Michael Everson, and was in his CeltScript fonts and on some fonts included with the Irish localization of Mac OS 6.0.8 and 7.1 and on.

CER-GS is a character encoding for Windows CeltScript fonts. It is deprecated by Extended Latin-8, which contains the euro and disunifies the ampersands.

References

  1. "Latin 8 Extended" . Retrieved 27 February 2017.
  2. "CER-GS" . Retrieved 27 February 2017.