The Atari ST character set [1] is the character set of the Atari ST personal computer family including the Atari STE, TT and Falcon. It is based on code page 437, the original character set of the IBM PC.
Like codepage 437, it aligns with ASCII codepoints 32–126, and has additional codepoints including letters with diacritics and other symbols. It differs from code page 437 in using other dingbats at code points 0–31, in exchanging the box-drawing characters 176–223 for the Hebrew alphabet and other symbols, and exchanging code points 158, 236 and 254–255 with the symbols for sharp S, line integral, cubed and macron.
The Atari ST family of computers contained this font stored in ROM in three sizes; as an 8×16 pixels-per-character font used in the high-resolution graphics modes, as an 8×8 pixels-per-character font used in the low- and medium-resolution graphics modes, and as a 6×6 pixels-per-character font used for icon labels in any graphics mode. [1]
All 256 codes were assigned a graphical character in ROM, including the codes from 0 to 31 that in ASCII were reserved for non-graphical control characters.
The following table shows the Atari ST character set. Each character is shown with a potential Unicode equivalent if available. Differences from code page 437 are shown boxed.
Although the ROM provides a graphic for all 256 different possible 8-bit codes, some APIs will not print some of these code points, in particular the range 0–31 and the code at 127. Instead they will interpret them as control characters.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | NUL [lower-alpha 1] | ⇧ | ⇩ | ⇨ | ⇦ | 🮽 | 🮾 | 🮿 | ✓ | 🕒︎ | 🔔︎ | ♪ | ␌ | ␍ | [lower-alpha 2] | [lower-alpha 2] |
1x | 🯰 | 🯱 | 🯲 | 🯳 | 🯴 | 🯵 | 🯶 | 🯷 | 🯸 | 🯹 | ə | ␛ | [lower-alpha 3] | [lower-alpha 3] | [lower-alpha 3] | [lower-alpha 3] |
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | ⌂ |
8x | Ç | ü | é | â | ä | à | å | ç | ê | ë | è | ï | î | ì | Ä | Å |
9x | É | æ | Æ | ô | ö | ò | û | ù | ÿ | Ö | Ü | ¢ | £ | ¥ | ß [lower-alpha 4] | ƒ |
Ax | á | í | ó | ú | ñ | Ñ | ª | º | ¿ | ⌐ | ¬ | ½ | ¼ | ¡ | « | » |
Bx | ã | õ | Ø | ø | œ | Œ | À | à | Õ | ¨ | ´ | † | ¶ | © | ® | ™ |
Cx | ij | IJ | א | ב | ג | ד | ה | ו | ז | ח | ט | י | כ | ל | מ | נ |
Dx | ס | ע | פ | צ | ק | ר | ש | ת | ן | ך | ם | ף | ץ | § | ∧ | ∞ |
Ex | α | β [lower-alpha 4] | Γ | π [lower-alpha 5] | Σ [lower-alpha 6] | σ | µ [lower-alpha 7] | τ | Φ | Θ | Ω [lower-alpha 8] | δ [lower-alpha 9] | ∮ [lower-alpha 10] | ϕ [lower-alpha 11] | ∈ [lower-alpha 12] | ∩ |
Fx | ≡ | ± | ≥ | ≤ | ⌠ | ⌡ | ÷ | ≈ | ° | • | · | √ | ⁿ | ² | ³ | ¯ [lower-alpha 13] |
Using Alt codes, users can enter a character by holding down the Alt key and entering the three-digit decimal code point on the Numpad. This provides a way to enter special characters not provided directly on the keyboard. [2]
The Atari ST character set long predates the introduction of the euro currency and thus does not provide a code point for the euro sign (U+20AC, €). However, some software (such as Calamus) utilizes code point 238 (0xEE) for this purpose. [12] [13] This code point is normally assigned to the mathematical element-of sign (U+2208, ∈), and to the Greek lowercase epsilon (U+03B5, ε) in code page 437. Alternatively, the rarely used logical conjunction sign (U+2227, ∧) at code point 222 (0xDE) could be replaced by the euro sign. [14]
Epsilon is the fifth letter of the Greek alphabet, corresponding phonetically to a mid front unrounded vowel IPA:[e̞] or IPA:[ɛ̝]. In the system of Greek numerals it also has the value five. It was derived from the Phoenician letter He . Letters that arose from epsilon include the Roman E, Ë and Ɛ, and Cyrillic Е, È, Ё, Є and Э.
Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.
Windows-1252 or CP-1252 is a single-byte character encoding of the Latin alphabet that was used by default in Microsoft Windows for English and many Romance and Germanic languages including Spanish, Portuguese, French, and German. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa.
The pound sign is the symbol for the pound unit of sterling – the currency of the United Kingdom and its associated Crown Dependencies and British Overseas Territories and previously of Great Britain and of the Kingdom of England. The same symbol is used for other currencies called pound, such as the Egyptian and Syrian pounds. The sign may be drawn with one or two bars depending on personal preference, but the Bank of England has used the one-bar style exclusively on banknotes since 1975.
PETSCII, also known as CBM ASCII, is the character set used in Commodore Business Machines' 8-bit home computers.
VISCII is an unofficially-defined modified ASCII character encoding for using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VISCII keeps the 95 printable characters of ASCII unmodified, but it replaces 6 of the 33 control characters with printable characters. It adds 128 precomposed characters. Unicode and the Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy VSCII and VISCII files may need conversion.
GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix denotes a non-mandatory standard.
KOI (КОИ) is a family of several code pages for the Cyrillic script. The name stands for Kod obmena informatsiey which means "Code for Information Interchange".
ArmSCII or ARMSCII is a set of obsolete single-byte character encodings for the Armenian alphabet defined by Armenian national standard 166–9. ArmSCII is an acronym for Armenian Standard Code for Information Interchange, similar to ASCII for the American standard. It has been superseded by the Unicode standard.
Code page 850 is a code page used under DOS operating systems in Western Europe. Depending on the country setting and system configuration, code page 850 is the primary code page and default OEM code page in many countries, including various English-speaking locales, whilst other English-speaking locales default to the hardware code page 437.
Code page 437 is the character set of the original IBM PC. It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacritics), Greek letters, icons, and line-drawing symbols. It is sometimes referred to as the "OEM font" or "high ASCII", or as "extended ASCII".
Several 8-bit character sets (encodings) were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.
MIK (МИК) is an 8-bit Cyrillic code page used with DOS. It is based on the character set used in the Bulgarian Pravetz 16 IBM PC compatible system. Kermit calls this character set "BULGARIA-PC" / "bulgaria-pc". In Bulgaria, it was sometimes incorrectly referred to as code page 856. This code page is known by FreeDOS as Code page 3021.
Symbol is one of the four standard fonts available on all PostScript-based printers, starting with Apple's original LaserWriter (1985). It contains a complete unaccented Greek alphabet and a selection of commonly used mathematical symbols. Insofar as it fits into any standard classification, it is a serif font designed in the style of Times New Roman.
The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.
The programming language APL uses a number of symbols, rather than words from natural language, to identify operations, similarly to mathematical symbols. Prior to the wide adoption of Unicode, a number of special-purpose EBCDIC and non-EBCDIC code pages were used to represent the symbols required for writing APL.
CWI-2 is a Hungarian code page frequently used in the 1980s and early 1990s. If this code page is erroneously interpreted as code page 437, it will still be fairly readable.
The GEM character set is the character set of Digital Research's graphical user interface GEM on Intel platforms. It is based on code page 437, the original character set of the IBM PC.
The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.
VSCII, also known as TCVN 5712, ISO-IR-180, .VN, ABC or simply the TCVN encodings, is a set of three closely related Vietnamese national standard character encodings for using the Vietnamese language with computers, developed by the TCVN Technical Committee on Information Technology (TCVN/TC1) and first adopted in 1993.