MIME / IANA | IBM850 |
---|---|
Alias(es) | cp850, 850, csPC850Multilingual, [1] DOS Latin 1, OEM 850 |
Language(s) | English, various others |
Classification | Extended ASCII, OEM code page |
Extends | US-ASCII |
Based on | OEM-US |
Transforms / Encodes | ISO/IEC 8859-1 (reordered) |
Other related encoding(s) | Code page 858 (PC DOS 2000's "modified code page 850"), code page 437 |
Code page 850 (CCSID 850) (also known as CP 850, IBM 00850, [2] OEM 850, [3] DOS Latin 1 [4] ) is a code page used under DOS operating systems [a] in Western Europe. [5] Depending on the country setting and system configuration, code page 850 is the primary code page and default OEM code page in many countries, including various English-speaking locales (e.g. in the United Kingdom, Ireland, and Canada), whilst other English-speaking locales (like the United States) default to the hardware code page 437. [6]
Code page 850 differs from code page 437 in that many of the box-drawing characters, Greek letters, and various symbols were replaced with additional Latin letters with diacritics, thus greatly improving support for Western European languages (all characters from ISO 8859-1 are included). At the same time, the changes frequently caused display glitches with programs that made use of the box-drawing characters to display a GUI-like surface in text mode.
After the DOS era, successor operating systems largely replaced code page 850 with Windows-1252, [b] later UCS-2 and UTF-16, [c] and finally UTF-8. However, legacy applications, especially command-line programs, may still depend on support for older code pages.
Each non-ASCII character appears with its equivalent Unicode code-point. Differences from code page 437 are limited to the second half of the table, the first half being the same.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x 0 | NUL | ☺︎ 263A | ☻ 263B | ♥︎ 2665 | ♦︎ 2666 | ♣︎ 2663 | ♠︎ 2660 | • 2022 | ◘ 25D8 | ○ 25CB | ◙ 25D9 | ♂︎ 2642 | ♀︎ 2640 | ♪ 266A | ♫ 266B | ☼ 263C |
1x 16 | ► 25BA | ◄ 25C4 | ↕︎ 2195 | ‼︎ 203C | ¶ 00B6 | § 00A7 | ▬ 25AC | ↨ 21A8 | ↑ 2191 | ↓ 2193 | → 2192 | ← 2190 | ∟ 221F | ↔︎ 2194 | ▲ 25B2 | ▼ 25BC |
2x 32 | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x 48 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x 64 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x 80 | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x 96 | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x 112 | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | ⌂ 2302 |
8x 128 | Ç 00C7 | ü 00FC | é 00E9 | â 00E2 | ä 00E4 | à 00E0 | å 00E5 | ç 00E7 | ê 00EA | ë 00EB | è 00E8 | ï 00EF | î 00EE | ì 00EC | Ä 00C4 | Å 00C5 |
9x 144 | É 00C9 | æ 00E6 | Æ 00C6 | ô 00F4 | ö 00F6 | ò 00F2 | û 00FB | ù 00F9 | ÿ 00FF | Ö 00D6 | Ü 00DC | ø 00F8 | £ 00A3 | Ø 00D8 | × 00D7 | ƒ 0192 |
Ax 160 | á 00E1 | í 00ED | ó 00F3 | ú 00FA | ñ 00F1 | Ñ 00D1 | ª 00AA | º 00BA | ¿ 00BF | ® 00AE | ¬ 00AC | ½ 00BD | ¼ 00BC | ¡ 00A1 | « 00AB | » 00BB |
Bx 176 | ░ 2591 | ▒ 2592 | ▓ 2593 | │ 2502 | ┤ 2524 | Á 00C1 | Â 00C2 | À 00C0 | © 00A9 | ╣ 2563 | ║ 2551 | ╗ 2557 | ╝ 255D | ¢ 00A2 | ¥ 00A5 | ┐ 2510 |
Cx 192 | └ 2514 | ┴ 2534 | ┬ 252C | ├ 251C | ─ 2500 | ┼ 253C | ã 00E3 | Ã 00C3 | ╚ 255A | ╔ 2554 | ╩ 2569 | ╦ 2566 | ╠ 2560 | ═ 2550 | ╬ 256C | ¤ 00A4 |
Dx 208 | ð 00F0 | Ð 00D0 | Ê 00CA | Ë 00CB | È 00C8 | ı 0131 | Í 00CD | Î 00CE | Ï 00CF | ┘ 2518 | ┌ 250C | █ 2588 | ▄ 2584 | ¦ 00A6 | Ì 00CC | ▀ 2580 |
Ex 224 | Ó 00D3 | ß 00DF | Ô 00D4 | Ò 00D2 | õ 00F5 | Õ 00D5 | µ 00B5 | þ 00FE | Þ 00DE | Ú 00DA | Û 00DB | Ù 00D9 | ý 00FD | Ý 00DD | ¯ 00AF | ´ 00B4 |
Fx 240 | SHY 00AD | ± 00B1 | ‗ 2017 | ¾ 00BE | ¶ 00B6 | § 00A7 | ÷ 00F7 | ¸ 00B8 | ° 00B0 | ¨ 00A8 | · 00B7 | ¹ 00B9 | ³ 00B3 | ² 00B2 | ■ 25A0 | NBSP 00A0 |
MIME / IANA | IBM00858 |
---|---|
Alias(es) | CCSID00858, CP00858, PC-Multilingual-850+euro [1] |
Transforms / Encodes | ISO 8859-1 |
Preceded by | Code page 850 |
In 1998, code page 858 (CCSID 858) [11] (also known as CP 858, IBM 00858, OEM 858 [3] ) was derived from this code page by changing code point 213 (D5hex) from a dotless i ⟨ı⟩ to the euro sign ⟨€⟩U+20AC. [12] [13] [14] Unlike most code pages modified to support the euro sign, the generic currency sign at CFhex was not chosen as the character to replace (compare ISO-8859-15 (from ISO-8859-1), code pages 808 (from 866), 848 (from 1125), 849 (from 1131) and 872 (from 855), ISO-IR-205 (from ISO-8859-4), ISO-IR-206 (from ISO-8859-13), and the changes to MacRoman and MacCyrillic).
Instead of adding support for the new code page 858, IBM's PC DOS 2000, also released in 1998, changed the definition of the existing code page 850 to what IBM called modified code page 850 to include the euro sign at code point 213. [15] [16] [17] [18] [19] The reason for this might have been due to restrictions in MS-DOS/PC DOS, which limited .CPI files to 64 KB in size or about six codepages maximum. Adding support for codepage 858 might have meant to drop another (e.g. codepage 850) at the same time, which might not have been a viable solution at that time, given that some applications were hard-wired to use codepage 850. More recent IBM/MS products implemented codepage 858 under its own ID.
In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte.
Windows-1252 or CP-1252 is a legacy single-byte character encoding that is used by default in Microsoft Windows throughout the Americas, Western Europe, Oceania, and much of Africa.
ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined.
ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-4 or North European. It was designed to cover Estonian, Latvian, Lithuanian, Greenlandic, and Sámi. It has been largely superseded by ISO/IEC 8859-10 and Unicode. Microsoft has assigned code page 28594 a.k.a. Windows-28594 to ISO-8859-4 in Windows. IBM has assigned code page 914 to ISO 8859-4.
Code page 855 is a code page used under DOS to write Cyrillic script.
Code page 866 is a code page used under DOS and OS/2 in Russia to write Cyrillic script. It is based on the "alternative code page" developed in 1984 in IHNA AS USSR and published in 1986 by a research group at the Academy of Science of the USSR. The code page was widely used during the DOS era because it preserves all of the pseudographic symbols of code page 437 and maintains alphabetic order of Cyrillic letters. Initially this encoding was only available in the Russian version of MS-DOS 4.01 (1990), but with MS-DOS 6.22 it became available in any language version.
Code page 852 is a code page used under DOS to write Central European languages that use Latin script.
Code page 860 is a code page used under DOS in Portugal to write Portuguese and it is also suitable to write Spanish and Italian. In Brazil, however, the most widespread codepage – and that which DOS in Brazilian Portuguese used by default – was code page 850.
Code page 857 is a code page used under DOS in Turkey to write Turkish.
Code page 869 is a code page used under DOS to write Greek and may also be used to get Greek letters for other uses such as math. It is also called DOS Greek 2. It was designed to include all characters from ISO 8859-7.
Several 8-bit character sets (encodings) were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.
Windows code pages are sets of characters or code pages used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Windows, although they are still supported both within Windows and other platforms, and still apply when Alt code shortcuts are used.
Code page 862 is a code page used under DOS in Israel for Hebrew.
Code page 775 is a code page used under DOS to write the Estonian, Lithuanian and Latvian languages. In Lithuania, this code page is standardised as LST 1590-1, alongside the related Code page 778.
In computing, a hardware code page (HWCP) refers to a code page supported natively by a hardware device such as a display adapter or printer. The glyphs to present the characters are stored in the alphanumeric character generator's resident read-only memory and are thus not user-changeable. They are available for use by the system without having to load any font definitions into the device first. Startup messages issued by a PC's System BIOS or displayed by an operating system before initializing its own code page switching logic and font management and before switching to graphics mode are displayed in a computer's default hardware code page.
Code page 912 is a code page used under IBM AIX and DOS to write the Albanian, Bosnian, Croatian, Czech, English, German, Hungarian, Polish, Romanian, Serbian, Slovak, and Slovene languages. It is an extension of ISO/IEC 8859-2, though prior to 1999, the code page matched ISO/IEC 8859-2 exactly.
Code page 915 is a code page used under IBM AIX and DOS to write the Bulgarian, Belarusian, Russian, Serbian and Macedonian but was never widely used. It would also have been usable for Ukrainian in the Soviet Union from 1933 to 1990, but it is missing the Ukrainian letter ge, ґ, which is required in Ukrainian orthography before and since, and during that period outside Soviet Ukraine. As a result, IBM created Code page 1124. It is an extension of ISO/IEC 8859-5. The original code page matched ISO/IEC 8859-5 directly.
Code page 856, is a code page used under DOS for Hebrew in Israel.
Code page 921 is a code page used under IBM AIX and DOS to write the Estonian, Latvian, and Lithuanian languages. It is an extension of ISO/IEC 8859-13. The original code page matched ISO/IEC 8859-13 directly.
Code page 922 is a code page used under IBM AIX and DOS to write the Estonian language. It is an extension and modification of ISO/IEC 8859-1, where the letters Ð/ð and Þ/þ used for Icelandic are replaced by the letters Š/š and Ž/ž respectively. This matches the encoding of these letters in Windows-1257 and ISO/IEC 8859-13.
The new official ID for the Multilingual "codepage 850 with EURO SIGN" is 858, not 850. IBM will switch to use 858 instead of their 850 variant with future issues of their products. […] I can only guess why they didn't add 858 to their EGAx.CPI, COUNTRY.SYS, and KEYBOARD.SYS files in PC DOS 2000. Many third-party applications are designed to work with 850 and didn't know about 858 at the time PC DOS 2000 was released, so it's easier for everyone, but unfortunately it's not compatible. […] As explained above, COUNTRY.SYS and KEYBOARD.SYS contain only two codepage entries for a given country in Western issues of DOS. (In Arabic and Hebrew issues there can be up to 8 codepages for one country, in theory there is no limit below the range of allowed codepages 1..65534). […] The problem is that removing support for 850 might have caused compatibility problems with applications which are hard-wired to use 850. Adding 858 as a third choice to all the files would have increased the file and table sizes significantly. The COUNTRY.SYS file parser in MS-DOS/PC DOS IO.SYS/IBMBIO.COM sets aside a 6 Kb (for DOS 6) scratchpad to load all the info. This allows a maximum of 438 entries in a COUNTRY.SYS file to be accepted, otherwise you will get the message "COUNTRY.SYS too large.". The NLSFUNC parser does not have this limitation, and the file parsers in DR-DOS (kernel and NLSFUNC) also do not know of such a restriction. Older issues of MS-DOS/PC DOS even had a 2 Kb buffer for a maximum of 146 entries.
[…] one could also create custom .CPI files in the traditional FONT style without difficulties, but you could only store up to […] six codepages in such a file if it should be useable by MS-DOS/PC DOS (some OEM issues and NT can handle files larger than 64 Kb, but MS-DOS/PC DOS can not).(NB. Based on fd-dev post .)