MIME / IANA | IBM1047 |
---|---|
Alias(es) | ibm-1047, cp1047 [1] |
Classification | EBCDIC |
Transforms / Encodes | ISO 8859-1 |
Other related encoding(s) | EBCDIC 037-2 |
MIME / IANA | IBM00924 |
---|---|
Alias(es) | CCSID00924, CP00924, ebcdic-Latin9-euro [1] |
Classification | EBCDIC |
Transforms / Encodes | ISO 8859-15 |
Code page 1047 (CCSID 1047) [2] is an EBCDIC code page with the full Latin-1 character set. [3] It is closely related to both EBCDIC 037-2 (with only two points differing) and EBCDIC 037 (with six points differing), both of which also encode Latin-1.
Code page 924 (CCSID 924) is an update of code page/CCSID 1047 which adds various characters including the euro sign. [4] [5] It is an EBCDIC version of Latin-9 (ISO 8859-15).
Characters are shown with their Unicode equivalents. Differences from EBCDIC 037 are shown with a heavy border.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ | NUL 0000 | SOH 0001 | STX 0002 | ETX 0003 | SEL | HT 0009 | RNL | DEL 007F | GE | SPS | RPT | VT 000B | FF 000C | CR 000D | SO 000E | SI 000F |
1_ | DLE 0010 | DC1 0011 | DC2 0012 | DC3 0013 | res/enp | NL 0085 | BS 0008 | POC | CAN 0018 | EM 0019 | UBS | CU1 | IFS 001C | IGS 001D | IRS 001E | ius/itb 001F |
2_ | DS | SOS | FS | WUS | byp/imp | LF 000A | ETB 0017 | ESC 001B | SA | SFE | sm/sw | CSP | MFA | ENQ 0005 | ACK 0006 | BEL 0007 |
3_ | SYN 0016 | IR | PP | TRN | NBS | EOT 0004 | SBS | IT | RFF | CU3 | DC4 0014 | NAK 0015 | SUB 001A | |||
4_ | SP 0020 | NBSP 00A0 | â 00E2 | ä 00E4 | à 00E0 | á 00E1 | ã 00E3 | å 00E5 | ç 00E7 | ñ 00F1 | ¢ 00A2 | . 002E | < 003C | ( 0028 | + 002B | | 007C |
5_ | & 0026 | é 00E9 | ê 00EA | ë 00EB | è 00E8 | í 00ED | î 00EE | ï 00EF | ì 00EC | ß 00DF | ! 0021 | $ 0024 | * 002A | ) 0029 | ; 003B | ^ 005E |
6_ | - 002D | / 002F | Â 00C2 | Ä 00C4 | À 00C0 | Á 00C1 | Ã 00C3 | Å 00C5 | Ç 00C7 | Ñ 00D1 | ¦ 00A6 | , 002C | % 0025 | _ 005F | > 003E | ? 003F |
7_ | ø 00F8 | É 00C9 | Ê 00CA | Ë 00CB | È 00C8 | Í 00CD | Î 00CE | Ï 00CF | Ì 00CC | ` 0060 | : 003A | # 0023 | @ 0040 | ' 0027 | = 003D | " 0022 |
8_ | Ø 00D8 | a 0061 | b 0062 | c 0063 | d 0064 | e 0065 | f 0066 | g 0067 | h 0068 | i 0069 | « 00AB | » 00BB | ð 00F0 | ý 00FD | þ 00FE | ± 00B1 |
9_ | ° 00B0 | j 006A | k 006B | l 006C | m 006D | n 006E | o 006F | p 0070 | q 0071 | r 0072 | ª 00AA | º 00BA | æ 00E6 | ¸ 00B8 | Æ 00C6 | ¤ 00A4 |
A_ | µ 00B5 | ~ 007E | s 0073 | t 0074 | u 0075 | v 0076 | w 0077 | x 0078 | y 0079 | z 007A | ¡ 00A1 | ¿ 00BF | Ð 00D0 | [ 005B | Þ 00DE | ® 00AE |
B_ | ¬ 00AC | £ 00A3 | ¥ 00A5 | · 00B7 | © 00A9 | § 00A7 | ¶ 00B6 | ¼ 00BC | ½ 00BD | ¾ 00BE | Ý 00DD | ¨ 00A8 | ¯ 00AF | ] 005D | ´ 00B4 | × 00D7 |
C_ | { 007B | A 0041 | B 0042 | C 0043 | D 0044 | E 0045 | F 0046 | G 0047 | H 0048 | I 0049 | SHY 00AD | ô 00F4 | ö 00F6 | ò 00F2 | ó 00F3 | õ 00F5 |
D_ | } 007D | J 004A | K 004B | L 004C | M 004D | N 004E | O 004F | P 0050 | Q 0051 | R 0052 | ¹ 00B9 | û 00FB | ü 00FC | ù 00F9 | ú 00FA | ÿ 00FF |
E_ | \ 005C | ÷ 00F7 | S 0053 | T 0054 | U 0055 | V 0056 | W 0057 | X 0058 | Y 0059 | Z 005A | ² 00B2 | Ô 00D4 | Ö 00D6 | Ò 00D2 | Ó 00D3 | Õ 00D5 |
F_ | 0 0030 | 1 0031 | 2 0032 | 3 0033 | 4 0034 | 5 0035 | 6 0036 | 7 0037 | 8 0038 | 9 0039 | ³ 00B3 | Û 00DB | Ü 00DC | Ù 00D9 | Ú 00DA | EO |
Letter Number Punctuation Symbol Other Undefined Differences from EBCDIC 037
Since CP 01047 contains all of the standard Latin-1 characters, it is possible to translate the character codes from the CP 01047 charset to ISO 8859-1 character codes, so that translation back to the CP 01047 charset is an exact value-preserving round-trip conversion. Likewise, half of the control character codes can be translated into their exact ASCII equivalents. If the remaining EBCDIC-only control characters are translated (arbitrarily) into the remaining unused ASCII codes points (hex 80 to 9F) as well, the resulting translation covers all of the 256 character code points. Such a conversion table (for translating from CP 01047 to ISO 8859-1) is shown below:
CP 1047 → ISO 8859-1 | ||||||||||||||||
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
0_ | 00 | 01 | 02 | 03 | 9C | 09 | 86 | 7F | 97 | 8D | 8E | 0B | 0C | 0D | 0E | 0F |
1_ | 10 | 11 | 12 | 13 | 9D | 85 | 08 | 87 | 18 | 19 | 92 | 8F | 1C | 1D | 1E | 1F |
2_ | 80 | 81 | 82 | 83 | 84 | 0A | 17 | 1B | 88 | 89 | 8A | 8B | 8C | 05 | 06 | 07 |
3_ | 90 | 91 | 16 | 93 | 94 | 95 | 96 | 04 | 98 | 99 | 9A | 9B | 14 | 15 | 9E | 1A |
4_ | 20 | A0 | E2 | E4 | E0 | E1 | E3 | E5 | E7 | F1 | A2 | 2E | 3C | 28 | 2B | 7C |
5_ | 26 | E9 | EA | EB | E8 | ED | EE | EF | EC | DF | 21 | 24 | 2A | 29 | 3B | 5E |
6_ | 2D | 2F | C2 | C4 | C0 | C1 | C3 | C5 | C7 | D1 | A6 | 2C | 25 | 5F | 3E | 3F |
7_ | F8 | C9 | CA | CB | C8 | CD | CE | CF | CC | 60 | 3A | 23 | 40 | 27 | 3D | 22 |
8_ | D8 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | AB | BB | F0 | FD | FE | B1 |
9_ | B0 | 6A | 6B | 6C | 6D | 6E | 6F | 70 | 71 | 72 | AA | BA | E6 | B8 | C6 | A4 |
A_ | B5 | 7E | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 7A | A1 | BF | D0 | 5B | DE | AE |
B_ | AC | A3 | A5 | B7 | A9 | A7 | B6 | BC | BD | BE | DD | A8 | AF | 5D | B4 | D7 |
C_ | 7B | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | AD | F4 | F6 | F2 | F3 | F5 |
D_ | 7D | 4A | 4B | 4C | 4D | 4E | 4F | 50 | 51 | 52 | B9 | FB | FC | F9 | FA | FF |
E_ | 5C | F7 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 5A | B2 | D4 | D6 | D2 | D3 | D5 |
F_ | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | B3 | DB | DC | D9 | DA | 9F |
Characters are shown with their Unicode equivalents. Differences from EBCDIC 1047 are shown with a heavy border.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ | NUL 0000 | SOH 0001 | STX 0002 | ETX 0003 | SEL | HT 0009 | RNL | DEL 007F | GE | SPS | RPT | VT 000B | FF 000C | CR 000D | SO 000E | SI 000F |
1_ | DLE 0010 | DC1 0011 | DC2 0012 | DC3 0013 | res/enp | NL 0085 | BS 0008 | POC | CAN 0018 | EM 0019 | UBS | CU1 | IFS 001C | IGS 001D | IRS 001E | ius/itb 001F |
2_ | DS | SOS | FS | WUS | byp/imp | LF 000A | ETB 0017 | ESC 001B | SA | SFE | sm/sw | CSP | MFA | ENQ 0005 | ACK 0006 | BEL 0007 |
3_ | SYN 0016 | IR | PP | TRN | NBS | EOT 0004 | SBS | IT | RFF | CU3 | DC4 0014 | NAK 0015 | SUB 001A | |||
4_ | SP 0020 | NBSP 00A0 | â 00E2 | ä 00E4 | à 00E0 | á 00E1 | ã 00E3 | å 00E5 | ç 00E7 | ñ 00F1 | Ý 00DD | . 002E | < 003C | ( 0028 | + 002B | | 007C |
5_ | & 0026 | é 00E9 | ê 00EA | ë 00EB | è 00E8 | í 00ED | î 00EE | ï 00EF | ì 00EC | ß 00DF | ! 0021 | $ 0024 | * 002A | ) 0029 | ; 003B | ^ 005E |
6_ | - 002D | / 002F | Â 00C2 | Ä 00C4 | À 00C0 | Á 00C1 | Ã 00C3 | Å 00C5 | Ç 00C7 | Ñ 00D1 | Š 0160 | , 002C | % 0025 | _ 005F | > 003E | ? 003F |
7_ | ø 00F8 | É 00C9 | Ê 00CA | Ë 00CB | È 00C8 | Í 00CD | Î 00CE | Ï 00CF | Ì 00CC | ` 0060 | : 003A | # 0023 | @ 0040 | ' 0027 | = 003D | " 0022 |
8_ | Ø 00D8 | a 0061 | b 0062 | c 0063 | d 0064 | e 0065 | f 0066 | g 0067 | h 0068 | i 0069 | « 00AB | » 00BB | ð 00F0 | ý 00FD | þ 00FE | ± 00B1 |
9_ | ° 00B0 | j 006A | k 006B | l 006C | m 006D | n 006E | o 006F | p 0070 | q 0071 | r 0072 | ª 00AA | º 00BA | æ 00E6 | ž 017E | Æ 00C6 | € 20AC |
A_ | µ 00B5 | ~ 007E | s 0073 | t 0074 | u 0075 | v 0076 | w 0077 | x 0078 | y 0079 | z 007A | ¡ 00A1 | ¿ 00BF | Ð 00D0 | [ 005B | Þ 00DE | ® 00AE |
B_ | ¢ 00A2 | £ 00A3 | ¥ 00A5 | · 00B7 | © 00A9 | § 00A7 | ¶ 00B6 | Œ 0152 | œ 0153 | Ÿ 0178 | ¬ 00AC | š 0161 | ¯ 00AF | ] 005D | Ž 017D | × 00D7 |
C_ | { 007B | A 0041 | B 0042 | C 0043 | D 0044 | E 0045 | F 0046 | G 0047 | H 0048 | I 0049 | SHY 00AD | ô 00F4 | ö 00F6 | ò 00F2 | ó 00F3 | õ 00F5 |
D_ | } 007D | J 004A | K 004B | L 004C | M 004D | N 004E | O 004F | P 0050 | Q 0051 | R 0052 | ¹ 00B9 | û 00FB | ü 00FC | ù 00F9 | ú 00FA | ÿ 00FF |
E_ | \ 005C | ÷ 00F7 | S 0053 | T 0054 | U 0055 | V 0056 | W 0057 | X 0058 | Y 0059 | Z 005A | ² 00B2 | Ô 00D4 | Ö 00D6 | Ò 00D2 | Ó 00D3 | Õ 00D5 |
F_ | 0 0030 | 1 0031 | 2 0032 | 3 0033 | 4 0034 | 5 0035 | 6 0036 | 7 0037 | 8 0038 | 9 0039 | ³ 00B3 | Û 00DB | Ü 00DC | Ù 00D9 | Ú 00DA | EO |
Letter Number Punctuation Symbol Other Undefined Differences from EBCDIC 1047
Windows-1252 or CP-1252 is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages such as Spanish, French, and German.
ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined.
ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-4 or North European. It was designed to cover Estonian, Latvian, Lithuanian, Greenlandic, and Sami. It has been largely superseded by ISO/IEC 8859-10 and Unicode. Microsoft has assigned code page 28594 a.k.a. Windows-28594 to ISO-8859-4 in Windows. IBM has assigned code page 914 to ISO 8859-4.
ISO/IEC 8859-7:2003, Information technology — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Greek. It was designed to cover the modern Greek language. The original 1987 version of the standard had the same character assignments as the Greek national standard ELOT 928, published in 1986. The table in this article shows the updated 2003 version which adds three characters. Microsoft has assigned code page 28597 a.k.a. Windows-28597 to ISO-8859-7 in Windows. IBM has assigned code page 813 to ISO 8859-7. (IBM CCSID 813 is the original encoding. CCSID 4909 adds the euro sign. CCSID 9005 further adds the drachma sign and ypogegrammeni.)
IBM code page 37 is an EBCDIC code page with the full Latin-1 character set used in IBM mainframes. It is used in some English- and Portuguese-speaking countries, including Australia, Brazil, Canada, New Zealand, Portugal, South Africa, and the United States.
IBM code page 285 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Ireland and the United Kingdom.
Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Polish, Czech, Slovak, Hungarian, Slovene, Bosnian, Croatian, Serbian, Romanian and Albanian. It may also be used with the German language; German-language texts encoded with Windows-1250 and Windows-1252 are identical.
Windows code pages are sets of characters or code pages used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Windows, although they are still supported both within Windows and other platforms.
IBM code page 500 is an EBCDIC code page with full Latin-1-charset support used in IBM mainframes.
IBM code page 875 is an EBCDIC code page with full Greek-charset used in IBM mainframes in Greece. It has superseded Code page 423.
IBM code page 273 is an EBCDIC code page with the full Latin-1 character set used in IBM mainframes in Austria and Germany.
IBM code page 278 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Finland and Sweden.
IBM code page 280 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Italy.
IBM code page 284 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Spain and Latin America.
IBM code page 297 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in France.
IBM code page 871 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Iceland.
Code page 37-2 is an EBCDIC code page with the full Latin-1 character set. It is closely related to both EBCDIC 037 and EBCDIC 1047, both of which also encode Latin-1, differing in four places from the former and in two places from the latter.
IBM code page 256 is an EBCDIC code page used in IBM mainframes. It supports all of Latin-1-charset except for the middle dot (·), copyright sign (©) superscript one (¹), multiplication sign (×), division sign (÷). EBCDIC 500 replaces the Peseta Sign (₧) with the middle dot, the florin sign (ƒ) with the copyright sign, the dotless i (ı) with the superscript one, the double low line (‗) with the multiplication sign, and the en space with the division sign to include all of ISO 8859-1. It supports the following:
IBM code page 281 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Japan.
IBM code page 282 is an EBCDIC code page with full Latin-1-charset used in IBM mainframes in Portugal.