Code page 903

Last updated
Code page 903
MIME / IANAIBM903
Alias(es)cp903 [1]
Extends JIS-Roman
Extensions Code page 897, Code page 1042
Other related encoding(s) Code page 904

Code page 903 (CCSID 903) [2] is encoded for use as the single byte component of certain simplified Chinese character encodings. [3] It is used in China. Despite this, it follows ISO 646-JP / the Roman half of JIS X 0201, in that it replaces the ASCII backslash 0x5C (rather than the ASCII dollar sign 0x24 as in GB 1988 / ISO 646-CN) with the yen/yuan sign. It also uses the same C0 replacement graphics as code page 897. [4] When combined with the double-byte Code page 928, it forms the two code-sets of IBM code page 936.

Codepage layout

Code page 903 [4] [5]
0123456789ABCDEF
0x NUL BS LF FF CR
1x DC1 DC3 CAN
2x  SP   ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ ¥ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } SUB
  Differences from ASCII

Related Research Articles

ISO/IEC 8859-3:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 3: Latin alphabet No. 3, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-3 or South European. It was designed to cover Turkish, Maltese and Esperanto, though the introduction of ISO/IEC 8859-9 superseded it for Turkish. The encoding was popular for users of Esperanto, but fell out of use as application support for Unicode became more common.

Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.

<span class="mw-page-title-main">Windows-1252</span> Character encoding

Windows-1252 or CP-1252 is a single-byte character encoding of the Latin alphabet that was used by default in Microsoft Windows for English and many Romance and Germanic languages including Spanish, Portuguese, French, and German. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa.

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined.

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 to it. This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No. 4, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin-4 or North European. It was designed to cover Estonian, Latvian, Lithuanian, Greenlandic, and Sámi. It has been largely superseded by ISO/IEC 8859-10 and Unicode. Microsoft has assigned code page 28594 a.k.a. Windows-28594 to ISO-8859-4 in Windows. IBM has assigned code page 914 to ISO 8859-4.

ISO/IEC 8859-7:2003, Information technology — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as Latin/Greek. It was designed to cover the modern Greek language. The original 1987 version of the standard had the same character assignments as the Greek national standard ELOT 928, published in 1986. The table in this article shows the updated 2003 version which adds three characters. Microsoft has assigned code page 28597 a.k.a. Windows-28597 to ISO-8859-7 in Windows. IBM has assigned code page 813 to ISO 8859-7. (IBM CCSID 813 is the original encoding. CCSID 4909 adds the euro sign. CCSID 9005 further adds the drachma sign and ypogegrammeni.)

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is designated ECMA-128 by Ecma International and TS 5881 as a Turkish standard. It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language, designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for the replacement of six Icelandic characters with characters unique to the Turkish alphabet. And the uppercase of i is İ; the lowercase of I is ı.

Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese (characters).

Windows-1257 is an 8-bit, single-byte extended ASCII code page used to support the Estonian, Latvian and Lithuanian languages under Microsoft Windows. In Lithuania, it is standardised as LST 1590-3, alongside a modified variant named LST 1590-4.

<span class="mw-page-title-main">Code page 950</span> Windows code page for Traditional Chinese, based on Big5

Code page 950 is the code page used on Microsoft Windows for Traditional Chinese. It is Microsoft's implementation of the de facto standard Big5 character encoding. The code page is not registered with IANA, and hence, it is not a standard to communicate information over the internet, although it is usually labelled simply as big5, including by Microsoft library functions.

<span class="mw-page-title-main">JIS X 0201</span> Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. Its two forms were a 7-bit encoding or an 8-bit encoding, although the 8-bit form was dominant until Unicode replaced it. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

Code page 895 is a 7-bit character set and is Japan's national ISO 646 variant. It is the Roman set of the JIS X 0201 Japanese Standard and is variously called Japan 7-Bit Latin, JISCII, JIS Roman, JIS C6220-1969-ro, ISO646-JP or Japanese-Roman. Its ISO-IR registration number is 14.

Code page 1009, also known as CP1009 (IBM) and CP20105 (Microsoft), is the International Reference Version (IRV) of ISO 646:1983 until its redefinition in ISO/IEC 646:1991.

Code page 1101, also known as CP1101, is an IBM code page number assigned to the UK variant of DEC's National Replacement Character Set (NRCS). The 7-bit character set was introduced for DEC's computer terminal systems, starting with the VT200 series in 1983, but is also used by IBM for their DEC emulation. Similar but not identical to the series of ISO 646 character sets, the character set is a close derivation from ASCII with only code point 0x23 differing.

Code page 1107, also known as CP1107, is an IBM code page number assigned to the alternate Denmark/Norway variant of DEC's National Replacement Character Set (NRCS). The 7-bit character set was introduced for DEC's computer terminal systems, starting with the VT200 series in 1983, but is also used by IBM for their DEC emulation. Similar but not identical to the series of ISO 646 character sets, the character set is a close derivation from ASCII with only six code points differing.

Code page 1020, also known as CP1020, is an IBM code page number assigned to the French-Canadian variant of DEC's National Replacement Character Set (NRCS). The 7-bit character set was introduced for DEC's computer terminal systems, starting with the VT200 series in 1983, but is also used by IBM for their DEC emulation. Similar but not identical to the series of ISO 646 character sets, the character set is a close derivation from ASCII with only ten code points differing.

Code page 897 is IBM's implementation of the 8-bit form of JIS X 0201. It includes several additional graphical characters in the C0 control characters area, and the code points in question may be used as control characters or graphical characters depending on the context, similarly in concept to OEM-US, but with different graphical characters. The C0 rows are shown below.

Code page 896, called Japan 7-Bit Katakana Extended, is IBM's code page for code-set G2 of EUC-JP, a 7-bit code page representing the Kana set of JIS X 0201 and accompanying Code page 895 which corresponds to the lower half of that standard. It encodes half-width katakana.

Code page 1042, also known as Simplified Chinese PC Data Extended, is a single byte character set (SBCS) used by IBM in its PC DOS operating system in China. This code page is intended for use with code page 928. It is an extension of Code page 903.

References

  1. Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. "CCSID 903 information document". Archived from the original on 2016-03-27.
  3. "Code page 903 information document". Archived from the original on 2016-03-17.
  4. 1 2 "Code Page CPGID 00903 (pdf)" (PDF). IBM.
  5. Code Page CPGID 00903 (txt), IBM