JIS X 0213

JIS X 0213
Language(s)	Japanese, English, Ainu, Russian ; Partial support: Greek, Chinese
Standard	JIS X 0213
Classification	ISO 2022, DBCS, CJK encoding
Extends	JIS X 0208
Encoding formats	Shift_JIS-2004 ; ISO-2022-JP-2004 ; EUC-JIS-2004
Preceded by	JIS X 0208, JIS X 0212
	v ; t ; e ;

Last updated November 20, 2024

JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan. This standard extends JIS X 0208. The first version was published in 2000 and revised in 2004 (JIS2004) and 2012.^[1]^[2]^[3]^[4] As well as adding a number of special characters, characters with diacritic marks, etc., it included an additional 3,625 kanji. The full name of the standard is 7-bit and 8-bit double byte coded extended KANJI sets for information interchange (7ビット及び8ビットの2バイト情報交換用符号化拡張漢字集合, Nana-Bitto Oyobi Hachi-Bitto no Ni-Baito Jōhō Kōkan'yō Fugōka Kakuchō Kanji Shūgō).

JIS X 0213 has two "planes" (94×94 character tables). Plane 1 is a superset of JIS X 0208 containing kanji sets level 1 to 3 and non-kanji characters such as Hiragana, Katakana (including letters used to write the Ainu language), Latin, Greek and Cyrillic alphabets, digits, symbols and so on. Plane 2 contains only level 4 kanji set. Total number of the defined characters is 11,233. Each character is capable of being encoded in two bytes.

This standard largely replaced the rarely used JIS X 0212-1990 "supplementary" standard, which included 5,801 kanji and 266 non-kanji. Of the additional 3,695 kanji in JIS X 0213, all but 952 were already in JIS X 0212.

JIS X 0213 defines several 7-bit and 8-bit encodings including EUC-JIS-2004, ISO-2022-JP-2004 and Shift JIS-2004. Also, it defines the mapping from each of these encodings to ISO/IEC 10646 (Unicode) for each character.

Unicode version 3.2 incorporated all characters of JIS X 0213 except for the characters that could be represented using combining characters. Because about 300 kanji are in Unicode Plane 2, Unicode implementations supporting only the Basic Multilingual Plane cannot handle all of the JIS X 0213 characters. This is not an issue for most applications, however.

The 2004 edition of JIS X 0213 changed the recommended renderings of 168 kanji.^[5] Ten additional kanji were added in JIS X 0213:2004.^[6]

Related Research Articles

In computing, JIS encoding refers to several Japanese Industrial Standards for encoding the Japanese language. Strictly speaking, the term means either:

ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994.

Shift JIS is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1.

Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese (characters).

<i>Mojikyō</i> Character encoding scheme

Mojikyō, also known by its full name Konjaku Mojikyō, is a character encoding scheme created to provide a complete index of characters used in the Chinese, Japanese, Korean, Vietnamese Chữ Nôm and other historical Chinese logographic writing systems. The Mojikyō Institute, which published the character set, also published computer software and TrueType fonts to accompany it. The Mojikyō Institute, chaired by Tadahisa Ishikawa (石川忠久), originally had its character set and related software and data redistributed on CD-ROMs sold in Kinokuniya stores.

TRON Code is a multi-byte character encoding used in the TRON project. It is similar to Unicode but does not use Unicode's Han unification process: each character from each CJK character set is encoded separately, including archaic and historical equivalents of modern characters. This means that Chinese, Japanese, and Korean text can be mixed without any ambiguity as to the exact form of the characters; however, it also means that many characters with equivalent semantics will be encoded more than once, complicating some operations.

〒 is the service mark of Japan Post and its successor, Japan Post Holdings, the postal operator in Japan. It is also used as a Japanese postal code mark since the introduction of the latter in 1968. Historically, it was used by the Ministry of Communications, which operated the postal service. The mark is a stylized katakana syllable te (テ), from the word teishin. The mark was introduced on February 8, 1887.

<span class="mw-page-title-main">JIS X 0201</span> Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. Its two forms were a 7-bit encoding or an 8-bit encoding, although the 8-bit form was dominant until Unicode replaced it. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

Half-width kana are katakana characters displayed compressed at half their normal width, instead of the usual square (1:1) aspect ratio. For example, the usual (full-width) form of the katakana ka is カ while the half-width form is ｶ. Additionally, half-width hiragana is included in Unicode, and it is usable on Web or in e-books via CSS's font-feature-settings: "hwid" 1 with Adobe-Japan1-6 based OpenType fonts. Finally, half-width kanji is usable on modern computers, and is used in some receipt printers, electric bulletin board and old computers.

き, in hiragana, キ in katakana, is one of the Japanese kana, which each represent one mora. Both represent and are derived from a simplification of the 幾 kanji. The hiragana character き, like さ, is drawn with the lower line either connected or disconnected.

け, or ケ, is one of the Japanese kana, each of which represents one mora. Both represent. The shape of these kana come from the kanji 計 and 介, respectively.

こ, in hiragana or コ in katakana, is one of the Japanese kana, each of which represents one mora. Both represent IPA:[ko]. The shape of these kana comes from the kanji 己.

す, in hiragana or ス in katakana, is one of the Japanese kana, each of which represents one mora. Their shapes come from the kanji 寸 and 須, respectively. Both kana represent the sound. In the Ainu language, the katakana ス can be written as small ㇲ to represent a final s and is used to emphasize the pronunciation of [s] rather than the normal [ɕ].

JIS X 0212 is a Japanese Industrial Standard defining a coded character set for encoding supplementary characters for use in Japanese. This standard is intended to supplement JIS X 0208. It is numbered 953 or 5049 as an IBM code page.

JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the current standard is 7-bit and 8-bit double byte coded KANJI sets for information interchange. It was originally established as JIS C 6226 in 1978, and has been revised in 1983, 1990, and 1997. It is also called Code page 952 by IBM. The 1978 version is also called Code page 955 by IBM.

Microsoft Windows code page 932, also called Windows-31J amongst other names, is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding. It contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

Volume 1 of the Association of Radio Industries and Businesses (ARIB) STD-B24 standard for Broadcast Markup Language specifies, amongst other details, a character encoding for use in Japanese-language broadcasting. It was introduced on 1999-10-26. The latest revision is version 6.3 as of 2016-07-06.

Several mutually incompatible versions of the Extended Binary Coded Decimal Interchange Code (EBCDIC) have been used to represent the Japanese language on computers, including variants defined by Hitachi, Fujitsu, IBM and others. Some are variable-width encodings, employing locking shift codes to switch between single-byte and double-byte modes. Unlike other EBCDIC locales, the lowercase basic Latin letters are often not preserved in their usual locations.

EPWING is the standard format for electronic dictionaries mainly used for Japanese. A subset of EPWING V1 is standardized as JIS X 4081.

Ghost characters are erroneous kanji included in the Japanese Industrial Standard, JIS X 0208. 12 of the 6,355 kanji characters are ghost characters.

References

↑ "日本工業標準調査会：データベース-JIS詳細表示". 2012-02-20. Retrieved 15 Mar 2015.
↑ "日本工業標準調査会：データベース-JIS規格詳細表示". 2000-01-20. Retrieved 15 Mar 2015.
↑ "日本工業標準調査会：データベース-JIS規格詳細表示". 2004-02-20. Retrieved 15 Mar 2015.
↑ "日本工業標準調査会：データベース-JIS規格詳細表示". 2008-10-01. Retrieved 15 Mar 2015.
↑ http://kakijun.jp/main/jis2004.html (in Japanese)
↑ Lunde, Ken (2014-04-07). "JIS X 0212 versus JIS X 0213". CJK Type Blog. Adobe Inc. Archived from the original on 2021-11-04. Retrieved 2021-11-04.

External links

JIS X 0213 Plane 1 code table Archived 2018-11-07 at the Wayback Machine
JIS X 0213 Plane 2 code table Archived 2020-08-02 at the Wayback Machine
JIS X 0213 online code table
Mapping tables between JIS X 0213 encodings and Unicode

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "日本工業標準調査会：データベース-JIS詳細表示". 2012-02-20. Retrieved 15 Mar 2015.

[2] "日本工業標準調査会：データベース-JIS規格詳細表示". 2000-01-20. Retrieved 15 Mar 2015.

[3] "日本工業標準調査会：データベース-JIS規格詳細表示". 2004-02-20. Retrieved 15 Mar 2015.

[4] "日本工業標準調査会：データベース-JIS規格詳細表示". 2008-10-01. Retrieved 15 Mar 2015.

[5] ttp://kakijun.jp/main/jis2004.html (in Japanese)

[6] Lunde, Ken (2014-04-07). "JIS X 0212 versus JIS X 0213". CJK Type Blog. Adobe Inc. Archived from the original on 2021-11-04. Retrieved 2021-11-04.

[1]

[2]

[3]

[4]

[5]

[6]

v t e Character encodings
Early telecommunications	Telegraph code Needle Morse Non-Latin Wabun/Kana Chinese Cyrillic Baudot and Murray Fieldata ASCII ISO/IEC 646 BCDIC Teletex and Videotex/Teletext T.51/ISO/IEC 6937 ITU T.61 ITU T.101 World System Teletext background sets Transcode
ISO/IEC 8859	Approved parts -1 (Western Europe) -2 (Central Europe) -3 (Maltese/Esperanto) -4 (North Europe) -5 (Cyrillic) -6 (Arabic) -7 (Greek) -8 (Hebrew) -9 (Turkish) -10 (Nordic) -11 (Thai) -13 (Baltic) -14 (Celtic) -15 (New Western Europe) -16 (Romanian) Abandoned parts -12 (Devanagari) Proposed but not approved KOI-8 Cyrillic Sámi Adaptations Welsh Barents Cyrillic Estonian Ukrainian Cyrillic
Bibliographic use	MARC-8 ANSEL CCCII/EACC ISO 5426 5426-2 5427 5428 6438 6862
National standards	ArmSCII Big5 BraSCII CNS 11643 DIN 66003 ELOT 927 GOST 10859 GB 2312 GB 12345 GB 12052 GB 18030 HKSCS ISCII JIS X 0201 JIS X 0208 JIS X 0212 JIS X 0213 KOI-7 KPS 9566 KS X 1001 KS X 1002 LST 1564 LST 1590-4 PASCII Shift JIS SI 960 TIS-620 TSCII VISCII VSCII YUSCII
ISO/IEC 2022	ISO/IEC 8859 ISO/IEC 10367 Extended Unix Code / EUC
Mac OS Code pages ("scripts")	Armenian Arabic Barents Cyrillic Celtic Central European Croatian Cyrillic Devanagari Farsi (Persian) Font X (Kermit) Gaelic Georgian Greek Gujarati Gurmukhi Hebrew Iceland Inuit Keyboard Latin (Kermit) Maltese/Esperanto Ogham Roman Romanian Sámi Turkish Turkic Cyrillic Ukrainian VT100
DOS code pages	437 668 708 720 737 770 773 775 776 777 778 850 851 852 853 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 897 899 903 904 932 936 942 949 950 951 1034 1040 1042 1043 1044 1098 1115 1116 1117 1118 1127 3846 ABICOMP CS Indic CSX Indic CSX+ Indic CWI-2 Iran System Kamenický Mazovia MIK
IBM AIX code pages	895 896 912 915 921 922 1006 1008 1009 1010 1012 1013 1014 1015 1016 1017 1018 1019 1046 1124 1133
Windows code pages	CER-GS 932 936 (GBK) 950 1169 Extended Latin-8 1250 1251 1252 1253 1254 1255 1256 1257 1258 1270 Cyrillic + Finnish Cyrillic + French Cyrillic + German Polytonic Greek
EBCDIC code pages	Japanese language in EBCDIC DKOI
DEC terminals (VTx)	Multinational (MCS) National Replacement (NRCS) French Canadian Swiss Spanish United Kingdom Dutch Finnish French Norwegian and Danish Swedish Norwegian and Danish (alternative) 8-bit Greek 8-bit Turkish SI 960 Hebrew Special Graphics Technical (TCS)
Platform specific	1052 1053 1054 1055 1056 1057 1058 Acorn RISC OS Amstrad CPC Apple II ATASCII Atari ST BICS Casio calculators CDC Compucolor 8001 Compucolor II CP/M+ DEC RADIX 50 DEC MCS/NRCS DG International Galaksija GEM GSM 03.38 HP Roman HP FOCAL HP RPL SQUOZE LICS LMBCS MSX NEC APC NeXT PETSCII PostScript Standard PostScript Latin 1 SAM Coupé Sega SC-3000 Sharp calculators Sharp MZ Sinclair QL Teletext TI calculators TRS-80 Ventura International WISCII XCCS ZX80 ZX81 ZX Spectrum
Unicode / ISO/IEC 10646	UTF-1 UTF-7 UTF-8 UTF-16 UTF-32 UTF-EBCDIC GB 18030 DIN 91379 BOCU-1 CESU-8 SCSU TACE16 Comparison of Unicode encodings
TeX typesetting system	Cork LY1 OML OMS OT1
Miscellaneous code pages	ABICOMP ASMO 449 Digital encoding of APL symbols ISO-IR-68 ARIB STD-B24 Fieldata HZ IEC-P27-1 INIS 7-bit 8-bit ISO-IR-169 ISO 2033 KOI KOI8-R KOI8-RU KOI8-U Mojikyō SEASCII Stanford/ITS Symbol TRON Unified Hangul Code
Control character	Morse prosigns C0 and C1 control codes ISO/IEC 6429 JIS X 0211 Unicode control, format and separator characters Whitespace characters
Related topics	CCSID Character encodings in HTML Charset detection Han unification Hardware code page MICR code Mojibake Variable-length encoding
Character sets

JIS X 0213

Contents

See also

Related Research Articles

References

External links

Language(s)	Japanese, English, Ainu, Russian Partial support: Greek, Chinese
Standard	JIS X 0213
Classification	ISO 2022, DBCS, CJK encoding
Extends	JIS X 0208
Encoding formats	Shift_JIS-2004 ISO-2022-JP-2004 EUC-JIS-2004
Preceded by	JIS X 0208, JIS X 0212
v t e