Katakana (Unicode block)

Katakana
Katakana
Range	U+30A0..U+30FF; (96 code points)
Plane	BMP
Scripts	Katakana (93 char.); Common (3 char.)
Major alphabets	Japanese; Ainu
Assigned	96 code points
Unused	0 reserved code points
Source standards	JIS X 0208
Unicode version history
1.0.0 (1991)	90 (+90)
1.1 (1993)	94 (+4)
3.2 (2002)	96 (+2)
Unicode documentation
	Code chart ∣ Web page
	Note:

Last updated October 10, 2024

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Block

Katakana ^[1] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+30Ax	゠	ァ	ア	ィ	イ	ゥ	ウ	ェ	エ	ォ	オ	カ	ガ	キ	ギ	ク
U+30Bx	グ	ケ	ゲ	コ	ゴ	サ	ザ	シ	ジ	ス	ズ	セ	ゼ	ソ	ゾ	タ
U+30Cx	ダ	チ	ヂ	ッ	ツ	ヅ	テ	デ	ト	ド	ナ	ニ	ヌ	ネ	ノ	ハ
U+30Dx	バ	パ	ヒ	ビ	ピ	フ	ブ	プ	ヘ	ベ	ペ	ホ	ボ	ポ	マ	ミ
U+30Ex	ム	メ	モ	ャ	ヤ	ュ	ユ	ョ	ヨ	ラ	リ	ル	レ	ロ	ヮ	ワ
U+30Fx	ヰ	ヱ	ヲ	ン	ヴ	ヵ	ヶ	ヷ	ヸ	ヹ	ヺ	・	ー	ヽ	ヾ	ヿ
Notes 1. ^ As of Unicode version 16.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Katakana block:

Version	Final code points^{[lower-alpha 1]}	Count	L2 ID	WG2 ID	Document
1.0.0	U+30A1..30F6, 30FB..30FE	90			(to be determined)
1.1	U+30F7..30FA	4			(to be determined)
3.2	U+30A0, 30FF	2	L2/99-238		Consolidated document containing 6 Japanese proposals, 1999-07-15
				N2092	Addition of forty eight characters, 1999-09-13
			L2/00-024		Shibano, Kohji (2000-01-31), JCS proposal revised
			L2/00-098, L2/00-098-page5	N2195	Rationale for non-Kanji characters proposed by JCS committee, 2000-03-15
			L2/00-234	N2203 (rtf, txt)	Umamaheswaran, V. S. (2000-07-21), "8.20", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
			L2/00-298	N2258	Sato, T. K. (2000-09-04), JIS X 0213 symbols part-2
			L2/00-342	N2278	Sato, T. K.; Everson, Michael; Whistler, Ken; Freytag, Asmus (2000-09-20), Ad hoc Report on Japan feedback N2257 and N2258
			L2/01-050	N2253	Umamaheswaran, V. S. (2001-01-21), "7.16 JIS X0213 Symbols", Minutes of the SC2/WG2 meeting in Athens, September 2000
			L2/01-114	N2328	Summary of Voting on SC 2 N 3503, ISO/IEC 10646-1: 2000/PDAM 1, 2001-03-09
↑ Proposed code points and characters names may differ from final code points and names

Related Research Articles

Katakana is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script.

Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.

<i>Mojikyō</i> Character encoding scheme

Mojikyō, also known by its full name Konjaku Mojikyō, is a character encoding scheme created to provide a complete index of characters used in the Chinese, Japanese, Korean, Vietnamese Chữ Nôm and other historical Chinese logographic writing systems. The Mojikyō Institute, which published the character set, also published computer software and TrueType fonts to accompany it. The Mojikyō Institute, chaired by Tadahisa Ishikawa (石川忠久), originally had its character set and related software and data redistributed on CD-ROMs sold in Kinokuniya stores.

In CJK computing, graphic characters are traditionally classed into fullwidth and halfwidth characters. Unlike monospaced fonts, a halfwidth character occupies half the width of a fullwidth character, hence the name.

In the Unicode standard, a plane is a contiguous group of 65,536 (2¹⁶) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–10₁₆ of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 16.0, five of the planes have assigned code points (characters), and seven are named.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana Phonetic Extensions is a Unicode block containing additional small katakana characters for writing the Ainu language, in addition to characters in the Katakana block.

CJK Unified Ideographs Extension C is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 2002 and 2006, plus five "urgently needed" characters added in Unicode versions 14.0 and 15.0, some of which had previously been mistakenly unified with other characters.

CJK Unified Ideographs Extension D is a Unicode block containing uncommon CJK ideographs for Chinese, Japanese, Korean, and Vietnamese, some of which are in current use. Much smaller than most Unicode blocks for CJK unified ideographs, Extension D consists of characters which were submitted to the Ideographic Research Group as "urgently needed characters" between 2006 and 2009. Characters submitted during the same period which were needed less urgently were included in CJK Unified Ideographs Extension E instead.

CJK Compatibility Ideographs is a Unicode block created to contain mostly Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. However, it also contains 12 unified ideographs sourced from Japanese character sets from IBM.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF). The square forms can have different presentations when they are used in horizontal or vertical text. For example, the characters U+333E㌾SQUARE BORUTO and U+3327㌧SQUARE TON should look different in horizontal and in vertical right-to-left: ㌧㌾

Kana Supplement is a Unicode block containing one archaic katakana character and 255 hentaigana characters. Additional hentaigana characters are encoded in the Kana Extended-A block.

Kanbun is a Unicode block containing annotation characters used in Japanese copies (kanbun) of Classical Chinese texts, to indicate reading order.

Enclosed Ideographic Supplement is a Unicode block containing forms of characters and words from Chinese, Japanese and Korean enclosed within or stylised as squares, brackets, or circles. It contains three such characters containing one or more kana, and many containing CJK ideographs. Many of its characters were added for compatibility with the Japanese ARIB STD-B24 standard. Six symbols from Chinese folk religion were added in Unicode version 10.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the second-to-last block of the Basic Multilingual Plane, followed only by the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Kana Extended-A is a Unicode block containing hentaigana and historic kana characters. Additional hentaigana characters are encoded in the Kana Supplement block.

Small Kana Extension is a Unicode block containing additional small variants for the Hiragana and Katakana syllabaries, in addition to those in the Hiragana, Katakana and Katakana Phonetic Extensions blocks.

Kana Extended-B is a Unicode block containing Taiwanese kana.

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[final-3] Proposed code points and characters names may differ from final code points and names

[1] "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.

[2] "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.

[1]

[2]

[lower-alpha 1]

Katakana (Unicode block)

Contents

Block

History

See also

Related Research Articles

References

Katakana
Range	U+30A0..U+30FF (96 code points)
Plane	BMP
Scripts	Katakana (93 char.) Common (3 char.)
Major alphabets	Japanese Ainu
Assigned	96 code points
Unused	0 reserved code points
Source standards	JIS X 0208
Unicode version history

1.0.0 (1991)	90 (+90)
1.1 (1993)	94 (+4)
3.2 (2002)	96 (+2)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2]