Katakana Phonetic Extensions

Last updated
Katakana Phonetic Extensions
RangeU+31F0..U+31FF
(16 code points)
Plane BMP
Scripts Katakana
Major alphabetsAinu
Assigned16 code points
Unused0 reserved code points
Unicode version history
3.2 (2002)16 (+16)
Note: [1] [2]

Katakana Phonetic Extensions is a Unicode block containing additional small katakana characters for writing the Ainu language, in addition to characters in the Katakana block.

Further small katakana are present in the Small Kana Extension block.

Katakana Phonetic Extensions [1]
Official Unicode Consortium code chart (PDF)
 0123456789ABCDEF
U+31Fx
Notes
1. ^ As of Unicode version 13.0

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Katakana Phonetic Extensions block:

Version Final code points [lower-alpha 1] Count L2  ID WG2  IDDocument
3.2U+31F0..31FF16 L2/99-238 Consolidated document containing 6 Japanese proposals, 1999-07-15
N2092 Addition of forty eight characters, 1999-09-13
L2/99-365 Moore, Lisa (1999-11-23), Comments on JCS Proposals
L2/00-024 Shibano, Kohji (2000-01-31), JCS proposal revised
L2/99-260R Moore, Lisa (2000-02-07), "JCS Proposals", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999
L2/00-297 N2257 Sato, T. K. (2000-09-04), JIS X 0213 symbols part-1
L2/00-342 N2278 Sato, T. K.; Everson, Michael; Whistler, Ken; Freytag, Asmus (2000-09-20), Ad hoc Report on Japan feedback N2257 and N2258
L2/01-050 N2253 Umamaheswaran, V. S. (2001-01-21), "7.16 JIS X0213 Symbols", Minutes of the SC2/WG2 meeting in Athens, September 2000
L2/01-114 N2328 Summary of Voting on SC 2 N 3503, ISO/IEC 10646-1: 2000/PDAM 1, 2001-03-09
  1. Proposed code points and characters names may differ from final code points and names

Related Research Articles

Katakana is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script. The word katakana means "fragmentary kana", as the katakana characters are derived from components or fragments of more complex kanji. Katakana and hiragana are both kana systems. With one or two minor exceptions, each syllable in the Japanese language is represented by one character or kana, in each system. Each kana represents either a vowel such as "a" ; a consonant followed by a vowel such as "ka" ; or "n", a nasal sonorant which, depending on the context, sounds either like English m, n or ng or like the nasal vowels of Portuguese or Galician.

Kana are syllabaries used to write Japanese phonological units, morae. Such syllabaries include: (1) the original kana, or magana, which were Chinese characters (kanji) used phonetically to transcribe Japanese; the most prominent magana system being man'yōgana (万葉仮名); the two descendants of man'yōgana, (2) cursive hiragana, and (3) angular katakana. There are also hentaigana, which are historical variants of the now standard hiragana. In current usage, kana can simply mean hiragana and katakana.

As of Unicode version 13.0 Cyrillic script is encoded across several blocks, all in the BMP:

JIS X 0201 Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. It is either 7-bit encoding or 8-bit encoding, although 8-bit encoding is dominant for modern use. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

, in hiragana or in katakana, is one of the Japanese kana, which each represent one mora. Both represent and their shapes come from the kanji 久.

, in hiragana, or in katakana, is one of the Japanese kana, which each represent one mora. The hiragana is written with three strokes, while the katakana is written with two. Both represent.

Combining Diacritical Marks Supplement is a Unicode block containing combining characters for the Uralic Phonetic Alphabet, Medievalist notations, and German dialectology (Teuthonista). It is an extension of the diacritic characters found in the Combining Diacritical Marks block.

Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character "Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context. Its block name in Unicode 1.0 was Generic Diacritical Marks.

Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.

Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.

IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.

Superscripts and Subscripts is a Unicode block containing superscript and subscript numerals, mathematical operators, and letters used in mathematics and phonetics. The use of subscripts and superscripts in Unicode allows any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX. Other superscript letters can be found in the Spacing Modifier Letters, Phonetic Extensions and Phonetic Extensions Supplement blocks, while the superscript 1, 2, and 3, inherited from ISO 8859-1, were included in the Latin-1 Supplement block.

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

Katakana is a Unicode block containing katakana characters for the Japanese and Ainu languages.

Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.

CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).

Kana Supplement is a Unicode block containing one archaic katakana character and 255 hentaigana characters. Additional hentaigana characters are encoded in the Kana Extended-A block.

Kanbun is a Unicode block containing annotation characters used in Japanese copies (kanbun) of Classical Chinese texts, to indicate reading order.

Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode. It is the last of the Basic Multilingual Plane excepting the short Specials block at U+FFF0–FFFF. Its block name in Unicode 1.0 was Halfwidth and Fullwidth Variants.

Small Kana Extension is a Unicode block containing additional small variants for the Hiragana and Katakana syllabaries, in addition to those in the Hiragana, Katakana and Katakana Phonetic Extensions blocks.

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.