Wave dash

Last updated

Wave dash (, Unicode U+301C) is a fullwidth character represented in Japanese character encoding, usually used to represent a range.

Contents

Vertical wave dash ( Wave DashV.svg )

Wave dash is also written in vertical text layout. Vertical wave dash is the vertical form by rotation and flip in Unicode and JIS C 6226. [1] [2]

Horizontal and vertical writing in East Asian scripts Writing conventions

Many East Asian scripts can be written horizontally or vertically. Chinese, Japanese and Korean scripts can be oriented in either direction, as they consist mainly of disconnected logographic or syllabic units, each occupying a square block of space, thus allowing for flexibility for which direction texts can be written, be it horizontally from left-to-right, horizontally from right-to-left, vertically from top-to-bottom, and even vertically from bottom-to-top.

See also

Code reference

Wave dash in Character sets standards
StandardReleaseCode-Point
Ku-Ten / Ku-Men-Ten
GlyphNote
Unicode 1.01991 U+301C WAVE DASH Wave Dash2.svg The glyph was different from the original JIS C 6226 or JIS X 0208.
Unicode 8.02015 U+301C WAVE DASH Wave Dash.svg The glyph was fixed in Errata fixed in Unicode 8.0.0, The Unicode Consortium, 6 Oct 2014 
JIS C 622619781-33 Wave Dash.svg The wave was not stressed this much. [3]
JIS X 0208 19901-33 Wave Dash.svg
JIS X 0213 20001-1-33 Wave Dash.svg
Wave dash in each encode [4]
EncodecodeNote
ISO 2022-JP 0x2141
Shift JIS 0x8160
EUC-JP 0xA1C1(= 0x2141 + 0x8080)
UTF-8 0xE3809C

Related Research Articles

Unicode Character encoding standard

Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of March 2019 the most recent version, Unicode 12.0, contains a repertoire of 137,993 characters covering 150 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, and both are code-for-code identical.

The tilde is a grapheme with several uses. The name of the character came into English from Spanish and from Portuguese, which in turn came from the Latin titulus, meaning "title" or "superscription".

Yen sign currency sign used by the Chinese yuan (CNY) and the Japanese yen (JPY) currencies; a capital Y with one or two horizontal strokes

The yen or yuan sign (¥) is a currency sign used by the Japanese yen and the Chinese yuan currencies. This monetary symbol resembles a Latin letter Y with a single or double horizontal stroke. The symbol is usually placed before the value it represents, for example ¥50, unlike the kanji/Chinese character, which is more commonly used in Japanese and Chinese and is written following the amount: 50円 in Japan and 50元 in China.

Japanese language and computers

In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write English is very small, and thus it is possible to use only one byte (28=256 possible values) to encode one English character. However, the number of characters in Japanese is much more than 256 and thus cannot be encoded using a single byte - Japanese is thus encoded using two or more bytes, in a so-called "double byte" or "multi-byte" encoding. Problems that arise relate to transliteration and romanization, character encoding, and input of Japanese text.

ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO standard specifying

Shift JIS is a character encoding for the Japanese language, originally developed by a Japanese company called ASCII Corporation in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1. 0.4% of all web pages used Shift JIS in September 2018, a decline from 1.3% in July 2014.

Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese.

A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set for representation in a computer. Most common variable-width encodings are multibyte encodings, which use varying numbers of bytes (octets) to encode different characters. (Some authors, notably in Microsoft documentation, use the term multibyte character set, which is a misnomer, because representation size is an attribute of the encoding, not of the character set.)

TRON Code is a multi-byte character encoding used in the TRON project. It is similar to Unicode but does not use Unicode's Han unification process: each character from each CJK character set is encoded separately, including archaic and historical equivalents of modern characters. This means that Chinese, Japanese, and Korean text can be mixed without any ambiguity as to the exact form of the characters; however, it also means that many characters with equivalent semantics will be encoded more than once, complicating some operations.

Chōonpu

The chōonpu, also known as chōonkigō (長音記号), onbiki (音引き), bōbiki (棒引き), or Katakana-Hiragana Prolonged Sound Mark by the Unicode Consortium, is a Japanese symbol that indicates a chōon, or a long vowel of two morae in length. Its form is a horizontal or vertical line in the center of the text with the width of one kanji or kana character. It is written horizontally in horizontal text and vertically in vertical text. The chōonpu is usually used to indicate a long vowel sound in katakana writing, rarely in hiragana writing, and never in romanized Japanese. The chōonpu is a distinct mark from the dash, and in most Japanese typefaces it can easily be distinguished. In horizontal writing it is similar in appearance to, but should not be confused with, the kanji character 一 ("one").

JIS X 0201 Japanese single byte character encoding

JIS X 0201, a Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. It is either 7-bit encoding or 8-bit encoding, although 8-bit encoding is dominant for modern use. The full name of this standard is 7-bit and 8-bit coded character sets for information interchange (7ビット及び8ビットの情報交換用符号化文字集合).

Meiryo is a Japanese sans-serif gothic typeface. Microsoft bundled Meiryo with Office Mac 2008 as part of the standard install, and it replaces MS Gothic as the default system font for Vista on Japanese systems.

Half-width kana are katakana characters displayed at half their normal width, instead of the usual square (1:1) aspect ratio. For example, the usual (full-width) form of the katakana ka is カ while the half-width form is カ. Half-width hiragana is not usable within Unicode, although it's usable on Web or E-books via CSS's font-feature-settings: "hwid" 1 with Adobe-Japan1-6 based OpenType fonts. Half-width kanji is not usable on modern computers even though it's used in some receipt printers, electric bulletin board or old computers.

JIS X 0213

JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan. This standard extends JIS X 0208. The first version was published in 2000 and revised in 2004 (JIS2004) and 2012. As well as adding a number of special characters, characters with diacritic marks, etc., it included an additional 3,625 kanji. The full name of the standard is 7-bit and 8-bit double byte coded extended KANJI sets for information interchange.

Quotation marks, also known as quotes, quote marks, quotemarks, speech marks, inverted commas, or talking marks, are punctuation marks used in pairs in various writing systems to set off direct speech, a quotation, or a phrase. The pair consists of an opening quotation mark and a closing quotation mark, which may or may not be the same character.

Halfwidth and fullwidth forms Alternative width characters in East Asian typography

In CJK computing, graphic characters are traditionally classed into fullwidth and halfwidth characters. With fixed-width fonts, a halfwidth character occupies half the width of a fullwidth character, hence the name.

JIS X 0212 is a Japanese Industrial Standard defining a coded character set for encoding supplementary characters for use in Japanese. This standard is intended to supplement JIS X 0208. It is numbered 953 or 5049 as an IBM code page.

JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the current standard is 7-bit and 8-bit double byte coded KANJI sets for information interchange. It was originally established as JIS C 6226 in 1978, and has been revised in 1983, 1990, and 1997. It is also called Code page 952 by IBM. The 1978 version is also called Code page 955 by IBM.

KS X 1001, formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer.

Microsoft Windows code page 932, also called Windows-31J amongst other names, is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding. It contains standard 7-bit ASCII codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

References

  1. Ken Lunde (1999). CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing. "O'Reilly Media, Inc.". pp. 345–346, 348. ISBN   978-1-56592-224-2.
  2. Unicode Vertical Text Layout, Table 4. Glyph Changes for Vertical Orientation: www.unicode.org
  3. Katsuhiro Ogata, History of wave dash in several standards (in Japanese), http://internet.watch.impress.co.jp/docs/special/20150307_691658.html: Internet Watch
  4. JIS X 0208 (1990) to Unicode, www.unicode.org, 1994