Japanese input method

Last updated
A Surface Type Cover glows in the dark at night, showing the standard JIS keyboard layout used by most keyboards sold in Japan. Surface type cover JIS keyboard layout blue.jpg
A Surface Type Cover glows in the dark at night, showing the standard JIS keyboard layout used by most keyboards sold in Japan.
VJE Japanese input method for DOS VJE Japanese input method for DOS screenshot.jpg
VJE Japanese input method for DOS

Japanese input methods are used to input Japanese characters on a computer.

Contents

There are two main methods of inputting Japanese on computers. One is via a romanized version of Japanese called rōmaji (literally "Roman character"), and the other is via keyboard keys corresponding to the Japanese kana . Some systems may also work via a graphical user interface, or GUI, where the characters are chosen by clicking on buttons or image maps.

Japanese keyboards

Microsoft's gaming keyboard for the Japanese market SideWinder X4 Keyboard Japanese.jpg
Microsoft's gaming keyboard for the Japanese market
Apple MacBook Pro Japanese Keyboard MacBookProJISKeyboard-1.jpg
Apple MacBook Pro Japanese Keyboard
70s Kanji keyboard (a subsystem common to the IBM 3278 Model 52 Display and the IBM 5924-T01 Kanji Keypunch ) used before the Kana-to-Kanji conversion was invented 200801191443 Japanische Tastatur.jpeg
70s Kanji keyboard (a subsystem common to the IBM 3278 Model 52 Display and the IBM 5924-T01 Kanji Keypunch ) used before the Kana-to-Kanji conversion was invented

Japanese keyboards (as shown on the second image) have both hiragana and Roman letters indicated. The JIS, or Japanese Industrial Standard, keyboard layout keeps the Roman letters in the English QWERTY layout, with numbers above them. Many of the non-alphanumeric symbols are the same as on English-language keyboards, but some symbols are located in other places. The hiragana symbols are also ordered in a consistent way across different keyboards. For example, the Q, W, E, R, T, Y keys correspond to た, て, い, す, か, ん (ta, te, i, su, ka, and n) respectively when the computer is used for direct hiragana input.

Input keys

Since Japanese input requires switching between Roman and hiragana entry modes, and also conversion between hiragana and kanji (as discussed below), there are usually several special keys on the keyboard. This varies from computer to computer, and some OS vendors have striven to provide a consistent user interface regardless of the type of keyboard being used. On non-Japanese keyboards, option- or control- key sequences can perform all of the tasks mentioned below.

On most Japanese keyboards, one key switches between Roman characters and Japanese characters. Sometimes, each mode (Roman and Japanese) may even have its own key, in order to prevent ambiguity when the user is typing quickly.

There may also be a key to instruct the computer to convert the latest hiragana characters into kanji, although usually the space key serves the same purpose since Japanese writing doesn't use spaces.

Some keyboards have a mode key to switch between different forms of writing. This of course would only be the case on keyboards that contain more than one set of Japanese symbols. Hiragana, katakana, halfwidth katakana, halfwidth Roman letters, and fullwidth Roman letters are some of the options. A typical Japanese character is square while Roman characters are typically variable in width. Since all Japanese characters occupy the space of a square box, it is sometimes desirable to input Roman characters in the same square form in order to preserve the grid layout of the text. These Roman characters that have been fitted to a square character cell are called fullwidth, while the normal ones are called halfwidth. In some fonts these are fitted to half-squares, like some monospaced fonts, while in others they are not. Often, fonts are available in two variants, one with the halfwidth characters monospaced, and another one with proportional halfwidth characters. The name of the typeface with proportional halfwidth characters is often prefixed with "P" for "proportional".

Finally, a keyboard may have a special key to tell the OS that the last kana entered should not be converted to kanji. Sometimes this is just the Return/Enter key.

Thumb-shift keyboards

A thumb-shift keyboard is an alternative design, popular among professional Japanese typists. Like a standard Japanese keyboard, it has hiragana characters marked in addition to Latin letters, but the layout is completely different. Most letter keys have two kana characters associated with them, which allows all the characters to fit in three rows, like in Western layouts. In the place of the space bar key on a conventional keyboard, there are two additional modifier keys, operated with thumbs - one of them is used to enter the alternate character marked, and the other is used for voiced sounds. The semi-voiced sounds are entered using either the conventional shift key operated by the little finger, or take place of the voiced sound for characters not having a voiced variant.

The kana-to-kanji conversion is done in the same way as when using any other type of keyboard. There are dedicated conversion keys on some designs, while on others the thumb shift keys double as such.

Rōmaji input

As an alternative to direct input of kana, a number of Japanese input method editors allow Japanese text to be entered using rōmaji , which can then be converted to kana or kanji. This method does not require the use of a Japanese keyboard with kana markings.

Mobile phones

Keitai input

Japanese mobile phone keypad (Model Samsung 708SC) Keypad on Japanese phone 708SC.jpg
Japanese mobile phone keypad (Model Samsung 708SC)

The primary system used to input Japanese on earlier generations of mobile phones is based on the numerical keypad. Each number is associated with a particular sequence of kana, such as ka, ki, ku, ke, ko for '2', and the button is pressed repeatedly to get the correct kana – each key corresponds to a column in the gojūon (5 row × 10 column grid of kana), while the number of presses determines the row. [2] Dakuten and handakuten marks, punctuation, and other symbols can be added by other buttons in the same way. Kana to kanji conversion is done via the arrow and other keys.

Flick input

Flick input is a Japanese input method used on smartphones. The key layout is the same as the Keitai input, but rather than pressing a key repeatedly, the user can swipe from the key in a certain direction to produce the desired character. [2] Japanese smartphone IMEs such as Google Japanese Input, POBox and S-Shoin all support flick input.

Flick input Flick input vowels.png
Flick input

Godan layout

In addition to the industry standard QWERTY and 12 key layouts, Google Japanese Input offers a 15-key Godan keyboard layout, which is an alphabet layout optimized for romaji input. The letters fit in a five rows by three columns grid. The left column consists of the five vowels, in the same order as the columns in the Gojūon table (a, i, u, e, o), while the central and right column consists of letters for the nine main voiceless consonants of kanas, in the same order as the rows in the Gojūon table (k, s, t, n, [special]; h, m, y, r, w). Other characters are typed by flick gesture:

  • The other twelve Latin consonants not needed for composing kanas (b, c, d, f, g, j, l, p, q, v, x, z) are composed on the voiceless consonants by swiping them up, or right, or even left (swiping k for q or g; swiping s for j or z; swiping t for c or d; swiping h for f, b or p; swiping m for l; swiping y for x; swiping w for v).
  • The main voiced kanas are composed like in romaji, by typing (without swiping) the voiceless consonant on the two last columns, then swiping the vowel on the first column.
  • The other voiced kanas letters (with handakuon or small forms) are composed by typing the voiceless consonant, then swiping the vowel, then swiping the [special] key (in the middle of the last row) to select the handakuon (swipe to the left or right) or small kana forms (swipe up).
  • Small kana can be written by swiping to l or x, and then writing the wanted letter, e.g. inputs fa and hu/fu, then la/xa both give out ふぁ/ファ fa, as in ファミコン Famikon .
  • Decimal digits are composed by swiping down the keys located on the first 3 rows (digits 1 to 9) or the middle of the fourth row (digit 0).
  • The four main punctuation signs are composed by swiping r at end of the fourth row (swipe down for comma, left for the full stop, up for the question mark, right for the exclamation mark).
  • Other signs or input controls may be composed by typing or swiping the other unused positions of other keys. But the tactile version of the layout adds keys in two additional columns for typing space, Enter, Backspace, moving the input cursor to the left or right, converting the previous character between hiragana and katakana, and selecting other input modes.
  • Writing just c gives out か・く・こ when written with a, u and o respectively, and し・せ when with i and e, respectively.
  • To write a sokuon before ち, the inputs WITH this character are: lt(s)u/xt(s)u, ti/chi. The input tchi doesn't work.
  • [Special] consists of ゛, ゜ and 小 (dakuten, handakuten, small).

Unlike the 12-key input, repeating a key in Godan is not interpreted as a gesture to cycle through kana with different vowels, but rather it would be interpreted as a repeated romaji letter behaving the same as in the QWERTY layout mode. [3]

Japanese keyboard layout and input comparison chart
LayoutDesktopKeitaiSmart phoneCycling inputFlick inputRomaji input
12 keyNoYesYesYesYesNo
QWERTYYesNoYesNoNoYes
GodanNoNoYesNoYesYes

Other

Other consumer devices in Japan which allow for text entry via on-screen programming, such as digital video recorders and video game consoles, allow the user to toggle between the numerical keypad and a full keyboard (QWERTY, or ABC order) input system.

Kana to kanji conversion

After the kana have been input, they are either left as they are, or converted into kanji (Chinese characters). The Japanese language has many homophones, and conversion of a kana spelling (representing the pronunciation) into a kanji (representing the standard written form of the word) is often a one-to-many process. The kana to kanji converter offers a list of candidate kanji writings for the input kana, and the user may use the space bar or arrow keys to scroll through the list of candidates until they reach the correct writing. On reaching the correct written form, pressing the Enter key, or sometimes the "henkan" key, ends the conversion process. This selection can also be controlled through the GUI with a mouse or other pointing device.

If the hiragana is required, pressing the Enter key immediately after the characters are entered will end the conversion process and results in the hiragana as typed. If katakana is required, it is usually presented as an option along with the kanji choices. Alternatively, on some keyboards, pressing the muhenkan (無変換, "no conversion") button switches between katakana and hiragana.

Operation of a typical IME IME demonstratie - Matsuo Bashou - Furu ikeya kawazu tobikomu mizuno oto.png
Operation of a typical IME

Sophisticated kana to kanji converters (known collectively as input method editors, or IMEs), allow conversion of multiple kana words into kanji at once, freeing the user from having to do a conversion at each stage. The user can convert at any stage of input by pressing the space bar or henkan button, and the converter attempts to guess the correct division of words. Some IME programs display a brief definition of each word in order to help the user choose the correct kanji.

Sometimes the kana to kanji converter may guess the correct kanji for all the words, but if it does not, the cursor (arrow) keys may be used to move backwards and forwards between candidate words, or digit keys can be used to select one of them directly (without pressing cursor keys multiple times and pressing Enter to confirm the choice). If the selected word boundaries are incorrect, the word boundaries can be moved using the control key (or shift key, e.g. on iBus-Anthy) plus the arrow keys.

Learning systems

Modern systems learn the user's preferences for conversion and put the most recently selected candidates at the top of the conversion list, and also remember which words the user is likely to use when considering word boundaries.

Predictive systems

The systems used on mobile phones go even further, and try to guess entire phrases or sentences. After a few kana have been entered, the phone automatically offers entire phrases or sentences as possible completion candidates, jumping beyond what has been input. This is usually based on words sent in previous messages.

See also

Related Research Articles

Furigana is a Japanese reading aid consisting of smaller kana printed either above or next to kanji or other characters to indicate their pronunciation. It is one type of ruby text. Furigana is also known as yomigana (読み仮名) and rubi in Japanese. In modern Japanese, it is usually used to gloss rare kanji, to clarify rare, nonstandard or ambiguous kanji readings, or in children's or learners' materials. Before the post-World War II script reforms, it was more widespread.

Hiragana is a Japanese syllabary, part of the Japanese writing system, along with katakana as well as kanji.

Katakana is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script.

Kana are syllabaries used to write Japanese phonological units, morae. Such syllabaries include (1) the original kana, or magana, which were Chinese characters (kanji) used phonetically to transcribe Japanese, the most prominent magana system being man'yōgana (万葉仮名); the two descendants of man'yōgana, (2) hiragana, and (3) katakana. There are also hentaigana, which are historical variants of the now-standard hiragana. In current usage, 'kana' can simply mean hiragana and katakana.

<span class="mw-page-title-main">Japanese language and computers</span>

In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write in English is quite small, and thus it is possible to use only one byte (28=256 possible values) to encode each English character. However, the number of characters in Japanese is many more than 256 and thus cannot be encoded using a single byte - Japanese is thus encoded using two or more bytes, in a so-called "double byte" or "multi-byte" encoding. Problems that arise relate to transliteration and romanization, character encoding, and input of Japanese text.

An input method is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters that are available to them. Using an input method is usually necessary for languages that have more graphemes than there are keys on the keyboard.

The dakuten, colloquially ten-ten, is a diacritic most often used in the Japanese kana syllabaries to indicate that the consonant of a syllable should be pronounced voiced, for instance, on sounds that have undergone rendaku.

Wāpuro rōmaji (ワープロローマ字), or kana spelling, is a style of romanization of Japanese originally devised for entering Japanese into word processors while using a Western QWERTY keyboard.

<span class="mw-page-title-main">Japanese writing system</span> Structure of the Japanese writing system

The modern Japanese writing system uses a combination of logographic kanji, which are adopted Chinese characters, and syllabic kana. Kana itself consists of a pair of syllabaries: hiragana, used primarily for native or naturalised Japanese words and grammatical elements; and katakana, used primarily for foreign words and names, loanwords, onomatopoeia, scientific names, and sometimes for emphasis. Almost all written Japanese sentences contain a mixture of kanji and kana. Because of this mixture of scripts, in addition to a large inventory of kanji characters, the Japanese writing system is considered to be one of the most complicated currently in use.

In the Japanese language, the gojūon (五十音, Japanese pronunciation:[ɡo(d)ʑɯꜜːoɴ], lit. "fifty sounds") is a traditional system ordering kana characters by their component phonemes, roughly analogous to alphabetical order. The "fifty" (gojū) in its name refers to the 5×10 grid in which the characters are displayed. Each kana, which may be a hiragana or katakana character, corresponds to one sound in Japanese. As depicted at the right using hiragana characters, the sequence begins with あ (a), い (i), う (u), え (e), お (o), then continues with か (ka), き (ki), く (ku), け (ke), こ (ko), and so on and so forth for a total of ten rows of five columns.

<i>Chōonpu</i> Japanese punctuation mark

The chōonpu, also known as chōonkigō (長音記号), onbiki (音引き), bōbiki (棒引き), or Katakana-Hiragana Prolonged Sound Mark by the Unicode Consortium, is a Japanese symbol that indicates a chōon, or a long vowel of two morae in length. Its form is a horizontal or vertical line in the center of the text with the width of one kanji or kana character. It is written horizontally in horizontal text and vertically in vertical text. The chōonpu is usually used to indicate a long vowel sound in katakana writing, rarely in hiragana writing, and never in romanized Japanese. The chōonpu is a distinct mark from the dash, and in most Japanese typefaces it can easily be distinguished. In horizontal writing it is similar in appearance to, but should not be confused with, the kanji character 一 ("one").

OpenVanilla (OV) is a free, open-source text-entry and processing architecture. It includes a collection of popular input methods and text processing filters, serving as a bridge between input methods and the operating system. It was originally designed to offer a better text-entry experience and alternative input methods not found in Apple's built-in set or suit better the needs for Windows "switchers." However, the developers have since worked on a Microsoft Windows port and a bridge between OV and SCIM on the X Window System. The macOS version is compatible with Mac OS X 10.3 (Panther) and Mac OS X 10.4 (Tiger). OV's input methods can also be used through SCIM on Linux or FreeBSD. An experimental Win32 Unicode version is also available.

Half-width kana are katakana characters displayed compressed at half their normal width, instead of the usual square (1:1) aspect ratio. For example, the usual (full-width) form of the katakana ka is カ while the half-width form is カ. Half-width hiragana is included in Unicode, and it is usable on Web or in e-books via CSS's font-feature-settings: "hwid" 1 with Adobe-Japan1-6 based OpenType fonts. Half-width kanji is usable on modern computers, and is used in some receipt printers, electric bulletin board and old computers.

ATOK is a Japanese input method editor (IME) produced by JustSystems, a Japanese software company.

Kotoeri (ことえり) is a discontinued Japanese-language input method that came standard with OS X and earlier versions of Classic Mac OS until OS X Yosemite. Kotoeri literally means "word selection".

Language input keys, which are usually found on Japanese and Korean keyboards, are keys designed to translate letters using an input method editor (IME). On non-Japanese or Korean keyboard layouts using an IME, these functions can usually be reproduced via hotkeys, though not always directly corresponding to the behavior of these keys.

The romanization of Japanese is the use of Latin script to write the Japanese language. This method of writing is sometimes referred to in Japanese as rōmaji.

<span class="mw-page-title-main">Keyboard layout</span> Arrangement of keys on a typographic keyboard

A keyboard layout is any specific physical, visual, or functional arrangement of the keys, legends, or key-meaning associations (respectively) of a computer keyboard, mobile phone, or other computer-controlled typographic keyboard.

<span class="mw-page-title-main">Thumb-shift keyboard</span> Keyboard design

The thumb-shift keyboard is a keyboard design for inputting Japanese sentences on word processors and computers. It was invented by Fujitsu in the late 1970s and released in 1980 as a feature of the line of Japanese word processors the company sold, named OASYS, to make Japanese input easier, faster and more natural. It is popular among people who input large quantities of Japanese sentences, such as writers, playwrights, lawyers and so on, because of its ease of use and speed. The rights regarding the use of this design were transferred to Nihongo Nyuuryoku Consortium, a technology sharing cooperative of interested companies, in 1989. It is referred to as an example of keyboard layout in Japanese Industrial Standards.

<span class="mw-page-title-main">MessagEase</span> Input method for touchscreen devices

MessagEase is an input method and virtual keyboard for touchscreen devices. It relies on a new entry system designed by Saied B. Nesbat, formatted as a 3x3 matrix keypad where users may press or swipe up, down, left, right, or diagonally to access all keys and symbols. It is a keyboard that was designed for devices like cell phones, mimicking the early cell phones' limited number of 12 keys.

References

  1. Hensch, Kurt (2004). IBM History of Far Eastern Languages in Computing: National Language Support Since 1961; [Looking to East Asia]. Kurt Hensch. pp. 66–67. ISBN   978-3-937267-03-6.
  2. 1 2 Which Japanese input method on iPhone is more popular, Kana or Romaji? , retrieved 2015-01-31
  3. "Godan キーボードとはなんですか?" [What is Godan Keyboard?]. Google Japan (in Japanese).