UniHan IME

Last updated

UniHan IME is an input method based on the framework of IIIMF developed by Hong Kong Sun Wah Hi-Tech Ltd.. UniHan IME is an input method interface that maps the keyboard keys string to the Han character in the latest version of Unicode Table.

UniHan is the CJKV characters section which occupied more than half the storage space of the Unicode Table. There are more than 75,000 characters coded in version 6.0.0 in year 2010. The Chinese, Japanese, Korean and Vietnamese shared the Han characters for naming for more than thousand years. The input methods for the Han characters from Unicode are mainly keyboard typing, mouse pointing on screen or hand writing on pad. The popular methods are the pinyin keyboard method and the hand writing method. A complete font set for Unihan version 6.0.0 is yet to come and so is the Unihan IME.

A similar IME called 8 Steps Unihan was developed by 8 Steps Unihan company in Melbourne, Australia. [1] The 8StepsA font coupled with Microsoft windows 10 SimSunExtB font are able to display all the characters in Unihan 10.0. which includes the extension F character set. The glyphs that are repeated have been linked together and only one of the linked code is used by the IME so that all the displayable characters are unique.

Related Research Articles

<span class="mw-page-title-main">Unicode</span> Character encoding standard

Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text written in all of the world's major writing systems. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

The yen and yuan sign (¥) is a currency sign used for the Japanese yen and the Chinese yuan currencies when writing in Latin scripts. This character resembles a capital letter Y with a single or double horizontal stroke. The symbol is usually placed before the value it represents, for example: ¥50, or JP¥50 and CN¥50 when disambiguation is needed. When writing in Japanese and Chinese, the Japanese kanji and Chinese character is written following the amount, for example 50円 in Japan, and 50元 or 50圆 in China.

In computing, Chinese character encodings can be used to represent text written in the CJK languages—Chinese, Japanese, Korean—and (rarely) obsolete Vietnamese, all of which use Chinese characters. Several general-purpose character encodings accommodate Chinese characters, and some of them were developed specifically for Chinese.

<span class="mw-page-title-main">Cangjie input method</span> Chinese keyboard entry system

The Cangjie input method is a system for entering Chinese characters into a computer using a standard computer keyboard. In filenames and elsewhere, the name Cangjie is sometimes abbreviated as cj.

<span class="mw-page-title-main">Japanese language and computers</span>

In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write in English is quite small, and thus it is possible to use only one byte (28=256 possible values) to encode each English character. However, the number of characters in Japanese is many more than 256 and thus cannot be encoded using a single byte - Japanese is thus encoded using two or more bytes, in a so-called "double byte" or "multi-byte" encoding. Problems that arise relate to transliteration and romanization, character encoding, and input of Japanese text.

The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the 19th century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.

The Hong Kong Supplementary Character Set is a set of Chinese characters – 4,702 in total in the initial release—used in Cantonese, as well as when writing the names of some places in Hong Kong.

<span class="mw-page-title-main">Korean language and computers</span> Input and use of Korean on computers

The writing system of the Korean language is a syllabic alphabet of character parts organized into character blocks representing syllables. The character parts cannot be written from left to right on the computer, as in many Western languages. Every possible syllable in Korean would have to be rendered as syllable blocks by a font, or each character part would have to be encoded separately. Unicode has both options; the character parts ㅎ (h) and ㅏ (a), and the combined syllable 하 (ha), are encoded.

<span class="mw-page-title-main">Chinese Character Code for Information Interchange</span> Character encoding standard

The Chinese Character Code for Information Interchange or CCCII is a character set developed by the Chinese Character Analysis Group in Taiwan. It was first published in 1980, and significantly expanded in 1982 and 1987.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

Unihan is an effort by Unicode/ISO 10646 to map Han characters into a single set, ignoring regional variations.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 15.1, Unicode defines a total of 97,680 characters.

Bopomofo, or Mandarin Phonetic Symbols, also named Zhuyin, is a Chinese transliteration and writing system for Mandarin Chinese and other related languages and dialects. More commonly used in Taiwanese Mandarin, it may also be used to transcribe other varieties of Chinese, particularly other varieties of Mandarin Chinese dialects, as well as Taiwanese Hokkien. Consisting of 37 characters and five tone marks, it transcribes all possible sounds in Mandarin.

Sinhala language software for computers have been present since the late 1980s but no standard character representation system was put in place which resulted in proprietary character representation systems and fonts. In the wake of this CINTEC introduced Sinhala within the UNICODE standard. ICTA concluded the work started by CINTEC for approving and standardizing Sinhala Unicode in Sri Lanka.

<span class="mw-page-title-main">Unicode input</span> Input characters using their Unicode code points

Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.

Menksoft is an IT company in Inner Mongolia, who developed Menksoft Mongolian IME, the most widely used Mongolian language input method editor (IME) in Inner Mongolia.

<span class="mw-page-title-main">Intelligent Input Bus</span> Framework for multilingual input

The Intelligent Input Bus is an input method (IM) framework for multilingual input in Unix-like operating-systems. The name "Bus" comes from its bus-like architecture.

Bengali input methods refer to different systems developed to type the characters of the Bengali script for Bengali language and others, using a typewriter or a computer keyboard.

The Vietnamese language is written with a Latin script with diacritics which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with software that can be installed on the device or from third-party software such as UniKey. Telex is the oldest input method devised to encode the Vietnamese language with its tones. Other input methods may also include VNI and VIQR. VNI input method is not to be confused with VNI code page.

References

  1. "Home". 8suh.com.