Unihan (disambiguation)

Last updated

Unihan is an effort by Unicode/ISO 10646 to map Han characters into a single set, ignoring regional variations.

Unihan may also refer to:

Related Research Articles

Unicode Character encoding standard

Unicode is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines 143,859 characters covering 154 modern and historic scripts, as well as symbols, emoji, and non-visual control and formatting codes.

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), and Korean (hanja).

In digital typography, the TrueType font Arial Unicode MS is an extended version of the font Arial. Compared to Arial, it includes higher line height, omits kerning pairs and adds enough glyphs to cover a large subset of Unicode 2.1—thus supporting most Microsoft code pages, but also requiring much more storage space. It also adds Ideographic layout tables, but unlike Arial, it mandates no smoothing in the 14–18 point range, and contains Roman (upright) glyphs only; there is no oblique (italic) version. Arial Unicode MS was previously distributed with Microsoft Office, but this ended in 2016 version. It is bundled with Mac OS X v10.5 and later. It may also be purchased separately from Ascender Corporation, who licenses the font from Microsoft.

Michael Everson American-Irish linguist, typesetter and font designer, and publisher

Michael Everson is an American and Irish linguist, script encoder, typesetter, font designer and publisher. He runs a publishing company called Evertype, through which he has published over a hundred books since 2006.

Open-source Unicode typefaces

A few projects exist to provide free and open-source Unicode typefaces, i.e. Unicode typefaces which are open-source and designed to contain glyphs of all Unicode characters, or at least a broad selection of Unicode scripts. There are also numerous projects aimed at providing only a certain script, such as the Arabeyes Arabic font. The advantage of targeting only some scripts with a font was that certain Unicode characters should be rendered differently depending on which h they are used in, and that a font that only includes the characters a certain user needs will be much smaller in file size compared to one with many glyphs. Unicode fonts in modern formats such as OpenType can in theory cover multiple languages by including multipluper character, though very few actually cover more than one language's forms of the unified Han characters.

Chinese Character Code for Information Interchange Character encoding standard

The Chinese Character Code for Information Interchange or CCCII is a character set developed by the Chinese Character Analysis Group in Taiwan. It was first published in 1980, and significantly expanded in 1982 and 1987.

<i>Mojikyō</i> Character encoding scheme

Mojikyō, also known by its full name Konjaku Mojikyō, is a character encoding scheme. The Mojikyō Institute, which publishes the character set, also published computer software and TrueType fonts to go along with it. The Mojikyō Institute, chaired by Tadahisa Ishikawa (石川忠久), originally had its character set and related software and data redistributed on CD-ROM by Kinokuniya. Conceptualized in 1996, the first version of the CD-ROM was released in July 1997. For a time, it even offered a web subscription, "Mojikyō WEB" (文字鏡WEB) which had more up-to-date characters.

Kangxi radical Unicode character block

The 214 Kangxi radicals, also known as the Zihui radicals, form a system of radicals (部首) of Chinese characters. The radicals are numbered in stroke count order. They are the most popular system of radicals for dictionaries that order Traditional Chinese characters by radical and stroke count. They are officially part of the Unicode encoding system for CJKV characters, in their standard order, under the coding block "Kangxi radicals", while their graphic variants are contained in the "CJK Radicals Supplement". Thus, a reference to "radical 61", for example, without additional context, refers to the 61st radical of the Kangxi Dictionary, 心; xīn "heart".

In Unicode, two glyphs are said to be Z-variants if they share the same etymology but have slightly different appearances and different Unicode code points. For example, the Unicode characters U+8AAA 說 and U+8AAC 説 are Z-variants. The notion of Z-variance is only applicable to the "CJKV scripts"—Chinese, Japanese, Korean and Vietnamese—and is a subtopic of Han unification.

The CEDICT project was started by Paul Denisowski in 1997 and is maintained by a team on mdbg.net under the name CC-CEDICT, with the aim to provide a complete Chinese to English dictionary with pronunciation in pinyin for the Chinese characters.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

Unihan font was developed by Ross Paterson in 1993.

The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters. In the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 13.0, Unicode defines a total of 92,856 CJK Unified Ideographs.

Ken Lunde

Ken Roger Lunde is an American specialist in information processing for East Asian languages.

Radical 213

Radical 213 meaning "turtle" is one of only two of the 214 Kangxi radicals that are composed of 16 strokes.

UniHan IME is an input method based on the framework of IIIMF developed by Hong Kong Sun Wah Hi-Tech Ltd.. UniHan IME is an input method interface that maps the keyboard keys string to the Han character in the latest version of Unicode Table.

Universal Character Set characters Complete list of the characters available on most computers

The Unicode Consortium (UC) and the International Organisation for Standardisation (ISO) collaborate on the Universal Character Set (UCS). The UCS is an international standard to map characters used in natural language, mathematics, music, and other domains to machine-readable values. By creating this mapping, the UCS enables computer software vendors to interoperate and transmit UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple meanings and thus be improperly decoded if the wrong one is chosen.

The Vietnamese language is written with a Latin script with diacritics which does require several accommodations when typing on phone or computers. Software-based systems are the most popular form of writing Vietnamese. Telex is the oldest input method devised to encode the Vietnamese language and is often set as the default on virtual keyboards on phones and touchscreen devices. Other input methods may also include VNI and VIQR, which suit physical personal desktop computers or laptops more.

KS X 1002 is a South Korean character set standard that is established in order to supplement KS X 1001. It consists of a total of 7,649 characters.