VNI Software Company is a developer of various education, entertainment, office, and utility software packages. They are known for developing an encoding (VNI encoding) and a popular input method (VNI Input) for Vietnamese on for computers. VNI is often available on computer systems to type Vietnamese, alongside TELEX input method as well. The most common pairing is the use of VNI on keyboard and computers, whilst TELEX is more common on phones or touchscreens.
The VNI company is a family-owned company and based in Westminster, California. It was founded in 1987 by Hồ Thành Việt to develop software that eases Vietnamese language use on computers. Among their products were the VNI Encoding and VNI Input Method. The VNI Input Method has since grown to become the top two most popular input methods for Vietnamese, alongside TELEX which is more advantageous for phones and touchscreens whilst VNI has found more use on keyboard computer systems.
In the 1990s, Microsoft recognized the potential of VNI's products and incorporated VNI Input Method into Windows 95 Vietnamese Edition and MSDN, in use worldwide.
Upon Microsoft's unauthorized use of these technologies, VNI took Microsoft to court over the matter. Microsoft settled the case out of court, withdrew the input method from their entire product line, and developed their own input method. It has, although virtually unknown, appeared in every Windows release since Windows 98. [1]
Starting with Windows 10 version 1903, the VNI Input Method (as "Vietnamese Number Key-based"), along with the Telex input method, are now natively supported. [2]
Despite the growing popularity of Unicode in computing, the VNI Encoding (see below) is still in wide use by Vietnamese speakers both in Vietnam and abroad. All professional printing facilities in the Little Saigon neighborhood of Orange County, California continue to use the VNI Encoding when processing Vietnamese text. For this reason, print jobs submitted using the VNI Character Set are compatible with local printers.
VNI invented, popularized, and commercialized an input method and an encoding, the VNI Character Set, to assist computer users entering Vietnamese on their computers. The user can type using only ASCII characters found on standard computer keyboard layouts. Because the Vietnamese alphabet uses a complex system of diacritics for tones and other letters of the Vietnamese alphabet, the keyboard would need 133 alphanumeric keys and a Shift key to cover all possible characters. [3]
Originally, VNI's input method utilized function keys (F1, F2, ...) to enter the tone marks, which later turned out to be problematic, as the operating system used those keys for other purposes. VNI then turned to the numerical keys along the top of the keyboard (as opposed to the numpad) for entering tone marks. This arrangement survives today, but users also have the option of customizing the keys used for tone marks.
With VNI Tan Ky mode on, the user can type in diacritical marks anywhere within a word, and the marks will appear at their proper locations. For example, the word trường , which means 'school', can be typed in the following ways:
The first way is the conventional method, following handwriting and spelling convention, where the base is written first (truong) and then the tonal marks added later one by one.
With the release of VNI Tan Ky 4 in the 1990s, VNI freed users from having to remember where to correctly insert tone marks within a word, because, as long as the user enters all the required characters and tone marks, the software will group them correctly. This feature is especially useful for newcomers to the language.
VNI Auto Accent is the company's most recent software release (2006), with the purpose of alleviating repetitive strain injury (RSI) caused by prolonged use of computer keyboards. Auto Accent helps reduce the number of keystrokes needed to type each word by automatically adding diacritical marks for the user. The user must still enter every base letter in the word.
The VNI Encoding uses up to two bytes to represent one Vietnamese vowel character, with the second byte supplying additional diacritical marks, therefore removing the need to replace control characters with Vietnamese characters, a problematic system found in TCVN1 (VSCII-1) and in VISCII, or using two different fonts such as is sometimes employed for TCVN3 (VSCII-3), one containing lowercase characters and the other uppercase characters. A similar approach is taken by Windows-1258 and VSCII-2.
This solution is more portable between different versions of Windows and between different platforms. However, due to the presence of multiple characters in a file to represent one written character increases the file size. The increased file size can usually be accounted for by compressing the data into a file format such as ZIP.
The VNI encoding was used extensively in the south of Vietnam, and sometimes used overseas, while TCVN 5712 was dominant in the north. [4]
Points 0x00 through 0x7F follow ASCII.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | ||||||||||||||||
9x | ||||||||||||||||
Ax | ||||||||||||||||
Bx | ||||||||||||||||
Cx | ◌̂̀ | ◌̂́ | ◌̂ 0302 | ◌̂̃ | ◌̣̂ | ◌̂̉ | Ỉ 1EC8 | ◌̆̀ | ◌̆́ | ◌̆ 0306 | ◌̣̆ | Ì | Í | Ỵ 1EF4 | ◌̣ 0323 | |
Dx | Đ 0110 | Ị 1ECA | Ĩ 0128 | Ơ 01A0 | ◌̃ 0303 | Ư 01AF | ◌̀ 0300 | ◌́ 0301 | ◌̆̉ | ◌̉ 0309 | ◌̆̃ | |||||
Ex | ◌̂̀ | ◌̂́ | ◌̂ 0302 | ◌̂̃ | ◌̣̂ | ◌̂̉ | ỉ 1EC9 | ◌̆̀ | ◌̆́ | ◌̆ 0306 | ◌̣̆ | ì | í | ỵ 1EF5 | ◌̣ 0323 | |
Fx | đ 0111 | ị 1ECB | ĩ 0129 | ơ 01A1 | ◌̃ 0303 | ư 01B0 | ◌̀ 0300 | ◌́ 0301 | ◌̆̉ | ◌̉ 0309 | ◌̆̃ |
A version intended for use on Macintosh systems, with a different arrangement (corresponding to the different arrangement between Windows-1252 and Mac OS Roman).
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | ◌̣̂ | ◌̂̉ | ◌̆́ | Đ 0110 | Ư 01AF | ◌̆̃ | ◌̂́ | ◌̂̀ | ◌̂ 0302 | ◌̣̂ | ◌̂̃ | ◌̂̉ | ◌̆́ | ◌̆̀ | ||
9x | ◌̆ 0306 | ◌̣̆ | í 00ED | ì 00EC | ỵ 1EF5 | ◌̣ 0323 | đ 0111 | ĩ 0129 | ị 1ECB | ơ 01A1 | ư 01B0 | ◌̃ 0303 | ◌̆̉ | ◌́ 0301 | ◌̉ 0309 | ◌̆̃ |
Ax | Ỉ 1EC8 | ◌̀ 0300 | ||||||||||||||
Bx | ỉ 1EC9 | ◌̀ 0300 | ||||||||||||||
Cx | ◌̂̀ | ◌̂̃ | ◌̃ 0303 | |||||||||||||
Dx | ||||||||||||||||
Ex | ◌̂ 0302 | ◌̆ 0306 | ◌̂́ | ◌̣̆ | ◌̆̀ | Í 00CD | Ỵ 1EF4 | ◌̣ 0323 | Ì 00CC | Ĩ 0128 | Ơ 01A0 | |||||
Fx | Ị 1ECA | ◌̆̉ | ◌̉ 0309 | ◌́ 0301 |
The VNI encoding for use on DOS does not use separate characters for diacritics, instead replacing certain ASCII punctuation characters with tone-marked uppercase letters (compare ISO 646).
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | HT | LF | VT | FF | CR | SO | SI |
1x | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | Ỵ 1EF4 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | Á 00C1 | _ |
6x | À 00C0 | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | Ặ 1EB6 | Ả 1EA2 | Ã 00C3 | Ạ 1EA0 | DEL |
8x | Ấ 1EA4 | ẻ 1EBB | é 00E9 | â 00E2 | ẽ 1EBD | à 00E0 | ẹ 1EB9 | Ầ 1EA6 | ê 00EA | ế 1EBF | è 00E8 | ề 1EC1 | Ẩ 1EA8 | ì 00EC | ể 1EC3 | ễ 1EC5 |
9x | Ẫ 1EAA | ỏ 1ECF | õ 00F5 | ô 00F4 | ọ 1ECD | ò 00F2 | ố 1ED1 | ù 00F9 | ồ 1ED3 | ổ 1ED5 | ỗ 1ED7 | ộ 1ED9 | ủ 1EE7 | ũ 0169 | ụ 1EE5 | ư 01B0 |
Ax | á 00E1 | í 00ED | ó 00F3 | ú 00FA | ứ 1EE9 | ừ 1EEB | ử 1EED | ữ 1EEF | ự 1EF1 | ỉ 1EC9 | ĩ 0129 | ị 1ECB | ệ 1EC7 | đ 0111 | Đ 0110 | Ậ 1EAC |
Bx | Ắ 1EAE | Ằ 1EB0 | Ẳ 1EB2 | Ẵ 1EB4 | É 00C9 | È 00C8 | Ẻ 1EBA | Ẽ 1EBC | Ẹ 1EB8 | Ế 1EBE | Ề 1EC0 | Ể 1EC2 | Ễ 1EC4 | Ệ 1EC6 | Í 00CD | Ì 00CC |
Cx | Ỉ 1EC8 | Ĩ 0128 | Ị 1ECA | Ó 00D3 | Ò 00D2 | Ỏ 1ECE | Õ 00D5 | Ọ 1ECC | Ố 1ED0 | Ồ 1ED2 | Ổ 1ED4 | Ỗ 1ED6 | Ộ 1ED8 | Ớ 1EDA | Ờ 1EDC | Ở 1EDE |
Dx | Ỡ 1EE0 | Ợ 1EE2 | Ú 00DA | Ù 00D9 | Ủ 1EE6 | Ũ 0168 | Ụ 1EE4 | Ứ 1EE8 | Ừ 1EEA | Ử 1EEC | Ữ 1EEE | Ự 1EF0 | Ý 00DD | Ỳ 1EF2 | Ỷ 1EF6 | Ỹ 1EF8 |
Ex | ả 1EA3 | ã 00E3 | ạ 1EA1 | ấ 1EA5 | ầ 1EA7 | ẩ 1EA9 | ẫ 1EAB | ậ 1EAD | ă 0103 | ắ 1EAF | ằ 1EB1 | ẳ 1EB3 | ẵ 1EB5 | ặ 1EB7 | ý 00FD | ỳ 1EF3 |
Fx | ỷ 1EF7 | ỹ 1EF9 | ỵ 1EF5 | ơ 01A1 | ớ 1EDB | ờ 1EDD | ở 1EDF | ỡ 1EE1 | ợ 1EE3 | Ô 00D4 | Ơ 01A0 | Ư 01AF | Ă 0102 | Â 00C2 | Ê 00CA | NBSP |
The use of Vietnamese Quoted-Readable (VIQR), a convention for writing in Vietnamese using ASCII characters, began during the Vietnam War, when typewriters were the main tool for word processing. Because the U.S. military required a way to represent Vietnamese scripts accurately on official documents, VIQR was invented for the military.[ citation needed ] Due to its longstanding use, VIQR was a natural choice for computer word processing, prior to the appearance of VNI, VPSKeys, VSCII, VISCII, and Unicode. It is still widely used[ when? ] for information exchange on computers, but is not desirable for design and layout, due to its cryptic appearance.
VIQR's main issue was the difficulty of reading VIQR text, especially for inexperienced computer users. VNI created and released a free font called VNI-Internet Mail, which utilized a variant of the VIQR notation and VNI's combining character technique to give VIQR text a more natural appearance by replacing certain ASCII punctuation with combining characters.
The following table compares VNI-Internet Mail to other codified VIQR or VIQR-like conventions.
Diacritical mark | RFC 1456 VIQR notation [7] | VSCII-MNEM notation [8] | VNI Internet Mail notation [6] | Example |
---|---|---|---|---|
Breve | ( | < | | | A| displayed as Ă |
Circumflex | ^ | > | ^ | E^ displayed as Ê |
Horn | + | * | * | U* displayed as Ư |
Acute | ' | ' | ' | O' displayed as Ó |
Grave | ` | ! | ` | O` displayed as Ò |
Hook above | ? | ? | { | O{ displayed as Ỏ |
Tilde | ~ | " | ~ | O~ displayed as Õ |
Dot below | . | . | } | O} displayed as Ọ |
Barred D | DD | DD | D_ | D_ displayed as Đ |
The Vietnamese alphabet is the modern writing script for Vietnamese. It uses the Latin script based on Romance languages originally developed by Portuguese missionary Francisco de Pina (1585–1625).
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
VISCII is an unofficially-defined modified ASCII character encoding for using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VISCII keeps the 95 printable characters of ASCII unmodified, but it replaces 6 of the 33 control characters with printable characters. It adds 128 precomposed characters. Unicode and the Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy VSCII and VISCII files may need conversion.
Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks.
The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that emerged during the 19th century from suggestions by Charles Trevelyan, William Jones, Monier Monier-Williams and other scholars, and formalised by the Transliteration Committee of the Geneva Oriental Congress, in September 1894. IAST makes it possible for the reader to read the Indic text unambiguously, exactly as if it were in the original Indic script. It is this faithfulness to the original scripts that accounts for its continuing popularity amongst scholars.
Vietnamese Quoted-Readable, also known as Vietnet, is a convention for writing Vietnamese using ASCII characters encoded in only 7 bits, making possible for Vietnamese to be supported in computing and communication systems at the time. Because the Vietnamese alphabet contains a complex system of diacritical marks, VIQR requires the user to type in a base letter, followed by one or two characters that represent the diacritical marks.
The Alt keyAlt on a computer keyboard is used to change (alternate) the function of other pressed keys. Thus, the Alt key is a modifier key, used in a similar fashion to the Shift key. For example, simply pressing A will type the letter 'a', but holding down the Alt key while pressing A will cause the computer to perform an Alt+A function, which varies from program to program. The international standard ISO/IEC 9995-2 calls it Alternate key. The key is located on either side of the space bar, but in non-US PC keyboard layouts, rather than a second Alt key, there is an 'Alt Gr' key to the right of the space bar. Both placements are in accordance with ISO/IEC 9995-2. With some keyboard mappings, the right Alt key can be reconfigured to function as an AltGr key although not engraved as such.
Backspace is the keyboard key that in typewriters originally pushed the carriage one position backwards, and in modern computer systems typically moves the display cursor one position backwards, deletes the character at that position, and shifts back any text after that position by one character.
Several 8-bit character sets (encodings) were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.
Đ, known as crossed D or dyet, is a letter formed from the base character D/d overlaid with a crossbar. Crossing was used to create eth (ð), but eth has an uncial as its base whereas đ is based on the straight-backed roman d, like in Sámi Languages and Vietnamese. Crossed d is a letter in the alphabets of several languages and is used in linguistics as a voiced dental fricative.
On personal computers with numeric keypads that use Microsoft operating systems, such as Windows, many characters that do not have a dedicated key combination on the keyboard may nevertheless be entered using the Alt code. This is done by pressing and holding the Alt key, then typing a number on the keyboard's numeric keypad that identifies the character and then releasing Alt.
Telex or TELEX, is a convention for encoding Vietnamese text in plain ASCII characters. Originally used for transmitting Vietnamese text over telex systems, it is one of the most used input method on phones and touchscreens and also computers. Vietnamese Morse code uses the TELEX system. Other systems include VNI and VIQR.
VPSKeys is a freeware input method editor developed and distributed by the Vietnamese Professionals Society (VPS). One of the first input method editors for Vietnamese, it allows users to add accent marks to Vietnamese text on computers running Microsoft Windows. The first version of VPSKeys, supporting Windows 3.1, was released in 1993. The most recent version is 4.3, released in October 2007.
Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.
The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.
The Vietnamese language is written with a Latin script with diacritics which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with software that can be installed on the device or from third-party software such as UniKey. Telex is the oldest input method devised to encode the Vietnamese language with its tones. Other input methods may also include VNI and VIQR. VNI input method is not to be confused with VNI code page.
UniKey is the most popular third-party software and input method editor (IME) for encoding Vietnamese for Windows. The core, UniKey Vietnamese Input Method, is also the engine imbedded in many Vietnamese software-based keyboards in Windows, Android, Linux, macOS and iOS. UniKey is free and the source code for the UniKey Vietnamese Input Method is distributed under GNU General Public License. The official website of UniKey is unikey.org, which supports both English and Vietnamese.
VSCII, also known as TCVN 5712, ISO-IR-180, .VN, ABC or simply the TCVN encodings, is a set of three closely related Vietnamese national standard character encodings for using the Vietnamese language with computers, developed by the TCVN Technical Committee on Information Technology (TCVN/TC1) and first adopted in 1993.
VNLabs or VN Labs is a software company based in San Jose, California, that specializes in input methods for various languages.
Chinese character IT is the information technology for computer processing of Chinese characters. While the English writing system uses a few dozen different characters, Chinese language needs a much larger character set. There are over ten thousand characters in the Xinhua Dictionary. In the Unicode multilingual character set of 149,813 characters, 98,682 are Chinese. That means computer processing of Chinese characters is the toughest among other languages.
{{cite journal}}
: Cite journal requires |journal=
(help)