This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these messages) |
Chu Bong-Foo | |||||||||
---|---|---|---|---|---|---|---|---|---|
Chinese | 朱邦復 | ||||||||
|
Chu Bong-Foo | |
---|---|
Born | 1937 (age 86–87) |
Nationality | Chinese |
Known for | Inventing Tsang-chieh (Cangjie) input method for computers |
Scientific career | |
Fields | Computer science |
Chu Bong-Foo (born 1937) is the inventor of the Tsang-chieh (Cangjie),a widely used Chinese input method. His input method,created in 1976 and given to the public domain in 1982,has sped up the computerization of Chinese society. Chu spent his childhood in Taiwan,and has worked in Brazil,the United States,Taiwan,Shenzhen and Macau.
Chu was born in 1937 in Huanggang,Hubei to father Chu Wan-in,also called Chu Huai-ping (Chinese :朱懷冰). His family led a wandering life during the turbulent days of mainland China,and they finally settled down in Taiwan. There he studied at a local high school. He was an imaginative teenager who spent so much time reading fiction that it negatively affected his studies. Later he also became interested in cinema. After graduating from Taiwan Provincial Agriculture Institute and his military service,he taught briefly at an elementary school in Hualien. In this period he witnessed the poverty of countryside,and developed a sense of mission for rural development and cultural improvement. Finding teaching not to his taste,he went to Brazil instead to develop his career,only to find life more difficult. Over that period of time,he took up several jobs. It was also during these turbulent times that Chu flirted with the hippie lifestyle and studied at a local conservatory. [1]
His work on Tsang-chieh did not begin until he worked at "Cultural Abril",a publishing house in Brazil,in 1972. From then on,he would dedicate his life to modernizing Chinese information technology. He saw for himself how the Brazilians could,in just one day,translate and publish foreign literature,while the Chinese took at least a year. The technology then,coupled with the complexities of the Chinese script,required a painstaking process of picking up type pieces from an enormous Chinese character set. Moreover,publishers frequently faced the challenge of encountering characters that were not part of their standard character set. Consequently,printing information in Chinese was significantly slower compared to other languages. In 1973,upon his return to Taiwan,he assembled a team to research an efficient method for character lookup using 26 keys on a standard keyboard.
Existing methods of looking up a Chinese character such as looking for its radicals,zhuyin,or romanization give only ambiguous results. On the other hand,while Chinese script has no alphabet,most characters are compounds of a common set of components. Chu assumed that it was possible to encode Chinese characters with a group of 'Chinese alphabets' which can be mapped on a common keyboard. After studying dictionary cut-outs and conducting many tests,the team released a table of 8,000 encoded characters in 1976. This result was unsatisfactory for general use but did however prove the possibility of encoding Chinese in this way.
Chu then enlisted more help,including that of Shen Hung-lian (沈紅蓮) from the Department of Chinese Literature,National Taiwan University. At the same time,Chu also learned about An Wang's encoding scheme. On one hand,Wang's scheme further confirmed the feasibility of the encoding approach. On the other hand,it inspired Chu to think that his encoding scheme should not only be convenient for looking up a character,it should also take the form of the characters into account to make it possible to compose (draw) the character from a code. Chu assumed this could be achieved with the following three steps:
To achieve these steps,the team employed a principle similar to the "pictophonetic compounds" principle of Chinese. In 1977 the team released the first generation of the method that would later be named "Tsang-chieh". The team selected a set of less than 2,000 components to compose about 12,000 common characters. Each component is represented by a permutation of 1 to 3 of 26 "Chinese alphabets" (also called "radicals"). Each "alphabet" maps to a particular letter key on a standard QWERTY keyboard. [2]
In 1978,he implemented the method with computer technology,making it a Chinese input method for computers. The ROC Defense Minister Chiang Wei-kuo gave the input method the name "Tsang-chieh". Chu put Tsang-chieh method in the public domain in a bold effort to promote Chinese computing,essentially giving up his rights to any royalty. His contribution led many future Chinese systems to come bundled with a free copy of the Tsang-chieh input method,removing the greatest barrier to effective Chinese input systems. Since then,many adaptations of Chu's methods have also appeared.
Over generations of upgrades,Chu's Tsang-chieh has included more and more characters. The fifth generation,released in 1985,included 60,000 characters.
During the development of Tsang-chieh method,Chu found that his invention is not only an input method,but also a character encoding method for computing systems. Unlike An Wang's encoding method of the time,or later methods such as Big5 and Unicode,Tsang-chieh method does not sort characters by their usage frequency,stroke count,or radical,but is based on their composition aspect and inspired by the "pictophonetic compounds" principle of Chinese.
Chu therefore began to develop a theory (which he would later call "Chinese DNA","Alphabets of Chinese Language",or "Chinese character gene" theory). The theory states that the forms selected by Chu are the "genes" of Chinese. Proper arrangement of these "genes" can provide all functions of the characters. Therefore,Tsang-chieh method as a character encoding is very useful,since it contains not only an ordered set of characters,but also precise references of shapes,pronunciations and semantics of the characters. Therefore,the system is an efficient base for a variety of Chinese information technology:smart dictionary;operating system and application software;programming language;hardware architecture of PC and embedded systems;and even strong artificial intelligence. [3] [4]
In 1979,he invented a character generator program,which takes Tsang-chieh encoded data and dynamically generates Chinese characters for screen display. In the same year,Chu's team collaborated with the Acer company,and the program became incorporated in the firmware of a "Chinese computer". Later the generator was also used in the "Tsang-chieh controller board",which would enable an Apple II computer to display Chinese characters in its hi-res graphics mode. A particular interesting "feature" of this early system was that it would also take and generate characters not explicitly included in the codepage,but implied by the rules of Tsang-chieh. [5]
Since then,Chu has held unique views on Chinese information technology. He considered input using ordinary keyboards more feasible and compatible than speech and handwriting recognition or specialized keyboard. However,many of his other opinions have been at odds with consensus:
In early 1990s,when the Chinese version of Microsoft Windows 3.0 attempted to enter Taiwanese market,Chu and some partners competed with it and advocated for more independence of Chinese information technology. Chu worked in Shenzhen with a group of developers and produced a software application for Chinese integration,called "Juzhen" (Chinese :聚珍),stood up against this strong force. It was released to the public domain,and distributed through the Rexun magazine. Between Chu and the financially strong Microsoft,the odds were against the former. However,Chu's engine had the benefit of space:in Chu's engine,a font containing 13095 characters took up at most a megabyte each and fit snugly on a floppy disk as compared to the 3–5 megabytes required by competitors' products. This strong advantage of Chu's technology led a sizeable number of technology companies to initiate discussions with Chu for a transfer of technology rights. Soon after,Jinmei (金梅),Zangzhu (藏珠) and other budget font makers swamped the market,forcing prices down and ensuring that every user could afford original copies of Chinese typefaces.
After "Juzhen" system,Chu left Taiwan for Macau. In 1999,he was appointed vice chairman of Culturecom Corporation. [6]
Since 1999,Chu became a vice chairman of Hong Kong- and Macau-based Culturecom Corporation,Chu's team has been cooperating with Culturalcom until 2006 when Culturecom terminated this partnership.
Several products and technology were developed respectively,and resulted a series of E-book device with several names such as 文昌,蒼頡. The core of the device is "Culturecom 1610",a RISC,System-on-a-chip "Chinese CPU" that includes a character generator. The device also features a "Cholesterol" LCD,which saves electricity. The device,similar to India's Simputer,features simple architecture and low cost. Chu's team designed it as an affordable electronic textbook for poor rural population. They also wished it to be the platform of a rural wireless network project named "eTown". However,up to 2006,these ideals were not realized. [7] In 2002,some details of the product were released to LGPL by the two parties. [8] Although the device did not take-off as expected,its technologies were employed by some other companies in their products,such as Kolin's i-library.
During this period,Chu's team was also interested in Virtual cinematography. They have released several feature length animation films.
Chu also gave more elaboration on his "Chinese DNA" theory. Using this theory as basis,Chu's team claimed to be developing:
ASCII,an acronym for American Standard Code for Information Interchange,is a character encoding standard for electronic communication. ASCII codes represent text in computers,telecommunications equipment,and other devices. ASCII has just 128 code points,of which only 95 are printable characters,which severely limit its scope. The set of available punctuation had significant impact on the syntax of computer languages and text markup. ASCII hugely influenced the design of character sets used by modern computers,including Unicode which has over a million code points,but the first 128 of these are the same as ASCII.
The bit is the most basic unit of information in computing and digital communication. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as either "1" or "0",but other representations such as true/false,yes/no,on/off,or +/− are also widely used.
Several input methods allow the use of Chinese characters with computers. Most allow selection of characters based either on their pronunciation or their graphical shape. Phonetic input methods are easier to learn but are less efficient,while graphical methods allow faster input,but have a steep learning curve.
The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters,based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for six European languages by the EEC ESPRIT information technology research and development program. As many symbols as possible have been taken over from the IPA;where this is not possible,other signs that are available are used,e.g. [@
] for schwa,[2
] for the vowel sound found in French deux 'two',and [9
] for the vowel sound found in French neuf 'nine'.
In internationalization,CJK characters is a collective term for graphemes used in the Chinese,Japanese,and Korean writing systems,which each include Chinese characters. It can also go by CJKV to include ChữNôm,the Chinese-origin logographic script formerly used for the Vietnamese language,or CJKVZ to also include Sawndip,used to write the Zhuang languages.
The Cangjie input method is a system for entering Chinese characters into a computer using a standard computer keyboard. In filenames and elsewhere,the name Cangjie is sometimes abbreviated as cj.
An input method is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters that are available to them. Using an input method is usually necessary for languages that have more graphemes than there are keys on the keyboard.
The Vietnamese alphabet is the modern writing script for the Vietnamese language. It uses the Latin script based on Romance languages originally developed by Francisco de Pina (1585–1625),a missionary from Portugal.
Vietnamese Quoted-Readable,also known as Vietnet,is a convention for writing Vietnamese using ASCII characters encoded in only 7 bits,making possible for Vietnamese to be supported in computing and communication systems at the time. Because the Vietnamese alphabet contains a complex system of diacritical marks,VIQR requires the user to type in a base letter,followed by one or two characters that represent the diacritical marks.
The writing system of the Korean language is a syllabic alphabet of character parts organized into character blocks representing syllables. The character parts cannot be written from left to right on the computer,as in many Western languages. Every possible syllable in Korean would have to be rendered as syllable blocks by a font,or each character part would have to be encoded separately. Unicode has both options;the character parts ㅎ(h) and ㅏ(a),and the combined syllable 하(ha),are encoded.
Chinese BASIC is the name given to several Chinese-localized versions of the BASIC programming language in the early 1980s.
A virtual keyboard is a software component that allows the input of characters without the need for physical keys. Interaction with a virtual keyboard happens mostly via a touchscreen interface,but can also take place in a different form when in virtual or augmented reality.
A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings,even those fonts which only include glyphs for a single writing system,or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts",although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535,it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.
VNI Software Company is a developer of various education,entertainment,office,and utility software packages. They are known for developing an encoding and a popular input method for Vietnamese on for computers. VNI is often available on computer systems to type Vietnamese,alongside TELEX input method as well. The most common pairing is the use of VNI on keyboard and computers,whilst TELEX is more common on phones or touchscreens.
The Chinese telegraph code,Chinese telegraphic code,or Chinese commercial code is a four-digit decimal code for electrically telegraphing messages written with Chinese characters.
Bopomofo,also called Zhuyin Fuhao,or simply Zhuyin,is a transliteration system for Standard Chinese and other Sinitic languages. It is the principal method of teaching Chinese Mandarin pronunciation in Taiwan. It consists of 37 characters and five tone marks,which together can transcribe all possible sounds in Mandarin Chinese.
A keyboard layout is any specific physical,visual,or functional arrangement of the keys,legends,or key-meaning associations (respectively) of a computer keyboard,mobile phone,or other computer-controlled typographic keyboard.
The Vietnamese language is written with a Latin script with diacritics which requires several accommodations when typing on phone or computers. Software-based systems are a form of writing Vietnamese on phones or computers with software that can be installed on the device or from third-party software such as UniKey. Telex is the oldest input method devised to encode the Vietnamese language with its tones. Other input methods may also include VNI and VIQR. VNI input method is not to be confused with VNI code page.
Chinese character IT is the information technology for computer processing of Chinese characters. While the English writing system uses a few dozen different characters,Chinese language needs a much larger character set. There are over ten thousand characters in the Xinhua Dictionary. In the Unicode multilingual character set of 149,813 characters,98,682 are Chinese. That means computer processing of Chinese characters is the toughest among other languages.
Chinese computational linguistics is a subset of computational linguistics;it is the scientific study and information processing of the Chinese language by means of computers. The purpose is to obtain a better understanding of how the language works and to bring more convenience to language applications. The term Chinese computational linguistics is often employed interchangeably with Chinese information processing,though the former may sound more theoretical while the latter more technical.