This article may be too technical for most readers to understand.(July 2022) |
The writing system of the Korean language is a syllabic alphabet of character parts (jamo) organized into character blocks (geulja) representing syllables. The character parts cannot be written from left to right on the computer, as in many Western languages. Every possible syllable in Korean would have to be rendered as syllable blocks by a font, or each character part would have to be encoded separately. Unicode has both options; the character parts ㅎ (h) and ㅏ (a), and the combined syllable 하 (ha), are encoded.
In RFC 1557, a method known as ISO-2022-KR for seven-bit encoding of Korean characters in email was described. Where eight bits are allowed, EUC-KR encoding is preferred. These two encodings combine US-ASCII (ISO 646) with the Korean standard KS X 1001:1992 [1] (previously named KS C 5601:1987). Another character set, KPS 9566 (similar to KS X 1001), is used in North Korea.
The international Unicode standard contains special characters for the Korean language in the Hangul phonetic system. Unicode supports two methods. The method used by Microsoft Windows is to have each of the 11,172 syllable combinations as code and a preformed font character. The other method encodes letters ( jamos ) and lets the software combine them correctly. The Windows method requires more font memory but allows better shapes, since it is complicated to create stylistically correct combinations (preferable for documents).
Another possibility is stacking a sequence of medial(s) (jungseong) and a sequence of final(s) (jongseong) or a Middle Korean pitch mark (if needed) on top of the sequence of initial(s) (choseong) if the font has medial and final jamos with zero-width spacing inserted to the left of the cursor or caret, thus appearing in the right place below (or to the right of) the initial. If a syllable has a horizontal medial (ㅗ, ㅛ, ㅜ, ㅠ or ㅡ), the initial will probably appear further left in a complete syllable than in preformed syllables due to the space that must be reserved for a vertical medial, making aesthetically poor what may be the only way to display Middle Korean hangul text without resorting to images, romanization, replacement of obsolete jamo or non-standard encodings. However, most current fonts do not support this.
The Unicode standard also has attempted to create a unified CJK character set which can represent Chinese (Hanzi) and the Japanese (Kanji) and Korean (Hanja) derivatives of this script through Han unification, which does not discriminate by language or region in rendering Chinese characters if the typographic traditions have not resulted in major differences in what a character looks like. Han unification has been criticized.
While the first Korean typewriter, or 한글 타자기, is unclear,the first Moa-Sugi style (모아쓰기,The form of hangul where consonants and vowels come together to form a letter; The standard form of Hangul used today) typewriter is thought to be first invented by Korean-American gyopo Lee Won-Ik (이원익) in 1914, where he modified a Smith Premier 10 typewriter's type into Hangul. [2] [3] Alongside Lee Won-ik's, Horace Grant Underwood's 1913 US-patented Hangul type, the Underwood, and another Korean-American Kim Jun-Sung's Hangul type are also brought up when discussing the first Moa-Sugi type. [4]
In 1929, the first Dubeolsik typewriter was made by Song Ki-Ju, a student studying abroad in the US, gaining attention from the Donga ilbo, however, it no longer exists; In 1934 he showcased another type, which was a modification of the Underwood portable. [5] [6] Song's 1934 typewriter is stored in the Hangul museum as the oldest existing Korean typewriter. [7] The invention led to the development of other typewriters in 1945 by Kim Joon Sung and 1950 by Kong Byung Woo. [8]
In 1949, eye doctor Kong Byung-Woo made the first practical Hangul type able to write both in Moa-Sugi and horizontally. [9]
On a Korean computer keyboard, text is typically entered by pressing a key for the appropriate jamo; the operating system creates each composite character on the fly. Depending on the Input method editor and keyboard layout, double consonants can be entered by holding the shift button. When all jamo making up a syllabic block has been entered, the user may initiate a conversion to hanja (or other special characters) using a keyboard shortcut or interface button; South Korean keyboards have a key for this. Subsequent semi-automated hanja conversion is supported in varying degrees by word processors.
When using a keyboard with another language, most operating systems require the user to type with an original Korean keyboard layout; the most common is Dubeolsik. In other languages, such as Japanese, text can be entered on non-native keyboards with romanization.
Operating systems such as Linux allow engine/hangul/hangul-keyboard='ro, resulting in a romaja keyboard; typing "seonggye" results in 성계. [10] In this configuration, ㄲ is obtained by "gg" rather than ⇧ Shift+G. This allows keying "jasanGun" to obtain 자산군, instead of keying "jasangun" (which would provide 자상운).
Korean text input is related to Korean typewriters (타자기) before computers. according to Jang Bong Seon, Horace Grant Underwood made a Korean typewriter during the first decade of the 20th century. [11] In 1927, Song Ki Joo invented the first Dubeolsik typewriter in Chicago; h
South Korea originally had a Nebeolsik standard, but Dubeolsik became standard in 1985. [12]
Some Korean fonts do not include hanja, and word processors do not allow a user to specify which font to use as a fallback for any hanja in a text; each hanja sequence must be manually formatted for a desired font.
Vertical text is supported poorly (or not at all) by HTML and most word processors. This is not an issue for modern Korean, which is usually written horizontally; until the second half of the 20th century, however, Korean was often written vertically. Fifteenth-century texts written in hangul had pitch marks to the left of syllables which are included in Unicode, although current fonts do not support them.
Programs designed for Korean language-related use include:
Hangul letters are detailed in several parts of Unicode:
Pre-composed hangul syllables in the Unicode hangul syllables block are algorithmically defined with the following formula:
To find the code point of "한" in Unicode:
Substituting these values in the formula above yields [(18 × 588) + (0 × 28) + 4] + 44032 = 54620. The Unicode value of 한 is 54620 in decimal, 한
in numeric character reference, and U+D55C in hexadecimal Unicode notation.
With the below module, calling e.g. hangul::from_jamo('ㅎ','ㅏ',Some('ㄴ'))
will return Some('한')
.
modhangul{constINITIAL_JAMO: [char;19]=['ㄱ','ㄲ','ㄴ','ㄷ','ㄸ','ㄹ','ㅁ','ㅂ','ㅃ','ㅅ','ㅆ','ㅇ','ㅈ','ㅉ','ㅊ','ㅋ','ㅌ','ㅍ','ㅎ',];constVOWEL_JAMO: [char;21]=['ㅏ','ㅐ','ㅑ','ㅒ','ㅓ','ㅔ','ㅕ','ㅖ','ㅗ','ㅘ','ㅙ','ㅚ','ㅛ','ㅜ','ㅝ','ㅞ','ㅟ','ㅠ','ㅡ','ㅢ','ㅣ',];constFINAL_JAMO: [Option<char>;28]=[None,Some('ㄱ'),Some('ㄲ'),Some('ㄳ'),Some('ㄴ'),Some('ㄵ'),Some('ㄶ'),Some('ㄷ'),Some('ㄹ'),Some('ㄺ'),Some('ㄻ'),Some('ㄼ'),Some('ㄽ'),Some('ㄾ'),Some('ㄿ'),Some('ㅀ'),Some('ㅁ'),Some('ㅂ'),Some('ㅄ'),Some('ㅅ'),Some('ㅆ'),Some('ㅇ'),Some('ㅈ'),Some('ㅊ'),Some('ㅋ'),Some('ㅌ'),Some('ㅍ'),Some('ㅎ'),];constGA_LOCATION: u32='가'asu32;// = 44_032pubfnfrom_jamo(initial: char,medial: char,last: Option<char>)-> Option<char>{if!(self::INITIAL_JAMO.contains(&initial)&&self::VOWEL_JAMO.contains(&medial)&&self::FINAL_JAMO.contains(&last)){returnNone;}char::from_u32(self::GA_LOCATION+588*(INITIAL_JAMO.iter().position(|&c|c==initial)?asu32)+28*(VOWEL_JAMO.iter().position(|&c|c==medial)?asu32)+FINAL_JAMO.iter().position(|&c|c==last)?asu32)}}
The Unicode Hangul Compatibility Jamo block has been allocated for compatibility with the KS X 1001 character set. It is usually used to represent hangul without distinguishing initials and finals.
The Hangul Jamo, Hangul Jamo Extended-A and Hangul Jamo Extended-B blocks contain initial, medial and final jamo, including obsolete jamo.
Hangul (word processor) shipped with fonts from Hanyang Information and Communication, which map obsolete hangul characters with Unicode's Private Use Areas. Despite the use of PUAs instead of dedicated code points, Hanyang's mapping was the most popular way to represent obsolete hangul in South Korea in 2007. With its Hangul 2010, however, Hancom deprecated Hanyang PUA code and began representing obsolete hangul characters with Unicode hangul jamo.
An interpunct·, also known as an interpoint, middle dot, middot, centered dot or centred dot, is a punctuation mark consisting of a vertically centered dot used for interword separation in Classical Latin. It appears in a variety of uses in some modern languages.
McCune–Reischauer romanization is one of the two most widely used Korean-language romanization systems. It was created in 1937 and the ALA-LC variant based on it is currently used for standard romanization library catalogs in North America.
The Soyombo script is an abugida developed by the monk and scholar Zanabazar in 1686 to write Mongolian. It can also be used to write Tibetan and Sanskrit.
The following tables of consonants and vowels (jamo) of the Korean alphabet (Hangul) display the basic forms in the first row and their derivatives in the following row(s). They are divided into initials, vowels (middle), and finals tables.
Revised Romanization of Korean is the official Korean language romanization system in South Korea. It was developed by the National Academy of the Korean Language from 1995 and was released to the public on 7 July 2000 by South Korea's Ministry of Culture and Tourism in Proclamation No. 2000-8.
Korean Braille is the Braille alphabet of the Korean language. It is not graphically-related to other braille scripts found around the world. Instead, it reflects the patterns found in Hangul, and differentiates initial consonants, vowels, and final consonants.
Gari Keith Ledyard was an American scholar who was Sejong Professor of Korean History Emeritus at Columbia University. He is best known for his work on the history of the Hangul alphabet.
Koryo-mar is a dialect of Korean spoken by Koryo-saram, ethnic Koreans who live in the countries of the former Soviet Union. It is descended from the Hamgyŏng dialect and multiple other varieties of Northeastern Korean. Koryo-mar is often reported as difficult to understand by speakers of standard Korean; this may be compounded by the fact that the majority of Koryo-saram today use Russian and not Korean as their first language.
The Korean language, known for its unique phonetic system, comprises 19 distinct consonant phonemes that exhibit a rich variety of articulatory features. Unlike many languages, Korean consonants are categorized into three main types: plain, tense, and aspirated, each contributing to the language's distinctive soundscape. Also, Korean phonology is characterized by a complex system of classification and pronunciation rules that play a crucial role in the language's phonetic and phonological structure.
Hangul (Korean: 한글) is the native script of Korea. It was created in the mid fifteenth century by King Sejong, as both a complement and an alternative to the logographic Sino-Korean Hanja. Initially denounced by the educated class as eonmun, it only became the primary Korean script following independence from Japan in the mid-20th century.
The Khitan small script was one of two writing systems used for the now-extinct Khitan language. It was used during the 10th–12th century by the Khitan people, who had created the Liao Empire in present-day northeastern China. In addition to the small script, the Khitans simultaneously also used a functionally independent writing system known as the Khitan large script. Both Khitan scripts continued to be in use to some extent by the Jurchens for several decades after the fall of the Liao dynasty, until the Jurchens fully switched to a script of their own. Examples of the scripts appeared most often on epitaphs and monuments, although other fragments sometimes surface.
KS X 1001, "Code for Information Interchange ", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer.
Hangul Jamo is a Unicode block containing positional forms of the Hangul consonant and vowel clusters. While the Hangul Syllables Unicode block contains precomposed syllables used in standard modern Korean, the Hangul Jamo block can be used to compose arbitrary syllables dynamically, including those not included in the Hangul Syllables block.
Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables can be directly mapped by algorithm to sequences of two or three characters in the Hangul Jamo Unicode block:
The Yale romanization of Korean was developed by Samuel Elmo Martin and his colleagues at Yale University about half a decade after McCune–Reischauer. It is the standard romanization of the Korean language in linguistics.
The Korean alphabet, known as Hangul or Hangeul in South Korea and Chosŏn'gŭl in North Korea, is the modern writing system for the Korean language. The letters for the five basic consonants reflect the shape of the speech organs used to pronounce them. They are systematically modified to indicate phonetic features. The vowel letters are systematically modified for related sounds, making Hangul a featural writing system. It has been described as a syllabic alphabet as it combines the features of alphabetic and syllabic writing systems.
Foreign words when used in Korean undergo transcription, to make them pronounceable and memorable. Transcription into Korean, for the most part, is very similar to or even influenced by transcription into Japanese, although the number of homophones resulted by imperfect mapping of foreign sounds onto native sounds is significantly smaller, as Korean has a larger phoneme inventory and a more inclusive phonotactics.
GB 12052-89, entitled Korean character coded character set for information interchange, is a character set standard established by China for the Korean language in China. It consists of a total of 5,979 characters, and has no relationship nor compatibility with South Korea's KS X 1001 and North Korea's KPS 9566.
KS X 1002 is a South Korean character set standard established in order to supplement KS X 1001. It consists of a total of 7,649 characters.