|c. 13th century BCE –present|
Oracle bone script
|ISO 15924||Hani(500),Han (Hanzi, Kanji, Hanja)|
|Literal meaning||"Han characters"|
Chinese characters (traditional Chinese and Japanese : 漢字; simplified Chinese : 汉字; pinyin : hànzì; Cantonese Jyutping : hon3 zi6; Wade–Giles : han4 tzŭ4; rōmaji : kanji;" Han characters") are logograms used to write the Chinese languages and several other languages historically influenced by Chinese culture. : 82 Chinese characters represent one of the four independent inventions of writing in human history to be universally accepted by scholars, and are the only one of these to be continuously used since its invention, with a documented history spanning more than three millennia. Over this span, the function and style of characters has evolved greatly. Most recently, countries using Chinese characters have standardised their forms and pronunciations. Broadly, simplified characters are used in mainland China, Singapore, and Malaysia, while traditional characters are used in Taiwan, Hong Kong, and Macau.
Chinese characters were historically adapted to write other languages spoken within the Sinosphere. Chinese characters in Japanese, Korean, and Vietnamese are known as kanji, hanja, and chữ Hán respectively. Of these, each also created their own characters for their internal use. These languages generally function very differently from Chinese, and belong to their own independent language families. In part due to this difference, Korean and Vietnamese are now written almost exclusively with alphabets designed to replace Chinese characters. This leaves Japanese as the sole major language unrelated to Chinese still written with Chinese characters.
Unlike in phonetic writing systems, where individual letters roughly correspond to phonemes, the Chinese writing system associates each logogram with a syllable. In fact, written characters, syllables, and morphemes—the basic units of meaning in a language—largely correspond one-to-one with one another in Chinese languages.However, written Chinese is not ideographic—characters fundamentally correspond to spoken syllables, not to the abstracted ideas themselves.
To a higher degree than most major languages, modern spoken Chinese has many homophones: the same spoken syllable can be represented by one of several different characters depending on context. Additionally, a particular character may have a range of different meanings; different readings of the same character may have different pronunciations and even etymologies. In Standard Chinese, one-fifth of the 2,400 most common characters have several pronunciations.
Chinese characters are used within several distinct writing systems that have developed throughout history, which may also include other elements such as punctuation, as well as rules with which characters are used. Numerous models attempting to explain how Chinese characters work to encode language have been presented by scholars. As models reflecting human language, any rules or categories for characters are imperfect. Broadly, in order to create meaning Chinese characters make use of the sounds of spoken language, the abstract ideas underpinning words, and graphical shape together, such that each dimension reinforces the others.
The Shuowen Jiezi was a hugely influential character dictionary written by the scholar Xu Shen c. 120 CE. In the dictionary's postface, Xu analyses what he sees as all the methods by which characters are created. This work introduced a categorisation scheme which would later become known as the liùshū (六書; 六书; 'six writings'). Mature formulations of this scheme stated that every character belonged to one of six categories, each mentioned with varying emphasis in the Shuowen Jiezi. For nearly two millennia afterwards, this framework would serve as the traditional lens through which characters were analysed throughout the Sinosphere. Xu based most of his analysis on examples of Qin seal script that were written down several centuries before his time—these were usually the oldest forms available to him, but Xu stated that he was aware of the existence of even older forms.
Modern scholars agree that the theory presented in the Shuowen Jiezi is problematic, failing to fully capture the nature of Chinese writing, both in the present, as well as at the time Xu was writing.However, the model has proven resilient and pervasive; it continues to serve as a guide for those studying writing systems that use Chinese characters. One of the most important features of the Shuowen Jiezi was its grouping of characters by radical: a component within a character that is generally considered to be of particular import. The Shuowen Jiezi recognised over 500 radicals—this number would be reduced substantially in future dictionaries, but the concept itself would remain ubiquitous.
Presented as a fundamental class upon which the rest of the writing system depends, a relatively small number of characters are pictograms , representational pictures of physical objects. 日 ('Sun'), 月 ('moon'), and 木 ('tree'). Xu Shen placed approximately 4% of all characters into this category.In practice, their forms are highly stylised and simplified from centuries of iteration: examples include
Over time, pictograms became increasingly stylised, simplified, and standardised, in order to make them easier to write. As character forms developed, distinct depictions of various physical objects within pictographs became reduced to instances of a single written component. ⼝ 'MOUTH' often carries a meaning related to mouths, but within 高 ('tall')—a pictogram of a tall building—it instead depicts a window, ultimately lending to the character's meaning of 'tallness'. In another instance, the same 'mouth' radical depicts the lip of a vessel in the modern form of the pictogram 畐 ('full').As such, what a pictogram is depicting is often not immediately evident. For example, within a given character the radical
Pictograms have often been extended from their original concrete meanings to take on additional layers of metaphor and synecdoche, which sometimes even displace the pictogram's original, literal meaning. Over time, this process sometimes creates excess ambiguity between graphically or phonetically similar characters, which is then usually resolved through adding additional components to disambiguate the characters in question. This can result in new pictograms, but usually results in other character types instead.
Also called simple indicatives, characters in this small category visually depict abstract concepts that lack corresponding physical forms, but nonetheless can be gestured towards intuitively. 上 ('up') and 下 ('down')—originally written as dots above and below a line, later evolving into their present forms which are less potentially ambiguous in context — 凸 ('convex'), 凹 ('concave'), and 平 ('flat and level'). Though few in number and limited in their scope, pictograms and ideograms form the basis on which more complex characters are derived.Examples include
Also translated as logical aggregates or associative idea characters, characters in this class are formed by combining two or more pictographs or ideographs to suggest a new, synthetic meaning. The canonical example is 明 ('bright'), often interpreted as the juxtaposition of the two brightest objects in the sky: 日 ('sun'), and 月 ('moon'), together expressing their shared quality of brightness. Though the historicity of this particular etymology has been contested in recent scholarship, it is definitively a canonical reading: for example, the common compound word 明白 means 'understanding', touching on the derived association of 明 with 'illumination'. The addition of the abbreviated 艹 'GRASS' radical on top results in the compound ideograph 萌 ('to sprout'), alluding to the heliotropic behaviour of plant life. Other commonly cited examples include 休 ('rest'), composed of pictographs ⼈ 'MAN' and ⽊ 'TREE', and 好 ('good'), composed of ⼥ 'WOMAN' and ⼦ 'CHILD'.
Xu Shen placed approximately 13% of characters in this category, but many of his examples are now believed to be phono-semantic compounds, whose origin has been obscured by subsequent changes in their form.Peter Boodberg and William Boltz go so far as to deny that any of the compound characters devised in ancient times were of this type, maintaining that now-lost "secondary readings" are responsible for the apparent absence of phonetic indicators, but their arguments have been rejected by other scholars.
In contrast, associative compound characters are common among kokuji, kanji originally coined in Japan. An example of a modern compound ideograph in the Chinese language is 砼 ('concrete'), combining the ⼈ 'MAN', ⼯ 'WORK', and ⽯ 'STONE' radicals.
A pivotal development in the history of Chinese writing was the initial application of the rebus principle, or phonetic borrowing, in which an existing character could be used to represent a totally unrelated word with a similar pronunciation. 氏 (zhī) is used to write 是 (shì) and vice versa, and likewise with 勺 (sháo) for 趙 (zhào). At the time of writing, these characters were either homophonous, or nearly so.In logographies, the use of rebus as a device represents a stage at which the writing system may begin to acquire a deeper phonetic dimension, and thus becoming more expressive as a whole. Chinese characters used purely for their sound values are attested in manuscripts dating to the Eastern Zhou period, with swapping between different characters to represent the same spoken word sometimes occurring within a span of only a handful of lines: for example,
Sometimes the old meaning of a borrowed character was subsequently lost completely, as with characters such as 自 (zì), which has lost its original meaning of 'nose' completely, and now exclusively has the meaning of 'oneself', or 萬 (wàn), which originally meant 'scorpion', but is now used only to mean the number 'ten thousand'.
When transcribing words of foreign origin, such as contemporary non-Chinese names, as well as the Buddhist terminology introduced to China in antiquity, Chinese characters are used for their phonetic value, in a rebus-like fashion. For example, in the name 罗马尼亚; 羅馬尼亞 (Luómǎníyà; 'Romania'), each character is only used for its sound value, and does not provide any particular meaning. This usage is similar to that of Japanese katakana and hiragana , although these syllabaries use a special set of simplified forms derived from Chinese characters, in order to clarify their purely phonetic role. Use of the rebus principle has also been observed with names written in other logographies, including both Egyptian hieroglyphs and the Maya script. However, the barrier between a character's pronunciation and meaning is never total: when transcribing into Chinese, phonetic characters are often chosen deliberately as to create certain connotations. This is regularly done with corporate brand names: for example, Coca-Cola's Chinese name is 可口可乐; 可口可樂 (Kěkǒu Kělè; 'the mouth can be happy'), with the phonetic characters selected as to possess a plausible meaning of "delicious and enjoyable".
Also known as semantic-phonetic compounds or picto-phonetic compounds, these characters are composed of at least two parts: the semantic component that suggests the general meaning of the compound, and the phonetic component that gives a hint as to the compound's pronunciation. Phono-semantic compounds are by far the largest class of characters within the traditional six-fold schema.In most cases, the semantic component is also the radical under which the character is categorised in dictionaries. Variously, the phonetic component of a compounds may be selected as to contribute an additional layer of meaning to the compound: as a result, determining whether a given character is a phono-semantic compound or a purely ideographic compound is often non-trivial.
Examples of phono-semantic compounds include 河 (hé; 'river'), 湖 (hú; 'lake'), 流 (liú; 'stream'), 沖 (chōng; 'surge'), and 滑 (huá; 'slippery'). On the left-hand side of each, these characters have three short strokes: 氵, a reduced form of the ⽔ 'WATER' radical. In these cases, this indicates to the reader that the meaning of each character is related on some level to the concept of "water". On the other side of each character is the phonetic component: 湖 (hú) is pronounced identically to 胡 (hú) in Standard Chinese, 河 (hé) is pronounced similarly to 可 (kě), and 沖 (chōng) is pronounced similarly to 中 (zhōng). While the discrepancies in these examples are rather tame, over time the accumulation of sound changes often result in a given character's original composition seeming totally arbitrary to a modern reader.
Generally, while the phonetic components within some compounds do relate a precise pronunciation, most may only provide an approximation, even before the emergence of any later sound changes. Some may only share the initial or final sounds of their phonetic components. 也 for their phonetic part—save the final one, which uses a previous character in the list—it is apparent that none of them share its modern pronunciation. The Old Chinese pronunciation of 也 has been reconstructed by Baxter and Sagart (2014) as /*lAjʔ/, similar to that for each compound. The table illustrates numerous sound changes that have taken place since the Shang and Zhou dynasties, the time during which most of the characters below first entered the lexicon. For a modern reader the resulting drift is dramatic, to the point where the phonetic component in each character no longer provides any hint whatsoever as to its pronunciation.With those changes, some characters may eventually seem totally unrelated to their phonetic component in their sounds. Sometimes, this actually turns out to be an accurate assessment, when dealing with characters that have undergone re-borrowing or orthographic merger with another phonetically distinct character, such that the new form is not actually associated with its original pronunciation. However, a divergence simply due to the sum total of centuries of phonetic change in the spoken language is equally as common. The table below lists characters that each use
|也||PTC||—||/*lAjʔ/||yaeX||yě [jè]||jaa5 [jaː˩˧]||ya [ja̠]|
|/*Cə.lraj/||drje||chí [ʈʂʰǐ]||ci4 [tsʰiː˩]||chi [tɕi]|
|/*l̥ajʔ/||syeX||chí [ʈʂʰǐ] |
|ci4 [tsʰiː˩]||chi [tɕi] |
|/*l̥aj/||sye||shī [ʂí]||si1 [siː˥]||se [se̞] |
|/*[l]ˤej-s/||dijH||dì [tî]||dei6 [tei˨]||ji [dʑi] |
|3-PR||人 (亻, 𠂉)|
|/*l̥ˤaj/||tha||tā [tʰá]||taa1 [tʰaː˥]||ta [ta̠]|
|/*l̥ˤaj/||thaH||tuō [tʰwó]||to1 [tʰɔː˥]||ta [ta̠] |
Writing during the first century, Xu Shen placed approximately 82% of characters into this category. Within the 18th-century Kangxi Dictionary , the figure is closer to 90%, pointing to the historical proficiency of this technique in extending the Chinese vocabulary. chữ Nôm characters in Vietnam.The principle later saw direct adoption in the creation of new
This method is still used to form new characters: for example 鈈 (bù; 'plutonium') is the ⾦ 'GOLD' radical plus the phonetic 不 (bù)—described in Chinese as "不 gives sound, 金 gives meaning". Many Chinese names for chemical elements and other characters related to chemistry were formed in this way. In fact, it is possible to tell just by glancing at a Chinese periodic table which elements are metals (⾦ 'GOLD'), solid non-metals (⽯ 'STONE'), liquids (氵 'WATER'), or gases (⽓ 'STEAM') at standard temperature and pressure.
Occasionally, a disyllabic word is written with two characters that contain the same radical, as in 蝴蝶 ('butterfly'), where both characters have the ⾍ 'INSECT' radical. A notable example regards the name for the pipa, a type of lute. The instrument's name 枇杷 was originally shared with one for the loquat, which has a shape reminiscent of the instrument. The name for the instrument was originally written with the 扌 'HAND' radical as 批把, referring to the upward and downward strokes made when playing the instrument. The name for the fruit was later changed to its present 枇杷, with the ⽊ 'TREE' radical; the name for the instrument became 琵琶, with 珡 ('guqin') added to both characters. In other cases, characters within a compound word sharing a radical may be a coincidence without any particular meaning.
The smallest category of characters is also the least understood. Shuowen Jiezi, Xu Shen gave the example pair of 考 (kǎo; 'to verify') and 老 (lǎo; 'old'), which have similar OC pronunciations of /*khuʔ/ and /*C-ruʔ/ respectively, suggests they may once have been the same word meaning 'elderly person', only to later lexicalise into two separate words. However, a specific term for the character class does not appear in the actual body of the dictionary, and it is often omitted from modern classification systems.In the postface to the
The traditional Shuowen Jiezi schema presupposes either a phonetic or semantic purpose for every character component. 五; 'five' and 八; 'eight', whose forms do not give visual hints to the quantities they represent. From this structural point of view, various systems have been proposed by modern scholars, with a straightforward example being seven categories of:More recently, using the lens of modern semiotics, many components have been identified as not functioning in either role: they are purely signs , or "pure forms". Basic examples of pure form characters are found with the numerals beyond four, e.g.
According to Yang,of the 3,500 frequently used characters in contemporary Standard Chinese, semantic characters are the rarest, accounting for about 5% of the lexicon, followed by pure form characters with 18%, and semantic–form and phonetic–form together accounting for 19%, with the remaining 58% being semantic–phonetic characters—loosely analogous to the traditional category of phono-semantic compounds.
In Chinese, there is a distinction between characters and words. In modern Chinese varieties, most words are compounds written with two or more characters.Written Chinese first emerged during the stage of the spoken language's development known as Old Chinese. In most cases, each Chinese character corresponds to a morpheme that was originally an independent word in Old Chinese. As a result, characters that are cognate among modern Chinese varieties—which have each descended from Old Chinese—are generally written with the same character. Different readings of the same character are often related in both sound and meaning.
Classical Chinese is an ancient form of the written language which became the standard as Old Chinese was dying out. Its use was loosely analogous to that of Latin in pre-modern Europe; it remained the prestige written language of China until the 20th century, well after the spoken varieties descended from Old Chinese had diverged. Despite being a literary form, it retained many properties of spoken Old Chinese. Over time, with numerous sound mergers occurring throughout different varieties, the introduction of polysyllabic words increasingly served the function of reducing ambiguity between words that had since become homophonic.Today, it has been estimated that over two-thirds of the 3,000 most common words in modern Standard Chinese are polysyllables, with the vast majority of these being two-syllable words.
Words in Old Chinese were generally monosyllabic; as such, each character denoted an independent word. /*-s/ called the qusheng去聲 that served a range of semantic functions—possibly the only example of inflectional morphology extant in the otherwise analytic language. For example:Affixes could be added to form a new word, which was often written with the same single character. In many cases, the pronunciations then diverged due to the systematic sound changes caused by the affixes. For example, many additional readings in modern varieties reflect the Middle Chinese 'departing tone', the major source of the 4th tone in modern Standard Chinese. Many scholars now believe that this Middle Chinese tone is the reflex of an Old Chinese derivational suffix
|宿||*sjuk||>||sjuwk||>||ⓘ||'to stay overnight'|
Another common sound change occurred between voiced and voiceless initials, though the phonemic voicing distinction has disappeared in most modern varieties. This is believed to reflect an Old Chinese de-transitivising prefix, but scholars disagree on whether the voiced or voiceless form reflects the original root. Note how the pairs of readings below reflect opposite transitivity from one another.
|*brats||>||bæjH||>||'to be defeated'|
|*djat||>||dzyet||>||ⓘ||'to be broken by bending'|
Multi-syllable words began entering the language during the Western Zhou period; it is estimated that between 25% and 30% of the vocabulary used in Warring States period texts is polysyllabic. The process has accelerated over the centuries as phonetic change has increased the number of homophones.The most common process of Chinese word formation after the Classical period has been to create compounds of existing words. Words have also been created by appending affixes to words, by reduplicating words, and by borrowing words from other languages. While polysyllabic words are generally written with one character per syllable, abbreviations are occasionally used.
Many compound words are composed from two near-synonymous characters words, creating a new, less ambiguous form that is often used in variation with one of its component characters, depending on context. For example:
Equally as common are nouns composed from a root and a particle suffix possessing no particular meaning, such as 子 (zǐ). These constructions serve to create a disyllabic word with the same meaning as the root character. As above, the root word usually, though not always, remains independent, in variation with the compound word.
Morphemic characters that have fallen out of use as independent words, and are now used only in compounds, are called bound forms .
Large-scale surveys by the PRC's Ministry of Education and State Language Commission have shown strong distribution patterns in the use of characters and words. This form of analysis is essential to the quantitative research of the Chinese language, with applications in pedagogy, publishing, and information processing.
The number of characters used in modern Chinese is stable, hovering around 10,000 in recent decades. Contrastingly, 80% of Chinese-language text is composed of just 590 characters, with 90% coverage achieved with 960 characters, and 99% with 2,400.
According to Qiu Xigui, the broadest trend in the evolution of Chinese characters over their history has been simplification, both in graphical shape (字形; zìxíng), the "external appearances of individual graphs", and in graphical form (字体; 字體; zìtǐ), "overall changes in the distinguishing features of graphic[al] shape and calligraphic style, [...] in most cases refer[ring] to rather obvious and rather substantial changes". Generally, within every written language using Chinese characters before the modern era, the working lexicon within texts had considerable irregularities, with many variant forms and substitutions being used.
Several works of classical Chinese literature indicate that, prior to the invention of characters, knotted cords were used to keep records.The practice had some similarities to the Inca technique of quipu. Works that reference the practice include chapter 80 of the Tao Te Ching and the "Xici II" chapter within the Yijing .
According to tradition, Chinese characters were invented by Cangjie, a mythical figure said to have been a scribe to the legendary Yellow Emperor during the 3rd millennium BCE. Frustrated by the limitations of knotting, and inspired by his study of the animals of the world, the landscape of the earth, and the stars in the sky, Cangjie is said to have invented symbols called 字 (zì)—the first Chinese characters. The legend relates that on the day the characters were created, grain rained down from the sky and that night the people heard ghosts wailing and demons crying because the human beings could no longer be cheated.
In recent decades, a series of inscribed graphs and pictures have been found at Neolithic sites in China, including Jiahu (c. 6500 BCE), Dadiwan and Damaidi from the 6th millennium BCE, and Banpo (5th millennium BCE). Often these finds are accompanied by media reports that push back the purported beginnings of Chinese writing by thousands of years. However, because these marks occur singly without any implied context and are made crudely, Qiu Xigui concludes that "we do not have any basis for stating that these constituted writing nor is there reason to conclude that they were ancestral to Shang dynasty Chinese characters." However, they do demonstrate a history of sign use in the Yellow River valley from the Neolithic through to the Shang period.
The earliest known examples of writing directly ancestral to modern characters are a body of inscriptions made on bronze vessels and oracle bones during the late Shang dynasty (c. 1250 –1050 BCE), with the very oldest dated to c. 1200 BCE. : 108 Oracle bones and the script they bore were first documented by modern scholars in 1899, after examples were discovered being sold as "dragon bones" for medicinal purposes, with the symbols carved into them identified as being Chinese writing. By 1928, the source of the bones had been traced to a village near Anyang in Henan, which was excavated by the Academia Sinica between 1928 and 1937. To date, over 150,000 such fragments have been found.
Oracle bone inscriptions are records of divinations performed in communication with royal ancestral spirits.The inscriptions range from a few characters in length at their shortest, to around 40 characters at their longest. The Shang king would communicate with his ancestors by means of scapulimancy, inquiring about subjects such as the royal family, military success, and weather forecasting. The interpreted answers would be recorded on the divination material itself.
Oracle bone script is a well-developed writing system, second millennium BCE. Although these divinatory inscriptions are the earliest surviving evidence of ancient Chinese writing, it is widely believed that writing was used for many other non-official purposes, but that the materials upon which non-divinatory writing was done—likely on wood and bamboo—were less durable than bones and shells, and have since decayed away.suggesting that the Chinese script's origins may lie earlier than the late
The traditional notion of an orderly procession of scripts, with each suddenly invented and displacing the one previous, has been conclusively superseded by modern archaeological finds and scholarly research.More often, it was the case that two or more scripts coexisted in a given area, and that scripts evolved gradually. As early as the Shang dynasty, oracle bone script coexisted as a simplified form alongside the normal script in bamboo books—preserved in bronze inscriptions—as well as the elaborate pictorial forms, often clan emblems, found on many bronzes.
Based on studies of these bronze inscriptions, it is clear that the mainstream script evolved in a slow, unbroken fashion from the Shang to the Zhou dynasty, until assuming the form that is now known as small seal script in the state of Qin, without any sudden shifts.Meanwhile, other scripts had evolved during the late Zhou, especially in eastern and southern regions. These include decorative scripts such as the bird-worm seal script, and the regional 'ancient' forms of eastern Zhou states, preserved as variant forms in the Han-era Shuowen Jiezi.
|Part of a series on|
Small seal script, which had evolved conservatively in the state of Qin during the Eastern Zhou, became standardised as the orthographic convention used throughout all of China by the imperial Qin dynasty. However, more than one script was in use at the time: a little-known, rectilinear, 'vulgar' form of the characters had coexisted alongside the more formal seal script for centuries in the Qin state; the popularity of this vulgar form grew as the practice of writing itself became more widespread.An immature form of clerical script called "early clerical" or "proto-clerical" had already developed by the Warring States period in the state of Qin based upon this vulgar form, with influence from seal script as well. The coexistence of the three scripts—small seal, vulgar and proto-clerical, with the latter evolving gradually into clerical script—runs counter to the traditional belief that the Qin dynasty only used one script, and that the clerical script was suddenly invented during the early Han.
The proto-clerical script matured gradually, and by the early Han period its sophistication was comparable to small seal script. 141–87 BCE.Recently discovered bamboo slips show the emergence of mature clerical script by the end of Emperor Wu of Han's reign in
As in previous eras, multiple scripts were in use during the Han, 八分 (bāfēn) —was dominant. An early type of cursive script was also in use at least as early as 24 BCE, incorporating cursive forms popular at the time, as well as elements from the vulgar writing that originated in Qin state. By the time of the Jin dynasty, this Han cursive style became known as 章草 (zhāngcǎo), sometimes known in English as 'clerical cursive', 'ancient cursive', or 'draft cursive'. Some believe this name, which uses the character 章 ('orderly'), arose because the style was considered by the Jin to be a more orderly form than what would become the modern form of cursive, called 今草 (jīncǎo), which had first emerged during the Jin, and is still used today.although mature clerical script—also called
Around the midpoint of the Eastern Han, 'neo-clerical' (新隶体; 新隸體; xīnlìtǐ). By the end of the Han, this had become the dominant daily script in use by scribes, though clerical script remained in use for formal works, such as engraved stelae. Qiu describes neo-clerical as a transitional form between clerical and regular script, it remained in use through the Three Kingdoms period, and into the Jin dynasty.a simplified and easier form of clerical script appeared, which Qiu terms
By the late Han, an early form of semi-cursive script 147 –188 CE), although such attributions refer to early masters of a script rather than to their actual inventors, since the scripts generally evolved into being over time. Qiu provides examples of early semi-cursive script, lending credence to its having popular origins, rather than being solely Liu's invention.had begun developing from a cursive form of neo-clerical script. This semi-cursive script was traditionally attributed to Liu Desheng (c.
The design of regular script has been credited to Cao Wei calligrapher Zhong Yao (c. 151 –230), often called the "father of regular script". However, some scholars observe that one person could not have unilaterally developed a new script which went on to see universal adoption, but could only have been a crucial contributor to the style's gradual formation.[ citation needed ] The earliest surviving manuscripts written in regular script are copies of Zhong Yao's work, including at least one copied by Wang Xizhi, often called the "Sage of Calligraphy". Regular script developed out of a neatly written form of early semi-cursive, with the addition of a 'pause' (顿; 頓; dùn) technique to end horizontal strokes, plus heavy tails on strokes which are written the downward-right diagonal. Thus, early regular script emerged from a neat, formal form of semi-cursive, which had itself emerged from neo-clerical, a simplified, convenient form of clerical script. It matured further during the Eastern Jin in the hands of Wang Xizhi and his son Wang Xianzhi. However, it had not yet achieved widespread use, with most writers continuing to use the earlier neo-clerical and semi-cursive styles for daily writing, with the conservative clerical script also remaining in use on some stelae.
Meanwhile, modern cursive script slowly emerged during this period, under the influence of both semi-cursive and the newly emerged regular script.In the hands of a few master calligraphers such as Wang, modern cursive began to be formalised.
It was not until the Northern and Southern period that regular script acquired a dominant status.Nevertheless, it continued to evolve stylistically, only reaching full maturity during the early Tang dynasty. Some credit Ouyang Xun with producing the first examples of a mature regular script. After this point, though developments in calligraphy as an art form, as well as in the simplification of character forms would continue, there would not be another major stylistic shift for the Chinese family of scripts.
Han unification is an ongoing effort by the Unicode Consortium to map each of the multiple character sets used within Chinese, Japanese, and Korean—together called the 'CJK languages'—into a single set of unified characters equally usable each language. The first release of the Unicode standard in 1991 was a major milestone of Han unification, and most text on the internet written in the relevant languages is now encoded with so-called CJK ideographs.
Broadly, Chinese characters are normally rectilinear units of uniform width. Within the square allotted to each character, most are constructed from smaller components, which are in turn drawn with a series of strokes.Strokes can be considered both the basic unit of handwriting, as well as the basic unit of graphemic organisation within the system. Individual strokes are generally categorised according to technique and graphemic function, as exemplified by the Eight Principles of Yong. In the transition from seal to clerical script, many formerly bespoke, interlinked character components became discrete and regularised.
Characters are assembled according to predictable visual patterns, with some components usually not seen in certain positions within a character, and some taking distinct, visually congruous forms only when in a certain position—such as the ⼑ 'KNIFE' radical appearing as 刂 on the right side of characters, but as ⺈ at the top of characters. Both the order in which strokes are drawn within a given component, as well as the order that components are assembled into whole characters is largely fixed, lending predictability and order to the writing system as a whole. This is broadly summed up in practice with a few rules of thumb: generally, components and characters are assembled from left-to-right, and from top-to-bottom, with 'enclosing' components started before, then closed after, the components they enclose.
For example, 字 is made up of two components, with each in turn composed of three strokes, drawn in the following order:
Over a character's history, graphical variants with identical meanings called allographs emerge via several processes, possibly to facilitate ease of handwriting, or to create a more 'correct' composition to the writer, according to the principles generally used to compose and explain characters.For example, individual components may be replaced with visually-, phonetically-, or semantically similar alternatives. For certain characters and components, different regions may prescribe different normative stroke orders, or even different allographs of the same character.
The boundary between character structure and style, and thus between allographs of the same character versus semantically distinct characters, is often non-trivial or unclear.
There are numerous styles, or "scripts" (书; 書; shū) in which Chinese characters can be written, each drawing from a broader historical tradition. Most that are used throughout the Sinosphere originated within China, but may have minor regional variations. Styles created outside China tend to remain localised in their use, these include the Japanese edomoji , and the Vietnamese lệnh thư script.
The oldest script style commonly used today is Qin-era seal script, though usually limited to use in the seals that lend the style its modern name. Though the art of carving traditional seals remains alive, few people are still able to comfortably read them today. Clerical and regular script styles are still ubiquitous in print; when writing by hand, semi-cursive styles are also widely used. Modern use of fully cursive script is largely informal—basic character shapes are suggested rather than explicitly realised, and abbreviation is sometimes extreme. Despite being cursive to the point where individual strokes are no longer differentiable and characters are often illegible to the untrained eye, cursive writing has historically been highly revered for the beauty and freedom that it is seen to embody. Some standard simplified forms are derived from cursive, as well as the Japanese hiragana syllabary.
Chinese calligraphy is usually done with ink brush, and was considered one of the four arts to be mastered by Chinese scholars. The set of rules is deliberately minimalist, but each character has a set number of brushstrokes. Strict regularity is not required, since strokes may be accentuated for dramatic effect of individual style. Calligraphy was considered a means by which scholars could artfully express their thoughts and teachings.
'Song' typefaces (宋体; 宋體; sòngtǐ)—also called 'Ming', especially in Japan, Taiwan, and Hong Kong—are named for the respective periods whose printed styles are being imitated, considered to be periods during which woodblock printing flourished in China. Ming and sans-serif are the most popular in body text.
Sans-serif typefaces, called 'black form' (黑体; 黑體; hēitǐ) in Chinese and 'Gothic' (ゴシック体) in Japanese, are characterised by simple lines of even thickness for each stroke, akin to sans-serif styles in Western typography.
Typefaces that emulate regular script are also common, but not as common as Ming or sans-serif typefaces in body text. Most typefaces in the Song dynasty were regular script typefaces, which resembled a particular calligrapher's handwriting, while most modern regular script typefaces tend toward general-purpose use.
Most prominently, the Korean, Japanese and Vietnamese languages have historically been written with Chinese characters, used for record-keeping, histories, and official communications.In these languages, Chinese characters have often been used to represent Chinese loanwords. Some characters retained their phonetic elements based on their pronunciation in a historical variety of Chinese from which they were acquired. These adaptations of Chinese pronunciation are known as Sino-Xenic pronunciations, and have been useful in the linguistic reconstruction of Middle Chinese.
Chinese characters were used in Vietnam during the millennium of Chinese rule that began in 111 BCE; they were originally used for writing Classical Chinese, but were adapted around the 13th century to write the Vietnamese language, creating the chữ Nôm script.
Chinese characters arrived in Korea beginning in the 2nd century BCE, alongside influences such as Buddhism; over the following three centuries, their use became widespread. From Korea, the characters spread to Japan language during the 5th century CE.
Currently, the only non-Chinese language normally written with Chinese characters is Japanese. Vietnam abandoned the use of chữ Nôm and Classical Chinese in the early 20th century in favour of a Latin alphabet, and Korea has largely replaced the use of hanja with hangul. Since education regarding Chinese characters is not mandatory in South Korea, the usage of hanja is rapidly disappearing.
In the Japanese writing system, Chinese characters used are known as kanji. Japanese historically borrowed many words from Chinese, which were written with their original characters, while native Japanese words were also written with orthographic borrowings of Chinese characters with similar meanings. Most kanji arrived via both borrowing processes, and thus have both native Japanese readings, known as kun'yomi, as well as Chinese-original readings, known as on'yomi. Moreover, Chinese words were often borrowed multiple times from different varieties and at different times, resulting in several distinct on'yomi readings for the same character.Modern Japanese uses kanji for most word stems, as well as hiragana and katakana, a pair of syllabaries collectively known as kana. Hiragana are used to write words, including grammatical inflections and particles, and katakana are used for transcribing non-Chinese loanwords, as well as for emphasis of native words, similar to how italics are used in languages written with the Latin script. The syllabaries were derived by simplifying Chinese characters selected to represent Japanese syllables; they differ from one another in part because each selected different characters for certain syllables, in addition to the different strategies employed to reduce the characters for easy writing. The angular katakana were obtained by selecting a smaller component from each character, while the curving hiragana were based on the cursive form of the entire character.
Because Japanese, unlike Chinese, is a synthetic language, many words consist of multiple syllables, and as such many kanji have multi-syllable pronunciations. For example, the kanji 刀 has a native kun'yomi reading of katana. In different contexts, it can also be read with the on'yomi reading tō, such as in the Chinese loanword 日本刀, nihontō, 'Japanese sword', whose pronunciation descends from the Chinese pronunciation at the time of borrowing. (In contemporary Standard Chinese, the word is pronounced rìběndào.) While modern loanwords from languages outside of the Sinosphere are usually written with katakana, loanwords prior to the Meiji era were typically written with unrelated kanji whose on'yomi had the same pronunciation as the syllables in the loanword. These spellings are called called ateji : for example, 亜米利加 was written for modern アメリカ, Amerika, 'America', 歌留多 or 加留多 for modern カルタ, karuta, 'card', 'letter', and 天婦羅 or 天麩羅 for modern テンプラ, tenpura, 'tempura'. Only some ateji spellings are still in common use, such as 缶, kan, 'can'.
As early as the Gojoseon period, Classical Chinese was the dominant form of written communication in Korea. Although the hangul alphabet was invented by the Joseon king Sejong in 1443, it was not taken up by Korean literati, and did not come into widespread use until the late 19th century. 기사, gisa yields more than 30 different entries. In the past, this ambiguity had been efficiently resolved by parenthetically displaying the associated hanja. While hanja is sometimes used for Sino-Korean vocabulary, native Korean words are rarely, if ever, written in hanja.Even today, much of the Korean vocabulary, especially in areas of science and sociology, comes directly from Chinese. However, due to the lack of tones in the Korean language, many dissimilar Sino-Korean words took on identical pronunciations, and as such are spelled identically in hangul. For example, the phonetic dictionary entry for
When learning to write hanja, students are taught to memorise both native and Sino-Korean Korean pronunciations for each hanja. The collation of hanja is similar to if the word water were listed as 'water; aqua', 'horse; equus', or 'gold; aurum', as hybridisations of the English and Latin terms.Examples of listings include:
|水||물, mul||수, su||'water'|
|人||사람, saram||인, in||'person'|
|大||큰, keun||대, dae||'big'|
|小||작을, jakeul||소, so||'small'|
|下||아래, arae||하, ha||'down'|
|父||아비, abi||부, bu||'father'|
|韓||나라 이름, nara ireum||한, han||'Korea'|
Hanja are still used in South Korea, particularly in newspapers, weddings, place names, and the practice of calligraphy—although to nowhere near the extent of kanji use in Japanese society. At present, Chinese characters are sometimes used for the disambiguation of homophonous words. Additionally, their use still possesses connotations of erudition and cultural Confucianism; knowledge of Chinese characters is considered to be a high class attribute by many Koreans, and an indispensable part of a classical education.There is a clear trend toward the exclusive use of hangul in ordinary South Korean contexts. Its use has become a politically contentious issue in the country, with some urging a "purification" of the national language and culture by totally abandoning their use and ending hanja education in schools, and instead exclusively using hangul throughout society and the in public schools. Others support a revival of ordinary hanja use, such as was the case in the 1970s and 80s.
Hanja educational policy has swung back and forth within the country, often swayed by the inclinations of individual education ministers. Students in grades 7–12 are presently taught 1,800 characters,albeit with a principal focus on simple recognition, with the aim of achieving newspaper literacy. Hanja retains its prominence in Korean academia, as the vast majority of Korean documents, history, and literature (such as the Veritable Records of the Joseon Dynasty ) were written in Classical Chinese using hanja. Therefore, a working knowledge of Chinese characters is still important for anyone wishing to interpret and study older Korean texts, or anyone who wishes to read scholarship in the humanities. Working knowledge of hanja is also useful for understanding the etymology of Sino-Korean vocabulary.
A 1949 law in North Korea apparently banned the use of all so-called foreign languages, which has been interpreted as including hanja, even the then-newly proposed New Korean Orthography. However, due to the country's isolation accurate reports about its use of hanja are difficult to obtain. A textbook for university history departments published in the country in 1971 contained 3,323 distinct characters, and in the 1990s North Korean school children were still expected to learn 2,000 characters, more than in South Korea or Japan.A 2013 textbook appears to integrate the use of hanja in secondary school education. Currently, North Korea is estimated to teach around 3,000 hanja to North Korean students by the time they graduate university; in some cases, the characters appear within advertisements and newspapers, but cultural use is narrower than in the South, mostly restricted to dictionaries and textbooks.
Chinese characters are thought to have been first introduced to the Ryukyu Islands in 1265 by a Japanese Buddhist monk.After the Okinawan kingdoms became tributaries of Ming China, especially the Ryukyu Kingdom, Classical Chinese was used in court documents, but hiragana was mostly used for popular writing and poetry. After Ryukyu became a vassal of Japan's Satsuma Domain, Chinese characters became more popular, as well as the use of kanbun. In modern Okinawan, which is labelled as a dialect of Japanese by the Japanese government, katakana and hiragana are mostly used to write Okinawan, but Chinese characters are still used.
Until the early 20th century, Classical Chinese (Hán văn) was used in Vietnam for all official and scholarly writing. The chữ Nôm script began to be developed around the 13th century to record folk literature in the Vietnamese language. Chinese characters, called chữ Hán (𡨸漢), chữ Nho (𡨸儒), or Hán tự (漢字), are now limited to ceremonial use in Vietnam.
The oldest written Chinese text found in Vietnam is an epigraphy dated to the year 618, erected by local Sui officials in Thanh Hóa . Similar to Zhuang sawndip, some chữ Nôm characters were created by combining semantic character components with phonetic components that resembled Vietnamese syllables. This process resulted in a highly complex system whose use was limited to a small portion of the Vietnamese population, never more than 5%. The oldest chữ Nôm written alongside Chinese is a Buddhist inscription dated to 1209. Before 1945, the library of the French School of the Far East (EFEO) in Hanoi collected a total of around 20,000 Chinese and Vietnamese epigraphy rubbings from throughout Indochina. The oldest surviving extant manuscript in Vietnamese is a late 15th-century bilingual copy of the Buddhist Sutra of Filial Piety , currently kept by the EFEO. It features Chinese text in larger characters, and an Old Vietnamese translation in smaller characters glossing the text. Every Hán Nôm book in Vietnam after the Phật thuyết is dated between the 17th and the 20th centuries, with most being hand-copied works, and few printed texts. By 1987, the library of the Institute of Hán-Nôm Studies in Hanoi had collected a total of 4,808 Hán Nôm manuscripts.
Classical Chinese and chữ Nôm fell out of use during the French colonial period, and were gradually replaced with the Vietnamese alphabet, which uses Latin characters and remains the primary writing system for Vietnamese. Contemporaneous use of chữ Hán in Vietnam is often connected with traditional culture, such as the practice of calligraphy.
Several minority languages of South and Southwest China were formerly written with scripts based on Chinese characters, but also included many locally created characters. The most extensive is the sawndip script used to write the Zhuang languages of Guangxi, which is still in use despite efforts to encourage the writing of Zhuang with a Latin-based alphabet. Other languages written with such scripts include Miao, Yao, Bouyei, Mulam, Kam, Bai, and Hani.All these languages are now officially written using Latin-based scripts. According to surveys, traditional sawndip script has twice as many users as the official Latin script.
The foreign dynasties that ruled northern China between the 10th and 13th centuries developed scripts that were inspired by Chinese characters but did not use them directly: the Khitan large script, Khitan small script, Tangut script, and Jurchen script—though Chinese characters were used to phonetically transcribe the language of the Jurchen people, renamed the 'Manchu' after the founding of the Qing dynasty. Other scripts within China that have adapted a few Chinese characters but are otherwise distinct include the Geba script, Sui script, Yi script, and the Lisu syllabary.
Along with the Persian and Arabic scripts, the Mongolian language was also written with Chinese characters phonetically transcribing Mongolian sounds. Notably, the only surviving copies of The Secret History of the Mongols were written in such a manner.
According to the 19th century missionary John Gulick:
"The inhabitants of other Asiatic nations, who have had occasion to represent the words of their several languages by Chinese characters, have as a rule used unaspirated characters for the sounds g, d, b. The Muslims from Arabia and Persia have followed this method ... The Mongols, Manchu, and Japanese also constantly select unaspirated characters to represent the sounds g, d, b, and j of their languages. These surrounding Asiatic nations, in writing Chinese words in their own alphabets, have uniformly used g, d, b, etc., to represent the unaspirated sounds."
In each region, the latest published standards for character forms are:
|China||Table of General Standard Chinese Characters||8105||2013|
|Hong Kong||List of Graphemes of Commonly-Used Chinese Characters||4762||2012|
| Taiwan ||Chart of Standard Forms of Common National Characters||4808||1983|
|Chart of Standard Forms of Less-Than-Common National Characters||6341||1983|
|Chart of Rarely-Used National Characters||18388||2017|
|South Korea||Basic Hanja for Educational Use||1800||2000|
In addition to specificity in character size and shape, Chinese characters are written with very precise rules regarding the strokes employed, as well as their placement and ordering. Just as each region has standardised forms, each also has standard stroke orders. Most characters have only one standard stroke order, though some words may differ in stroke order by region, even occasionally resulting in different stroke counts.
There is often considerable overlap between the concepts of 'style' and 'form'; with the advent of Unicode, this distinction has challenged the process of Han unification. The designers of the Noto CJK family of typefaces, a collaboration between Google and Adobe, researched the regional distinctions in Chinese character forms extensively, as to create a general-purpose, neutral typeface family—and not release fonts meant to write Japanese that looked "too Chinese", or vice-versa.
With the use of woodblock printing, there was a considerable consolidation in forms prior to the standardisation efforts of the 20th century, especially during the Ming. These orthodox forms are in turn well-represented in touchstone reference works throughout the modern era, such as the 1716 Kangxi Dictionary and the 1915 Zhonghua Da Zidian .[ citation needed ]
One of the earliest proponents of character simplification was Lufei Kui, who proposed in 1909 that simplified characters should be used in education. In the years following the Xinhai Revolution and its associated May Fourth Movement, many anti-imperialist Chinese intellectuals sought ways to modernise China as quickly as possible. Traditional culture and values were challenged and subsequently blamed for societal and economic problems. Soon, people in the movement began pointing to the traditional Chinese writing system as an obstacle to the modernisation of China, proposing that it should either be reformed or abolished entirely. Lu Xun, a renowned 20th century author, stated 'If Chinese characters are not destroyed, then China will die'.
During the 1930s and 1940s, discussions on character simplification took place within the Kuomintang government, and a large number of the intelligentsia maintained that character simplification would help boost literacy throughout the country.In 1935, a table of 324 simplified characters collected by Qian Xuantong was introduced as the first official batch of simplified characters; however, it was rescinded in 1936 due to fierce opposition within the party.
Although most closely associated with the PRC, the modern process of character simplification began well before 1949. Cursive script were the source of inspiration for many of the simplified forms, while others were already used in print, albeit not for most formal works. With the goal of increasing functional literacy, a major concern at the time, discussions on character simplification took place among Chinese intelligentsia and within the Kuomintang (KMT) government during the Republican period.This earlier initiative to simplify the Chinese writing system was later inherited and implemented by the Communists after its subsequent abandonment by the KMT.
The use of traditional versus simplified characters varies greatly, and can depend on both the local customs and the medium. Before official reforms, character simplifications were not officially sanctioned and generally took the form of vulgar variants and idiosyncratic substitutions. Unofficial, often simplified forms would be used in everyday writing or for quick notes. Since the 1950s, the PRC has officially encouraged the use of simplified characters on the mainland. Along with the Republic of China, Hong Kong and Macau—at the time still under colonial rule—were not affected by the reform. There is no firm rule for which characters to use, and often it is determined by tastes and inclinations of the audience and writer.
In other Sinophone countries, the use of simplified characters is generally more common among younger people, while many older generations literate in Chinese still use traditional forms. Outside of China, Chinese-language shop signs are also often written using traditional characters.
Most simplified forms in use today are the direct result of PRC initiatives during the 1950s and 1960s. During the transitional years, while considerable confusion about the character forms was still rampant, transitional characters mixing simplified and yet-to-be simplified components appeared sporadically, then disappeared.[ citation needed ] Before largely settling on simplifying the existing system, some within the PRC, including Mao Zedong, also explored the total replacement of Chinese characters with a phonetic script, usually based on the Latin alphabet, culminating in projects such as Gwoyeu Romatzyh and Latinxua Sin Wenz.
The PRC initiated the first round of simplifications with two documents published in 1956 and 1965. The reforms both simplified the forms of many characters in use, and reduced the total number of characters in the lexicon. 來 was written as 来 in the earlier clerical script; it used one fewer stroke, and was thus adopted as a simplified form. The 雲 ('cloud') character was written as 云 in the ancient oracle bone script. This simpler form had remained in use later as a phonetic loan with a meaning of 'to say', and with the original meaning of 'cloud' it was instead written with an added ⾬ 'RAIN' radical as a semantic indicator. When using simplified forms, these two characters are merged into 云.The majority of first round characters were drawn from conventional abbreviated or ancient forms. For example, the orthodox character
A second round of simplifications was promulgated in 1977, but it was poorly received by the public, and fell out of official use very quickly, ultimately being formally rescinded in 1986. The second round of simplifications were unpopular in large part because the vast majority of its forms were completely new, in stark contrast to the many familiar variants present in the first round.
Two revised lists of simplified characters were published in 1988: the List of Commonly Used Characters in Modern Chinese having 2,500 common characters and 1,000 less common characters, and the Chart of Generally Utilised Characters of Modern Chinese with 7,000 characters, including those in the smaller list. In 2013, the revised Table of General Standard Chinese Characters replaced the 1988 lists as the new standard: it includes 8,105 characters, with 3,500 categorised as primary, 3,000 as secondary, and 1,605 as tertiary.GB 2312, an early version of the national encoding standard used in the PRC, has 6,763 code points; its modern, mandatory successor GB 18030 has a much higher number. The Chinese Proficiency Test (HSK) covers 2,663 characters and 5,000 words at its highest level, while the Chinese Proficiency Grading Standards for International Chinese Language Education would cover 3,000 characters and 11,092 words at the highest level.
Singapore underwent three successive rounds of character simplification promulgated by the Ministry of Education, with the first two having some simplifications that differed from those used in mainland China. The first round was published in 1969, and consisted of 498 simplified and 502 traditional characters. The second round in 1974 consisted of 2287 simplified characters, including 49 differences from the PRC system that were removed with the final round in 1976.In 1993, Singapore adopted the 1986 revisions made in mainland China.
Unlike in mainland China, where personal names may only be registered using simplified characters, Singapore parents have the option of registering their children's names in traditional characters.
Malaysia uses simplified characters in Chinese-language schools. Chinese-language newspapers in the country are published in either simplified or traditional characters—often, headlines are printed with traditional forms, and the body with simplified forms.[ citation needed ]
In the Philippines, most Chinese schools and businesses still use the traditional characters with bopomofo, owing from influence from Taiwan due to shared Hokkien heritage. Recently, however, more Chinese schools now use both simplified characters and pinyin. Since most readers of Chinese newspapers in the Philippines belong to the older generation, they are still published largely using traditional characters.[ citation needed ]
In Taiwan, the Ministry of Education's Chart of Standard Forms of Common National Characters lists 4,808 characters; the Chart of Standard Forms of Less-Than-Common National Characters lists another 6,341 characters. The Chinese Standard Interchange Code (CNS11643)—the official national encoding standard—supported 48,027 characters in its 1992 version; currently encoding over 96,000 characters,while BIG-5, the most widely used non-Unicode encoding, supports only 13,053. The Test of Chinese as a Foreign Language (TOCFL) covers 8,000 words at its highest level. The Taiwan Benchmarks for the Chinese Language (TBCL), a guideline designed to describe levels of Chinese language proficiency, covers 3,100 characters and 14,425 words at the highest level.
In Hong Kong, which uses traditional characters, the Education and Manpower Bureau's List of Graphemes of Commonly-Used Chinese Characters , containing 4,759 characters, is intended for use in elementary and junior secondary education.
Chinese-language signage in the United States and Canada most often uses traditional characters.There is some effort to get municipal governments to implement more simplified character signage due to recent immigration from mainland China. Most Chinese-language newspapers in North America are printed using traditional characters.
After World War II, the Japanese government also instituted a series of orthographic reforms. Some characters were given simplified forms called shinjitai ; the older forms were then labelled the kyūjitai . The use of numerous variant forms was discouraged, and lists of characters to be learned during each grade of school were created: first the 1850-character tōyō kanji list in 1945, and then the 1945-character jōyō kanji list in 1981, with a 2136-character revision in 2010. The Japanese government restricts characters that can be used in names to the jōyō kanji plus an additional list of 983 jinmeiyō kanji historically prevalent in names. 龍) alongside the shinjitai form (竜).While these lists serve as a guideline, unlisted characters are still widely used by native Japanese speakers, such as the kyūjitai form of 'dragon' (
Just as letters in the Latin script have characteristic shapes—for example, with lowercase letters mostly occupying the x-height, and certain letters having distinctive ascenders or descenders—Chinese characters characteristically occupy a roughly square area within which components are to fit, in order to maintain a uniform size and shape—especially with small printed characters in Ming and sans-serif styles. Beginners often practise writing on graph paper with grid lines; Chinese people sometimes use the term 'square-block characters' (方块字; 方塊字; fāngkuàizì), also translated as 'tetragraphs', in reference to written characters.
Despite standardisation, use of certain non-standard forms has been common historically, especially in handwriting. In older sources, even authoritative ones, variant characters are easily found.While orthodox forms were mandatory in official and semi-official printed works, many printers produced works of varying quality, with errata including the deletion of passages, the apparent forgery of earlier styles, as well as the non-normative use of characters, portrayed either as incorrect variant forms or as outright typos. In the preface to the Kangxi Dictionary, there are 30 variant characters which are not found in the dictionary itself.
In certain cases, compound words and set phrases may be represented by single-character contractions. Some of these can be considered logograms, where characters represent whole words rather than syllable-morphemes, though these are generally considered as non-standard ligatures or abbreviations instead—similar to scribal abbreviations such as an ampersand for the digraph et, or an ñ for the digraph nn. These usually see use in handwriting or decorations, but sometimes in print as well. These ligatures are called 合文; héwén, 合书; 合書; héshū or 合体字; 合體字; hétǐzì in Chinese; in the special case where two characters are combined, they are known as 'two-syllable characters' (双音节汉字; 雙音節漢字; shuāngyīnjié hànzì).
A commonly seen example is the 'double happiness' character 囍, formed as a ligature of 喜喜 and referred to by its disyllabic name 双喜; 雙喜; shuāngxǐ. In handwriting, numbers are very frequently squeezed into one space or combined—common ligatures include 廿; niàn; 'twenty', normally read as 二十; èrshí, 卅; sà; 'thirty', normally read as 三十; sānshí, and 卌; xì; 'forty', normally read as 四十; sìshí in Standard Mandarin, though other Chinese varieties may differ. For example, 廿 is given a monosyllabic reading of jaa6 in Cantonese. Calendars often use numeral ligatures in order to save space, and in modern printings of the traditional Chinese calendar, the use of 廿 is standard. Thus, one would generally write "21 March" as 三月廿一.
Examples of modern contractions include characters sometimes used to represent SI units. In Chinese these units are disyllabic and usually written with two characters, as with 'centimetre' (厘米; límǐ) from 厘; 'centi-' and 米; 'metre', or 'kilowatt' (千瓦; qiānwǎ). However, in the 19th century these were often written via compound characters, pronounced disyllabically, such as 瓩 for 千瓦 or 糎 for 厘米—some of these characters were also used in Japan, where they were pronounced with borrowed European readings instead. These have now fallen out of general use, but are occasionally seen. Less systematic examples include 图; 圕 (tuān), a contraction of 图书馆; 圖書館 (túshūguǎn; 'library'). Since polysyllabic characters are often non-standard, they are often excluded from character dictionaries.
The use of such contractions is as old as the characters themselves, and they have frequently been used in religious or ritual contexts. In the oracle bone script, personal names, ritual items, and even phrases are commonly contracted into single characters, such as 受又 (shòu yòu; 'receive blessings') becoming 祐 (yòu). A dramatic example is found in medieval manuscripts, where 'bodhisattva' (菩萨; 菩薩; púsà) is sometimes contracted to a single character composed of four 十 arranged in a 2×2 grid—derived from the 艹 'GRASS' radicals present in the original characters. For the sake of consistency and standardisation, the Chinese government has sought to limit the contemporary use of polysyllabic characters in public writing.
Conversely, with the erhua phenomenon in Mandarin varieties, expressed via the fusion of the diminutive 儿; ér suffix, some monosyllabic words may be written with two characters, such as in huār (花儿; 'flower').
Chinese characters are primarily morphosyllabic, meaning that there is usually a one-to-one correspondence between Chinese morphemes and spoken Chinese syllables, and therefore written Chinese characters. However, in modern Chinese varieties most common words are disyllabic and therefore dimorphemic. In modern Standard Chinese, 10% of morphemes are bound forms, only appearing in compound words. However, a few morphemes are disyllabic, some of which even date back to Classical Chinese. 蝴蝶 (húdié; 'butterfly') and 珊瑚 (shānhú; 'coral')—the first character of 'butterfly' and the second character of 'coral' each have 胡 for a phonetic component, with the ⾍ 'INSECT' and ⽟ 'JADE' radicals as their respective semantic components, also present within the other character of each word. Neither of the aforementioned hú characters exist as independent morphemes, except as poetic abbreviations of the disyllabic words.Excluding loanwords, these are typically words for plants and small animals, usually written with a pair of phono-semantic compounds sharing a common radical. Examples are
Often a rare character, complex, or antiquate variant will appear in personal or place names. As many computer encoding systems have historically included only the most common characters, this can create problems. As a representative example, the name of Taiwanese politician Yu Shyi-kun contains the rare character 堃 (kūn); printing this character is often nontrivial. Newspapers covering Yu that encountered difficulty with his name dealt with this problem in various ways, including using software to combine two extant characters into a visually similar amalgam, embedding a picture of the character instead of encoding it as text, substituting a homophonic character with the expectation that the reader would make the correct inference.[ citation needed ] Generally, printed materials in Taiwan will annotate such a character with bopomofo. Japanese newspapers often replace obscure characters with katakana instead, as is accepted practice in Japanese style guides.[ citation needed ]
There are also extremely stroke-rich characters, which tend to be rare. A notable example is 𪚥 (zhé; 'verbose'), which fell out of use by the end of the 5th century, containing 64 strokes. This character may not necessarily be seen as the most complex or difficult, as it simply requires writing the 16-stroke character 龍 (lóng; 'dragon') four times within the space allotted for one. Another 64-stroke character created in the same manner is 𠔻 (zhèng), composed of the character 興 (xīng, xìng; 'flourish') in quadruplicate.
One of the most complex characters found in modern Chinese dictionaries is 齉 (nàng; 'snuffle') with 36 strokes. Other stroke-rich characters include the triplicated 靐 (bìng) with 39 strokes, and the quadruplicated 䨻 (bèng) with 52, both meaning 'the loud noise of thunder'—however, these are not commonly used. As an example, the most complex character that can be input with a representative IME is 龘 (dá; 'appearance of a dragon in flight'). It is composed of the ⿓ 'DRAGON' radical in triplicate, having a total of 16 × 3 = 48 strokes. Among the most complex characters presently in common use are 籲 (yù; 'to implore') with 32 strokes, 鬱 (yù; 'luxuriant', 'lush', 'gloomy')—also the character in the jōyō kanji list having the most strokes, with 29—豔 (yàn; 'colourful') with 28, and 釁 (xìn; 'quarrel') with 25. Also occasionally in modern use is 鱻 (xiān; 'fresh'), a variant of 鮮 with 33 strokes.
In Japanese, an 84-stroke kokuji exists: , normally read taito . It is composed of the 'cloud' character 䨺 atop the aforementioned triple-'dragon' character, also possessing the meaning of 'appearance of a dragon in flight': it has readings おとど, otodo, たいと, taito, and だいと, daito.
In addition, there are a number of 'dialect characters' (方言字; fāngyánzì) that are not generally used in formal written Chinese but represent colloquial terms in various spoken varieties of the language. In general, it is common practice to use standard characters to transcribe previously-unwritten words in Chinese dialects when obvious cognates exist. However, when no obvious cognate exists due to factors like irregular sound changes or semantic drift over time, or an origin in a non-Chinese language, like a substratum or loanword, then characters to transcribe it are borrowed according to the rebus principle, or invented in an ad hoc manner.[ citation needed ] These new characters are generally phono-semantic compounds, e.g. Min Nan 侬 ('person'), although there are examples of compound ideographs, e.g. northeast Mandarin 孬 ('bad').[ citation needed ]
There may be several ways to write a dialectal word—often, one that is etymologically correct, and one or several that are based on the word's pronunciation—e.g. the etymological 觸祭 versus the phonetic 戳鸡 (7tshoq1ci) in Shanghainese, meaning 'eat'. Speakers of a dialect will generally recognise a dialectal word if it is transcribed according to pronunciation, while the etymologically correct form may be more difficult to recognise.[ citation needed ] For example, few Gan speakers would recognise the character 隑 as meaning 'to lean' in their dialect, because this sense of the character is now archaic in Standard Mandarin.
As an exception, written Cantonese is widespread in Hong Kong, even for certain formal documents, due to the former British colonial administration's recognition of Cantonese for use for official purposes.[ citation needed ] In Taiwan, there is also a body of semi-official characters used to represent Taiwanese Hokkien and Hakka. An example of an Hakka vernacular character is 㓾 (cii11, 'kill'). Other varieties of Chinese with a significant number of speakers—like Shanghainese Wu, Gan Chinese, and Sichuanese Mandarin—also have their own series of characters, but these are not often seen, except on advertising billboards directed toward locals and are not used in formal settings except to give precise transcriptions of witness statements in legal proceedings.[ citation needed ] Standard Chinese is the preferred written language within every region of mainland China.
Dozens of schemes have been devised for indexing Chinese characters and sorting them into dictionaries. Most of these are specific to the dictionary for which they were invented, and relatively few have seen widespread use. Often, character dictionaries incorporate several means for which users may locate entries. Traditionally, methods for organising and sorting Chinese dictionaries have been divided into form-based orders, which sort by graphical properties such as constituent components, sound-based orders, usually based on an extant transliteration scheme such as pinyin or bopomofo, and meaning-based orders.
Many Chinese-, Japanese-, and Korean-language character dictionaries are indexed using a technique known as radical-and-stroke sorting, in which characters are grouped by common components called radicals, with radicals in turn ordered by number of strokes. The characters under each radical heading are in turn listed in order of their total number of strokes. Grouping by radical was introduced in the Shuowen Jiezi , which used 540 radicals. The 214 Kangxi radicals were introduced in the Zihui in 1615, and later popularised by the 1716 Kangxi Dictionary .
For example, to locate the character 松 ('pine tree') in such a dictionary, the user first determines which part of the character is the radical—here, the radical is ⽊ 'TREE'. One then counts the number of strokes in the radical (four). Within the radical index, usually located on one of the dictionary's inside covers, the page number of the section heading for ⽊ 'TREE' is listed, alongside those of the other radicals with four strokes. The user can then turn to the appropriate section heading, which will have a sub-index with page numbers that correspond to the number of strokes present in the remainder of the character. The right half of 松 also contains four strokes—upon turning to the corresponding page number, the user can now scan the entries to locate the character in question. Some dictionaries have a sub-index listing every character containing a given radical: if the user knows the number of strokes in the non-radical portion of the character, they can use this to obtain the page number directly.
Another form-based system is the four-corner method, in which characters are classified according to the shapes at each of the rectilinear character's corners. In modern Chinese, characters and words are also ordered by their frequency, as determined by use within a corpus, often with the aid of a computerised database. Important stroke-based sorting methods include stroke-count sorting, stroke-count-stroke-order sorting, GB stroke-based sorting and YES sorting.
Most modern Chinese dictionaries arrange the main character entries alphabetically according to pinyin spelling, but also provide a traditional radical-based index in the front of the dictionary.To find a character with an unknown pronunciation using one of these dictionaries, the reader determines the radical and stroke number of the character, as before, and locates the character in the radical index. The character's entry will give the character's pronunciation in pinyin or the page number of the main character entry.
Studies within China have suggested that literate individuals have an active vocabulary of between 3,000 and 4,000 characters, while specialists in fields like classical literature or history are estimated to have a working vocabulary of between 5,000 and 6,000 characters.
The Japanese Ministry of Education has designated 2,136 jōyō kanji to be taught in primary and secondary school. Today, a well-educated Japanese person may know upwards of 3,500 characters. The kanji kentei tests a speaker's ability to read and write kanji. The highest level of the kanji kentei tests according to the full JIS X 0208 list, which includes over 6,000 kanji.
The South Korean Basic Hanja for Educational Use is a set of 1,800 characters standardised in 1972, with the first 900 hanja taught to middle school students, and the rest taught to high school students.
In March 1991, the Supreme Court of Korea published the 2,854-character Table of Hanja for Use in Personal Names.The list expanded gradually: by 2015 there were 8,142 hanja, including the set of basic hanja, permitted for use in Korean names.
Ostensibly, Chinese characters can be created and used arbitrarily, though they are unlikely to gain widespread use or inclusion in official character sets.Counting the entries within major Chinese dictionaries is a viable means of estimating the growth of the character inventory over time.
Estimates of the total number of characters in modern use can be sourced from encoding schemes and dictionaries:according to sources from mainland China, Taiwan, Hong Kong, Japan, and Korea, this number is likely around 15,000. For comparison, Unicode encodes over 90,000 CJK Unified Ideographs.
|1915||Zhonghua Da Zidian||48,000|
|1989||Hanyu Da Zidian||54,678|
|2017||Dictionary of Chinese Character Variants||106,330|
|Year||Dictionary||Language||Number of characters|
|2003||Dai Kan-Wa Jiten||Japanese||50,305|
|2008||Han-Han Dae Sajeon||Korean||53,667 [ additional citation(s) needed ]|
Even the Zhonghua Zihai does not include characters in the Chinese family of scripts created to represent non-Chinese languages, except the unique characters in use in Japan and Korea. Characters formed by Chinese principles in other languages include the roughly 1,500 Japanese-made kokuji given in the Kokuji no Jiten, Nôm characters historically used in Vietnam. More divergent descendants include the Tangut script, which created over 5,000 characters with visually similar strokes to Chinese characters, but different principles of formation.the Korean gukja, the over 10,000 sawndip characters still in use in Guangxi, and the almost 20,000
Modified radicals and new variants are two common reasons for the ever-increasing number of characters. There are about 300 radicals, with 100 being in common use. Creating a new character by modifying the radical is an easy way to disambiguate homographs among picto-phonetic compounds (xíngshēngzì). This practice began long before the Qin standardisation of Chinese script. The third-person personal pronoun 他 (tā) written with the ⼈ 'MAN' radical, traditionally used regardless of the target's gender or animacy, illustrates modifying signifiers in order to form new characters. In modern written Chinese, further graphical distinctions have been made between 她 ('she') with the ⼥ 'WOMAN' radical, 牠 ('it') with the 牜 'COW' radical, 它 ('it') with the ⼧ 'ROOF' radical, and 祂 ('He') with the 礻 'SPIRIT' radical, though all are pronounced identically as tā. One consequence of modifying radicals is the fossilisation of rare and obscure variant logographs, some of which are not even found in Classical Chinese texts. For instance, 和 (hé; 'harmony', 'peace'), which combines the ⽲ 'GRAIN' radical with the ⼝ 'MOUTH' radical, has variants 咊 with the components' positions reversed, and 龢 with ⿕ 'FLUTE' replacing ⼝ 'MOUTH'.
Kanji are the logographic Chinese characters taken from the Chinese script used in the writing of Japanese. They were made a major part of the Japanese writing system during the time of Old Japanese and are still used, along with the subsequently-derived syllabic scripts of hiragana and katakana. The characters have Japanese pronunciations; most have two, with one based on the Chinese sound. A few characters were invented in Japan by constructing character components derived from other Chinese characters. After the Meiji Restoration, Japan made its own efforts to simplify the characters, now known as shinjitai, by a process similar to China's simplification efforts, with the intention to increase literacy among the common folk. Since the 1920s, the Japanese government has published character lists periodically to help direct the education of its citizenry through the myriad Chinese characters that exist. There are nearly 3,000 kanji used in Japanese names and in common communication.
In a written language, a logogram, logograph, or lexigraph is a written character that represents a word or morpheme. Chinese characters are generally logograms, as are many hieroglyphic and cuneiform characters. The use of logograms in writing is called logography, and a writing system that is based on logograms is called a logography or logographic system. All known logographies have some phonetic component, generally based on the rebus principle.
Written Chinese comprises Chinese characters used to represent the Chinese languages. Chinese characters do not constitute an alphabet or a compact syllabary. Rather, the writing system is logosyllabic: a character generally corresponds to one syllable of spoken Chinese, and may be either be a word on its own, or a part of a polysyllabic word. The characters themselves are often composed of parts that may represent physical objects, abstract notions, or pronunciation. Literacy requires the memorization of a great number of characters: college-educated Chinese speakers know about 4,000. The large number of Chinese characters has in part led to the adoption of Western alphabets or other complementary systems as auxiliary means of representing Chinese.
A Chinese radical or indexing component is a graphical component of a Chinese character under which the character is traditionally listed in a Chinese dictionary. This component is often a semantic indicator similar to a morpheme, though sometimes it may be a phonetic component or even an artificially extracted portion of the character. In some cases the original semantic or phonological connection has become obscure, owing to changes in character meaning or pronunciation over time.
Traditional Chinese characters refer to one of several standard sets of characters used to write Chinese languages. In Taiwan, the set of traditional characters is regulated by Taiwan's Ministry of Education, standardized in the Standard Form of National Characters. These forms were predominant in written Chinese until the middle of the 20th century, when various countries that use Chinese characters began standardizing simplified sets of characters, often with characters that existed before as well-known variants of the predominant forms.
Stroke order is the order in which the strokes of a Chinese character are written. A stroke is a movement of a writing instrument on a writing surface. Chinese characters are used in various forms in Chinese, Japanese, and Korean. They are known as Hanzi in (Mandarin) Chinese, kanji in Japanese (かんじ), and Hanja in Korean (한자).
Man'yōgana is an ancient writing system that uses Chinese characters to represent the Japanese language. It was the first known kana system to be developed as a means to represent the Japanese language phonetically. The date of the earliest usage of this type of kana is not clear, but it was in use since at least the mid-7th century. The name "man'yōgana" derives from the Man'yōshū, a Japanese poetry anthology from the Nara period written with man'yōgana.
All Chinese characters are logograms, but can be further categorised based on the manner of their creation or derivation. Some characters may be analysed structurally as compounds created from smaller components, while some are not decomposable in this way. A small number of characters originate as pictographs and ideograms, but the vast majority are what are often called phono-semantic compounds.
Semi-cursive script, also known as running hand script, is a style of calligraphy which emerged in China during the Han dynasty. The style is used to write Chinese characters and is abbreviated slightly where a character’s strokes are permitted to be visibly connected as the writer writes, but not to the extent of the cursive style. This makes the style easily readable by readers who can read regular script and quickly writable by calligraphers who require ideas to be written down quickly. In order to produce legible work using the semi-cursive style, a series of writing conventions is followed, including the linking of the strokes, simplification and merging strokes, adjustments to stroke order and the distribution of text of the work.
Shinjitai are the simplified forms of kanji used in Japan since the promulgation of the Tōyō Kanji List in 1946. Some of the new forms found in shinjitai are also found in Simplified Chinese characters, but shinjitai is generally not as extensive in the scope of its modification.
In Japanese language, Ryakuji are colloquial simplifications of kanji.
Variant Chinese characters are Chinese characters that are homophones and synonyms. Most variants are allographs in most circumstances, such as casual handwriting. Some contexts require the usage of certain variants, such as in textbook editing.
The Tangut script is a logographic writing system used for writing the extinct Tangut language of the Western Xia dynasty. According to the latest count, 5863 Tangut characters are known, excluding variants. The Tangut characters are similar in appearance to Chinese characters, with the same type of strokes, but the methods of forming characters in the Tangut writing system are significantly different from those of forming Chinese characters. As in Chinese calligraphy, regular, running, cursive and seal scripts were used in Tangut writing.
The debate on traditional Chinese characters and simplified Chinese characters is an ongoing dispute concerning Chinese orthography among users of Chinese characters. It has stirred up heated responses from supporters of both sides in mainland China, Hong Kong, Macau, Taiwan, and among overseas Chinese communities with its implications of political ideology and cultural identity. Simplified characters here exclusively refer to those characters simplified by the People's Republic of China (PRC), instead of the concept of character simplification as a whole. The effect of simplified characters on the language remains controversial, decades after their introduction.
The Chinese family of scripts are writing systems descended from the Chinese oracle bone script and used for a variety of languages in East Asia. They include logosyllabic systems such as the Chinese script itself, and adaptations to other languages, such as kanji (Japanese), Hanja (Korean), chữ Hán and chữ Nôm (Vietnamese), Sawgun and Sawndip (Zhuang) and Bowen (Bai). More divergent are Tangut, Khitan large script, and its offspring Jurchen, as well as the Yi script, the Sui script and Geba script, which were inspired by Chinese although not directly descended from it. The partially deciphered Khitan small script may be another. In addition, various phonetic scripts descend from Chinese characters, of which the best known are the various kana syllabaries, the semi-syllabary bopomofo, nüshu, and lisu.
In Chinese calligraphy, Chinese characters can be written according to five major styles. These styles are intrinsically linked to the history of Chinese script.
Radical 140 or radical grass (艸部) meaning "grass" is one of 29 of the 214 Kangxi radicals that are composed of 6 strokes. It transforms into 艹 when appearing at the top of a character or component. In the Kangxi Dictionary and in modern standard Traditional Chinese as used in Taiwan, Hong Kong and Macau, 艹 consists of four strokes, while in Simplified Chinese and modern Japanese, 艹 consists of three strokes.
Radical 113 or radical spirit (示部) meaning ancestor or veneration is number 113 out of the 214 Kangxi radicals. It is one of the 23 radicals composed of 5 strokes. When appearing at the left side of a character, the radical transforms into 礻 in modern Chinese and Japanese jōyō kanji.
Radical 120 or radical silk (糸部) meaning "silk" is one of the 29 Kangxi radicals composed of 6 strokes.
Modern Chinese characters are the Chinese characters used in modern languages, especially in the standard Mandarin Chinese.
Often, the Chinese character can function as an independent unit in sentences, but sometimes it must be paired with another character or more to form a word. [...] Most words consist of two or more characters, and more than 80 per cent make use of lexical compounding of morphemes (Packard, 2000).
In the highest antiquity, government was carried on successfully by the use of knotted cords (to preserve the memory of things). In subsequent ages the sages substituted for these written characters and bonds. By means of these (the doings of) all the officers could be regulated, and (the affairs of) all the people accurately examined.
...East Asia had been among the first regions of the world to produce written records of the past. Well into modern times Chinese script, the common script across East Asia, served—with local adaptations and variations—as the normative medium of record-keeping and written historical narrative, as well as official communication. This was true, not only in China itself, but in Korea, Japan, and Vietnam.