Last updated

An orthography is a set of conventions for writing a language. It includes norms of spelling, hyphenation, capitalization, word breaks, emphasis, and punctuation.


Most transnational languages in the modern period have a system of writing, and for most such languages a standard orthography has been developed, often based on a standard variety of the language, and thus exhibiting less dialect variation than the spoken language. [1] [2] Sometimes there may be variation in a language's orthography, as between American and British spelling in the case of English orthography. In some languages orthography is regulated by language academies, although for many languages (including English) there are no such authorities. Even in the latter languages, a significant amount of consensus arises naturally, although a maximum of consistency or standardization occurs only when prescriptively imposed according to style guides.

Etymology and meaning

The English word orthography dates from the 15th century. It comes from the French orthographie, from Latin orthographia, which derives from Greek ὀρθός orthós, "correct", and γράφειν gráphein, "to write". [3]

Orthography is largely concerned with matters of spelling, and in particular the relationship between phonemes and graphemes in a language. [4] [5] Other elements that may be considered part of orthography include hyphenation, capitalization, word breaks, emphasis, and punctuation. [6] Orthography thus describes or defines the set of symbols used in writing a language, and the rules regarding how to use those symbols.

Most natural languages developed as oral languages, and writing systems have usually been crafted or adapted as ways of representing the spoken language. The rules for doing this tend to become standardized for a given language, leading to the development of an orthography that is generally considered "correct". In linguistics the term orthography is often used to refer to any method of writing a language, without judgment as to right and wrong, with a scientific understanding that orthographic standardization exists on a spectrum of strength of convention. The original sense of the word, though, implies a dichotomy of correct and incorrect, and the word is still most often used to refer specifically to a thoroughly standardized, prescriptively correct, way of writing a language. A distinction may be made here between etic and emic viewpoints: the purely descriptive (etic) approach, which simply considers any system that is actually used—and the emic view, which takes account of language users' perceptions of correctness.

Units and notation

Orthographic units, such as letters of an alphabet, are technically called graphemes. These are a type of abstraction, analogous to the phonemes of spoken languages; different physical forms of written symbols are considered to represent the same grapheme if the differences between them are not significant for meaning. For example, different forms of the letter "b" are all considered to represent a single grapheme in the orthography of, say, English.

Graphemes or sequences of them are sometimes placed between angle brackets, as in b or back. This distinguishes them from phonemic transcription, which is placed between slashes (/b/, /bæk/), and from phonetic transcription, which is placed between square brackets ([b], [bæk]).


The writing systems on which orthographies are based can be divided into a number of types, depending on what type of unit each symbol serves to represent. The principal types are logographic (with symbols representing words or morphemes), syllabic (with symbols representing syllables), and alphabetic (with symbols roughly representing phonemes). Many writing systems combine features of more than one of these types, and a number of detailed classifications have been proposed. Japanese is an example of a writing system that can be written using a combination of logographic kanji characters and syllabic hiragana and katakana characters; as with many non-alphabetic languages, alphabetic romaji characters may also be used as needed. [7]

Correspondence with pronunciation

Orthographies that use alphabets and syllabaries are based on the principle that the written symbols (graphemes) correspond to units of sound of the spoken language: phonemes in the former case, and syllables in the latter. However, in virtually all cases, this correspondence is not exact. Different languages' orthographies offer different degrees of correspondence between spelling and pronunciation. English orthography, French orthography and Danish orthography, for example, are highly irregular, whereas the orthographies of languages such as Russian, German and Spanish represent pronunciation much more faithfully, although the correspondence between letters and phonemes is still not exact. Finnish, Turkish and Serbo-Croatian orthographies are remarkably consistent: approximation of the principle "one letter per sound".

An orthography in which the correspondences between spelling and pronunciation are highly complex or inconsistent is called a deep orthography (or less formally, the language is said to have irregular spelling). An orthography with relatively simple and consistent correspondences is called shallow (and the language has regular spelling).

One of the main reasons for which spelling and pronunciation deviate is that sound changes taking place in the spoken language are not always reflected in the orthography, and hence spellings correspond to historical rather than present-day pronunciation. One consequence of this is that many spellings come to reflect a word's morphophonemic structure rather than its purely phonemic structure (for example, the English regular past tense morpheme is consistently spelled -ed in spite of its different pronunciations in various words). This is discussed further at Phonemic orthography § Morphophonemic features.

The syllabary systems of Japanese (hiragana and katakana) are examples of almost perfectly shallow orthographies—the kana correspond with almost perfect consistency to the spoken syllables, although with a few exceptions where symbols reflect historical or morphophonemic features: notably the use of ぢ ji and づ zu (rather than じ ji and ず zu, their pronunciation in standard Tokyo dialect) when the character is a voicing of an underlying ち or つ (see rendaku), and the use of は, を, and へ to represent the sounds わ, お, and え, as relics of historical kana usage.

The Korean hangul system was also originally an extremely shallow orthography, but as a representation of the modern language it frequently also reflects morphophonemic features.

For full discussion of degrees of correspondence between spelling and pronunciation in alphabetic orthographies, including reasons why such correspondence may break down, see Phonemic orthography.

Defective orthographies

An orthography based on the principle that symbols correspond to phonemes may, in some cases, lack characters to represent all the phonemes or all the phonemic distinctions in the language. This is called a defective orthography. An example in English is the lack of any indication of stress. Another is the digraph th, which represents two different phonemes (as in then and thin) and replaced the old letters ð and þ . A more systematic example is that of abjads like the Arabic and Hebrew alphabets, in which the short vowels are normally left unwritten and must be inferred by the reader.

When an alphabet is borrowed from its original language for use with a new language—as has been done with the Latin alphabet for many languages, or Japanese Katakana for non-Japanese words—it often proves defective in representing the new language's phonemes. Sometimes this problem is addressed by the use of such devices as digraphs (such as sh and ch in English, where pairs of letters represent single sounds), diacritics (like the caron on the letters š and č, which represent those same sounds in Czech), or the addition of completely new symbols (as some languages have introduced the letter w to the Latin alphabet) or of symbols from another alphabet, such as the rune þ in Icelandic.

After the classical period, Greek developed a lowercase letter system that introduced diacritic marks to enable foreigners to learn pronunciation and in some cases, grammatical features. However, as pronunciation of letters changed over time, the diacritic marks were reduced to representing the stressed syllable. In Modern Greek typesetting, this system has been simplified to only have a single accent to indicate which syllable is stressed. [8]

See also

Related Research Articles

Alphabet Standard set of letters that represent phonemes of a spoken language

An alphabet is a standardized set of basic written symbols or graphemes that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a syllable, for instance, and logographic systems use characters to represent words, morphemes, or other semantic units.

English orthography is the system of writing conventions used to represent spoken English in written form that allows readers to connect spelling to sound to meaning.

In linguistics, a grapheme is the smallest functional unit of a writing system.

Morphophonology is the branch of linguistics that studies the interaction between morphological and phonological or phonetic processes. Its chief focus is the sound changes that take place in morphemes when they combine to form words.

In phonology and linguistics, a phoneme is a unit of sound that distinguishes one word from another in a particular language.

Sinhala script Abugida

Sinhala script, also known as Sinhalese script, is a writing system used by the Sinhalese people and most Sri Lankans in Sri Lanka and elsewhere to write the Sinhala language, as well as the liturgical languages, Pali and Sanskrit. The Sinhalese Akṣara Mālāva, one of the Brahmic scripts, is a descendant of the Ancient Indian Brahmi script.

Schwa Vowel sound as in the first syllable of "about"

In linguistics, specifically phonetics and phonology, schwa is the mid central vowel sound in the middle of the vowel chart, denoted by the IPA symbol, or another vowel sound close to that position. An example in English is the vowel sound of the 〈a〉 in the word about. Schwa in English is mainly found in unstressed positions, but in some other languages it occurs more frequently as a stressed vowel.

A spelling reform is a deliberate, often authoritatively sanctioned or mandated change to spelling rules. Proposals for such reform are fairly common, and over the years, many languages have undergone such reforms. Recent high-profile examples are the German orthography reform of 1996 and the on-off Portuguese spelling reform of 1990, which is still being ratified.

Phonetic transcription is the visual representation of speech sounds by means of symbols. The most common type of phonetic transcription uses a phonetic alphabet, such as the International Phonetic Alphabet.

A phonemic orthography is an orthography in which the graphemes correspond to the phonemes of the language. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme-phoneme correspondence can be expected in orthographies based on alphabetic writing systems, but they differ in how complete this correspondence is. English orthography, for example, is alphabetic but highly nonphonemic; it was once mostly phonemic during the Middle English stage, when the modern spellings originated, but spoken English changed rapidly while the orthography was much more stable, resulting in the modern nonphonemic situation. However, because of their relatively recent modernizations compared to English, the Romanian, Italian, Turkish, Spanish, Finnish, Czech, Latvian and Polish orthographic systems come much closer to being consistent phonemic representations.

Atong is a Sino-Tibetan language related to Koch, Rabha, Bodo and Garo. It is spoken in the South Garo Hills and West Khasi Hills districts of Meghalaya state in Northeast India, southern Kamrup district in Assam, and adjacent areas in Bangladesh. The correct spelling "Atong" is based on the way the speakers themselves pronounce the name of their language. There is no glottal stop in the name and it is not a tonal language.

Letter (alphabet) Grapheme in an alphabetic system of writing

A letter is a segmental symbol of a phonemic writing system. The inventory of all letters forms the alphabet. Letters broadly correspond to phonemes in the spoken form of the language, although there is rarely a consistent, exact correspondence between letters and phonemes.

A pronunciation respelling for English is a notation used to convey the pronunciation of words in the English language, which does not have a phonemic orthography.

Czech orthography is a system of rules for correct writing (orthography) in the Czech language.

According to the alphabetic principle, letters and combinations of letters are the symbols used to represent the speech sounds of a language based on systematic and predictable relationships between written letters, symbols, and spoken words. The alphabetic principle is the foundation of any alphabetic writing system. In the education field, it is known as the alphabetic code.

Romanisation of Bengali is the representation of written Bengali language in the Latin script. Various romanisation systems for Bengali are used, most of which do not perfectly represent Bengali pronunciation. While different standards for romanisation have been proposed for Bengali, none has been adopted with the same degree of uniformity as Japanese or Sanskrit.

Dyslexia is a complex, lifelong disorder involving difficulty in learning to read or interpret words, letters and other symbols. Dyslexia does not affect general intelligence, but is often co-diagnosed with ADHD. There are at least three sub-types of dyslexia that have been recognized by researchers: orthographic, or surface dyslexia, phonological dyslexia and mixed dyslexia where individuals exhibit symptoms of both orthographic and phonological dyslexia. Studies have shown that dyslexia is genetic and can be passed down through families, but it is important to note that, although a genetic disorder, there is no specific locus in the brain for reading and writing. The human brain does have language centers, but written language is a cultural artifact, and a very complex one requiring brain regions designed to recognize and interpret written symbols as representations of language in rapid synchronization. The complexity of the system and the lack of genetic predisposition for it is one possible explanation for the difficulty in acquiring and understanding written language.

Writing system Any conventional method of visually representing verbal communication

A writing system is a method of visually representing verbal communication, based on a script and a set of rules regulating its use. While both writing and speech are useful in conveying messages, writing differs in also being a reliable form of information storage and transfer. Writing systems require shared understanding between writers and readers of the meaning behind the sets of characters that make up a script. Writing is usually recorded onto a durable medium, such as paper or electronic storage, although non-durable methods may also be used, such as writing on a computer display, on a blackboard, in sand, or by skywriting. Reading a text can be accomplished purely in the mind as an internal process, or expressed orally.

The orthographic depth of an alphabetic orthography indicates the degree to which a written language deviates from simple one-to-one letter–phoneme correspondence. It depends on how easy it is to predict the pronunciation of a word based on its spelling: shallow orthographies are easy to pronounce based on the written word, and deep orthographies are difficult to pronounce based on how they are written.


  1. Ulrich Ammon (2004), "Standard variety", Sociolinguistics, 1, Walter de Gruyter, pp. 273–283, ISBN   978-3-11-014189-4
  2. Coulmas, Florian; Guerini, Federica (2012), "Literacy and Writing Reform", in Spolsky, Bernard (ed.), The Cambridge Handbook of Language Policy, Cambridge University Press, p. 454f
  3. orthography, Online Etymology Dictionary
  4. Seidenberg, Mark S. 1992. "Beyond Orthographic Depth in Reading: Equitable Division of Labor." In: Ram Frost & Leonard Katz (eds.). Ortho545fgraphy, Phonology, Morphology, and Meaning, pp. 85–118. Amsterdam: Elsevier, p. 93.
  5. Donohue, Mark. 2007. "Lexicography for Your Friends." In Terry Crowley, Jeff Siegel, & Diana Eades (eds.). Language Description, History and Development: Linguistic Indulgence in Memory of Terry Crowley. pp. 395–406. Amsterdam: Benjamins, p. 396.
  6. Coulmas, Florian. 1996. The Blackwell Encyclopedia of Writing Systems. Oxford: Blackwell, p. 379.
  7. Koda, Keiko; Zehler, Annette M. (Mar 3, 2008). Learning to Read Across Languages. Routledge. p. 17.
  8. Bulley, Michael. 2011. "Spelling Reform: A Lesson from the Greeks". English Today, 27(4), p. 71. doi : 10.1017/S0266078411000575

Further reading