Last updated

Transliteration is a type of conversion of a text from one script to another that involves swapping letters (thus trans- + liter- ) in predictable ways, such as Greek α a , Cyrillic д d , Greek χ → the digraph ch , Armenian ն n or Latin æ ae . [1]


For instance, for the Modern Greek term "Ελληνική Δημοκρατία", which is usually translated as "Hellenic Republic", the usual transliteration to Latin script is Ellīnikī Dīmokratia, and the name for Russia in Cyrillic script, "Россия", is usually transliterated as Rossiya, but is pronounced exactly the same way as "Россия".

Transliteration is the process of representing or intending to represent a word, phrase, or text in a different script or writing system. Transliterations are designed to convey the pronunciation of the original word in a different script, allowing readers or speakers of that script to approximate the sounds and pronunciation of the original word. Transliterations do not change the pronunciation of the word. Thus, in the Greek above example, λλ is transliterated ll though it is pronounced exactly the same way as [l], or the Greek letters, λλ. Δ is transliterated D though pronounced as [ð], and η is transliterated ī, though it is pronounced [i] (exactly like ι or η) and is not long.

Transcription, conversely, seeks to capture sound, but phonetically approximate it into the new script; "Ελληνική Δημοκρατία" corresponds to [elinicíðimokratía] [2] in the International Phonetic Alphabet. While differentiation is lost in the case of [i], note how the letter shape κ becomes either [c] or [k] depending on the vowel that follows it. [Elinicíðimokratía] is also pronounced slightly differently than the original Greek pronunciation, as it is a phonetic approximation: a transcription, rather than a transliteration.

Angle brackets may be used to set off transliteration, as opposed to slashes // for phonemic transcription and square brackets for phonetic transcription. Angle brackets may also be used to set off characters in the original script. Conventions and author preferences vary.


Systematic transliteration is a mapping from one system of writing into another, typically grapheme to grapheme. Most transliteration systems are one-to-one, so a reader who knows the system can reconstruct the original spelling.

Transliteration is opposed to transcription, which maps the sounds of one language into a writing system. Still, most systems of transliteration map the letters of the source script to letters pronounced similarly in the target script, for some specific pair of source and target language. Transliteration may be very close to transcription if the relations between letters and sounds are similar in both languages. In practice, there are some mixed transliteration/transcription systems that transliterate a part of the original script and transcribe the rest.

For many script pairs, there are one or more standard transliteration systems. However, unsystematic transliteration is common.

Difference from transcription

In Modern Greek, the letters ⟨η⟩ ⟨ι⟩ ⟨υ⟩ and the letter combinations ⟨ει⟩ ⟨oι⟩ ⟨υι⟩ are pronounced [i] (except when pronounced as semivowels), and a modern transcription renders them all as ⟨i⟩; but a transliteration distinguishes them, for example by transliterating to ⟨ī⟩ ⟨i⟩ ⟨y⟩ and ⟨ei⟩ ⟨oi⟩ ⟨yi⟩. (As the ancient pronunciation of ⟨η⟩ was [ɛː], it is often transliterated as an ⟨i⟩ with a macron, even for modern texts.) On the other hand, ⟨ευ⟩ is sometimes pronounced [ev] and sometimes [ef], depending on the following sound. A transcription distinguishes them, but this is no requirement for a transliteration that renders both as ⟨eu⟩. The initial letter 'h' reflecting the historical rough breathing in words such as Ellēnikē should logically be omitted in transcription from Koine Greek on, [3] and from transliteration from 1982 on, but it is nonetheless frequently encountered.

Greek wordTransliterationTranscriptionEnglish translation
Ελληνική ΔημοκρατίαEllīnikī DīmokratiaEliniki DimokratiaHellenic Republic
των υιώνtōn yiōnton ionof the sons


A simple example of difficulties in transliteration is the Arabic letter qāf. It is pronounced, in literary Arabic, approximately like English [k], except that the tongue makes contact not on the soft palate but on the uvula, but the pronunciation varies between different dialects of Arabic. The letter is sometimes transliterated into "g", sometimes into "q" or " ' " (for in Egypt it is silent) and rarely even into "k" in English. [4] Another example is the Russian letter "Х" (kha). It is pronounced as the voiceless velar fricative /x/, like the Scottish pronunciation of ch in "loch". This sound is not present in most forms of English and is often transliterated as "kh" as in Nikita Khrushchev. Many languages have phonemic sounds, such as click consonants, which are quite unlike any phoneme in the language into which they are being transliterated.

Some languages and scripts present particular difficulties to transcribers. These are discussed on separate pages.


See also

Related Research Articles

<span class="mw-page-title-main">Alphabet</span> Set of letters used to write a given language

An alphabet is a standardized set of written letters that represent particular spoken sounds in a language. Specifically, letters correspond to phonemes, the categories of sounds that can distinguish one word from another in a given language. Not all writing systems represent language in this way: a syllabary assigns symbols to spoken syllables, while logographic systems assign symbols to spoken words, morphemes, or other semantic units.

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

<span class="mw-page-title-main">Breve</span> Diacritical mark

A breve is the diacritic mark ◌̆, shaped like the bottom half of a circle. As used in Ancient Greek, it is also called brachy, βραχύ. It resembles the caron but is rounded, in contrast to the angular tip of the caron. In many forms of Latin, ◌̆ is used for a shorter, softer variant of a vowel, such as "Ĭ", where the sound is nearly identical to the English /i/.

In the polytonic orthography of Ancient Greek, the rough breathing character is a diacritical mark used to indicate the presence of an sound before a vowel, diphthong, or after rho. It remained in the polytonic orthography even after the Hellenistic period, when the sound disappeared from the Greek language. In the monotonic orthography of Modern Greek phonology, in use since 1982, it is not used at all.

<span class="mw-page-title-main">Romanization</span> Transliteration or transcription to Latin characters

In linguistics, romanization or romanisation is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into phonemic transcription, which records the phonemes or units of semantic meaning in speech, and more strict phonetic transcription, which records speech sounds with precision.

A caron is a diacritic mark (◌̌) commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.

<span class="mw-page-title-main">Ya (Cyrillic)</span> Cyrillic letter

Ya or Ja is a letter of the Cyrillic script, the civil script variant of Old Cyrillic Little Yus, and possibly Iotated A. Among modern Slavic languages, it is used in the East Slavic languages and Bulgarian. It is also used in the Cyrillic alphabets used by Mongolian and many Uralic, Caucasian and Turkic languages of the former Soviet Union.

A phonemic orthography is an orthography in which the graphemes correspond to the language's phonemes. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme–phoneme correspondence can be expected in orthographies based on alphabetic writing systems, but they differ in how complete this correspondence is. English orthography, for example, is alphabetic but highly nonphonemic; it was once mostly phonemic during the Middle English stage, when the modern spellings originated, but spoken English changed rapidly while the orthography was much more stable, resulting in the modern nonphonemic situation. On the contrary the Albanian, Serbian/Croatian/Bosnian/Montenegrin, Romanian, Italian, Turkish, Spanish, Finnish, Czech, Latvian, Esperanto, Korean and Swahili orthographic systems come much closer to being consistent phonemic representations.

<span class="mw-page-title-main">Ezh</span> Letter of the Latin alphabet

Ezh, also called the "tailed z", is a letter, notable for its use in the International Phonetic Alphabet (IPA) to represent the voiced postalveolar fricative consonant. For example, the pronunciation of "si" in vision and precision, or the ⟨s⟩ in treasure. See also the letter ⟨Ž⟩ as used in many Slavic languages, the Persian alphabet letter ⟨ژ⟩, the Cyrillic letter ⟨Ж⟩, and the Esperanto letter ⟨Ĵ⟩.

<span class="mw-page-title-main">Cyrillization</span> Transcription of languages into Cyrillic script

Cyrillization or Cyrillisation is the process of rendering words of a language that normally uses a writing system other than Cyrillic script into the Cyrillic alphabet. Although such a process has often been carried out in an ad hoc fashion, the term "cyrillization" usually refers to a consistent system applied, for example, to transcribe names of German, Chinese, or English people and places for use in Russian, Ukrainian, Serbian, Macedonian or Bulgarian newspapers and books. Cyrillization is analogous to romanization, when words from a non-Latin script-using language are rendered in the Latin alphabet for use

<span class="mw-page-title-main">I (Cyrillic)</span> Letter of the Cyrillic script

I is a letter used in almost all Cyrillic alphabets with the exception of Belarusian.

The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC. It is derived from the earlier Phoenician alphabet, and was the earliest known alphabetic script to have distinct letters for vowels as well as consonants. In Archaic and early Classical times, the Greek alphabet existed in many local variants, but, by the end of the 4th century BC, the Euclidean alphabet, with 24 letters, ordered from alpha to omega, had become standard and it is this version that is still used for Greek writing today.

<span class="mw-page-title-main">Eng (letter)</span> Letter of the Latin alphabet

Eng or engma is a letter of the Latin alphabet, used to represent a voiced velar nasal in the written form of some languages and in the International Phonetic Alphabet.

The romanization of Ukrainian, or Latinization of Ukrainian, is the representation of the Ukrainian language in Latin letters. Ukrainian is natively written in its own Ukrainian alphabet, which is based on the Cyrillic script. Romanization may be employed to represent Ukrainian text or pronunciation for non-Ukrainian readers, on computer systems that cannot reproduce Cyrillic characters, or for typists who are not familiar with the Ukrainian keyboard layout. Methods of romanization include transliteration and transcription.

In linguistics, vowel length is the perceived length of a vowel sound: the corresponding physical measurement is duration. In some languages vowel length is an important phonemic factor, meaning vowel length can change the meaning of the word, for example in Arabic, Estonian, Finnish, Fijian, Japanese, Kannada, Kyrgyz, Latin, Malayalam, Old English, Scottish Gaelic, Tamil and Vietnamese.

<span class="mw-page-title-main">Romanization of Hebrew</span> Transcription of Hebrew into the Latin alphabet

The Hebrew language uses the Hebrew alphabet with optional vowel diacritics. The romanization of Hebrew is the use of the Latin alphabet to transliterate Hebrew words.

<span class="mw-page-title-main">Romanization of Arabic</span> Representation of Arabic in Latin script

The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

Scientific transliteration, variously called academic, linguistic, international, or scholarly transliteration, is an international system for transliteration of text from the Cyrillic script to the Latin script (romanization). This system is most often seen in linguistics publications on Slavic languages.

Romanisation of Bengali is the representation of written Bengali language in the Latin script. Various romanisation systems for Bengali are used, most of which do not perfectly represent Bengali pronunciation. While different standards for romanisation have been proposed for Bengali, none has been adopted with the same degree of uniformity as Japanese or Sanskrit.

<span class="mw-page-title-main">Romanization of Persian</span> Representation of the Persian language with the Latin script

Romanization or Latinization of Persian is the representation of the Persian language with the Latin script. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.


  1. "Transliteration" . Retrieved 26 April 2021.
  2. Use of the acute accent to mark stress rather than tone is not formally IPA-compliant, but serves in this example to parallel orthography.
  3. See Koine Greek phonology.
  4. "Language log".
Listen to this article (4 minutes)
This audio file was created from a revision of this article dated 13 April 2005 (2005-04-13), and does not reflect subsequent edits.