Scientific transliteration of Cyrillic

Last updated

Scientific transliteration, variously called academic, linguistic, international, or scholarly transliteration, is an international system for transliteration of text from the Cyrillic script to the Latin script (romanization). This system is most often seen in linguistics publications on Slavic languages.

Contents

Scientific transliteration of Cyrillic into Latin was first introduced in 1898 as part of the standardization process for the Preußische Instruktionen (PI) in 1899.

Details

The scientific transliteration system is roughly as phonemic as is the orthography of the language transliterated. The deviations are with щ, where the transliteration makes clear that two phonemes are involved, and џ, where it fails to represent the (monophonemic) affricate with a single letter. The transliteration system is based on the Gaj's Latin alphabet used in Serbo-Croatian, in which each letter corresponds directly to a Cyrillic letter in Bosnian, Montenegrin and Serbian official standards, and was heavily based on the earlier Czech alphabet. The Cyrillic letter х, representing the sound [x] as in Bach, was romanized h in Serbo-Croatian, but in German-speaking countries the native digraph ch was used instead. [1] It was codified in the 1898 Prussian Instructions for libraries, or Preußische Instruktionen (PI), which were adopted in Central Europe and Scandinavia. Scientific transliteration can also be used to romanize the early Glagolitic alphabet, which has a close correspondence to Cyrillic.

Scientific transliteration is often adapted to serve as a phonetic alphabet. [2]

Scientific transliteration was the basis for the ISO 9 transliteration standard. While linguistic transliteration tries to preserve the original language's pronunciation to a certain degree, the latest version of the ISO standard (ISO 9:1995) has abandoned this concept, which was still found in ISO/R 9:1968 and is now restricted to a one-to-one mapping of letters. It thus allows for unambiguous reverse transliteration into the original Cyrillic text and is language-independent.

The previous official Soviet romanization system, GOST 16876-71, is also based on scientific transliteration but used Latin h for Cyrillic х instead of Latin x or ssh and sth for Cyrillic Щ, and had a number of other differences. Most countries using Cyrillic script now have adopted GOST 7.79 instead, which is not the same as ISO 9 but close to it.

Representing all of the necessary diacritics on computers requires Unicode, Latin-2, Latin-4, or Latin-7 encoding.

Table

Prussian Instructions, scientific transliteration, and ISO 9
Cyrillicscientific transliterationPI [3] ISO 9
Church
Slavonic
BulgarianRussianBelarusianUkrainianSerbianMacedonian
А а aaaaaaaaa
Б б bbbbbbbbb
В в vvvvvvvvv
Г г ggghhggg (h BEUK)g
Ґ ґ g [lower-alpha 1] gġg̀ (g BEUK)
Д д ddddddddd
Ѓ ѓ ǵ (ģ)ǵ
Ђ ђ đ (dj)ďđ
Е е eeeeeeee
Ё ё ëëëë
Є є ejejeê
Ж ж žžžžžžžžž
З з zzzzzzzzz
Ѕ ѕ ʒ (dz)dz
И и iiiyiiii
I і ii [lower-alpha 1] iiīì
Ї ї iji (ï)ï
Й й jjjjjj
Ј ј jjjǰ
К к kkkkkkkkk
Л л lllllllll
Љ љ lj (ļ/ľ)lj (ļ/ľ)ľľ
М м mmmmmmmmm
Н н nnnnnnnnn
Њ њ nj (ň/ń/ņ)nj (ň/ń/ņ)ńň
О о ooooooooo
П п ppppppppp
Р р rrrrrrrrr
С с sssssssss
Т т ttttttttt
Ќ ќ ḱ (ķ)
Ћ ћ ǵććć
У у uuuuuuuu
ОУ оу u
Ў ў ŭ (w)ŭ
Ф ф fffffffff
Х х xhx (ch)x (ch)x (ch)hhchh
Ц ц ccccccccc
Ч ч ččččččččč
Џ џ dž (ģ)ǵ
Ш ш ššššššššš
Щ щ šč (št)štščščšč (št BG)ŝ
Ъ ъ ъ (ǔ)ǎʺ- [lower-alpha 2] BG)ʺ
Ы ы y (ū)yyy [lower-alpha 1] yy
Ь ь ь (ǐ)jʹʹʹʹʹ
Ѣ ѣ ěě [lower-alpha 1] ě [lower-alpha 1] ě [lower-alpha 1] ě [lower-alpha 1] ěě
Э э èèėè
Ю ю jujujujujujuû
Я я jajajajajaâ
 ʼ  ʼ
Ѡ ѡ o, ô
Ѧ ѧ ę
Ѩ ѩ
Ѫ ѫ ǫă [lower-alpha 1] ăǎ
Ѭ ѭ [lower-alpha 1]
Ѯ ѯ ks
Ѱ ѱ ps
Ѳ ѳ th (θ)f [lower-alpha 1] f [lower-alpha 1] f [lower-alpha 1]
Ѵ ѵ ü(i) [lower-alpha 1] (i) [lower-alpha 1] (i) [lower-alpha 1]
Ѥ ѥ je
Ꙗ ꙗ ja
  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 archaic letter
  2. Indicated by - (hyphen) if medial, disregarded if final.

( ) Letters in parentheses are older or alternative transliterations. Ukrainian and Belarusian apostrophe are not transcribed. The early Cyrillic letter koppa (Ҁ, ҁ) was used only for transliterating Greek and its numeric value and was thus omitted. Prussian Instructions and ISO 9:1995 are provided for comparison.

Unicode encoding is:

See also

Notes

  1. Hans H. Wellisch (1978), The Conversion of Scripts: Its Nature, History, and Utilization, New York City: Wiley, p. 257, Wikidata   Q104231343
  2. Timberlake 2004, p 24.
  3. Hans H. Wellisch (1978), The Conversion of Scripts: Its Nature, History, and Utilization, New York City: Wiley, pp. 260–262, Wikidata   Q104231343
  4. 1 2 The templates {{ softsign }} and {{ hardsign }} may be used for the proper character.

Related Research Articles

<span class="mw-page-title-main">Cyrillic script</span> Writing system used for various Eurasian languages

The Cyrillic script, Slavonic script or simply Slavic script is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Eastern Europe, the Caucasus, Central Asia, North Asia, and East Asia, and used by many other minority languages.

<span class="mw-page-title-main">Romanization</span> Transliteration or transcription to Latin letters

In linguistics, romanization or romanisation is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into phonemic transcription, which records the phonemes or units of semantic meaning in speech, and more strict phonetic transcription, which records speech sounds with precision.

Uyghur is a Turkic language with a long literary tradition spoken in Xinjiang, China by the Uyghurs. Today, the Uyghur Arabic alphabet is the official writing system used for Uyghur in Xinjiang, whereas other alphabets like the Uyghur Latin and Uyghur Cyrillic alphabets are still in use outside China, especially in Central Asia.

ISO 9 is an international standard establishing a system for the transliteration into Latin characters of Cyrillic characters constituting the alphabets of many Slavic and non-Slavic languages.

<span class="mw-page-title-main">Romanization of Russian</span> Romanization of the Russian alphabet

The romanization of the Russian language, aside from its primary use for including Russian names and words in text written in a Latin alphabet, is also essential for computer users to input Russian text who either do not have a keyboard or word processor set up for inputting Cyrillic, or else are not capable of typing rapidly using a native Russian keyboard layout (JCUKEN). In the latter case, they would type using a system of transliteration fitted for their keyboard layout, such as for English QWERTY keyboards, and then use an automated tool to convert the text into Cyrillic.

<span class="mw-page-title-main">Shcha</span> Cyrillic letter

Shcha, Shta, Scha, Šče or Sha with descender is a letter of the Cyrillic script. In Russian, it represents the long voiceless alveolo-palatal fricative, similar to the pronunciation of ⟨sh⟩ in Welsh-sheep. In Ukrainian and Rusyn, it represents the consonant cluster, something like cash-chest. In Bulgarian, it represents the consonant cluster. Most other non-Slavic languages written in Cyrillic use this letter to spell the few loanwords that use it or foreign names; it is usually pronounced, an approximation of the Russian pronunciation of the letter, and is often omitted when teaching those languages.

The romanization of Ukrainian, or Latinization of Ukrainian, is the representation of the Ukrainian language in Latin letters. Ukrainian is natively written in its own Ukrainian alphabet, which is based on the Cyrillic script. Romanization may be employed to represent Ukrainian text or pronunciation for non-Ukrainian readers, on computer systems that cannot reproduce Cyrillic characters, or for typists who are not familiar with the Ukrainian keyboard layout. Methods of romanization include transliteration and transcription.

<span class="mw-page-title-main">Ukrainian alphabet</span> Alphabet that uses letters from the Cyrillic script

The Ukrainian alphabet is the set of letters used to write Ukrainian, which is the official language of Ukraine. It is one of several national variations of the Cyrillic script. It comes from the Cyrillic script, which was devised in the 9th century for the first Slavic literary language, called Old Slavonic. In the 10th century, it became used in Kievan Rus' to write Old East Slavic, from which the Belarusian, Russian, Rusyn, and Ukrainian alphabets later evolved. The modern Ukrainian alphabet has 33 letters in total: 21 consonants, 1 semivowel, 10 vowels and 1 palatalization sign. Sometimes the apostrophe (') is also included, which has a phonetic meaning and is a mandatory sign in writing, but is not considered as a letter and is not included in the alphabet.

Ayin is the sixteenth letter of the Semitic scripts, including Phoenician ʿayin 𐤏, Hebrew ʿayin ע, Aramaic ʿē 𐡏, Syriac ʿē ܥ, and Arabic ʿayn ع.

<span class="mw-page-title-main">Romanization of Bulgarian</span> Transliteration of Bulgarian text

Romanization of Bulgarian is the practice of transliteration of text in Bulgarian from its conventional Cyrillic orthography into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names and place names in foreign-language contexts, or for informal writing of Bulgarian in environments where Cyrillic is not easily available. Official use of romanization by Bulgarian authorities is found, for instance, in identity documents and in road signage. Several different standards of transliteration exist, one of which was chosen and made mandatory for common use by the Bulgarian authorities in a law of 2009.

<span class="mw-page-title-main">Hard sign</span> Letter of the Cyrillic script

The letter Ъ ъ of the Cyrillic script is known as er goläm in the Bulgarian alphabet, as the hard sign in the modern Russian and Rusyn alphabets, as the debelo jer in pre-reform Serbian orthography, and as ayirish belgisi in the Uzbek Cyrillic alphabet. The letter is called back yer or back jer and yor or jor in the pre-reform Russian orthography, in Old East Slavic, and in Old Church Slavonic.

<span class="mw-page-title-main">Romanization of Arabic</span> Representation of Arabic in Latin script

The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

<span class="mw-page-title-main">Kazakh alphabets</span> Alphabets used to write the Kazakh language

Three alphabets are used to write Kazakh: the Cyrillic, Latin and Arabic scripts. The Cyrillic script is used in Kazakhstan and Mongolia. An October 2017 Presidential Decree in Kazakhstan ordered that the transition from Cyrillic to a Latin script be completed by 2031. The Arabic script is used in Saudi Arabia, Iran, Afghanistan, and parts of China.

Romanization or Latinization of Belarusian is any system for transliterating written Belarusian from Cyrillic to the Latin alphabet.

BGN/PCGN romanization system for Russian is a method for romanization of Cyrillic Russian texts, that is, their transliteration into the Latin alphabet as used in the English language.

GOST 16876-71 is a romanization system devised by the National Administration for Geodesy and Cartography of the Soviet Union. It is based on the scientific transliteration system used in linguistics. GOST was an international standard so it included provision for a number of the languages of the Soviet Union. The standard was revised twice in 1973 and 1980 with minor changes.

The American Library Association and Library of Congress Romanization Tables for Russian, or the Library of Congress system, are a set of rules for the romanization of Russian-language text from Cyrillic script to Latin script.

The romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.

<span class="mw-page-title-main">Cyrillic alphabets</span> Related alphabets based on Cyrillic scripts

Numerous Cyrillic alphabets are based on the Cyrillic script. The early Cyrillic alphabet was developed in the 9th century AD and replaced the earlier Glagolitic script developed by the Bulgarian theologians Cyril and Methodius. It is the basis of alphabets used in various languages, past and present, Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages. About half of them are in Russia. Cyrillic is one of the most-used writing systems in the world. The creator is Saint Clement of Ohrid from the Preslav literary school in the First Bulgarian Empire.

The Lezgin language has been written in several different alphabets over the course of its history. These alphabets have been based on three scripts: Arabic script, Latin script, and Cyrillic script.

References