Romanization of Bulgarian

Last updated
The new system is not always used properly. One of these signposts in Sofia shows the name of the district of Lozenets written according to the international scientific system of transliteration (c = ts), but in the other the name of the Irish journalist James Bourchier has been "relatinised" according to the official Bulgarian system (Dzheyms Baucher), even though the system obviously does not apply to names that have authentic Roman spellings. Signposts in Sofia.JPG
The new system is not always used properly. One of these signposts in Sofia shows the name of the district of Lozenets written according to the international scientific system of transliteration (c = ts), but in the other the name of the Irish journalist James Bourchier has been "relatinised" according to the official Bulgarian system (Dzheyms Baucher), even though the system obviously does not apply to names that have authentic Roman spellings.

Romanization of Bulgarian is the practice of transliteration of text in Bulgarian from its conventional Cyrillic orthography into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names and place names in foreign-language contexts, or for informal writing of Bulgarian in environments where Cyrillic is not easily available. Official use of romanization by Bulgarian authorities is found, for instance, in identity documents and in road signage. Several different standards of transliteration exist, one of which was chosen and made mandatory for common use by the Bulgarian authorities in a law of 2009. [1] [2] [3]



The various romanization systems differ with respect to 12 out of the 30 letters of the modern Bulgarian alphabet. The remaining 18 have constant mappings in all romanization schemes: а→a, б→b, в→v, г→g, д→d, е→e, з→z, и→i, к→k, л→l, м→m, н→n, о→o, п→p, р→r, с→s, т→t, ф→f. Differences exist with respect to the following:


Three different systems have been adopted officially by Bulgarian authorities at overlapping times.

ISO/R 9:1968

An older system in the tradition of common Slavic scientific transliteration was adopted by the Council of Orthography and Transcription of Geographical Names in Sofia in 1972 and subsequently by the UN in 1977. [4] It is identical to that codified in the ISO norm ISO/R 9:1968. This system uses diacritic letters (č, š, ž) as well as j and c. It was adopted in 1973 as the Bulgarian state standard BDS 1596:1973 which, although still valid formally [5] is no longer used in practice, [6] having been superseded by the 2009 Transliteration Act.


The second system was a French-oriented transliteration of personal and place names in the documents issued by the Bulgarian Ministry of Interior for travel abroad, used until 1999. [7]

English-oriented digraph systems

Systems based on a radically different principle, which avoids diacritics and is optimized for compatibility with English sound-letter correspondences, [8] have come into official use in Bulgaria since the mid-1990s. These systems characteristically use ch, sh, zh rather than č, š, ž, and y rather than j.


One such system was proposed in Danchev et al.'s English Dictionary of Bulgarian Names of 1989. [9]

Streamlined System

A similar system (differing from the former in the treatment of letters ъ, у, and digraphs ай, ей, ой and уй), called the "Streamlined System" by Ivanov (2003) [8] and Gaidarska (1998), [10] was adopted in 1995 [11] for use in Bulgarian-related place names in Antarctica by the Antarctic Place-names Commission of Bulgaria. [12] Another system along similar lines, differing from the Antarctic one only in the treatment of ц (ts vs. c), was adopted by the Bulgarian authorities for use in identity documents in 1999; [8] after an amendment in 2000, the official Bulgarian system became identical with that of the Antarctica Commission.

The new official Bulgarian system does not allow for unambiguous mapping back into Cyrillic, since unlike most other systems it does not distinguish between ъ and а (both rendered as a). It also does not distinguish between the digraph values of zh=ж, sh=ш and the value of the same Roman strings in rendering accidental clusters of separate Cyrillic letters zh=зх and sh=сх, as they occur in words like изход (izhod) or схема (shema). [6]

New Orthographic Dictionary system

A modification of the system using a diacritic was proposed in the authoritative New Orthographic Dictionary of the Bulgarian Language [13] in 2002, with ъ rendered as ă rather than a. However, that proposal was not adopted for official usage, and failed to become established in popular practice.

Streamlined System with -ia-exception

An exception to the rules was introduced by the Bulgarian authorities in 2006, mandating the transliteration of word-final -ия as -ia rather than -iya in given names and geographical names (such as Ilia, Maria and Bulgaria, Sofia, Trakia etc.). [14] [15] [16] In 2009, a law passed by the Bulgarian parliament made this system mandatory for all official use and some types of private publications, expanding also the application of the ia-exception rule to all -ия in word-final position. [1]

The Streamlined system was adopted by UN [17] [18] in 2012, and by BGN and PCGN [19] [20] in 2013.

According to Arenstein, “The international roots of the Bulgarian romanization system strike at the core of one of romanization’s most perplexing paradoxes: an impulse to redefine and distinguish national identity while also ensuring the accessibility of that identity to outside groups. In other words, instilling nationalism with a sense of internationalism.” [3]

Streamlined System with unambiguous reverse mapping

A variant of the Streamlined System allowing for unambiguous mapping back into Cyrillic was proposed by Ivanov, Skordev and Dobrev in 2010 to be used in cases when the retrieval of the original Cyrillic forms is essential. [6] For that purpose, certain Cyrillic letters and combinations of letters are transliterated as follows: ъ→`a, ь→`y, зх→z|h, йа→y|a, йу→y|u, сх→s|h, тс→t|s, тш→t|sh, тщ→t|sht, шт→sh|t, шц→sh|ts, ия (in final position, if the ia-exception rule is applied) →i|a. The standard transliteration form of a given text is obtained from its unambiguously reversible one by simply removing the additional symbols ` and |.


Systems along similar lines to the new official Bulgarian system, though with differences regarding the letters х, ъ, ь, ю and я, have also been in use in the ALA-LC Romanization scheme of the Library of Congress, British Standard 2959:1958, the now-superseded 1952 BGN/PCGN romanization of the United States and British geographic naming institutions, and the 1917 system of the British Academy. [21]

The ISO 9 standard, in its 1995 version, has introduced another romanization system that works with a consistent one-to-one reversible mapping, resorting to rare diacritic combinations such as â,û,ŝ.

The GOST 7.79-2000 "Rules of transliteration of Cyrillic script by Latin alphabet" contains an unambiguous and reversible ASCII-compatible transliteration system for Bulgarian: й→j, х→x, ц→c or cz, ч→ch, ш→sh, щ→sth, ъ→a`, ь→`, ю→yu, я→ya.

Archaic letters

The archaic Cyrillic letters ѣ and ѫ, which were part of the pre-1945 orthography of Bulgarian, are variously transcribed as ⟨i͡e, e⟩, as ⟨ya, ě⟩, and as ⟨u̐, ŭǎ⟩, respectively, in the ALA/LC, BGN/PCGN and ISO 9 standards.

Comparison table

Cyrillic ISO 9
ISO 9 (1968)
ALA/LC [22] British Standard

(1958) [23]

Official Bulgarian (2006);
UN (2012);
BGN/PCGN (2013)
Danchev [9]
а a
б b
в v
г g
д d
е e
ж žzh
з z
и i
й jĭyy, i
к k
л l
м m
н n
о o
п p
р r
с s
т t
у uou
ф f
х hkhh
ц ct͡sts
ч čch
ш šsh
щ ŝštsht
ъ ǎŭ (ʺ) [24] ŭa [25] u
ь ʹjʹʼ, 'ʼy
ю ûjui͡uyu
я âjai͡aya [26]
Archaic letters
ѫ ǎūŭ
ѣ ěi͡eêe, ya
Cyrillic ISO 9
ISO 9 (1968)
Official Bulgarian (2006);
UN (2012);
BGN/PCGN (2013)
Danchev [9]

Romanization sample

Bulgarian CyrillicOfficial transliterationEnglish
Всички хора се раждат свободни и равни по достойнство и права. Те са надарени с разум и съвест и следва да се отнасят помежду си в дух на братство.Vsichki hora se razhdat svobodni i ravni po dostoynstvo i prava. Te sa nadareni s razum i savest i sledva da se otnasyat pomezhdu si v duh na bratstvo.All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

Personalized and stylized writing

Some people and companies prefer to use or retain personalized spellings of their own names in Latin. Examples are politicians Ivan Stancioff (instead of "Stanchov") and Simeon Djankov [27] (instead of "Dyankov"), and beer brand Kamenitza (instead of Kamenitsa). The freedom of using different Roman transliterations of personal names is guaranteed by Article 2(2) of the governmental 2010 Regulation for Issuing of Bulgarian Personal Documents. [28]

Informal writing

Sometimes, especially in e-mail or text messaging, the Cyrillic alphabet is not available and people are forced to write in Roman script. This often does not follow the official or any other of the standards listed above, but rather is an idiosyncratic Bulgarian form of text speak. While most letters are straightforward, several can take different forms. The letter variants listed below are often used interchangeably with some or all of the above standards, often in the same message.

Cyrillic letterLatin variantExamplesNotes
жj, zh, z, (rarely: w)plaj (плаж, beach)
kozha (кожа, skin)
vezliv (вежлив, polite)
j - the sound of ж is represented by j in French, the English sound of j is also similar
zh - official transliteration
z - shortened version of zh or stripped version of ž
йi, y, jiod (йод, iodine)
mayoneza (майонеза, mayonnaise)
Jordan (Йордан, name Yordan)
j - more rarely used, but especially in words that are foreign to Bulgarian and with j in Latin script
цccarevica (царевица, corn)c almost exclusively represents ц despite the official transliteration of the Cyrillic letter being ts
ч4, ch4ovek (човек, human)In Bulgarian the number 4 is chetiri (четири); additionally and perhaps more importantly the glyph ⟨4⟩ looks similar to ⟨ч⟩; this is also used in Volapuk encoding
ш6, sh, (rarely: w)ka6on (кашон, box)In Bulgarian the number 6 is shest (шест)
щ6t, sht, (rarely: 7)sno6ti (снощи, yesterday at night / yesternight)6t - a combination 6+t to represent the sound of щ
7 - since in the Cyrillic alphabet щ follows ш, 7 can be used as it follows 6 (ш)
ъa, u, y, 1, (rarely: @, `)sanuva (сънува, dreams)
pyzel (пъзел, puzzle)
v1n (вън, outside)
1 - the number may resemble the letter ъ.
юu, y, yu, ju, iuzumbyul (зюмбюл, hyacinth)As a single letter diphthong the letter ю has many variations
яq, ya, ja, iakonqk (коняк, cognac)As a single letter diphthong the letter я has many variations but the most common is the single letter q as it resembles я.

There is no set rule, and people often vary the combinations within a single message, so that "ъ" may be presented as "u", "a" or "y" in three adjacent words, and "щ" can be "sht" in one word, and "6t" in the next, and "ю" may be written differently in the same word. Conversely, "j" could be used to represent "й", "ж" and even "дж" in adjacent words, while "y" can be used for "ъ" in one word and for "й" in the next.

This unofficial email/SMS language is often referred to as "shlyokavitsa" [29] [30] [31] [32] [33] [34] The use of Latinised Bulgarian, while ubiquitous in personal communication, is frowned upon in certain internet contexts, and many websites' comment sections and internet forums have rules stating that posts in Roman script will be deleted. [35] [36] [37]

See also


  1. 1 2 State Gazette # 19, 13 March 2009. ISSN   0205-0900 (in Bulgarian)
  2. G. Selvelli. Su alcuni aspetti ideologici dei sistemi di traslitterazione degli alfabeti cirillici nei Balcani. Studi Slavistici XII (2015). pp. 159–180. (in Italian)
  3. 1 2 Arenstein, B. (2018). "Scripted History: Hebrew Romanization in Interwar British Mandate Palestine" (PDF). Columbia University. Retrieved November 19, 2019.
  4. UNGEGN Working Group on Romanization Systems, ("Bulgarian" version 3.0, March 2009)
  5. BDS 1596:1973 Transliteration of Bulgarian words with Latin characters. Bulgarian Institute for Standardization (BDS) website.
  6. 1 2 3 L. Ivanov, D. Skordev and D. Dobrev. The New National Standard for the Romanization of Bulgarian. Mathematica Balkanica. New Series Vol. 24, 2010, Fasc. 1-2. pp.121-130. ISSN   0205-3217
  7. Focus News Agency. 2 February 2008. (in Bulgarian)
  8. 1 2 3 L.L. Ivanov, On the Romanization of Bulgarian and English, Contrastive Linguistics, XXVIII, 2003, 2, pp. 109-118. ISSN   0204-8701; Errata, id., XXIX, 2004, 1, p. 157.
  9. 1 2 3 A. Danchev, M. Holman, E. Dimova and M. Savova. An English Dictionary of Bulgarian Names: Spelling and Pronunciation. Sofia: Nauka i Izkustvo Publishers, 1989. 288 pp.
  10. M. Gaidarska. The Current State of the Transliteration of Bulgarian Names into English in Popular Practice, Contrastive Linguistics, XXII, 1998, 112, pp. 69-84. ISSN   0204-8701
  11. Inventaire des outils de romanisation. Bibliothèque et Archives Canada. Ottawa 2006. (in French) (2019 edition, in English)
  12. L.L. Ivanov, Toponymic Guidelines for Antarctica , Antarctic Place-names Commission of Bulgaria, Sofia, 1995.
  13. V. Stankov (ed.). New Orthographic Dictionary of the Bulgarian Language. Bulgarian Academy of Sciences. Sofia: Hazel Publishers, 2002. p. 51. ISBN   978-954-8283-61-8
  14. Regulations for the issuing of Bulgarian identity documents (Amendment), State Gazette #83 of 2006. ISSN   0205-0900 (in Bulgarian)
  15. Ministry of Regional Development and Public Works, Ordinance #3 of 26 October 2006 on the Transliteration of the Bulgarian Geographical Names in Latin Alphabet, State Gazette # 94, 21 November 2006. ISSN   0205-0900 (in Bulgarian)
  16. United Nations Document E/CONF.98/CRP.71. 17 August 2007.
  17. Report on the Current Status of United Nations Romanization Systems for Geographical Names. Compiled by the UNGEGN Working Group on Romanization Systems, February 2013.
  18. Romanization System In Bulgaria. Tenth United Nations Conference on the Standardization of Geographical Names. New York, 2012.
  19. Romanization system for Bulgarian: BGN/PCGN 2013 System. National Geospatial-Intelligence Agency, September 2014.
  20. Romanization system for Bulgarian: BGN/PCGN 2013 Agreement.
  21. British Academy. Transliteration of Slavonic: Report of the Committee for the Transliteration into English of Words Belonging to Russian and Other Slavonic Languages. Proceedings of the British Academy, Vol. VIII (1917). 20 pp.
  22. "Bulgarian" (PDF). ALA-LC Romanization Tables (The Library of Congress). Retrieved 2021-02-26.{{cite web}}: CS1 maint: url-status (link)
  23. BS 2979:1958 Transliteration of Cyrillic and Greek Characters, BSI Group, 30 July 1958, Wikidata   Q105693940
  24. ALA-LC: Ъ in final position is romanized as ʺ (mainly found in pre-1945 texts).
  25. except in the word Bulgaria (2006 official system)
  26. except in word-final -ия (2006 official system)
  27. Simeon Djankov will give a lecture at the London Business School, Ministry of Finance (Bulgaria), 10 Feb 2012. Retrieved Mar 2013.
  28. Regulation for Issuing of Bulgarian Personal Documents. Government Decree #13 of 8 February 2010.
  29. , or "maymunitsa", or "Metodievitsa". The form of the word shlyòkavitsa follows the pattern of alphabet names, such as "Kirilitsa" (Cyrillic), "Glagolitsa" (Glagolitic) and "Latinitsa" (Roman script), but is also a synonym for a low-quality rakia (bg:shlyokavitsa).
  30. Thoughts on Cyrillic and Shlyokavitsa (in Bulgarian), Dnevnik, 7 Feb 2013. Retrieved Mar 2013.
  31. From Glagolitic to Shlyokavitsa (in Bulgarian), Ivan Popov's Blog, 15 Dec 2011. Retrieved Mar 2013.
  32. On the Shlyokavitsa initiative (in Bulgarian),, April 2004. Retrieved Mar 2013.
  33. "Shlyokavitsa Initiative" homepage (in Bulgarian), Retrieved Mar 2013.
  34. Satirical Cartoons – "Directorate for the Utilization of Sofia" – Щ deathbed (6, t: "We're here, dad!", Щ: "My children..."), Jul 2011; Alphabet scroll (= "а б в...ц ч...ш щ"), May 2011. Retrieved Mar 2013.
  35. Chitanka literature website (in Bulgarian), Retrieved Mar 2013.
  36. Ovo forum (rule 1.a) (in Bulgarian),, 28 Aug 2010. Retrieved Mar 2013.
  37. Sheep Place/Bubbalog blog rules (in Bulgarian). Retrieved Mar 2013.

Related Research Articles

Romanization Transliteration or transcription to Latin characters

Romanization or romanisation, in linguistics, is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into phonemic transcription, which records the phonemes or units of semantic meaning in speech, and more strict phonetic transcription, which records speech sounds with precision.

Ŝ or ŝ is a consonant in Esperanto orthography, representing the sound.

Russian alphabet Alphabet that uses letters from the Cyrillic script

The Russian alphabet is used to write Russian words. It comes from the Cyrillic script, which was devised in the 9th century for the first Slavic literary language, Old Slavonic. Initially an old variant of the Bulgarian alphabet, it became used in the Kievan Rus' since the 10th century to write what would become the Russian language.

Bulgarian alphabet Writing system of the Bulgarian language

The Bulgarian Cyrillic alphabet is used to write the Bulgarian language. The Cyrillic alphabet was originally developed in the First Bulgarian Empire during the 9th – 10th century AD at the Preslav Literary School.

Romanization of Russian Romanization of the Russian alphabet

The romanization of the Russian language, aside from its primary use for including Russian names and words in text written in a Latin alphabet, is also essential for computer users to input Russian text who either do not have a keyboard or word processor set up for inputting Cyrillic, or else are not capable of typing rapidly using a native Russian keyboard layout (JCUKEN). In the latter case, they would type using a system of transliteration fitted for their keyboard layout, such as for English QWERTY keyboards, and then use an automated tool to convert the text into Cyrillic.

Shcha Cyrillic letter

Shcha, Shta or Sha with descender is a letter of the Cyrillic script. In Russian, it represents the voiceless alveolo-palatal fricative, similar to the pronunciation of ⟨sh⟩ in sheep. In Ukrainian and Rusyn, it represents the consonant cluster. In Bulgarian, it represents the consonant cluster. Other non-Slavic languages written in Cyrillic use this letter to spell the few loanwords that use it or foreign names; it is usually pronounced and is often omitted when teaching those languages.

The romanization of Ukrainian, or Latinization of Ukrainian, is the representation of the Ukrainian language in Latin letters. Ukrainian is natively written in its own Ukrainian alphabet, which is based on the Cyrillic script. Romanization may be employed to represent Ukrainian text or pronunciation for non-Ukrainian readers, on computer systems that cannot reproduce Cyrillic characters, or for typists who are not familiar with the Ukrainian keyboard layout. Methods of romanization include transliteration and transcription.

Ukrainian alphabet Alphabet that uses letters from the Cyrillic script

The Ukrainian alphabet is the set of letters used to write Ukrainian, which is the official language of Ukraine. It is one of several national variations of the Cyrillic script. The modern Ukrainian alphabet consists of 33 letters.

Hard sign Letter of the Cyrillic script

The letter Ъ of the Cyrillic script is known as er golyam in the Bulgarian alphabet, as the hard sign in the modern Russian and Rusyn alphabets, as the debelo jer in pre-reform Serbian orthography, and as ayirish belgisi in the Uzbek Cyrillic alphabet. The letter is called back yer or back jer and yor or jor in the pre-reform Russian orthography, in Old East Slavic, and in Old Church Slavonic.

Romanization of Arabic Representation of Arabic in Latin script

The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

Kazakh alphabets Alphabets used to write the Kazakh language

Three alphabets are used to write the Kazakh language: in the Cyrillic, Latin and Arabic scripts. The Cyrillic script is used in Kazakhstan and Mongolia. An October 2017 Presidential Decree in Kazakhstan ordered that the transition from Cyrillic to a Latin script be completed by 2025. The Arabic script is used in parts of China, Iran and Afghanistan.

Scientific transliteration, variously called academic, linguistic, international, or scholarly transliteration, is an international system for transliteration of text from the Cyrillic script to the Latin script (romanization). This system is most often seen in linguistics publications on Slavic languages.

BGN/PCGN romanization system for Russian is a method for romanization of Cyrillic Russian texts, that is, their transliteration into the Latin alphabet as used in the English language.

The romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.

Tajik alphabet Alphabet used to write the Tajik language

The Tajik language has been written in three alphabets over the course of its history: an adaptation of the Perso-Arabic script, an adaptation of the Latin script and an adaptation of the Cyrillic script. Any script used specifically for Tajik may be referred to as the Tajik alphabet, which is written as алифбои тоҷикӣ in Cyrillic characters, الفبای تاجیکی with Perso-Arabic script and alifʙoji toçikī in Latin script.

Romanization of Persian Representation of the Persian language with the Latin script

Romanization of Persian or Latinization of Persian is the representation of the Persian language with the Latin script. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.

Informal or ad hoc romanizations of Cyrillic have been in use since the early days of electronic communications, starting from early e-mail and bulletin board systems. Their use faded with the advances in the Russian internet that made support of Cyrillic script standard, but resurfaced with the proliferation of instant messaging, SMS and mobile phone messaging in Russia.

Cyrillic alphabets Related alphabets based on Cyrillic scripts

Numerous Cyrillic alphabets are based on the Cyrillic script. The early Cyrillic alphabet was developed in the First Bulgarian Empire during the 9th century AD at the Preslav Literary School by Saint Clement of Ohrid and Saint Naum and replaced the earlier Glagolitic script developed by the Byzantine theologians Cyril and Methodius. It is the basis of alphabets used in various languages, past and present, in parts of Southeastern Europe and Northern Eurasia, especially those of Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages. About half of them are in Russia. Cyrillic is one of the most-used writing systems in the world.

The Komi language, a Uralic language spoken in the north-eastern part of European Russia, has been written in several different alphabets. Currently, Komi writing uses letters from the Cyrillic script. There have been five distinct stages in the history of Komi writing:

Since its inception in the 18th century and up to the present, it is based on the Cyrillic alphabet to write the Udmurt language. Attempts were also made to use the Latin alphabet to write the Udmurt language. In its modern form, the Udmurt alphabet was approved in 1937.