Romanization of Armenian

Last updated

There are various systems of romanization of the Armenian alphabet .


Transliteration systems

Hübschmann-Meillet (1913)

In linguistic literature on Classical Armenian, the commonly used transliteration is that of Hübschmann-Meillet (1913). It uses a combining dot above mark U+0307 to express the aspirates, ṫ, cḣ, č̇, ṗ, k̇. Some documents were published using a similar Latin dasia diacritic U+0314, a turned comma combining above the letter, which is easier to distinguish visually in t̔, ch̔, č̔, p̔, k̔.

However, the correct support of these combining diacritics has been poor for long in the past and was not very common on many usual applications and computer fonts or rendering systems, so some documents have been published using, as possible fallbacks, their spacing variants such as the modifier letter dot above ˙ U+02D9 written after the letter instead of above it, or the turned comma ʻ U+02BB written after the letter instead of above it — or sometimes the spacing Greek spiritus asper ῾ U+1FFE, or the spacing grave accent ˋ U+02CB even if it is too flat, or even the ASCII backquote ` U+0060, or the ASCII apostrophe-quote ' U+0027 when there was no confusion possible.

But the preferred character today is the modifier letter left half-ring ʿ U+02BF, or the modifier letter ʻ U+02BB or ʽ U+02BD, which is the spacing variant of the dasia diacritic (it is also historically a correct adaptation to the Latin script of the Greek spiritus asper, see rough breathing) with the advantage of having excellent support in many Latin fonts because it is also a simple reversed.

Also, some ambiguities were not solved to work with modern vernacular Armenian, which has two dialects, both using two possible orthographies (besides, the modern orthography is used for Classical Armenian in modern publications).

BGN/PCGN (1981)

BGN/PCGN romanization (1981) uses a right single quotation mark to express aspirates, t’, ch’, ts’, p’, k’, the opposite of the original rough breathing diacritic. [1]

This romanization was taken up by ISO (1996) and is considered obsolete. This system is a loose transcription and is not reversible (without using dictionary lookup), notably for single Armenian letters romanized into digraphs (these non-reversible, or ambiguous romanizations are shown in a red cell in the table below).

Some Armenian letters have several romanizations, depending on their context:

ISO 9985 (1996)

ISO 9985 (1996) is the international standard for transliteration of the modern Armenian alphabet. Like with the BGN/PCGN romanization, the right single quotation mark is used to denote most of the aspirates.

This system is reversible because it avoids the use of digraphs and returns to the Hübschmann-Meillet (however some diacritics for vowels are also modified).

The aspirate series is not treated consistently in ISO 9985: while pʼ, tʼ, cʼ, kʼ are romanized with an apostrophe-like mark, aspirated չ č is not, and instead its unaspirated counterpart ճ is transcribed č̣ with an underdot appearing nowhere else in the system. Note that in this scheme, č (signifying չ) collides with the Hübschmann-Meillet transliteration (where it signifies ճ).

This system is recommended for international bibliographic text interchange (it is also the base of simplified romanizations found to localize the Armenian toponomy of for transliterating human names), where it works very well with the common ISO/IEC 8859-2 Latin encoding used in Central Europe.

ALA-LC (1997)

ALA-LC romanization (1997) is largely compatible with BGN/PCGN, but returns to expressing aspirates with a left single quotation mark (in fact the modifier letter left half-ring ʿ U+02BF, US-MARC hexadecimal code B0, that is also used to denote ayin in Arabic, so some documents may contain either the preferred left half-ring, or sometimes the ASCII backquote ` U+0060).

This standard changes the transliteration scheme used between Classical/Eastern Armenian and Western Armenian for the Armenian consonants represented by swapping the pairs b vs. p, g vs. k, d vs. t, dz vs. ts and ch vs. j.

In all cases, and to make this romanization less ambiguous and reversible,

ASCII-only input methods

On various Armenian websites, non-standard transliterators have appeared to allow inputting modern Western or Eastern Armenian text using ASCII only characters. It is not a proper transliterator but can be convenient for users that don't have Armenian keyboards.

Despite these input methods are commonly used, they are not obeying to any approved international or Armenian standard, so they are not recommended for the romanization of Armenian. Note that the input methods recognize the Latin digraphs zh, dz, gh, tw, sh, vo, ch, rr for Classic or Eastern Armenian, and zh, dz, tz, gh, vo, ch, rr for Western Armenian, but offer no way to disambiguate words where the digraphs should not be recognized.

Some Armenian letters are entered as Latin digraphs, and may also be followed by the input of an ASCII single quote (which acts as the only letter modifier recognized) but this quote does not always mean that the intended Armenian letter should be aspirated (this may be the reverse for the input ch'), it is also used as a vowel modifier. Due to ambiguities, texts must be corrected by entering an intermediate dummy character before entering the second Latin letter or quote, then removing the dummy character, so that the automatic input converter keeps the Armenian letters distinct.

Transliteration tables

Some Armenian letters have very different phonetic sounds between Classical or Eastern Armenian and Western Armenian, so that the usage of Armenian letters is different between the two sub-branches of the language.

This is made visible in the table below by coloring transliterations specific to Classical or Eastern Armenian on green background, and those for Western Armenian on blue background. Other letters are transliterated independently of the language branch. However, cells with red background contain transliterations that are context dependent (and may in some cases create ambiguities, only the ISO 9985 and Hübschmann-Meillet romanizations do not use any context-dependant ambiguous digraphs for transcribing simple Armenian letters that are not ligatures, but the former is inconsistent with its representation of aspirated consonants and incompatible with all other systems for a pair of letters).

Armenian scriptcapitalԱԲԳԴԵԶԷԸԹԺԻԼԽԾԿՀՁՂՃՄ
Romanization of Classical or Eastern Armenian ASCII inputabgdeze'y't'zhilxc'khdzghtwm
Hübschmann-Meilletêət̔, ṫžcjłč
ISO 9985 ēët’çġč̣
BGN/PCGN e, yeeyzhkhtsdzghch
ALA-LC e, yēětʿ
Romanization of Western Armenian ALA-LC pktdzgtsj
ASCII inputee'yt'xtz
Armenian scriptcapitalՅՆՇՈՉՊՋՌՍՎՏՐՑՒՓՔՕՖՈՒ 
minusculeյնշոչպջռսվտրցւփքօֆուև (եւ)
Romanization of Classical or Eastern Armenian ASCII inputynshvochpjrrsvtrcwp'k', qofuev
Hübschmann-Meilletšoč̔, č̇ǰc̔, ċp̔, ṗk̔, k̇ô
ISO 9985čc’p’k’òowew
BGN/PCGN sho, voch’jrrts’ouev, yev
ALA-LC y, hochʿtsʿpʿkʿōew, ev
Romanization of Western Armenian ALA-LC bchd
ASCII inputh'vochch'rrcp'k', qoev

Note that in the table above, the last two columns refer to digraphs, not isolated letters (however, they are considered letters in the Reformed orthography). However the last column displays the ligature that is used in the Classical orthography only as an isolated symbol for the short Armenian word ew (meaning and) and its derivations in a way similar to the ampersand (&) in the Latin script (in the Reformed orthography, it is also used at the middle and the end of words instead of եվ); the same transliteration to ew (classical Armenian) or ev (reformed orthography) will be used for the letters this ligature represents, when they are used as digraphs: it used to refer to the w consonant, now it refers to the v consonant.

Armenian script also uses some other digraphs that are often written as optional ligatures, in lowercase only (five of them are encoded in Unicode only for full roundtrip compatibility with some legacy encodings); when present, these ligatures (which are purely typographic and carry no semantic distinction in normal Armenian texts) must be romanized by decomposing their component letters.

See also

Related Research Articles

H Letter of the Latin alphabet

H, or h, is the eighth letter in the ISO basic Latin alphabet. Its name in English is aitch, or regionally haitch.

In the polytonic orthography of Ancient Greek, the rough breathing character, is a diacritical mark used to indicate the presence of an sound before a vowel, diphthong, or after rho. It remained in the polytonic orthography even after the Hellenistic period, when the sound disappeared from the Greek language. In the monotonic orthography of Modern Greek phonology, in use since 1982, it is not used at all.

Æ Letter of the Latin alphabet

Æ is a character formed from the letters a and e, originally a ligature representing the Latin diphthong ae. It has been promoted to the full status of a letter in some languages, including Danish, Norwegian, Icelandic, and Faroese. It was also used in Old Swedish before being changed to ä. Today, the International Phonetic Alphabet uses it to represent the "a" sound in the English word "cat". Diacritic variants include Ǣ, ǣ, Ǽ, ǽ, Æ̀, æ̀, Æ̂, æ̂, Ǣ, ǣ, Æ̃, and æ̃.

Ligature (writing) Glyph combining two or more letterforms in a single typeset or handwritten character

In writing and typography, a ligature occurs where two or more graphemes or letters are joined as a single glyph. An example is the character æ as used in English, in which the letters a and e are joined. The common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t were combined.

Digraph (orthography)

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

The Royal Thai General System of Transcription (RTGS) is the official system for rendering Thai words in the Latin alphabet. It was published by the Royal Institute of Thailand.

Classical Armenian

Classical Armenian is the oldest attested form of the Armenian language. It was first written down at the beginning of the 5th century, and all Armenian literature from then through the 18th century is in Classical Armenian. Many ancient manuscripts originally written in Ancient Greek, Persian, Hebrew, Syriac and Latin survive only in Armenian translation.

Romanization of Greek is the transliteration (letter-mapping) or transcription (sound-mapping) of text from the Greek alphabet into the Latin alphabet. The conventions for writing and romanizing Ancient Greek and Modern Greek differ markedly. The sound of the English letter B was written as β in ancient Greek but is now written as the digraph μπ, while the modern β sounds like the English letter V instead. The Greek name Ἰωάννης became Johannes in Latin and then John in English, but in modern Greek has become Γιάννης; this might be written as Yannis, Jani, Ioannis, Yiannis, or Giannis, but not Giannes or Giannēs as it would be for ancient Greek. The word Άγιος might variously appear as Hagiοs, Agios, Aghios, or Ayios, or simply be translated as "Holy" or "Saint" in English forms of Greek placenames.

The romanization of Arabic refers to the standard norms for rendering written and spoken Arabic in the Latin script in one of various systematic ways. Romanized Arabic is used for a number of different purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

The Romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of Romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.

C Letter of the Latin alphabet

C, or c, is the third letter in the English and ISO basic Latin alphabets. Its name in English is cee, plural cees.

The Armenian orthography reform occurred between 1922 and 1924 in Soviet Armenia and was partially reviewed in 1940. Its main features were neutralization of classical etymological writing and the adjustment of phonetic realization and writing.

Latin script Writing system used for most European languages

Latin script, also known as Roman script, is a set of graphic signs (script) based on the letters of the classical Latin alphabet. This is derived from a form of the Cumaean Greek version of the Greek alphabet used by the Etruscans. Several Latin-script alphabets exist, which differ in graphemes, collation and phonetic values from the classical Latin alphabet.

Armenian alphabet Alphabet used to write the Armenian language

The Armenian alphabet is an alphabetic writing system used to write Armenian. It was developed around 405 AD by Mesrop Mashtots, an Armenian linguist and ecclesiastical leader. The system originally had 36 letters; eventually, three more were adopted. The alphabet was also in wide use in the Ottoman Empire around the 18th and 19th centuries. The Armenian word for "alphabet" is այբուբեն, named after the first two letters of the Armenian alphabet: ⟨Ա⟩ Armenian: այբ ayb and ⟨Բ⟩ Armenian: բեն ben. Armenian is written horizontally, left-to-right.

The Osage script is a new script promulgated in 2006 and revised 2012–2014 for the Osage language. Because Latin orthographies were subject to interference from English conventions among Osage students who were more familiar with English than with Osage, in 2006 the director of the Osage Language Program, Herman Mongrain Lookout, decided to create a distinct script by modifying or fusing Latin letters. This Osage script has been in regular use on the Osage Nation ever since.

ISO 11940-2 is an ISO standard for a simplified transcription of the Thai language into Latin characters.

Classical Armenian orthography, traditional orthography or Mashtotsian orthography, is the orthography that was developed by Mesrop Mashtots in the 5th century for writing Armenian and reformed during the early 19th century. Today, it is used primarily by the Armenian diaspora, including all Western Armenian speakers and Eastern Armenian speakers in Iran, which has rejected the Armenian orthography reform of Soviet Armenia during the 1920s. In the Armenian diaspora, some linguists and politicians allege political motives behind the reform of the Armenian alphabet.

International auxiliary language orthography is often simplified when compared with natural language orthography.

Vo (letter)

Vo is the twenty-fourth letter of the Armenian alphabet. It has a numerical value of 600. It was created by Mesrop Mashtots in the 5th century AD. It represents the open-mid back rounded vowel (/ɔ/), but when it occurs isolated or word-initially, it represents /vɔ/. It is one of the two letters that represent the sound O, the other being Օ which was not created by Mashtots.