Romanization of Armenian

Last updated

There are various systems of romanization of the Armenian alphabet .

Contents

Transliteration systems

Hübschmann-Meillet (1913)

In linguistic literature on Classical Armenian, the commonly used transliteration is that of Hübschmann-Meillet (1913). It uses a combining dot above mark U+0307 to express the aspirates, ṫ, cḣ, č̇, ṗ, k̇. Some documents were published using a similar Latin dasia diacritic U+0314, a turned comma combining above the letter, which is easier to distinguish visually in t̔, ch̔, č̔, p̔, k̔.

However, the correct support of these combining diacritics has been poor for long in the past and was not very common on many usual applications and computer fonts or rendering systems, so some documents have been published using, as possible fallbacks, their spacing variants such as the modifier letter dot above ˙ U+02D9 written after the letter instead of above it, or the turned comma ʻ U+02BB written after the letter instead of above it — or sometimes the spacing Greek spiritus asper ῾ U+1FFE, or the spacing grave accent ˋ U+02CB even if it is too flat, or even the ASCII backquote ` U+0060, or the ASCII apostrophe-quote ' U+0027 when there was no confusion possible.

But the preferred character today is the modifier letter left half-ring ʿ U+02BF, or the modifier letter ʻ U+02BB or ʽ U+02BD, which is the spacing variant of the dasia diacritic (it is also historically a correct adaptation to the Latin script of the Greek spiritus asper, see rough breathing) with the advantage of having excellent support in many Latin fonts because it is also a simple reversed.

Also, some ambiguities were not solved to work with modern vernacular Armenian, which has two dialects, both using two possible orthographies (besides, the modern orthography is used for Classical Armenian in modern publications).

BGN/PCGN (1981)

BGN/PCGN romanization (1981) uses a right single quotation mark to express aspirates, t’, ch’, ts’, p’, k’, the opposite of the original rough breathing diacritic. [1]

This romanization was taken up by ISO (1996) and is considered obsolete. This system is a loose transcription and is not reversible (without using dictionary lookup), notably for single Armenian letters romanized into digraphs (these non-reversible, or ambiguous romanizations are shown in a red cell in the table below).

Some Armenian letters have several romanizations, depending on their context:

ISO 9985 (1996)

ISO 9985 (1996) is the international standard for transliteration of the modern Armenian alphabet. Like with the BGN/PCGN romanization, the right single quotation mark is used to denote most of the aspirates.

This system is reversible because it avoids the use of digraphs and returns to the Hübschmann-Meillet (however some diacritics for vowels are also modified).

The aspirate series is not treated consistently in ISO 9985: while pʼ, tʼ, cʼ, kʼ are romanized with an apostrophe-like mark, aspirated չ č is not, and instead its unaspirated counterpart ճ is transcribed č̣ with an underdot appearing nowhere else in the system. Note that in this scheme, č (signifying չ) collides with the Hübschmann-Meillet transliteration (where it signifies ճ).

This system is recommended for international bibliographic text interchange (it is also the base of simplified romanizations found to localize the Armenian toponomy of for transliterating human names), where it works very well with the common ISO/IEC 8859-2 Latin encoding used in Central Europe.

ALA-LC (1997)

ALA-LC romanization (1997) is largely compatible with BGN/PCGN, but returns to expressing aspirates with a left single quotation mark (in fact the modifier letter left half-ring ʿ U+02BF, US-MARC hexadecimal code B0, that is also used to denote ayin in Arabic, so some documents may contain either the preferred left half-ring, or sometimes the ASCII backquote ` U+0060).

This standard changes the transliteration scheme used between Classical/Eastern Armenian and Western Armenian for the Armenian consonants represented by swapping the pairs b vs. p, g vs. k, d vs. t, dz vs. ts and ch vs. j.

In all cases, and to make this romanization less ambiguous and reversible,

ASCII-only input methods

On various Armenian websites, non-standard transliterators have appeared, which allows inputting modern Western or Eastern Armenian text using ASCII-only characters. It is not a proper transliterator but can be convenient for users that don't have Armenian keyboards.

Despite these input methods being commonly used, they do not adhere to any approved international or Armenian standard, so they are not recommended for the romanization of Armenian. Note that the input methods recognize the Latin digraphs zh, dz, gh, tw, sh, vo, ch, rr for Classic or Eastern Armenian, and zh, dz, tz, gh, vo, ch, rr for Western Armenian, but offer no way to disambiguate words where the digraphs should not be recognized.

Some Armenian letters are entered as Latin digraphs, and may also be followed by the input of an ASCII single quote (which acts as the only letter modifier recognized) but this quote does not always mean that the intended Armenian letter should be aspirated (this may be the reverse for the input ch'), it is also used as a vowel modifier. Due to ambiguities, texts must be corrected by entering an intermediate dummy character before entering the second Latin letter or quote, then removing the dummy character, so that the automatic input converter keeps the Armenian letters distinct.

Transliteration tables

Some Armenian letters have very different phonetic sounds between Classical or Eastern Armenian and Western Armenian, so that the usage of Armenian letters is different between the two sub-branches of the language.

This is made visible in the table below by coloring transliterations specific to Classical or Eastern Armenian on green background, and those for Western Armenian on blue background. Other letters are transliterated independently of the language branch. However, cells with red background contain transliterations that are context dependent (and may in some cases create ambiguities, only the ISO 9985 and Hübschmann-Meillet romanizations do not use any context-dependant ambiguous digraphs for transcribing simple Armenian letters that are not ligatures, but the former is inconsistent with its representation of aspirated consonants and incompatible with all other systems for a pair of letters).

Armenian scriptcapitalԱԲԳԴԵԶԷԸԹԺԻԼԽԾԿՀՁՂՃՄ
053105320533053405350536053705380539053A053B053C053D053E053F05400541054205430544
minusculeաբգդեզէըթժիլխծկհձղճմ
056105620563056405650566056705680569056A057B056C056D056E057F05700571057205730574
Romanization of Classical or Eastern Armenian ASCII inputabgdeze'y't'zhilxc'khdzghtwm
Hübschmann-Meilletêət̔, ṫžcjłč
ISO 9985 ēët’çġč̣
BGN/PCGN e, yeeyzhkhtsdzghch
ALA-LC e, yēětʿ
Romanization of Western Armenian ALA-LC pktdzgtsj
ASCII inputee'yt'xtz
 
Armenian scriptcapitalՅՆՇՈՉՊՋՌՍՎՏՐՑՒՓՔՕՖՈՒ 
05450546054705480549054A054B054C054D054E054F05500551055205530554055505560548
0552
minusculeյնշոչպջռսվտրցւփքօֆուև (եւ)
05750576057705780579057A057B057C057D057E057F05800581058205830584058505860578
0582
0587
Romanization of Classical or Eastern Armenian ASCII inputynshvochpjrrsvtrcwp'k', qofuev
Hübschmann-Meilletšoč̔, č̇ǰc̔, ċp̔, ṗk̔, k̇ô
ISO 9985čc’p’k’òowew
BGN/PCGN sho, voch’jrrts’ouev, yev
ALA-LC y, hochʿtsʿpʿkʿōew, ev
Romanization of Western Armenian ALA-LC bchd
ASCII inputh'vochch'rrcp'k', qoev

Note that in the table above, the last two columns refer to digraphs, not isolated letters (however, they are considered letters in the Reformed orthography). However the last column displays the ligature that is used in the Classical orthography only as an isolated symbol for the short Armenian word ew (meaning and) and its derivations in a way similar to the ampersand (&) in the Latin script (in the Reformed orthography, it is also used at the middle and the end of words instead of եվ); the same transliteration to ew (classical Armenian) or ev (reformed orthography) will be used for the letters this ligature represents, when they are used as digraphs: it used to refer to the w consonant, now it refers to the v consonant.

Armenian script also uses some other digraphs that are often written as optional ligatures, in lowercase only (five of them are encoded in Unicode only for full roundtrip compatibility with some legacy encodings); when present, these ligatures (which are purely typographic and carry no semantic distinction in normal Armenian texts) must be romanized by decomposing their component letters.

See also

Related Research Articles

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

H, or h, is the eighth letter in the Latin alphabet, used in the modern English alphabet, including the alphabets of other western European languages and others worldwide. Its name in English is aitch, or regionally haitch.

In the polytonic orthography of Ancient Greek, the rough breathing character is a diacritical mark used to indicate the presence of an sound before a vowel, diphthong, or after rho. It remained in the polytonic orthography even after the Hellenistic period, when the sound disappeared from the Greek language. In the monotonic orthography of Modern Greek phonology, in use since 1982, it is not used at all.

<span class="mw-page-title-main">Ligature (writing)</span> Glyph combining two or more letterforms

In writing and typography, a ligature occurs where two or more graphemes or letters are joined to form a single glyph. Examples are the characters ⟨æ⟩ and ⟨œ⟩ used in English and French, in which the letters ⟨a⟩ and ⟨e⟩ are joined for the first ligature and the letters ⟨o⟩ and ⟨e⟩ are joined for the second ligature. For stylistic and legibility reasons, ⟨f⟩ and ⟨i⟩ are often merged to create ⟨fi⟩ ; the same is true of ⟨s⟩ and ⟨t⟩ to create ⟨st⟩. The common ampersand, ⟨&⟩, developed from a ligature in which the handwritten Latin letters ⟨e⟩ and ⟨t⟩ were combined.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

The romanization of Ukrainian, or Latinization of Ukrainian, is the representation of the Ukrainian language in Latin letters. Ukrainian is natively written in its own Ukrainian alphabet, which is based on the Cyrillic script. Romanization may be employed to represent Ukrainian text or pronunciation for non-Ukrainian readers, on computer systems that cannot reproduce Cyrillic characters, or for typists who are not familiar with the Ukrainian keyboard layout. Methods of romanization include transliteration and transcription.

The Royal Thai General System of Transcription (RTGS) is the official system for rendering Thai words in the Latin alphabet. It was published by the Royal Institute of Thailand in early 1917, when Thailand was called Siam.

<span class="mw-page-title-main">Classical Armenian</span> Oldest attested form of the Armenian language

Classical Armenian is the oldest attested form of the Armenian language. It was first written down at the beginning of the 5th century, and all Armenian literature from then through the 18th century is in Classical Armenian. Many ancient manuscripts originally written in Ancient Greek, Hebrew, Syriac and Latin survive only in Armenian translation.

<span class="mw-page-title-main">Romanization of Arabic</span> Representation of Arabic in Latin script

The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

Scientific transliteration, variously called academic, linguistic, international, or scholarly transliteration, is an international system for transliteration of text from the Cyrillic script to the Latin script (romanization). This system is most often seen in linguistics publications on Slavic languages.

The romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.

C, or c, is the third letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is cee. Cees is the plural form for this letter.

The Armenian orthography reform occurred between 1922 and 1924 in Soviet Armenia and was partially reviewed in 1940. Its main features were neutralization of classical etymological writing and the adjustment of phonetic realization and writing.

<span class="mw-page-title-main">Romanization of Persian</span> Representation of the Persian language with the Latin script

Romanization or Latinization of Persian is the representation of the Persian language with the Latin script. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.

<span class="mw-page-title-main">Armenian alphabet</span> Alphabet used to write the Armenian language

The Armenian alphabet, or more broadly the Armenian script, is an alphabetic writing system developed for Armenian and occasionally used to write other languages. It was developed around 405 AD by Mesrop Mashtots, an Armenian linguist and ecclesiastical leader. The script originally had 36 letters. Eventually, two more were adopted in the 13th century. In reformed Armenian orthography (1920s), the ligature ևev is also treated as a letter, bringing the total number of letters to 39.

The Osage script is a new script promulgated in 2006 and revised 2012–2014 for the Osage language. Because Latin orthographies were subject to interference from English conventions among Osage students who were more familiar with English than with Osage, in 2006 the director of the Osage Language Program, Herman Mongrain Lookout, decided to create a distinct script by modifying or fusing Latin letters. This Osage script has been in regular use on the Osage Nation ever since.

ISO 11940-2 is an ISO standard for a simplified transcription of the Thai language into Latin characters.

Classical Armenian orthography, traditional orthography or Mashtotsian orthography, is the orthography that was developed by Mesrop Mashtots in the 5th century for writing Armenian and reformed during the early 20th century. Today, it is used primarily by the Armenian diaspora, including all Western Armenian speakers and Eastern Armenian speakers in Iran, which has rejected the Armenian orthography reform of Soviet Armenia during the 1920s. In the Armenian diaspora, some linguists and politicians allege political motives behind the reform of the Armenian alphabet.

<span class="mw-page-title-main">Vo (Armenian letter)</span> Letter in the Armenian alphabet

Vo is the twenty-fourth letter of the Armenian alphabet. It has a numerical value of 600. It was created by Mesrop Mashtots in the 5th century AD. It represents the open-mid back rounded vowel, but when it occurs isolated or word-initially, it represents. It is one of the two letters that represent the sound O, the other being Օ which was not created by Mashtots.

References

  1. "Romanization of Armenian: BGN/PCGN 1981 System" (PDF). Retrieved 10 March 2024.

Further reading