This article includes a list of general references, but it lacks sufficient corresponding inline citations .(September 2017) |
There are various systems of romanization of the Armenian alphabet .
In linguistic literature on Classical Armenian, the commonly used transliteration is that of Hübschmann-Meillet (1913). It uses a combining dot above diacritic (U+0307) to express the aspirates: ṫ, cḣ, č̇, ṗ, k̇. Some documents were rather published using the Greek rough breathing diacritic (U+0314), a reversed comma combining above the letter, which is easier to distinguish visually, and serves a similar purpose in Greek: t̔, ch̔, č̔, p̔, k̔.
However, the computer support (fonts, rendering systems, availability on usual applications) of these combining diacritics has been poor for long, so some documents resorted to use, as possible fallbacks, their spacing variants (so-called “modifier letters”) written after the letter instead of above it, such as the spacing dot above ⟨˙⟩ (U+02D9), or the spacing turned comma ⟨‘⟩ (U+02BB) — or sometimes the spacing Greek-script rough breaking ⟨῾⟩ (U+1FFE), or the spacing grave accent ⟨ˋ⟩ or ASCII backquote or ⟨`⟩ (U+02CB or U+0060) even if they are too flat, or even the ASCII apostrophe-quote ⟨'⟩ (U+0027) when there was no confusion possible.
The preferred character today is the spacing left half-ring ⟨ʿ⟩ (U+02BF), or the spacing turned comma ⟨‘⟩ (U+02BB, which has the shape of a single left quotation mark), or the spacing reversed comma ⟨ʽ⟩ (U+02BD, which is the Latin-script equivalent of the Greek-script rough breathing), with the spacing turned comma having the advantage of excellent support in many Latin fonts because it is visually identical to the left single quote ⟨‘⟩.
Also, some ambiguities were not solved to work with modern vernacular Armenian, which has two dialects, both using two possible orthographies (besides, the modern orthography is used for Classical Armenian in modern publications).
BGN/PCGN romanization (1981) uses a right single quotation mark (more accurately, a modifier letter apostrophe) to express aspirates, tʼ, chʼ, tsʼ, pʼ, kʼ, the opposite of the original rough breathing diacritic. [1]
This romanization was taken up by ISO (1996) and is considered obsolete. This system is a loose transcription and is not reversible (without using dictionary lookup), notably for single Armenian letters romanized into digraphs (these non-reversible, or ambiguous romanizations are shown in a red cell in the table below).
Some Armenian letters have several romanizations, depending on their context:
ISO 9985 (1996) is the international standard for transliteration of the modern Armenian alphabet. Like with the BGN/PCGN romanization, the apostrophe is used to denote most of the aspirates.
This system is reversible because it avoids the use of digraphs and returns to the Hübschmann-Meillet (however some diacritics for vowels are also modified).
The aspirate series is not treated consistently in ISO 9985: while pʼ, tʼ, cʼ, kʼ are romanized with an apostrophe-like mark, aspirated չ č is not, and instead its unaspirated counterpart ճ is transcribed č̣ with an underdot appearing nowhere else in the system. Note that in this scheme, č (signifying չ) collides with the Hübschmann-Meillet transliteration (where it signifies ճ).
This system is recommended for international bibliographic text interchange (it is also the base of simplified romanizations found to localize the Armenian toponomy of for transliterating human names), where it works very well with the common ISO/IEC 8859-2 Latin encoding used in Central Europe.
ALA-LC romanization (1997) is largely compatible with BGN/PCGN, but returns to expressing aspirates with a left single quotation mark (in fact the modifier letter left half-ring ʿ U+02BF, US-MARC hexadecimal code B0, that is also used to denote ayin in Arabic, so some documents may contain either the preferred left half-ring, or sometimes the ASCII backquote ` U+0060).
This standard changes the transliteration scheme used between Classical/Eastern Armenian and Western Armenian for the Armenian consonants represented by swapping the pairs b vs. p, g vs. k, d vs. t, dz vs. ts and ch vs. j.
In all cases, and to make this romanization less ambiguous and reversible,
On various Armenian websites, non-standard transliterators have appeared, which allows inputting modern Western or Eastern Armenian text using ASCII-only characters. It is not a proper transliterator but can be convenient for users that don't have Armenian keyboards.
Despite these input methods being commonly used, they do not adhere to any approved international or Armenian standard, so they are not recommended for the romanization of Armenian. Note that the input methods recognize the Latin digraphs zh, dz, gh, tw, sh, vo, ch, rr for Classic or Eastern Armenian, and zh, dz, tz, gh, vo, ch, rr for Western Armenian, but offer no way to disambiguate words where the digraphs should not be recognized.
Some Armenian letters are entered as Latin digraphs, and may also be followed by the input of an ASCII single quote (which acts as the only letter modifier recognized) but this quote does not always mean that the intended Armenian letter should be aspirated (this may be the reverse for the input ch'), it is also used as a vowel modifier. Due to ambiguities, texts must be corrected by entering an intermediate dummy character before entering the second Latin letter or quote, then removing the dummy character, so that the automatic input converter keeps the Armenian letters distinct.
Some Armenian letters have very different phonetic sounds between Classical or Eastern Armenian and Western Armenian, so that the usage of Armenian letters is different between the two sub-branches of the language.
This is made visible in the table below by coloring transliterations specific to Classical or Eastern Armenian on green background, and those for Western Armenian on blue background. Other letters are transliterated independently of the language branch. However, cells with red background contain transliterations that are context dependent (and may in some cases create ambiguities, only the ISO 9985 and Hübschmann-Meillet romanizations do not use any context-dependant ambiguous digraphs for transcribing simple Armenian letters that are not ligatures, but the former is inconsistent with its representation of aspirated consonants and incompatible with all other systems for a pair of letters).
Armenian script | capital | Ա | Բ | Գ | Դ | Ե | Զ | Է | Ը | Թ | Ժ | Ի | Լ | Խ | Ծ | Կ | Հ | Ձ | Ղ | Ճ | Մ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0531 | 0532 | 0533 | 0534 | 0535 | 0536 | 0537 | 0538 | 0539 | 053A | 053B | 053C | 053D | 053E | 053F | 0540 | 0541 | 0542 | 0543 | 0544 | ||
minuscule | ա | բ | գ | դ | ե | զ | է | ը | թ | ժ | ի | լ | խ | ծ | կ | հ | ձ | ղ | ճ | մ | |
0561 | 0562 | 0563 | 0564 | 0565 | 0566 | 0567 | 0568 | 0569 | 056A | 057B | 056C | 056D | 056E | 057F | 0570 | 0571 | 0572 | 0573 | 0574 | ||
Romanization of Classical or Eastern Armenian | ASCII input | a | b | g | d | e | z | e' | y' | t' | zh | i | l | x | c' | k | h | dz | gh | tw | m |
Hübschmann-Meillet | ê | ə | t̔, ṫ | ž | c | j | ł | č | |||||||||||||
ISO 9985 | ē | ë | tʼ | ç | ġ | č̣ | |||||||||||||||
BGN/PCGN | e, ye | e | y | zh | kh | ts | dz | gh | ch | ||||||||||||
ALA-LC | e, y | ē | ě | tʿ | |||||||||||||||||
Romanization of Western Armenian | ALA-LC | p | k | t | dz | g | ts | j | |||||||||||||
ASCII input | e | e' | y | t' | x | tz | |||||||||||||||
Armenian script | capital | Յ | Ն | Շ | Ո | Չ | Պ | Ջ | Ռ | Ս | Վ | Տ | Ր | Ց | Ւ | Փ | Ք | Օ | Ֆ | ՈՒ | |
0545 | 0546 | 0547 | 0548 | 0549 | 054A | 054B | 054C | 054D | 054E | 054F | 0550 | 0551 | 0552 | 0553 | 0554 | 0555 | 0556 | 0548 0552 | |||
minuscule | յ | ն | շ | ո | չ | պ | ջ | ռ | ս | վ | տ | ր | ց | ւ | փ | ք | օ | ֆ | ու | և (եւ) | |
0575 | 0576 | 0577 | 0578 | 0579 | 057A | 057B | 057C | 057D | 057E | 057F | 0580 | 0581 | 0582 | 0583 | 0584 | 0585 | 0586 | 0578 0582 | 0587 | ||
Romanization of Classical or Eastern Armenian | ASCII input | y | n | sh | vo | ch | p | j | rr | s | v | t | r | c | w | p' | k', q | o | f | u | ev |
Hübschmann-Meillet | š | o | č̔, č̇ | ǰ | r̄ | c̔, ċ | p̔, ṗ | k̔, k̇ | ô | ||||||||||||
ISO 9985 | č | ṙ | cʼ | pʼ | kʼ | ò | ow | ew | |||||||||||||
BGN/PCGN | sh | o, vo | chʼ | j | rr | tsʼ | o | u | ev, yev | ||||||||||||
ALA-LC | y, h | o | chʿ | ṛ | tsʿ | pʿ | kʿ | ō | ew, ev | ||||||||||||
Romanization of Western Armenian | ALA-LC | b | ch | d | |||||||||||||||||
ASCII input | h' | vo | ch | ch' | rr | c | p' | k', q | o | ev |
Note that in the table above, the last two columns refer to digraphs, not isolated letters (however, they are considered letters in the Reformed orthography). However the last column displays the ligature that is used in the Classical orthography only as an isolated symbol for the short Armenian word ew (meaning and) and its derivations in a way similar to the ampersand (&) in the Latin script (in the Reformed orthography, it is also used at the middle and the end of words instead of եվ); the same transliteration to ew (classical Armenian) or ev (reformed orthography) will be used for the letters this ligature represents, when they are used as digraphs: it used to refer to the w consonant, now it refers to the v consonant.
Armenian script also uses some other digraphs that are often written as optional ligatures, in lowercase only (five of them are encoded in Unicode only for full roundtrip compatibility with some legacy encodings); when present, these ligatures (which are purely typographic and carry no semantic distinction in normal Armenian texts) must be romanized by decomposing their component letters.
A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨ó⟩, grave ⟨ò⟩, and circumflex ⟨ô⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.
In the polytonic orthography of Ancient Greek, the rough breathing character is a diacritical mark used to indicate the presence of an sound before a vowel, diphthong, or after rho. It remained in the polytonic orthography even after the Hellenistic period, when the sound disappeared from the Greek language. In the monotonic orthography of Modern Greek phonology, in use since 1982, it is not used at all.
In writing and typography, a ligature occurs where two or more graphemes or letters are joined to form a single glyph. Examples are the characters ⟨æ⟩ and ⟨œ⟩ used in English and French, in which the letters ⟨a⟩ and ⟨e⟩ are joined for the first ligature and the letters ⟨o⟩ and ⟨e⟩ are joined for the second ligature. For stylistic and legibility reasons, ⟨f⟩ and ⟨i⟩ are often merged to create ⟨fi⟩ ; the same is true of ⟨s⟩ and ⟨t⟩ to create ⟨st⟩. The common ampersand, ⟨&⟩, developed from a ligature in which the handwritten Latin letters ⟨e⟩ and ⟨t⟩ were combined.
A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.
A ring diacritic may appear above or below letters. It may be combined with some letters of the extended Latin alphabets in various contexts.
The romanization of the Russian language, aside from its primary use for including Russian names and words in text written in a Latin alphabet, is also essential for computer users to input Russian text who either do not have a keyboard or word processor set up for inputting Cyrillic, or else are not capable of typing rapidly using a native Russian keyboard layout (JCUKEN). In the latter case, they would type using a system of transliteration fitted for their keyboard layout, such as for English QWERTY keyboards, and then use an automated tool to convert the text into Cyrillic.
The Royal Thai General System of Transcription (RTGS) is the official system for rendering Thai words in the Latin alphabet. It was published by the Royal Institute of Thailand in early 1917, when Thailand was called Siam.
Classical Armenian is the oldest attested form of the Armenian language. It was first written down at the beginning of the 5th century, and all Armenian literature from then through the 18th century is in Classical Armenian. Many ancient manuscripts originally written in Ancient Greek, Hebrew, Syriac and Latin survive only in Armenian translation.
The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.
Scientific transliteration, variously called academic, linguistic, international, or scholarly transliteration, is an international system for transliteration of text from the Cyrillic script to the Latin script (romanization). This system is most often seen in linguistics publications on Slavic languages.
The romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.
C, or c, is the third letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is cee, plural cees.
The Armenian orthography reform occurred between 1922 and 1924 in Soviet Armenia and was partially revised in 1940. Per Dr. Jasmine Dum-Tragut in Armenian: Modern Eastern Armenian, "The main features of the orthography reform of 1922 were the neutralization of classical, etymological writing and the adjustment of phonetic realization and writing..."
Romanization of Georgian is the process of transliterating the Georgian language from the Georgian script into the Latin script.
Greek orthography has used a variety of diacritics starting in the Hellenistic period. The more complex polytonic orthography, which includes five diacritics, notates Ancient Greek phonology. The simpler monotonic orthography, introduced in 1982, corresponds to Modern Greek phonology, and requires only two diacritics.
The Armenian alphabet or, more broadly, the Armenian script, is an alphabetic writing system developed for Armenian and occasionally used to write other languages. It was developed around AD 405 by Mesrop Mashtots, an Armenian linguist and ecclesiastical leader. The script originally had 36 letters. Eventually, two more were adopted in the 13th century. In reformed Armenian orthography (1920s), the ligature ևev is also treated as a letter, bringing the total number of letters to 39.
ISO 11940-2 is an ISO standard for a simplified transcription of the Thai language into Latin characters.
Classical Armenian orthography, traditional orthography or Mashtotsian orthography, is the orthography that was developed by Mesrop Mashtots in the 5th century for writing Armenian and reformed during the early 20th century. Today, it is used primarily by the Armenian diaspora, including all Western Armenian speakers and Eastern Armenian speakers in Iran, which has rejected the Armenian orthography reform of Soviet Armenia during the 1920s. In the Armenian diaspora, some linguists and politicians allege political motives behind the reform of the Armenian alphabet.
Vo is the twenty-fourth letter of the Armenian alphabet. It has a numerical value of 600. It was created by Mesrop Mashtots in the 5th century AD. It represents the open-mid back rounded vowel, but when it occurs isolated or word-initially, it represents 1. It is one of the two letters that represent the sound O, the other being Օ which was not created by Mashtots.