Romani alphabets

Last updated

The Romani language has for most of its history been an entirely oral language, with no written form in common use. Although the first example of written Romani dates from 1542, [1] it is not until the twentieth century that vernacular writing by native Romani people arose.

Contents

Printed anthologies of Romani folktales and poems began in the 20th century in Eastern Europe, using the respective national scripts (Latin or Cyrillic). [2] Written Romani in the 20th century used the writing systems of their respective host societies, mostly Latin alphabets (Romanian, Italian, French, etc.).

Standardization

Currently, there is no single standard orthography used by both scholars and native speakers. Efforts of language planners have been hampered by the significant dialectal divisions in Romani: the absence of standard phonology, in turn, makes the selection of a single written form problematic.

In an effort to overcome this, during the 1980s and 1990s Marcel Courthiade proposed a model for orthographic unification based on the adoption of a meta-phonological orthography, which "would allow dialectal variation to be accommodated at the phonological and morpho-phonological level". [1] This system was presented to the International Romani Union in 1990, who adopted it as the organization's "official alphabet". This recognition by the International Romani Union allowed Courthiade's system to qualify for funding from the European Commission.

Despite being used in several publications, such as the grammar of Romani compiled by Gheorghe Sarău [3] and the Polish publication Informaciaqo lil, [4] the IRU standard has yet to find a broad base of support from Romani writers. One reason for the reluctance to adopt this standard, according to Canadian Rom Ronald Lee, is that the proposed orthography contains a number of specialised characters not regularly found on European keyboards, such as θ and ʒ. [5]

Instead, the most common pattern among native speakers is for individual authors to use an orthography based on the writing system of the dominant contact language: thus Romanian in Romania, Hungarian in Hungary and so on. A currently observable trend, however, appears to be the adoption of a loosely English-oriented orthography, developed spontaneously by native speakers for use online and through email. [1]

Descriptive linguistics has, however, a long and established tradition of transcription. [1] Despite small differences between individual linguists in the representation of certain phonemes, most adhere to a system which Hancock terms Pan-Vlax. [4]

Latin script

The overwhelming majority of academic and non-academic literature produced currently in Romani is written using a Latin-based orthography. [1] There are three main systems that are likely to be encountered: the Pan-Vlax system, the International Standard and various Anglicised systems. [4]

Pan-Vlax

In most recent descriptive literature, a variety of orthography which Ian Hancock terms Pan-Vlax will likely be used. [4] This orthography is not a single standardised form, but rather a set of orthographical practices which exhibit a basic "core" of shared graphemes and a small amount of divergence in several areas. The Pan-Vlax script is based on the Latin script, augmented by the addition of several diacritics common to the languages of eastern Europe, such as the caron. Sometimes stress is indicated with an acute accent.

In the following table, the most common variants of the graphemes are shown. The phonemes used in the table are somewhat arbitrary and are not specifically based on any one dialect (for example, the phoneme denoted /d͡ʒ/ in the table can be realised as /ʒ/, /ʐ/ or /ɟ/, depending on dialect):

Romani "Pan-Vlax" alphabet
GraphemePhonemeExample
A a/a/ambrólpear
B b/b/barvalórich
C c/ts/círdelhe pulls
Č č/t͡ʃ/čačótrue
Čh čh/t͡ʃʰ/čhavóboy
D d/d/dromroad
Dž dž/d͡ʒ/džukéldog
E e/e/eftáseven
F f/f/fórotown
G g/ɡ/gadžónon-Rom
H h/h/herdelézi Saint George's Day
Ch ch/x/chalhe eats
I i/i/ilóheart
J j/j/jagfire
K k/k/kajwhere
Kh kh/kʰ/khamsun
L l/l/lačhógood
M m/m/manúšman
N n/n/nasúlbad
O o/o/ochtóeight
P p/p/paramísifairy tale
Ph ph/pʰ/phabájapple
R r/r/raklínon-Romani girl
S s/s/somnakájgold
Š š/ʃ/šukárbeautiful
T t/t/tatóhot
Th th/tʰ/themland
U u/u/umálfield
V v/ʋ/vurdóncart
Z z/z/zorpower
Ž ž/ʒ/žójaThursday

The use of the above graphemes is relatively stable and universal, taking into account dialectal mergers and so on. However, in certain areas there is somewhat more variation. A typically diverse area is in the representation of sounds not present in most varieties of Romani. For example, the centralised vowel phonemes of several varieties of Vlax and Xaladitka, when they are indicated separately from the non-centralised vowels, can be represented using ə, ъ or ă. [4] Another particularly variant area is the representation of palatalised consonants, which are absent from a number of dialects. Some variant graphemes for /tʲ/ include tj, ty, ć, čj and t᾿. [1] Finally, the representation of the second rhotic, which in several dialects has been merged with /r/, tends to vary between ř, rr, and rh, and sometimes even gh, with the first two being the most frequently found variants. [4]

International Standard

The International Standard orthography, as devised by Marcel Courthiade and adopted by the International Romani Union, uses similar conventions to the Pan-Vlax system outlined above. Several of the differences are simply graphical, such as replacing carons with acute accents—transforming č š ž into ć ś ź—and acute accents with grave accents. However, its most distinctive feature is the use of "meta-notations", which are intended to cover cross-dialectal phonological variation, particularly in degrees of palatalisation; "morpho-graphs", which are used to represent the morphophonological alternation of case suffixes [6] in different phonological environments; [7] and a double dot (¨) to indicate a centralized vowel.

The "meta-notations" are ćh, ʒ, and the caron (ˇ; named ćiriklo after the word for bird), the realisation of which varies by dialect. The first two are respectively pronounced as /t͡ʃʰ/ and /d͡ʒ/ in the first stratum but /ɕ/ and /ʑ/ in the third stratum. [8] The caron on a vowel represents palatalisation; ǒ and ǎ are pronounced /o/ and /a/ in Lovaricka, but /jo/ and /ja/ in Kalderash. [4]

The three "morpho-graphs" are ç, q. and θ, which represent the initial phonemes of a number of case suffixes, which are realised /s/, /k/ and /t/ after a vowel and /ts/, /ɡ/ and /d/ after a nasal consonant.

Anglicised

The English-based orthography commonly used in North America is, to a degree, an accommodation of the Pan-Vlax orthography to English-language keyboards, replacing those graphemes with diacritics with digraphs, such as the substitution of ts ch sh zh for c č š ž. [4] This particular orthography seems to have arisen spontaneously as Romani speakers have communicated using email, a medium in which graphemes outside the Latin-1 charset have until recently been difficult to type. [1] In addition, it is this orthography which is recommended for use by Romani scholar and activist Ronald Lee. [5]

Romani in Macedonia

Romani in Macedonia is written with the following alphabet: [9]

This alphabet is used in the educational system in Macedonia for Romani-speaking students.

A aB bC cĆ ćČ čD dDž džE e
F fG gGJ gjH hI iJ jK kKh kh
L lLj ljM mN nNj njO oP pPh ph
R rS sŠ šT tTh thU uV vY y
X xZ z

Kepeski & Jusuf (1980) noted that the following alphabet is used by Romani people in Macedonia and Serbia (Kosovo): [10]

A aÄ äB bC cČ čKj kj (Ćć)D dGj gj (Ǵǵ)
Dž džE eF fG gH hX xI iJ j
K kL lLj ljM mN nNj njO oP p
Q qR rS sŠ šT tU uV vZ z
Ž ž

Finnish Romani

Finnish Romani (or Finnish Kalo) is written with the following alphabet: [11]

A aB b(C c)D dE eF fG gH h
Ȟ ȟI iJ jK kL lM mN nO o
P p(Q q)R rS sŠ šT tU uV v
(W w)(X x)Y y(Z z)Ž ž(Å å)Ä äÖ ö

The letters in parentheses are only used in loanwords and are therefore not always part of the alphabet. The digraphs , kh, ph, th, and are used, but are not letters of the alphabet. Š and Ž are only used in these digraphs.

Cyrillic script

Cyrillic alphabet of Kalderash dialect [10]
Upper caseАБВГҒДЕЁЖЗИЙККхЛМНОППхРРрСТТхУФХЦЧШЫЬЭЮЯ
Lower caseабвгғдеёжзийккхлмноппхрррсттхуфхцчшыьэюя
Cyrillic alphabet of Ruska Roma dialect [12]
Upper caseАБВГҐДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЫЬЭЮЯ
Lower caseабвгґдеёжзийклмнопрстуфхцчшыьэюя

Greek script

In Greece, for instance, Romani is mostly written with the Greek alphabet (although very little seems to be written in Romani in Greece). [13]

Arabic script

The Arabic script has also been used, for example, in Iran. [13] [14] More importantly, the first periodical produced by Roma for Roma was printed in the Arabic script in the 1920s in Edirne in Turkey. It was called "Laćo" which means "good". [13]

Comparison of alphabets

IPALatinCyrillic
Hungarian1971 Romani World CongressAmerican RomaniPan-VlaxMacedonianInternational Romani Union StandardFinnish Romani [11] ?Kalderash dialect [10] Ruska dialect [12]
LovariCarpathianOfficial Teaching AlphabetFolk Alphabet Kepeski & Jusuf (1980) [10] [15]
[a]AА, [16] Я [17]
[ɑ]A
[æ]Ä
[b]BБ
[ts]CTsCC, Ç [18] Ц
[t͡ʃ]ChČĆЧ
[t͡ʃʰ]ChhČhĆh [19] Чх
[d]DD, Θ [20] DД
[dz]Dz
[d͡ʒ]DzhƷ [21] Дж
[ɟ]DyGjGj (Ǵǵ)
[e]EЭ, [16] Е [17]
[f]FФ
[ɡ]GG, Q [22] GГ [23] ҒҐ
[h]HГ
[x]X [24] XX [24] XȞХ
[i]IЫ, [16] И [17]
[ɨ]ÏІ
[j]JYJJЙ
[k]KKK, Q [22] KК
[kʰ]KhKhКх
[l]LЛ
[ʎ]LyLj
[m]MМ
[n]NН
[ɲ]NyNj
[o]OО, [16] Ё [17]
[ø]Ö [25] Ö
[p]PП
[pʰ]PhPhПх
[r]RР
[ɽ],[ɻ],[rː],[ʀ]Ř, Rr, Rh, Gh [26] RrРр
[s]SS, Ç [18] SС
[ʃ]ShŠŚШ
[ɕ]ŚĆh [19]
[t]TT, Θ [20] Т
[tʰ]ThThТх
[c]TyTj, Ty, Ć, Čj, T’ [26] ĆKj (Ć)
[u]UУ, [16] Ю [17]
[y]ÜÜ [25] Y
[v]VВ
[ʋ]V
[z]ZЗ
[ʒ]ZhŽŹЖ
[ʑ]ŹƷ [21] Ӂ

See also

Notes

  1. 1 2 3 4 5 6 7 Matras (2002)
  2. Bagchi (2016)
  3. Sarău (1994)
  4. 1 2 3 4 5 6 7 8 Hancock (1995)
  5. 1 2 Lee (2005 :272)
  6. Whether these endings are to be analysed as postpositions or case endings is still a matter of debate in Romani linguistics. See, for example, Hancock (1995) and Matras (2002) for varying approaches.
  7. Matras (1999)
  8. Courthiade (2009 :43–44)
  9. Petrovski (2021)
  10. 1 2 3 4 Everson (2001)
  11. 1 2 Granqvist (2011)
  12. 1 2 Serghievsky & Barannikov (1938)
  13. 1 2 3 Bakker & Kyuchukov (2000 :90)
  14. Djonedi (1996)
  15. Phonetic assignment provisional (not in source)
  16. 1 2 3 4 5 After hard consonants
  17. 1 2 3 4 5 After soft (palatal) consonants
  18. 1 2 Represents /s/ after vowels and /tˢ/ after nasals.
  19. 1 2 Represents /t͡ʃʰ/ in the first stratum and /ɕ/ in the third stratum.
  20. 1 2 Represents /t/ after vowels and /d/ after nasals.
  21. 1 2 Represents /d͡ʒ/ in the first stratum and /ʑ/ in the third stratum.
  22. 1 2 Represents /k/ after vowels and /ɡ/ after nasals.
  23. As in Russian, this orthography does not distinguish between /ɡ/ and /h/.
  24. 1 2 This is the Greek letter Chi and was ordered alphabetically after H.
  25. 1 2 Courthiade (2009 :496–499)
  26. 1 2 Only exists in some dialects and varies according to dialects.

Related Research Articles

A macron is a diacritical mark: it is a straight bar ¯ placed above a letter, usually a vowel. Its name derives from Ancient Greek μακρόν (makrón) 'long' because it was originally used to mark long or heavy syllables in Greco-Roman metrics. It now more often marks a long vowel. In the International Phonetic Alphabet, the macron is used to indicate a mid-tone; the sign for a long vowel is instead a modified triangular colon ː.

An orthography is a set of conventions for writing a language, including norms of spelling, hyphenation, capitalization, word boundaries, emphasis, and punctuation.

In phonology and linguistics, a phoneme is a set of phones that can distinguish one word from another in a particular language.

Romani is an Indo-Aryan macrolanguage of the Romani communities. According to Ethnologue, seven varieties of Romani are divergent enough to be considered languages of their own. The largest of these are Vlax Romani, Balkan Romani (600,000), and Sinte Romani (300,000). Some Romani communities speak mixed languages based on the surrounding language with retained Romani-derived vocabulary – these are known by linguists as Para-Romani varieties, rather than dialects of the Romani language itself.

Finnish orthography is based on the Latin script, and uses an alphabet derived from the Swedish alphabet, officially comprising twenty-nine letters but also including two additional letters found in some loanwords. The Finnish orthography strives to represent all morphemes phonologically and, roughly speaking, the sound value of each letter tends to correspond with its value in the International Phonetic Alphabet (IPA) – although some discrepancies do exist.

A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.

<span class="mw-page-title-main">Š</span> Latin letter S with caron

The grapheme Š, š is used in various contexts representing the sh sound like in the word show, usually denoting the voiceless postalveolar fricative /ʃ/ or similar voiceless retroflex fricative /ʂ/. In the International Phonetic Alphabet this sound is denoted with ʃ or ʂ, but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet. It represents the same sound as the Turkic letter Ş and the Romanian letter Ș (S-comma), the Hebrew and Yiddish letter ש, the Ge'ez (Ethiopic) letter ሠ and the Arabic letter ش.

A phonemic orthography is an orthography in which the graphemes correspond to the language's phonemes. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme–phoneme correspondence can be expected in orthographies based on alphabetic writing systems, but they differ in how complete this correspondence is. English orthography, for example, is alphabetic but highly nonphonemic; it was once mostly phonemic during the Middle English stage, when the modern spellings originated, but spoken English changed rapidly while the orthography was much more stable, resulting in the modern nonphonemic situation. On the contrary the Albanian, Serbian/Croatian/Bosnian/Montenegrin, Romanian, Italian, Turkish, Spanish, Finnish, Czech, Latvian, Esperanto, Korean and Swahili orthographic systems come much closer to being consistent phonemic representations.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

<span class="mw-page-title-main">Gaj's Latin alphabet</span> Form of Latin script used to write Serbo-Croatian

Gaj's Latin alphabet, also known as abeceda or gajica, is the form of the Latin script used for writing Serbo-Croatian and all of its standard varieties: Bosnian, Croatian, Montenegrin, and Serbian.

<span class="mw-page-title-main">Ř</span> Latin letter R with caron

The grapheme Ř, ř is a letter used in the alphabets of the Czech and Upper Sorbian languages. It was also used in proposed orthographies for the Silesian language. It has been used in academic transcriptions for rhotic sounds.

<span class="mw-page-title-main">Vlax Romani language</span> Romani dialect group of southeastern Europe

Vlax Romani is a dialect group of the Romani language. Vlax Romani varieties are spoken mainly in Southeastern Europe by the Romani people. Vlax Romani can also be referred to as an independent language or as one dialect of the Romani language. Vlax Romani is the second most widely spoken dialect subgroup of the Romani language worldwide, after Balkan Romani.

The phonology of the Ojibwe language varies from dialect to dialect, but all varieties share common features. Ojibwe is an indigenous language of the Algonquian language family spoken in Canada and the United States in the areas surrounding the Great Lakes, and westward onto the northern plains in both countries, as well as in northeastern Ontario and northwestern Quebec. The article on Ojibwe dialects discusses linguistic variation in more detail, and contains links to separate articles on each dialect. There is no standard language and no dialect that is accepted as representing a standard. Ojibwe words in this article are written in the practical orthography commonly known as the Double vowel system.

Czech orthography is a system of rules for proper formal writing (orthography) in Czech. The earliest form of separate Latin script specifically designed to suit Czech was devised by Czech theologian and church reformist Jan Hus, the namesake of the Hussite movement, in one of his seminal works, De orthographia bohemica.

The Czech language developed at the close of the 1st millennium from common West Slavic. Until the early 20th century, it was known as Bohemian.

<span class="mw-page-title-main">Ę</span> Latin letter E with ogonek

ä

<span class="mw-page-title-main">Bengali language</span> Indo-Aryan language in Bengal region

Bengali, also known by its endonym Bangla, is an Indo-Aryan language native to the Bengal region of South Asia. With approximately 240 million native speakers and another 41 million as second language speakers as of 2021, Bengali is the sixth most spoken native language and the seventh most spoken language by the total number of speakers in the world. It is the fifth most spoken Indo-European language.

<span class="mw-page-title-main">Bengali alphabet</span> Abugida script used in writing Bengali

The Bengali script or Bangla alphabet is the alphabet used to write the Bengali language based on the Bengali-Assamese script, and has historically been used to write Sanskrit within Bengal. It is one of the most widely adopted writing systems in the world . It is one of the official scripts of the Indian Republic. It is used as the official script of the Bengali language in Bangladesh, West Bengal, Tripura and Barak valley of Assam Until recently, it was the usual script for the Meitei language in Manipur, but is being replaced by Meitei mayek. two of the official languages of India.

This article describes the phonology of the Occitan language.

<span class="mw-page-title-main">Podlachian language</span> East Slavic microlanguage

Podlachian language is an East Slavic literary microlanguage based on the East Slavic dialects spoken by inhabitants of the southern part of Podlachian Province in Poland between the Narew (north) and Bug (south) rivers. The native speakers of these dialects usually refer to them by the adverbial term po-svojomu. The unequivocal academic classification of the po-svojomu dialects has been disputed for many years among linguists as well as activists of ethnic minorities in Podlachia, who classify them as either Belarusian dialects with Ukrainian traits or Ukrainian dialects.

References

Suggested further reading