Early Romani

Last updated

Early Romani [1] (sometimes referred to as Late Proto-Romani [2] ) is the latest common predecessor of all forms of the Romani language. It was spoken before the Roma people dispersed throughout Europe. It is not directly attested, but rather reconstructed on the basis of shared features of existing Romani varieties. Early Romani is thought to have been spoken in the Byzantine Empire between the 9th-10th and 13th-14th centuries. [3]

Contents

Phonology

Vowels

The vowels were as follows: [4]

The vowels of Early Romani
  Front Central Back
Close iu
Mid eo
Open a

Consonants

The consonants were as follows: [5]

Early Romani consonants
Labial Alveolar Post-al.
/Palatal
Velar Glottal
Nasal m n
Stop p
b
 
t
d
 
k
ɡ
 
Affricate ts
 
(dz)
tʃʰ

 
Fricative f v s z ʃ ( ʒ ) x ([ χ]?) h
Approximant l j
Trill r ɽr

The sound conventionally designated ř had originated from Indo-Aryan retroflex stops and appears to have still been a retroflex ([ ɽr ]) in Early Romani, judging from a retroflex reflex preserved in at least one dialect and from the diversity of reflexes in different dialects, which include geminated apical trills [rː] . Nonetheless, Yaron Matras also considers it possible that Early Romani had already shifted the place of articulation to a uvular, i.e. had acquired the modern Kalderash pronunciation [ ʁ]. On the other hand, the possibility has also been entertained that there may still have been not just one, but several retroflexes in Early Romani, including a nasal and a lateral. [6]

Dentals may have been allophonically palatalised before /i/. [7]

The following Latin letters are used in this article [8] to designate sounds in ways different from the IPA symbols:

lettercččhkhphřšž
phoneme in IPA/ts//tʃ//tʃʰ////// ɽr //ʃ// ʒ /

Stress

Stress was on the final syllable in the native lexical stratum (čhavó 'boy'), except that certain suffixes were not counted as part of the word for the purposes of stress placement, so the stress was placed before them instead (čhavés-ke 'for the boy'). These were the Layer II case markers (e.g. -ke, 'for'), the vocative markers, the present/future marker -a and the remoteness marker -asi. The mediopassive suffix did not receive stress either, e.g. díkh-jol 'is seen'. The special behaviour of these suffixes was due the fact that they had originally been independent words. In addition, original compound verbs ending in -d- 'to give' had stress on the original first compound member (váz-dav 'I lift'). In the foreign lexical component, words could be stressed on any syllable in accordance with the pronunciation in the source language (fóros 'town'), but when native suffixes were added, the stems received final stress like native stems (forós-ke 'for the town'). [9]

Notable morphonological processes

The consonant /s/ appears to have exhibited an optional alternation with /h/ in certain morphemes in Early Romani, a variation pattern inherited from late Middle Indo-Aryan. These must have included the 2nd singular ending -es when followed by the suffix -a (producing -eha alongside the older -esa), and, due to analogy, the 1st plural ending -as in front of -a (producing -aha alongside the older -asa), the instrumental plural case ending after vowels (-V-ha alongside older -V-sa) and the copula having variants beginning in h- alongside the older s-. Many dialects have extended this pattern to many more forms and have generalised the /h/ variants, whereas others have only retained the conservative forms with /s/ without any trace of the alternation. [10]

The vowel /i/ was desyllabified to a semivowel /j/ before a vowel-initial suffixes: sg. buti 'work' - pl. butja. [11]

Grammar

The morphology exhibited a split between two strata - native (including both inherited words and loans from before the immigration into the Byzantine Empire) and foreign (predominantly loans from Byzantine Greek and some from Slavic; later borrowings from other languages also join this group in descendant dialects). Words of the two strata were often formed and declined somewhat differently.

Nominal morphology

Early Romani nominals had two genders, masculine and feminine, two numbers - singular and plural, and eight cases - nominative, accusative (oblique), vocative, dative, ablative, locative, instrumental and genitive. The nominal phrases also expressed definiteness by means of a definite article.

Partly like other Modern Indo-Aryan languages, the grammatical morphemes in Romani noun declension are classified into three layers - Layer I (remainders of Old Indo-Aryan inflectional endings), Layer II (a set of originally separate words turned into new postposed inflectional elements) and Layer III (adpositions). Layer I suffixes are portmanteau morphs that simultaneously express case (nominative, oblique or vocative) and number, have different variants according to the gender of the word and exhibit some unpredictable lexical variation that makes it possible to speak of declension classes. Layer II suffixes express only case and have largely the same form. [12]

Layer I

The most common endings can be summarised as follows: [13]

nominative singularnominative pluraloblique singularoblique pluralvocative singularvocative plural
nativemasculine-o-e-es-en-éja-ále
-∅-a-a
feminine-i/-∅-a-a-e
foreignmasculine-Vs-i (but -ja if V = i)-Vs-Vna, -V
feminine-a-a?-a-o?

Native feminine stems had a tendency to exhibit /j/ in front of the vowel of the suffix outside of the nominative singular: -j-a, -j-en etc. This was always the case if the nominative singular ended in -i.

The following is a complete list of Early Romani declension classes largely [14] as reconstructed by Viktor Elšík (with terminology adapted for this article): [15]

stratumgenderstem typetypenominative singularnominative pluraloblique singularoblique pluralvocative

singular

vocative pluralexample
nativemasculinezero-stemmain type-∅-a-es-en-a-álekher 'house'
-ipen/ibe(n)-as*------čačipen 'truth'
zero plural

(exceptional)

-∅-es------vast 'hand'
o-stem-o-e-éja-álešero 'head'
i-stem-i-j-a-j-es-j-en------pani 'water'
femininezero-stemnon-jotating-∅-a-a-en-e-áledžuv 'louse'
jotating-j-a-j-a-j-en-e-j-álesuv 'needle'
i-stem-i-ije-j-álepiri 'pot'
foreignmasculineo-stem-o(s)**-i

(-a)

-os***-en-ona?-ále?foros 'town'
u-stem-u(s)**-us-u?-ále?papus 'grandfather'
i-stem-i(s)**-j-a-is-i(na)?-j-ále?sapunis 'soap'
femininea-stem-a-es?-a-o?-ále?cipa 'skin'

* - The stems formed with the suffixes -ipen and -ibe(n) dropped the -e- before endings: oblique -ipn-as, -ibn-as, nominative plural -ipn-a, -ibn-a. [16]

** - In foreign-stratum masculine words in -o(s), -u(s), -i(s), the variants of the nominative without -s found in some dialects might be due to a late reinterpretation of the -s as an oblique ending by analogy with native stratum words. [16] However, the forms without -s might be original in words that were of neuter gender in Greek such as kókalo 'bone', since these, too, were adapted as masculine words in Romani. Present-day dialects have either only the forms with -s or only the forms without -s, but if the latter interpretation is correct, then both rules would be the result of a later generalisation. [17] Note, moreover, that originally neuter Greek words like kókalo also seem to have retained a Greek plural in -a: kókala 'bones'. [18]

*** - However, the oblique form of the abstract nouns formed with the suffix -im-o ended in -im-as. [16] They retained a Greek plural in -im-ata. [16]

Layer II

The Layer II suffixes are added to the appropriate Layer I oblique case form. After the plural oblique Layer I suffix ending in /n/, the initial voiceless consonant of the suffixes became voiced and the sibilant turned into an affricate. The forms were as follows: [19]

casemain formform after /n/
dative -ke-ge
locative -te-de
ablative -tar-dar
instrumental -sa-ca
genitive -ker--ger-

The genitive took the inflectional endings of adjectives and agreed with the modified noun: -ker-o, -ker-e, etc (an example of Suffixaufnahme). The genitive suffix may also have had an optional short variant -k-/-g- besides -ker-/-ger-, as seen in several modern dialects, with or without a difference in function. If there was a difference, the long form may have been more emphatic and preferred when genitives were placed after the noun or nominalised. [20]

Layer III

Layer III words in Early Romani were prepositions (as they mostly are in contemporary dialects as well). Some inherited prepositions were andar 'out of', andre 'in(to)', angle 'in front of', astjal 'for, because of', dži 'until', karig 'towards', (ka)tar 'from', ke 'at, to', mamuj 'against', maškar 'between', pal 'behind', paš 'next to, by', perdal 'across, through', te 'at, to', tel 'under', truja(l) 'past, around', upral/opral 'from the top of', upre/opre 'above, on, over', and vaš 'for'. The pairs andre-andar, angle-anglal, ke/te-katar/tar formed locative-ablative pairs, but there were no special directive prepositions - the locative ones were used to express direction as well. Certain prepositions ending in vowels dropped them before the definite article: e.g. ke- + -o > ko. [21] [22]

Case use

The bare oblique case was used: [23]

1. as an accusative (direct object) case with animate nouns (as well as with pronouns), whereas inanimate nouns used the nominative.

2. It was also used to express possession: man si kher 'I have a house'.

3. Further, it expressed the indirect object of the verb 'to give', i.e. functioned as a dative case.

The instrumental was used also as a comitative case, meaning 'together with' as well as 'by means of'. [24]

Adjuncts to almost all prepositions required the noun to be in the locative case, [25] at least if animate, but may have taken the nominative case if inanimate, as commonly found in modern dialects. [26] However, bi 'without' took the genitive and vaš 'for' took the dative. [27]

Adjective declension

Adjectives used attributively or predicatively were normally declined as follows: [28]

nominative singular masculinenominative singular femininenominative pluraloblique singular masculineoblique singular feminineoblique pluralvocative [29] examples
native-o-i-e-e-a-e-ebaro 'big'
foreign-o-o-a-on-e-on-a-on-e-ezeleno 'green'

A small group of adjectives such as šukar 'pretty' ended in a consonant and were indeclinable.

Nominalised adjectives were declined like nouns: e.g. e phures-ke 'for the old one'.

The comparative and superlative were expressed by the form constructed with the suffix -eder. [30]

Pronouns

The personal pronouns were: [31]

nominative emphaticnominative non-emphatic obliquepossessive
1st singularmemanminř- (< mir-?)
2nd singulartututtir-
3rd singular masculineov (av)lo (to)(o)les(o)les-ker-
3rd singular feminineoj (aj)li (ti)(o)la(o)la-ker-
reflexive---pespes-ker- (or possibly pinř- [32] )
1st pluralamenamen-amar-
2nd pluraltumentumen-tumar-
3rd pluralol (*al), onle (te)(o)len-(o)len-ger-

The possessive forms inflected and agreed with the modified noun like adjectives: tir-o, tir-i, tir-e, etc. In the 3rd person, there were two sets of nominative forms - the emphatic and the non-emphatic pronouns, the latter being commonly used anaphorically and encliticised. [33] The reflexive was used only in the third person. [34]

The demonstrative pronouns had a four-term system that contrasted deictic use (for immediately present referents, expressed by the vowel a) and anaphoric use (for discussed referents, expressed by the vowel o), as well as plain use (for normal indication, expressed by the consonant d) and specific use (for emphasis and contrast with other referents, expressed by the consonant k). The inflection pattern in the nominative was somewhat unique. [35] The forms were as follows (sources differ on whether the consonants in parentheses were present): [36] [37]

nominative singular masculinenominative singular femininenominative pluraloblique singular masculineoblique singular feminineoblique plural
proximate plainada-vaada-jaada-laada-le(s)ada-laada-le(n)
proximate specificaka-vaaka-jaaka-laaka-le(s)aka-laaka-le(n)
remote plainodo-vaodo-jaodo-laodo-le(s)odo-laodo-le(n)
remote specificoko-vaoko-jaoko-laoko-le(s)oko-laoko-le(n)

In addition, the following more archaic and simpler demonstrative forms must have still had some limited (less emphatic) use in Early Romani, since they are preserved in various dialects [38] and even retain the default function in Epiros Romani to this day: [39]

nominative singular masculinenominative singular femininenominative pluraloblique singular masculineoblique singular feminineoblique plural
proximatea-vaa-jaa-laa-lesa-laa-len
remoteo-vao-jao-lao-leso-lao-len


Corresponding adverbs were adaj 'here', odoj 'there', akaj 'precisely here' and okoj 'precisely there'. [40] A related temporal adverb was akana 'now'. [41] 'Such' was asav-. [42]

Interrogative pronouns were kon (obl. kas-) 'who', kaj 'where' (katar 'where from?'), kana 'when', so 'what', sav- 'which, what sort of' (declined as an adjective), sar 'how' and keti 'how much'. For 'why' the dative of so was used: sos-ke. There may also have been an interrogative kibor 'how big'. [43] The interrogatives could also be used as relativepronouns, especially kaj, which also occurred in the sense of 'which' as well as 'where' and thus as a more or less general 'subordinator' and 'relativiser' of clauses (as well as in the sense of 'that' as a complementiser: 'I think that ...'). [44]

Indefinite pronouns could be formed in several ways. The word kaj (rarely daj) 'some, any' could be preposed to other expressions to express indefiniteness (e.g. kaj-jekh 'anyone > anybody', kaj-či 'anything'). The word či 'something, anything' could apparently be postposed to other expressions (still retaining the same meaning), as seen in kaj-či and kaj-ni-či 'anything'. So could, possibly, an indefinite particle -ni, as seen in kaj-ni 'wherever' and in kaj-ni-či. The postposed particle -moni expressed free-choice indefinite constructions such as kon-moni 'whoever', či-moni 'whatever', kajmoni 'wherever'. Finally, there may have been a preposed particle vare-, which had been borrowed from Romanian - unusually for Early Romani - and was added to interrogative pronouns: vare-so 'something'. [45] [46]

Totality was expressed by the particle sa 'everything, all, always', savořo 'all' and the Slavic-derived vsako 'every'. [47]

Definite article

Early Romani had a definite article, which was also used, as in Greek, with proper nouns and to express generic reference in various constructions (e.g. content or origin, lit. 'made out of the X'). The exact forms are difficult to reconstruct due to great dialectal variation. According to Yaron Matras' account, [48] the Early Romani forms were:

nominative singular masculinenominative singular femininenominative pluraloblique singular masculineoblique singular feminineoblique plural
o (< *ov)i (< *oj) or e [42] ol(o)le(o)la(o)le

The numeral jekh 'one' could be used to express indefiniteness, but its use was not obligatory.

Numerals

The numerals from 1 to 10 were: [49]

12345678910
jekhdujtrinštarpandžšoveftaoxtoenjadeš

The teens were formed according to the pattern 'ten-and-unit' [49] using the conjunction -u- 'and' borrowed from an Iranian language, little used elsewhere in Early Romani: e.g. deš-u-trin for 13, [50] except for teens containing the Greek-derived units 7, 8 and 9: thus deš-efta for 17. [51] Thus:

111213141516171819
deš-u-jekhdeš-u-dujdeš-u-trindeš-u-štardeš-u-pandždeš-u-šovdeš-eftadeš-oxtodeš-enja

Of the tens, 30 and probably 40 and 50 were borrowed into Early Romani from Greek, [52] while the others were formed with native roots, mostly with the morpheme -var meaning 'times', i.e. 'X times 10': [53]

102030405060708090100
dešbištriandasarandapenindašov-var-dešefta-var-dešoxto-var-dešenja-var-deššel

Combinations of tens between 30 and 90 and single digits were formed not with -u- but with thaj 'and' (the usual Romani conjunction with that meaning): trianda-thaj-jekh for 31, if a conjunction was used at all. The combinations with biš (20) also used -thaj- according to Peter Bakker, [54] while Viktor Elšík and Yaron Matras consider -u- to be a possibility as well. [55]

The native cardinal numerals, namely the ones for 1-6, 10, 20 and 100, inflected in modifier position like adjectives ending in a consonant: e.g. deš-e 'ten (oblique)'. The Greek-derived ones (7-9 and 30-50) did not. [56]

Ordinal numerals, apart from avgo 'first', were regularly derived from the cardinals with the suffix -to: e.g. efta-to 'seven-th' and even duj-to 'second'; [30] the word for third may have been slightly had the slightly irregular form tri-to due to Greek influence. [57] The ordinals in -to were declined as foreign-stratum adjectives. [42]

Multiplicatives were formed with -var 'times': trin-var 'three times'. [30] Half was paš. [58]

Verbal morphology

The Early Romani verb inflected in tense (including aspect) and mood and agreed with the subject (and possibly the object) in person, number and sometimes gender. The basic structure of the Early Romani verb could be summarised with the following verb chain (note that not all slots need to be occupied): [59]

steminflectional suffixes
123456
rootloan-adaption markervalency markerspast stem markerperson & number agreement and tenseremoteness and modality

For the stem-forming suffixes in slots 2-3, see the section on Word Formation below.

Stems

Each verb had two stems: a present (imperfective) one and a past (perfective) one.

The overwhelming majority of present stems ended in a consonant (e.g. ker- 'do') and some could consist only of a consonant (e.g. l- 'take'), while a small number ended in a vowel, which was normally /a/, e.g. xa- 'eat').

The past stems, which were originally the Old Indo Aryan past participles, were usually formed by adding one of several suffixes to the present stem. Usually, they were:

  1. after vowels: -l-; e.g. xa-l- 'eat'
  2. after /v/ and the voiced dental sonorants /r/, /l/ and /n/: -d-; e.g. ker-d- 'do'
  3. after other consonants (e.g. //, /tʃ/, /s/, /ʃ/): -t-; e.g. dikh-t- 'see', beš-t- 'sit'
  4. in motion verbs (av- 'come', ačh- 'stay', ušt- 'stand'): -il-, e.g. av-il- 'come' [60]
  5. if the present stem was formed with the mediopassive suffix -jov-, that suffix was replaced by -il-, e.g. ker-d-jov- > ker-d-(j)-il- 'be done'
  6. in foreign-stratum intransitive verbs: -il-: -is-áv-il- > -is-á-jl-
  7. after roots consisting of a single consonant (including original compounds ending in -d- 'give'): variably -in- or -∅-: d-in- or d- 'give'
  8. In verbs expressing psychological state ending in /a/: variably -n-, -n-il-, -n-d-il-, etc.: dara-n/nil/ndil- 'fear'.

After /m/, the original -t- may have begun to be gradually replaced by -l- already in Early Romani, as it is replaced after other consonants as well in many descendant dialects. [61]

Irregular alternations between the past and the present stem were found in dža- : gel- 'go', kal- : klist 'raise', mer- : mul- 'die', per- : pel- 'fall', rov- : run- 'cry', sov- : sut- 'sleep'. The pair ov- : ul- 'become, be' was due to a contraction of the regular ov-il- to ul-). [62]

The copula varied between using the stem s-/h- and the extended s/h-in- in the present tense, according to some scholars, [63] whereas others believe that the short forms are the original ones. [64] However, it used suppletive stems in the subjunctive and future tense: usually ov- 'become' and occasionally av- 'come'. [65] It can be said to also have a suppletive past stem ul-, [66] although the regularly constructed imperfect forms (see below) could be used in a past sense.

Person and number agreement

The agreement markers used with the present and with the past stem were different: ker-av '(that) I make', but kerd-j-om 'I made'. The present agreement markers were as follows:

singularplural
1st person-av-as
2nd person-es-en
3rd person (native stratum)-el
3rd person (foreign stratum)-i [67]

The initial vowel of the endings was omitted after verb stems ending in a vowel: xa-s '(that) you eat'.

The past agreement markers were as follows:

singularplural
1st person-j-om-j-am
2nd person-j-al (-j-an)-j-an
3rd person (transitive verbs)-j-as-e
3rd person (intransitive verbs, masculine)-o
3rd person (intransitive verbs, feminine)-i

The past agreement markers were preceded by -/j/- (1st sing. kerd-j-om 'I made', etc.) except for the endings of the 3rd person plural and intransitive singular -e, -o, and -i (e.g. 3rd pl. kerd-e 'they made'), which are, in fact, identical to the forms of the past participle. Like a participle, the intransitive singular ending agrees with the gender of the subject (masc. gel-o 'he went', fem. gel-i 'she went'). [68] It is also thought possible that the element -in- may have occurred optionally before 3rd plural ending -e. [69]

Exceptionally, the copula used the past agreement markers in the present tense: s-(in-j)-om 'I am', etc., except for the third person form, which was si for both numbers.

It has been speculated whether there might have been a set of 3rd person object agreement markers of the form -os 'him', -i 'her' and -e 'them' appended to the subject agreement markers (e.g. dikht-jas-os 'she saw him') and used in cases when there was no emphasis on the object. Such a system is preserved today in a single dialect, Epiros Romani, but is also similar to the ones found in Domari and the Dardic languages. [70] However, a plausible phonetic development leading to this is not easy to reconstruct. [71]

Tenses and moods

The last slot in the verb chain could be either empty or occupied by the present-future indicative particle -a or the remoteness particle -asi. By combining different stems and ending sets with different particles, the following forms were produced:

-∅indicative -aremoteness -asi
present stem + present endings Subjunctive (ker-él)Present-Future (ker-él-a) Imperfect (ker-él-asi)
past stem + past endingsPast (kerd-jás)--- Pluperfect (kerd-jás-asi)

The Imperative consisted of the present stem alone in the singular (ker!) and coincided with the 2nd plural subjunctive for the plural (kerén!).

The Pluperfect apparently used the 'transitive' 3rd singular ending -jas before -asi even with intransitives (gel-jás-asi).

The Subjunctive was used in clauses expressing purpose, constructions expressing wishes and the like: te keráv 'that I do' (in function where many languages use an infinitive, a feature of the Balkan Sprachbund). The Past could be used to express a completed action in the future as well: dži kaj kerdjám 'until we have done it', so its meaning has been described as perfective and aspectual rather than temporal. The 'remote' tenses Imperfect and Plurperfect could also be used to express meanings such as conditional, hypothetical or counterfactual actions: te džanélasi 'if he knew it', mangdjómasi 'I would like to ask', tedžandjásasi 'if he had known it'. [72]

Non-finite forms

The past participle of native-stratum verbs consisted of the past stem and the usual adjective endings: kerd-o 'done', bešt-o 'seated, sitting'. The meaning was passive in transitive verbs. The past participle of foreign-stratum verbs ended in -(i)men, which was originally indeclinable. [73]

There were two gerunds, both expressing actions simultaneous with that of the main verb:

The inflected gerund consisted of the present stem, the suffix -(i)nd- and adjective endings: ker-ind-o 'doing'. [44] It had an inherently non-perfective meaning.

The non-inflected gerund consisted of the present stem and the suffix -i and was aspectually neutral: pučh-i 'having asked'.

There was no infinitive, instead the language used the finite subjunctive introduced by the complementiser particle te (which could also mean 'if'), and the subjunctive agreed with it in person and number: darava te vakerav 'I'm afraid to talk'. [74] [75]

Other expressions of modality

For ability, an impersonal verb was used: an inherited word ašti and the Persian šaj 'it is possible' appear to have co-existed. The negation was našti. [76] Another view is that ašti is a later innovation produced in several dialects by analogy from našti. [77]

For volition, the verb kam- 'to want' was used.

For necessity, the copula s- was inflected and combined with te and the subjunctive: [76] ol si te soven 'they have to sleep', me s(inj)om te sovav 'I have to sleep'. [78]

There were two negating particles: an indicative one, na, and a subjunctive-imperative one, ma: na sovela 'he doesn't sleep' vs ma sov 'don't sleep!' and ma te sovel 'may he not sleep!'. [79] The copula is likely to have acquired a suppletive negative counterpart already in Early Romani: si 'is' vs (na)naj 'is not', [80] although the original Early Romani form may have been the regular na si (> na-hi > naj). [81]

Word formation

Word formation was mostly suffixing.

Nominal suffixes

There were also:

Adjectival suffixes and prefixes

Adverbs

Locative adverbs (also used to express direction) could be formed by the addition of -e with original locative meaning (andr-e 'inside') and -al with original ablative meaning (andr-al 'from the inside > inside'). They often correspond to prepositions without these suffixes, or just coincide with them (with or without adverbial suffixes); see the Layer III section. Adverbs could also be formed from adjectives by adding -es. [83] The following locative adverbs are reconstructed: [84]

locativeandreangleavridurmamujmaškareopre/uprepalepašetele
meaning'inside''in front''outside''faraway''beyond''in-between''above''behind''nearby''below'
ablative ('from')andralanglalavrjalduralmamujalmaškaralopral/upralpalalpašaltelal

Among the other notable adverbs are the Greek-derived pale 'again', palpale 'back' tasja 'tomorrow', komi 'still' and panda 'still' < 'always'. [85] Further, there were the inherited particles vi and nina meaning 'also, even' (vi... vi... could also be used as both ... and ...' [86] ), the Greek-derived moni 'only', as well as atoska 'then'. [87]

Verbal suffixes

PresentPast
Transitives-(V)z-, -(V)n- + -ker/ar--Vs- + -ker-d-/-ar-d-
Intransitives-(V)s- + -áv--Vs- + -á-jl/(n)dil-

Conjunctions

Among coordinating conjunctions, there were thaj 'and' [90] and vaj 'or', but it is impossible to reconstruct with certainty the word for 'but' due to later borrowings at least in all dialects that have dispersed outside of the former Byzantine territory. [91] The conjunction u 'and' seems to have been used especially, but not only, in some numerals (see above). [92] Important and multi-purpose subordinating conjunctions were te 'to', 'in order to', 'if', 'that' (for non-factual clauses) and kaj, originally 'where', but also a general marker of relative clauses 'that, which', as well as 'that' (for factual clauses). [93]

Syntax

The object was generally placed after the verb (VO), unless it was moved to the front of the clause for contrastive purposes, whereas the subject could either precede or follow the verb (SV or VS), with SV expressing emphasis on the subject or its prominence and VS signalling continuity. However, in clauses introduced by the conjunction te 'to, in order to, if', the verb followed immediately after te. Pronominal objects tended to be placed immediately after the verb, before other objects or subjects. Interrogative clauses did not differ from affirmative ones in their word order. Attributes, both adjectives, genitives, numerals and demonstratives were usually placed before the nouns they modified. The language used prepositions. [94] [44]

As already mentioned, possession was expressed with the possessor in the accusative: man si grast 'I have a horse'. There were also constructions with external possessors in the accusative: man dukhala o šero 'my head hurts'. [95]

If the head noun of a dependent clause was not also its subject, it had to be 'duplicated' with a resumptive pronoun within the clause: o čhavo kaj dinjom les i čhuri 'the boy to whom I gave the knife'. [96]

A typical Balkan Sprachbund syntactic feature of Early Romani was the contrast between two complementisers meaning 'that': a factual one kaj džala 'that he goes' and a non-factual one te džal 'that he go', 'to go'. [97]

Lexicon

Approximately 1000 lexical roots can be reconstructed as having been part of the Early Romani lexicon. Most of these, about 700, were inherited from the Indo-Aryan predecessor of Romani, around 200 were loanwords from Byzantine Greek, and around 100 were loanwords that had been acquired during the migration from India to the Byzantine Empire - approximately 70 from Iranian languages and 40 from Armenian. [98] According to a different estimate, the Iranian and Armenian loans were as many as 200-250. It is likely that Early Romani freely used Greek words when necessary, much as its descendant Romani dialects resort, when needed, to the lexis of the majority languages in the areas where they are spoken. [99]

Correspondences between Early Romani and selected Romani dialects

The following are some examples of sound correspondences showing changes that have taken place in different dialects. [100] [101]

Early RomaniexampleSofia ErliArliDrindariKalderašEast SlovakSintiRuska RomaWelsh Romani
*ařo 'flour'rr[ɽr] [102] [ʁ]rrrr
*ndř*mandřo 'bread'[ɽr] > rr[ɽr] [102] [nʁ]r, ndrrrr
*čh*čhavo 'boy'čhčhčhś [ɕ]čhččč
*dž*džan- 'know'dž > žžź [ʑ]dž, ž
*ti, *di*dives 'day'ti, di (but:

tj, dj > kj, gj)

ti, di (but:

tj > kj, č)

ci, ziki, gi (či, dži)ti, di (či, dži)ti, diti, diti, di
*ki, *gi*vogi 'soul'ki, giki, gici, (d)ziki, giki, giki, giki, giki, gi
*ni*pani 'water'ninij(i)j(i)/ninininini
*li*bokoli 'cake'lilij(i)lilililili
*lj*giljav- 'to sing'ljlj, jljljljjlj/jj
*-st/št*grast 'horse'-s/š-s/š-s/š-st/št-st/št-st/št-st/št-st/št
*VsV in endings*keresa 'you do'VsVVhVVsVVsV (VhV)VhVVhVVsVVsV
3rd sing. *-as*kerdjas 'he did'-as-a-as-a-as/a-as-a-as

The following are some notable grammatical differences between dialects in comparison with the Early Romani condition. [100]

Early RomaniexampleSofia ErliArliDrindariKalderašEast SlovakSintiRuska RomaWelsh Romani
foreign masc. nom.sg.*for-os,

*kokal-o

-Vs-V-Vs-V-Vs-V(s)-V-Vs
foreign masc. nom.pl.*for-i-ovja-(j)a-uja-uri-a-i-i (-ja)-i
def. article nom.pl.*olooo/ule, əloinoneo
def. article fem.obl.*(o)laeee [103] (o)la, le(o)lainonei
3rd sing. masc. pronoun*ovovovovoujovjovjovjov
present indicative*kerel-a-∅-a-a-∅-∅-a/∅-a/∅-a/∅
subjunctive*kerel-∅= pres.ind.-∅-∅= pres.ind.= pres.ind.-∅-∅-∅
future*kerel-aka ...ka ...mə ...-a, kame...-anonel- te, av- tedža- te
1st sg. past*kerdj-om-om-om-im-em-om-om/um-om-om
2nd sg. past*kerdj-al (-an)-an-an-(e)an-an-al-al-an-an
2nd pl. past*kerdj-an-en-en-(e)an-an-an-an-(n)e-an
3rd sg. past intransitive*gel-o-o-o-o-o= trans.= trans.= trans.= trans.
infinitivenonenonenonenonenone= 3rd sg.= 3rd sg.none / 2,3 sg.none
foreign verb (pres.trans.)*-(V)z/n-ker/ar--in--in--iz--isar/i--in--ev/∅--in--as-, -in-
negation*na ...na ...na ...na ...či ...na ...... garna ...na ...
comparative* -eder-eder > po-po-po-maj--eder-eder/ester-ydyr-eder
superlative* -eder-eder > naj-naj-naj-maj-jekh ... -eder-ester-ydyrbuteder ...

Related Research Articles

The Finnish language is spoken by the majority of the population in Finland and by ethnic Finns elsewhere. Unlike the languages spoken in neighbouring countries, such as Swedish and Norwegian, which are North Germanic languages, or Russian, which is a Slavic language, Finnish is a Uralic language of the Finnic languages group. Typologically, Finnish is agglutinative. As in some other Uralic languages, Finnish has vowel harmony, and like other Finnic languages, it has consonant gradation.

Romani is an Indo-Aryan macrolanguage of the Romani communities. According to Ethnologue, seven varieties of Romani are divergent enough to be considered languages of their own. The largest of these are Vlax Romani, Balkan Romani (600,000), and Sinte Romani (300,000). Some Romani communities speak mixed languages based on the surrounding language with retained Romani-derived vocabulary – these are known by linguists as Para-Romani varieties, rather than dialects of the Romani language itself.

Hurrian is an extinct Hurro-Urartian language spoken by the Hurrians (Khurrites), a people who entered northern Mesopotamia around 2300 BC and had mostly vanished by 1000 BC. Hurrian was the language of the Mitanni kingdom in northern Mesopotamia and was likely spoken at least initially in Hurrian settlements in modern-day Syria.

<span class="mw-page-title-main">Tzeltal language</span> Mayan language of Mexico

Tzeltal or Tseltal is a Mayan language spoken in the Mexican state of Chiapas, mostly in the municipalities of Ocosingo, Altamirano, Huixtán, Tenejapa, Yajalón, Chanal, Sitalá, Amatenango del Valle, Socoltenango, Las Rosas, Chilón, San Juan Cancuc, San Cristóbal de las Casas and Oxchuc. Tzeltal is one of many Mayan languages spoken near this eastern region of Chiapas, including Tzotzil, Chʼol, and Tojolabʼal, among others. There is also a small Tzeltal diaspora in other parts of Mexico and the United States, primarily as a result of unfavorable economic conditions in Chiapas.

<span class="mw-page-title-main">Halkomelem</span> Shalishan language

Halkomelem is a language of various First Nations peoples of the British Columbia Coast. It is spoken in what is now British Columbia, ranging from southeastern Vancouver Island from the west shore of Saanich Inlet northward beyond Gabriola Island and Nanaimo to Nanoose Bay and including the Lower Mainland from the Fraser River Delta upriver to Harrison Lake and the lower boundary of the Fraser Canyon.

Georgian grammar has many distinctive and extremely complex features, such as split ergativity and a polypersonal verb agreement system.

<span class="mw-page-title-main">Hindustani grammar</span> Grammatical features of the Hindustani lingua franca

Hindustani, the lingua franca of Northern India and Pakistan, has two standardised registers: Hindi and Urdu. Grammatical differences between the two standards are minor but each uses its own script: Hindi uses Devanagari while Urdu uses an extended form of the Perso-Arabic script, typically in the Nastaʿlīq style.

Tsez, also known as Dido, is a Northeast Caucasian language with about 15,000 speakers spoken by the Tsez, a Muslim people in the mountainous Tsunta District of southwestern Dagestan in Russia. The name is said to derive from the Tsez word for "eagle", but this is most likely a folk etymology. The name Dido is derived from the Georgian word დიდი, meaning "big".

This article describes the grammar of the standard Tajik language as spoken and written in Tajikistan. In general, the grammar of the Tajik language fits the analytical type. Little remains of the case system, and grammatical relationships are primarily expressed via clitics, word order and other analytical constructions. Like other modern varieties of Persian, Tajik grammar is almost identical to the classic Persian grammar, although there are differences in some verb tenses.

Classical Kʼicheʼ was an ancestral form of today's Kʼicheʼ language, which was spoken in the highland regions of Guatemala around the time of the 16th-century Spanish conquest of Guatemala. Classical Kʼicheʼ has been preserved in a number of historical Mesoamerican documents, lineage histories, missionary texts, and dictionaries. Most famously, it is the language in which the renowned highland Maya mythological and historical narrative Popol Vuh is written. Another historical text of partly similar content is the Título de Totonicapán.

Kurdish grammar has many inflections, with prefixes and suffixes added to roots to express grammatical relations and to form words.

The verb is one of the most complex parts of Basque grammar. It is sometimes represented as a difficult challenge for learners of the language, and many Basque grammars devote most of their pages to lists or tables of verb paradigms. This article does not give a full list of verb forms; its purpose is to explain the nature and structure of the system.

The grammar of the Marathi language shares similarities with other modern Indo-Aryan languages such as Odia, Gujarati or Punjabi. The first modern book exclusively about the grammar of Marathi was printed in 1805 by Willam Carey.

Breton is a Brittonic Celtic language in the Indo-European family, and its grammar has many traits in common with these languages. Like most Indo-European languages it has grammatical gender, grammatical number, articles and inflections and, like the other Celtic languages, Breton has mutations. In addition to the singular–plural system, it also has a singulative–collective system, similar to Welsh. Unlike the other Brittonic languages, Breton has both a definite and indefinite article, whereas Welsh and Cornish lack an indefinite article and unlike the other extant Celtic languages, Breton has been influenced by French.

<span class="mw-page-title-main">Maliseet-Passamaquoddy language</span> Algonquian language

Maliseet-Passamaquoddy is an endangered Algonquian language spoken by the Maliseet and Passamaquoddy peoples along both sides of the border between Maine in the United States and New Brunswick, Canada. The language consists of two major dialects: Maliseet, which is mainly spoken in the Saint John River Valley in New Brunswick; and Passamaquoddy, spoken mostly in the St. Croix River Valley of eastern Maine. However, the two dialects differ only slightly, mainly in their phonology. The indigenous people widely spoke Maliseet-Passamaquoddy in these areas until around the post-World War II era when changes in the education system and increased marriage outside of the speech community caused a large decrease in the number of children who learned or regularly used the language. As a result, in both Canada and the U.S. today, there are only 600 speakers of both dialects, and most speakers are older adults. Although the majority of younger people cannot speak the language, there is growing interest in teaching the language in community classes and in some schools.

<span class="mw-page-title-main">Pashto grammar</span> Grammar of the Pashto language

Pashto is an S-O-V language with split ergativity. Adjectives come before nouns. Nouns and adjectives are inflected for gender (masc./fem.), number (sing./plur.), and case. The verb system is very intricate with the following tenses: Present; simple past; past progressive; present perfect; and past perfect. In any of the past tenses, Pashto is an ergative language; i.e., transitive verbs in any of the past tenses agree with the object of the sentence. The dialects show some non-standard grammatical features, some of which are archaisms or descendants of old forms.

<span class="mw-page-title-main">Inflection</span> Process of word formation

In linguistic morphology, inflection is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and definiteness. The inflection of verbs is called conjugation, and one can refer to the inflection of nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions and postpositions, numerals, articles, etc, as declension.

Kho'ini is a Tatic dialect or language spoken in northwestern Iran, and is one of many Western Iranian languages. It is spoken in the village of Xoin and surrounding areas, about 60 kilometres (37 mi) southwest of Zanjan city in northern Iran. The Xoini verbal system follows the general pattern found in other Tati dialects. However, the dialect has its own special characteristics such as continuous present which is formed by the past stem, a preverb shift, and the use of connective sounds. The dialect is in danger of extinction.

This article deals with the grammar of the Udmurt language.

Bukiyip (Bukiyúp), or Mountain Arapesh, is an Arapesh language (Torricelli) spoken by around 16,000 people between Yangoru and Maprik in the East Sepik Province of Papua New Guinea. Bukiyip follows the SVO typology. The Arapesh languages are known for their complex noun-phrase agreement system.

References

  1. Matras (2002: passim)
  2. Beníšek, Michael. 2020. The Historical Origins of Romani. P. 18. In: Matras & Tenser (2020)
  3. Matras (2002: 19)
  4. Matras (2002: 58-62)
  5. Matras (2002: 49-56)
  6. Elšík & Matras (2006: 70-71)
  7. Elšík & Matras (2006: 71)
  8. Cf. Matras (2002: 254)
  9. Matras (2002: 62-64)
  10. Matras (2002: 68-71_
  11. Matras (2002: 68)
  12. Matras (2002: 78-80)
  13. Matras (2002: 80-85); except for the vocative.
  14. The vocative of native-stratum words is given mostly as claimed in Boretzky & Igla (2004: 2.2.1.1.D). For the foreign-stratum words, forms found in Vlax, Central and/or Balkan Romani are given.
  15. Elšík & Matras (2006: 72), Matras (2002: 83)
  16. 1 2 3 4 Matras (2002: 85)
  17. Matras (2002: 70-71)
  18. Matras (2002: 81)
  19. Matras (2002: 87-91_
  20. Boretzky & Igla (2004: 22A)
  21. Matras (2002: 91-92)
  22. Elšík & Matras (2006: 83, 241)
  23. Matras (2002: 85-87)
  24. Matras (2002: 89)
  25. Matras (2002: 88)
  26. Matras (2002: 94)
  27. Boretzky & Igla (2004: 2.2.1.2.A, B)
  28. Matras (2002: 94-96)
  29. Elšík & Matras (2006: 218-219)
  30. 1 2 3 4 Elšík & Matras (2006: 170)
  31. Matras (2002: 98-103, 110-111)
  32. Elšík & Matras (2006: 76-77)
  33. Matras (2002: 110)
  34. Elšík & Matras (2006: 76)
  35. Matras (2002: 103-105)
  36. Matras (2002: 110)
  37. Elšík & Matras (2006: 74)
  38. Matras (2002: 107-108)
  39. Matras (2004: 79)
  40. Elšík & Matras (2006: 75)
  41. Matras (2002: 66-67)
  42. 1 2 3 Elšík & Matras (2006: 74)
  43. Elšík & Matras (2006: 77)
  44. 1 2 3 Elšík & Matras (2006: 84)
  45. Matras (2002: 112-116)
  46. Elšík & Matras (2006: 77-78)
  47. Elšík & Matras (2006: 74, 76)
  48. Matras (2002: 110-112)
  49. 1 2 Matras (2002: 28)
  50. Matras (2002: 196)
  51. Matras & Elšík (2006: 164-165)
  52. Elšík & Matras (2006: 171)
  53. Elšík & Matras (2006: 168), Matras (2002: 22, 28)
  54. Bakker (2001: 100-101)
  55. Elšík & Matras (2006: 166)
  56. Elšík & Matras (2006: 163, 429)
  57. Elšík & Matras (2006: 164, 429)
  58. Elšík & Matras (2006: 170)
  59. Matas (2002: 117)
  60. Elšík & Matras (2006: 80)
  61. Elšík & Matras (2006: 80)
  62. Matras (2002: 138-143)
  63. Matras (2002: 143)
  64. Boretzky & Igla (2004: 2.2.2.1.A)
  65. Matras (2002: 137-138)
  66. Elšík & Matras (2006: 316)
  67. The use of -i rather than -el in foreign-stratum verbs might have been optional (Elšík & Matras (2006: 81).
  68. Matras (2002: 143-145)
  69. Elšík & Matras (2006: 81)
  70. Elšík & Matras (2006: 82)
  71. Matras (2004: 78)
  72. Matras (2002: 151-155)
  73. Elšík & Matras (2006: 327)
  74. Matras (2002: 159-162)
  75. Elšík & Matras (2006: 127-128)
  76. 1 2 Matras (2002: 162-164)
  77. Boretzky & Igla (2004: 2.2.2.2.1.2)
  78. Elšík & Matras (2006: 126).
  79. Matras (2002: 189)
  80. Elšík & Matras (2006: 157, 316).
  81. Boretzky & Igla (2004: 2.2.2.1.C)
  82. 1 2 Matras (2002: 74-77)
  83. Matras (2002: 70, 91, 214)
  84. Elšík & Matras (2006: 242)
  85. Matras (2002: 197)
  86. Elšík & Matras (2006: 186)
  87. Matras (2002: 200).
  88. 1 2 Matras (2002: 119-128)
  89. Matras (2002: 136-137)
  90. Matras (2002: 39). Elsewhere in that work, Matras usually cites the form as taj when not listing several variants, but here he presents the variant thaj as the one produced regularly from the Old Indo Aryan etymon by the relevant sound laws.
  91. Matras (2002: 201), Elšik & Matras (2006: 185)
  92. Matras (2002: 196)
  93. Elšik & Matras (2006: 84), Matras (2002: 186-187)
  94. Matras (2002: 166-174)
  95. Matras (2002: 174-176)
  96. Matras (2002: 176-178)
  97. Matras (2002: 190)
  98. Matras (2002: 21)
  99. Elšík & Matras (2006: 69)
  100. 1 2 Boretzky & Igla (2004)
  101. Matras (2002)
  102. 1 2 Boretzky, N. (2000). South Balkan II as a Romani dialect branch: Bugurdži, Drindari, and Kalajdži. Romani Studies fifth series, 10, 105-183.
  103. Boretzky, N. (2000). South Balkan II as a Romani dialect branch: Bugurdži, Drindari, and Kalajdži. Romani Studies fifth series, 10, 105-183. p. 124

Sources