Hindustani is the lingua franca of northern India and Pakistan, and through its two standardized registers, Hindi and Urdu, a co-official language of India and co-official and national language of Pakistan respectively. Phonological differences between the two standards are minimal.
Front | Central | Back | |||
---|---|---|---|---|---|
long | short | short | long | ||
Close | iː | ɪ | ʊ | uː | |
Close-mid | eː | oː | |||
Open-mid | ɛː | ə | ɔː | ||
Open | ( æː ) | aː |
Hindustani natively possesses a symmetrical ten-vowel system. [1] The vowels [ə],[ɪ],[ʊ] are always short in length, while the vowels [aː], [iː], [uː], [eː], [oː], [ɛː], [ɔː] are usually considered long, in addition to an eleventh vowel /æː/ which is found in English loanwords. The distinction between short and long vowels is often described as tenseness, with short vowels being lax, and long vowels being tense. [2] Vowels are somewhat longer before voiced stops than before voiceless stops. [3] Additionally, [ɛ] and [ɔ] occur as conditional allophones of /ə/.
/ə/ is often realized more open than mid [ ə ], i.e. as near-open [ ɐ ]. [4] It is subject to schwa deletion word-medially in certain contexts.
The open central vowel is transcribed in IPA by either [aː] or [ɑː].
In Urdu, there is further short [a] (spelled ہ, as in کمرہkamra [kəmra] ) in word-final position, which contrasts with [aː] (spelled ا, as in لڑکاlaṛkā [ləɽkaː] ). This contrast is often not realized by Urdu speakers, and always neutralized in Hindi (where both sounds uniformly correspond to [aː]). [5] [6]
Among the close vowels, what in Sanskrit are thought to have been primarily distinctions of vowel length (that is /i,iː/ and /u,uː/), have become in Hindustani distinctions of quality, or length accompanied by quality (that is, /ɪ,iː/ and /ʊ,uː/). [7] The opposition of length in the close vowels has been neutralized in word-final position, only allowing long close vowels in final position. As a result, Sanskrit loans which originally have a short close vowel are realized with a long close vowel, e.g. śakti (शक्ति – شکتی 'energy') and vastu (वस्तु – وستو 'item') are [ʃəktiː] and [ʋəstuː], not *[ʃəktɪ] and *[ʋəstʊ]. [8]
The vowel represented graphically as ऐ – اَے (romanized as ai) has been variously transcribed as [ɛː] or [æː]. [9] Among sources for this article, Ohala (1999), pictured to the right, uses [ɛː], while Shapiro (2003 :258) and Masica (1991 :110) use [æː]. Furthermore, an eleventh vowel /æː/ is found in English loanwords, such as /bæːʈ/ ('bat'). [10] Hereafter, ऐ – اَے (romanized as ai) will be represented as [ɛː] to distinguish it from /æː/, the latter.
In addition, [ɛ] occurs as a conditioned allophone of /ə/ (schwa) within the sequence /əɦə/ (/əɦ/ before the next syllable or word-finally due to schwa deletion). [8] This change is part of the prestige dialect of Delhi, but may not occur for every speaker. Here are some examples of this process:
Hindi/Urdu | Transliteration | Phonemic | Phonetic |
---|---|---|---|
कहना / کہنا "to say" | kahnā | /kəɦ.nɑː/ | [kɛɦ.nɑː] |
शहर / شہر "city" | śahar | /ʃə.ɦəɾ/ | [ʃɛ.ɦɛɾ] |
ठहरना / ٹھہرنا "to wait" | ṭhaharnā | /ʈʰə.ɦəɾ.nɑː/ | [ʈʰɛ.ɦɛɾ.nɑː] |
However, the fronting of schwa does not occur in words with a schwa only on one side of the /ɦ/ such as kahānī/kəɦaːniː/ (कहानी – کہانی 'a story') or bāhar/baːɦər/ (बाहर – باہر 'outside').
The vowel [ɔ] occurs in proximity to /ɦ/ if the /ɦ/ is surrounded on one of the sides by a schwa and on other side by a round vowel (due to Hindustani phonotactics, this generally only occurs in the sequences /əɦʊ/ or /ʊɦə/). It differs from the vowel [ɔː] in that it is a short vowel. For example, in bahut/bəɦʊt/ the /ɦ/ is surrounded on one side by a schwa and a round vowel on the other side. One or both of the schwas will become [ɔ] giving the pronunciation [bɔɦɔt].
Some Eastern dialects keep /ɛː,ɔː/ as diphthongs, pronouncing them as [aɪ~əɪ,aʊ~əʊ]. [11]
As in French and Portuguese, there are nasalized vowels in Hindustani. There is disagreement over the issue of the nature of nasalization (barring English-loaned /æ/ which is never nasalized [10] ). Masica (1991 :117) presents four differing viewpoints:
Masica [12] supports this last view.
The principal vowel phonemes may be organised as follows to demonstrate the orthographic conventions for vowels.
Vowels | |||||||
---|---|---|---|---|---|---|---|
IPA | Hindi | ISO 15919 | Urdu [13] | Approximate English equivalent | |||
Initial | Combining | Final | Medial | Initial | |||
ə | अ | [14] | a | ـہ | ـ◌َـ | اَ | about |
aː | आ | ा | ā | ـا | آ | far | |
ɪ | इ | ि | i | ◌ِی | ـ◌ِـ | اِ | still |
iː | ई | ी | ī | ◌ِـیـ | اِیـ | fee | |
ʊ | उ | ु | u | ◌ُو | ـ◌ُـ | اُ | book |
uː | ऊ | ू | ū | ◌ُو | اُو | moon | |
eː | ए | े | ē | ے | ـیـ | ایـ | mate |
ɛː | ऐ | ै | ai | ◌َـے | ◌َـیـ | اَیـ | fairy |
oː | ओ | ो | ō | ◌و | او | force | |
ɔː | औ | ौ | au | ◌َـو | اَو | lot (Received Pronunciation) | |
ʰ | [15] | h | ھ [15] | aspiration of the preceding consonant, as in cake | |||
◌̃ | [16] | ँ | m̐ | ں | ـن٘ـ | [17] | heavy nasalisation of the preceding vowel, like can't in rapid GA |
[18] | [19] | ं | ṁ | homorganic nasal before the succeeding consonant, like jungle or branch, and light vowel nasalisation |
Hindustani has a core set of 28 consonants inherited from earlier Indo-Aryan. Supplementing these are two consonants that are internal developments in specific word-medial contexts, [21] and seven consonants originally found in loan words, whose expression is dependent on factors such as status (class, education, etc.) and cultural register (Modern Standard Hindi vs Urdu).
Most native consonants may occur geminate (doubled in length; exceptions are /bʱ,ɽ,ɽʱ,ɦ/). Geminate consonants are always medial and preceded by one of the interior vowels (that is, /ə/, /ɪ/, or /ʊ/). They all occur monomorphemically except [ʃː], which occurs only in a few Sanskrit loans where a morpheme boundary could be posited in between, e.g. /nɪʃ+ʃiːl/ for niśśīl[nɪˈʃːiːl] ('without shame'). [10]
For the English speaker, a notable feature of the Hindustani consonants is that there is a four-way distinction of phonation among plosives, rather than the two-way distinction found in English. The phonations are:
The last is commonly called "voiced aspirate", though Shapiro (2003 :260) notes that,
"Evidence from experimental phonetics, however, has demonstrated that the two types of sounds involve two distinct types of voicing and release mechanisms. The series of so-called voice aspirates should now properly be considered to involve the voicing mechanism of murmur, in which the air flow passes through an aperture between the arytenoid cartilages, as opposed to passing between the ligamental vocal bands."
The murmured consonants are believed to be a reflex of murmured consonants in Proto-Indo-European, a phonation that is absent in all branches of the Indo-European family except Indo-Aryan and Armenian.
Labial | Dental/ Alveolar | Retroflex | Post-alv./ Palatal | Velar | Uvular | Glottal | ||
---|---|---|---|---|---|---|---|---|
Nasal | m | n | ( ɳ ) | ( ɲ ) | ŋ | |||
Stop/ Affricate | voiceless | p | t̪ | ʈ | t͡ʃ | k | ( q ) | |
voiceless aspirated | pʰ | t̪ʰ | ʈʰ | t͡ʃʰ | kʰ | |||
voiced | b | d̪ | ɖ | d͡ʒ | ɡ | |||
voiced aspirated | bʱ | d̪ʱ | ɖʱ | d͡ʒʱ | ɡʱ | |||
Fricative | voiceless | f | s | ( ʂ ) | ʃ | ( x ) | ɦ | |
voiced | ʋ | z | ( ʒ ) | ( ɣ ) | ||||
Approximant | l | j | ||||||
Tap/Trill | unaspirated | ɾ | ɽ | |||||
aspirated | ɽʱ |
Stops in final position are not released, although they continue to maintain the four-way phonation distinction in final position. /ʋ/ varies freely with [v], and can also be pronounced [w]. /r/ is usually flapped or trilled. [30] In intervocalic position, it may have a single contact and be described as a flap [ ɾ ], [31] but it may also be a clear trill, especially in word-initial and syllable-final positions, and geminate /rː/ is always a trill in Arabic and Persian loanwords, e.g. zarā[zəɾaː] (ज़रा – ذرا 'little') versus well-trilled zarrā[zəraː] (ज़र्रा – ذرّہ 'particle'). [4] The palatal and velar nasals [ɲ,ŋ] occur only in consonant clusters, where each nasal is followed by a homorganic stop, as an allophone of a nasal vowel followed by a stop, and in Sanskrit loanwords. [21] [4] However /n/ + velar clusters also occur, eg. /ʊn.kaː/ making /ŋ/ phonemic. There are murmured sonorants, [lʱ,rʱ,mʱ,nʱ], but these are considered to be consonant clusters with /ɦ/ in the analysis adopted by Ohala (1999).
The fricative /ɦ/ in Hindustani is typically voiced (as [ɦ]), especially when surrounded by vowels, but there is no phonemic difference between this voiced fricative and its voiceless counterpart [h].
Hindustani also has a phonemic difference between the dental plosives and the so-called retroflex plosives. The dental plosives in Hindustani are laminal denti-alveolar as in Spanish, and the tongue-tip must be well in contact with the back of the upper front teeth. The retroflex series is not purely retroflex; it actually has an apico-postalveolar (also described as apico-pre-palatal) articulation, and sometimes in words such as ṭūṭā/ʈuːʈaː/ (टूटा – ٹوٹا 'broken') it even becomes alveolar. [32]
In some Indo-Aryan languages, the plosives [ɖ,ɖʱ] and the flaps [ɽ,ɽʱ] are allophones in complementary distribution, with the former occurring in initial, geminate and postnasal positions and the latter occurring in intervocalic and final positions. However, in Standard Hindi they contrast in similar positions, as in nīṛaj (नीड़ज – نیڑج 'bird') vs niḍar (निडर – نڈر 'fearless'). [33]
Hindustani does not distinguish between [v] and [w], specifically Hindi. These are distinct phonemes in English, but conditional allophones of the phoneme /ʋ/ in Hindustani (written ⟨व⟩ in Hindi or ⟨و⟩ in Urdu), meaning that contextual rules determine when it is pronounced as [v] and when it is pronounced as [w]. /ʋ/ is pronounced [w] in onglide position, i.e. between an onset consonant and a following vowel, as in pakwān (पकवानپکوان, 'food dish'), and [v] elsewhere, as in vrat (व्रतورت, 'vow'). Native Hindi speakers are usually unaware of the allophonic distinctions, though these are apparent to native English speakers. [34]
In most situations, the allophony is non-conditional, i.e. the speaker can choose [v], [w], or an intermediate sound based on personal habit and preference, and still be perfectly intelligible, as long as the meaning is constant. This includes words such as advait (अद्वैतادویت) (pronounced [əd̪ˈʋɛːt̪]), which can be pronounced equally correctly as [əd̪ˈwɛːt̪] or [əd̪ˈvɛːt̪]. [34]
Sanskrit borrowing has reintroduced /ɳ/ and /ʂ/ into formal Modern Standard Hindi. They occur primarily in Sanskrit loanwords and proper nouns. In casual speech, they are sometimes replaced with /n/ and /ʃ/. [10] /ɳ/ does not occur word-initially and has a nasalized flap [ɽ̃] as a common allophone. [21]
Loanwords from Persian (including some words which Persian itself borrowed from Arabic or Turkish) introduced six consonants, /f,z,ʒ,q,x,ɣ/. Being Persian in origin, these are seen as a defining feature of Urdu, although these sounds officially exist in Hindi and modified Devanagari characters are available to represent them. [35] [36] Among these, /f,z/, also found in English and Portuguese loanwords, are now considered well-established in Hindi; indeed, /f/ appears to be encroaching upon and replacing /pʰ/ even in native (non-Persian, non-English, non-Portuguese) Hindi words as well as many other Indian languages such as Bengali, Gujarati and Marathi, as happened in Greek with phi. [21] This /pʰ/ to /f/ shift also occasionally occurs in Urdu. [37] While [z] is a foreign sound, it is also natively found as an allophone of /s/ beside voiced consonants.
The other three Persian loans, /q,x,ɣ/, are still considered to fall under the domain of Urdu, and are also used by some Hindi speakers; however, other Hindi speakers may assimilate these sounds to /k,kʰ,g/ respectively. [25] [35] [38] The sibilant /ʃ/ is found in loanwords from all sources (Arabic, English, Portuguese, Persian, Sanskrit) and is well-established. [10] Some Hindi speakers (especially those from rural areas) pronounce the /f,z,ʃ/ sounds as /pʰ,dʒ,s/, though these same speakers, having a Sanskritic education, may hyperformally uphold /ɳ/ and /ʂ/. [39] [24] In contrast, for native speakers of Urdu, the maintenance of /f,z,ʃ/ is not commensurate with education and sophistication, but is characteristic of all social levels. [38] The sibilant /ʒ/, found in loanwords from Persian, Portuguese, and English, is very rare and is considered to fall under the domain of Urdu; although it is officially present in Hindi, many speakers of Hindi assimilate it to /z/ or /dʒ/. [27] [24]
Being the main sources from which Hindustani draws its higher, learned terms, English, Sanskrit, Arabic, and to a lesser extent Persian provide loanwords with a rich array of consonant clusters. The introduction of these clusters into the language contravenes a historical tendency within its native core vocabulary to eliminate clusters through processes such as cluster reduction and epenthesis. [40] Schmidt (2003 :293) lists distinctively Sanskrit/Hindi biconsonantal clusters of initial /kr,kʃ,st,sʋ,ʃr,sn,nj/ and final /tʋ,ʃʋ,nj,lj,rʋ,dʒj,rj/, and distinctively Perso-Arabic/Urdu biconsonantal clusters of final /ft,rf,mt,mr,ms,kl,tl,bl,sl,tm,lm,ɦm,ɦr/.
Hindustani has a stress accent, but it is not as important as in English. To predict stress placement, the concept of syllable weight is needed:
Stress is on the heaviest syllable of the word, and in the event of a tie, on the last such syllable. If all syllables are light, the penultimate is stressed. However, the final mora of the word is ignored when making this assignment (Hussein 1997) [or, equivalently, the final syllable is stressed either if it is extra-heavy, and there is no other extra-heavy syllable in the word or if it is heavy, and there is no other heavy or extra-heavy syllable in the word]. For example, with the ignored mora in parentheses: [41]
Hindi spelling | Urdu spelling | Romanization | Pronunciation | Gloss |
---|---|---|---|---|
रेज़गारी | ریزگاری | rezgārī | [ˈreːz.ɡaː.ri(ː)] | small change, coins |
समिति | سَمِتی | samiti | [sə.ˈmɪ.t(ɪ)] | committee |
क़िस्मत | قسمت | qismat | [ˈqɪs.mə(t)] | fate |
रौज़ाना | روزانہ | rozānā | [roː.ˈzaː.na(ː)] | daily |
किधर | کِدھر | kidhar | [kɪ.ˈdʱə(r)] | where, where to |
जनाब | جناب | janāb | [dʒə.ˈnaː(b)] | sir, mister |
असबाब | اسباب | asbāb | [əs.ˈbaː(b)] | goods, property |
मुसलमान | مسلمان | musalmān | [mʊ.səl.ˈmaː(n)] | Muslim |
परवरदिगार | پروردگار | parvardigār | [pər.ʋər.dɪ.ˈɡaː(r)] | epithet of God |
Content words in Hindustani normally begin on a low pitch, followed by a rise in pitch. [42] [43] Strictly speaking, Hindustani, like most other Indian languages, is rather a syllable-timed language. The schwa /ə/ has a strong tendency to vanish into nothing (syncopated) if its syllable is unaccented.
Gurmukhī is an abugida developed from the Laṇḍā scripts, standardized and used by the second Sikh guru, Guru Angad (1504–1552). Commonly regarded as a Sikh script, Gurmukhi is used in Punjab, India as the official script of the Punjabi language.
Modern Standard Hindi, commonly referred to as Hindi, is a member of the Indo-Aryan language group within the Indo-European language family. Hindi is considered a Sanskritised register of the Hindustani language, which itself is based primarily on the Khariboli dialect of Delhi and neighbouring areas. It is the preferred official language of India, although much national business is also done in English and the other languages recognized in the Indian constitution. It is the official language of India alongside English in nine states and three union territories and an additional official language in three other states. Hindi is also one of the 22 scheduled languages of the Republic of India.
Gujarati is an Indo-Aryan language native to the Indian state of Gujarat and spoken predominantly by the Gujarati people. Gujarati is descended from Old Gujarati. In India, it is one of the 22 scheduled languages of the Union. It is also the official language in the state of Gujarat, as well as an official language in the union territory of Dadra and Nagar Haveli and Daman and Diu. As of 2011, Gujarati is the 6th most widely spoken language in India by number of native speakers, spoken by 55.5 million speakers which amounts to about 4.5% of the total Indian population. It is the 26th most widely spoken language in the world by number of native speakers as of 2007.
Anusvara, also known as Bindu, is a symbol used in many Indic scripts to mark a type of nasal sound, typically transliterated ⟨ṃ⟩ or ⟨ṁ⟩ in standards like ISO 15919 and IAST. Depending on its location in the word and the language for which it is used, its exact pronunciation can vary. In the context of ancient Sanskrit, anusvara is the name of the particular nasal sound itself, regardless of written representation.
A phonemic orthography is an orthography in which the graphemes correspond consistently to the language's phonemes, or more generally to the language's diaphonemes. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme–phoneme correspondence can be expected in orthographies based on alphabetic writing systems, but they differ in how complete this correspondence is. English orthography, for example, is alphabetic but highly nonphonemic.
In phonology, epenthesis means the addition of one or more sounds to a word, especially in the beginning syllable (prothesis) or in the ending syllable (paragoge) or in-between two syllabic sounds in a word. The opposite process, where one or more sounds are removed, is referred to as elision.
In an alphabetic writing system, a silent letter is a letter that, in a particular word, does not correspond to any sound in the word's pronunciation. In linguistics, a silent letter is often symbolised with a null sign U+2205∅EMPTY SET, which resembles the Scandinavian letter Ø. A null or zero is an unpronounced or unwritten segment.
Compensatory lengthening in phonology and historical linguistics is the lengthening of a vowel sound that happens upon the loss of a following consonant, usually in the syllable coda, or of a vowel in an adjacent syllable. Lengthening triggered by consonant loss may be considered an extreme form of fusion. Both types may arise from speakers' attempts to preserve a word's moraic count.
Hindustani, also known as Hindi-Urdu, is the vernacular form of two standardized registers used as official languages in India and Pakistan, namely Hindi and Urdu. It comprises several closely related dialects in the northern, central and northwestern parts of the Indian subcontinent but is mainly based on Khariboli of the Delhi region. As an Indo-Aryan language, Hindustani has a core base that traces back to Sanskrit but as a widely-spoken lingua franca, it has a large lexicon of loanwords, acquired through centuries of foreign rule and ethnic diversity.
Hindustani, the lingua franca of Northern India and Pakistan, has two standardised registers: Hindi and Urdu. Grammatical differences between the two standards are minor but each uses its own script: Hindi uses Devanagari while Urdu uses an extended form of the Perso-Arabic script, typically in the Nastaʿlīq style.
Hindustani has been written in several different scripts. Most Hindi texts are written in the Devanagari script, which is derived from the Brāhmī script of Ancient India. Most Urdu texts are written in the Urdu alphabet, which comes from the Persian alphabet. Hindustani has been written in both scripts. In recent years, the Latin script has been used in these languages for technological or internationalization reasons. Historically, Kaithi script has also been used.
The phonology of Bengali, like that of its neighbouring Eastern Indo-Aryan languages, is characterised by a wide variety of diphthongs and inherent back vowels.
The Gujarati language is an Indo-Aryan language native to the Indian state of Gujarat. Much of its phonology is derived from Sanskrit.
The phonemic inventory of Maldivian (Dhivehi) consists of 29 consonants and 10 vowels. Like other modern Indo-Aryan languages the Maldivian phonemic inventory shows an opposition of long and short vowels, of dental and retroflex consonants as well as single and geminate consonants.
The phoneme inventory of the Marathi language is similar to that of many other Indo-Aryan languages. An IPA chart of all contrastive sounds in Marathi is provided below.
Nepali is the national language of Nepal. Besides being spoken as a mother tongue by more than 48% of the population of Nepal, it is also spoken in Bhutan and India. The language is recognized in the Nepali constitution as an official language of Nepal.
Schwa deletion, or schwa syncope, is a phenomenon that sometimes occurs in Assamese, Hindi, Urdu, Bengali, Kashmiri, Punjabi, Gujarati, and several other Indo-Aryan languages with schwas that are implicit in their written scripts. Languages like Marathi and Maithili with increased influence from other languages through coming into contact with them—also show a similar phenomenon. Some schwas are obligatorily deleted in pronunciation even if the script suggests otherwise. Here, schwa refers to an inherent vowel in the respective abugida scripts, not necessarily pronounced as schwa.
Konkani alphabets refers to the five different scripts currently used to write the Konkani language.
This article explains the phonology of Malay and Indonesian based on the pronunciation of Standard Malay, which is the official language of Brunei and Singapore, "Malaysian" of Malaysia, and Indonesian the official language of Indonesia and a working language in Timor Leste. There are two main standards for Malay pronunciation, the Johor-Riau standard, used in Brunei and Malaysia, and the Baku, used in Indonesia and Singapore.
The inherited, native lexicon of the Hindustani language exhibits a large number of extensive sound changes from its Middle Indo-Aryan and Old Indo-Aryan. Many sound changes are shared in common with other Indo-Aryan languages such as Marathi, Punjabi, and Bengali.
A few sounds, borrowed from the other languages like Persian and Arabic, are written with a dot (bindu or nukta) as shown in Table 2.2. …those who come from rural backgrounds and do not speak really good Khariboli, pronounce these sounds as the nearest equivalents in Hindi.
These letters—q, kh, gh, z, f—occur in words of Arabic or Persian origin. Many speakers maintain these sounds in their speech, but others often pronounce them as k, kh, g, j and ph, respectively.
Hindi has a nasal sound roughly equivalent to the n in the English sang, transliterated here as ṅ or ṁ, and has two slightly differing sh sounds, transliterated as ś and ṣ. ... A few words contain consonants…from Arabic, Persian, Portuguese, and English: क़ (ق) is transliterated as q, ख़ (خ) as kh, ग़ (غ) as g, ज़ (ظ ,ز, or ض) as z, झ़ (ژ) as zh, and फ़ (ف) as f.
Only in Hindi 10 Phonemes व /v/ क़ /q/ ञ /ɲ/ य /j/ ष /ʂ/ ख़ /x/ ग़ /ɣ/ ज़ /z/ झ़ /ʒ/ फ़ /f/
... showed extremely regular patterns. As is not uncommon in a study of subphonemic detail, the objective data patterned much more cleanly than intuitive judgments ... [w] occurs when /वو/ is in onglide position ... [v] occurs otherwise ...
In addition to the basic consonantal sounds discussed in sections 3.1 and 3.2, many speakers use any or all five additional consonants (क़ ḳ, ख़ ḳh,ग़ ġ, ज़ z, फ़ f) in words of foreign origin (primarily from Persian, Arabic, English, and Portuguese). The last two of these, ज़ z and फ़ f, are the initial sounds in English zig and fig respectively. The consonant क़ ḳ is a voiceless uvular stop, somewhat like k, but pronounced further back in the mouth. ख़ ḳh is a voiceless fricative similar in pronunciation to the final sound of the German ach. ग़ ġ is generally pronounced as a voiceless uvular fricative, although it is occasionally heard as a stop rather than a fricative. In devanāgari each of these five sounds is represented by the use of a subscript dot under one of the basic consonant signs.