This article should specify the language of its non-English content, using {{ lang }}, {{ transliteration }} for transliterated languages, and {{ IPA }} for phonetic transcriptions, with an appropriate ISO 639 code. Wikipedia's multilingual support templates may also be used.(December 2022) |
Schwa deletion, or schwa syncope, is a phenomenon that sometimes occurs in Assamese, Hindi, Urdu, Bengali, Kashmiri, Punjabi, Gujarati, and several other Indo-Aryan languages with schwas that are implicit in their written scripts. Languages like Marathi and Maithili with increased influence from other languages through coming into contact with them—also show a similar phenomenon. Some schwas are obligatorily deleted in pronunciation even if the script suggests otherwise. [1] [2] Here, schwa refers to an inherent vowel in the respective abugida scripts, not necessarily pronounced as schwa (mid central vowel).
Schwa deletion is important for intelligibility and unaccented speech. It also presents a challenge to non-native speakers and speech synthesis software because the scripts, including Devanagari, do not indicate when schwas should be deleted. [3]
For example, the Sanskrit word "Rāma" (IPA: [raːmɐ] , राम) is pronounced "Rām" (IPA: [raːm] , राम्) in Hindi. The schwa (ə) sound at the end of the word is deleted in Hindi. [4] However, in both cases, the word is written राम.
The schwa is not deleted in ancient languages such as Sanskrit. The schwa is also retained in all the modern registers of the Dravidian languages Tamil, Telugu, Kannada, and Malayalam as well as the Indo-Aryan languages Odia and Sinhala. According to Masica (1993), there has been not "any attempt to deal with it [schwa deletion] (and medial vowel loss in general) in systematic fashion either descriptively or historically across all NIA [New Indo-Aryan] languages." [5]
Languages | Mid schwa deletion | Final schwa deletion |
---|---|---|
Assamese | Retains | Both |
Bengali | Both | Both |
Gujarati | Both | Deletes |
Hindi | Both | Deletes |
Kannada | Retains | Retains |
Malayalam | Retains | Retains |
Marathi | Both | Both |
Odia | Retains | Retains |
Punjabi | Both | Deletes |
Sanskrit | Retains | Retains |
Tamil | Retains | Retains |
Telugu | Retains | Retains |
Urdu | Deletes | Deletes |
Although the Devanagari script is used as a standard to write Modern Hindi, the schwa ('ə') implicit in each consonant of the script is "obligatorily deleted" at the end of words and in certain other contexts, unlike in Sanskrit. [1] That phenomenon has been termed the "schwa syncope rule" or the "schwa deletion rule" of Hindi. [1] [3] One formalisation of this rule has been summarised as ə → ∅ /VC_CV. In other words, when a schwa-succeeded consonant (itself preceded by another vowel) is followed by a vowel-succeeded consonant, the schwa inherent in the first consonant is deleted. [3] [6] However, this rule sometimes deletes a schwa that should remain and sometimes fails to delete a schwa when it should be deleted. The rule is reported to result in correct predictions on schwa deletion 89% of the time. [6]
Schwa deletion is computationally important because it is essential to building text-to-speech software for Hindi. [6] [7]
As a result of schwa syncope, the Hindi pronunciation of many words differs from that expected from a literal Sanskrit-style reading of Devanagari. For instance, राम is pronounced Rām (not Rāma, as in Sanskrit), रचना is pronounced Rachnā (not Rachanā), वेद is pronounced Ved (not Veda) and नमकीन is pronounced Namkīn (not Namakīna). [6] [7] The name of the script itself is pronounced Devnāgrī, not Devanāgarī. [8]
Correct schwa deletion is also critical because the same letter sequence is pronounced two different ways in Hindi depending on the context. Failure to delete the appropriate schwas can then change the meaning. [9] For instance, the letter sequence 'रक' is pronounced differently in हरकत (har.kat, meaning movement or activity) and सरकना (sarak.na, meaning to slide). Similarly, the sequence धड़कने in दिल धड़कने लगा (the heart started beating) and in दिल की धड़कनें (beats of the heart) is identical prior to the nasalisation in the second usage. However, it is pronounced dhaṛak.ne in the first and dhaṛ.kanẽ in the second. [9]
While native speakers pronounce the sequences differently in different contexts, non-native speakers and voice-synthesis software can make them "sound very unnatural", making it "extremely difficult for the listener" to grasp the intended meaning. [9]
Different Indian languages can differ in how they apply schwa deletion. For instance, medial schwas from Sanskrit-origin words are often retained in Bengali even if they are deleted in Hindi. [10] An example of this is रचना/রচনা which is pronounced racanā (/rɐtɕɐnaː/) in Sanskrit, racnā (/rətʃnɑː/) in Hindi and rôcona (/rɔtʃona/) in Bengali. While the medial schwa is deleted in Hindi (because of the ə → ∅ / VC_CV rule), it is retained in Bengali. [6]
On the other hand, the final schwa in वेद /বেদ is deleted in both Hindi and Bengali (Sanskrit: /veːd̪ə/, Hindi: /veːd̪/, Bengali: /bed̪/). [6]
The Assamese equivalent for schwa is the open-mid back rounded vowel or [ɔ]. Assamese deleted this vowel at the end of consonant ending words, with a few exceptions like in numerals. In clusters, it's deleted in words like কান্ধ (/kandʱ-/, shoulder), বান্ধ (/bandʱ-/, bond) while optional in the word গোন্ধ (/ɡʊnˈdʱ(ɔ)/, smell). Modern Standard Assamese developed the schwa in words like কাছ (/kaˈsɒ/, turtle), পাৰ (/paˈɹɒ/, pigeon), তই কৰ (/tɔi kɔɹɔ/, you do) which appear with different vowels in some other dialects, like কাছু /ˈkasu/, পাৰা /ˈpaɾa/, কৰাহ /ˈkɔɾaʱ/ in some Kamrupi dialects. Eastern (and its sub-dialect, Standard) and Central Assamese retained the schwa in medial positions, like নিজৰা (/niˈzɔɹa/, stream), বিচনি (/biˈsɔni/, handfan), বতৰা (/bɔˈtɔɹa/, news), পাহৰে (/paˈɦɔɹe/, forgets), নকৰে (/nɔˈkɔɹe/, doesn't do), which were deleted in some of the Kamrupi dialect, while some others kept them as /a/. Conjuncts in Sanskrit loanwords always have the schwa, and in consonants ending words (that are followed by schwa), the schwa is optionally present in words ending with suffixes, for example, শিক্ষিত from Sanskrit शिक्षित (śikṣita, "educated") is pronounced both as /ˌxikˈkʰitɔ/ and /ˌxikˈkʰit/.
The Bengali equivalent for schwa is open-mid back rounded vowel or [ɔ]. Bengali deletes this vowel at the end when not ending in a consonant cluster but sometimes retains this vowel at the medial position. The consonant clusters at the end of a word usually follows a close-mid back rounded vowel or [o]. For example, the Sanskrit word पथ (/pɐt̪ʰɐ/, way) corresponds to the Bengali word পথ /pɔt̪ʰ/ (পথ্). But the Skt. word अन्त (/ɐnt̪ɐ/, end) retains the end vowel and becomes অন্তো (/ɔnt̪o/) in Bengali, as it ends with a consonant cluster.
However, tatsama borrowings from Sanskrit generally retain the 'ɔˈ except in word-final positions and except in very informal speech.
That vowel in medial position are not always retained. For instance, 'কলকাতা' is pronounced as কোল্কাতা (/kolkat̪a/), and not /kolɔkat̪a/ (although different pronunciations based on dialect exist, none pronounce it this way).
Gujarati has a strong schwa deletion phenomenon, affecting both medial and final schwas. From an evolutionary perspective, the final schwas appear to have been lost prior to the medial ones. [2] According to Cardona, the word-final schwa deletion occurred during the transition from Middle Gujarati to Modern Gujarati. [11]
In the Dardic subbranch of Indoian, Kashmiri similarly demonstrates schwa deletion. For instance, drākṣa (द्राक्ष) is the Sanskrit word for grape, but the final schwa is dropped in the Kashmiri version, which is dach (दछ् or دَچھ).
Maithili's schwa deletion differs from other neighbouring languages. It does not delete schwa, but shortens it, i.e. ə → ə̆ / VC_CV applies to the language. Maithili with increased influence of other languages through coming into contact with them has been showing the phenomenon of schwa deletion sometimes with words that traditionally pronounce schwas. For instance, हमरो is həməro (even ours) with schwas but is pronounced həmᵊro. [12] That is akin to the neighbouring Bhojpuri in which हमरा (meaning mine) is pronounced həmrā rather than həmərā from the deletion of a medial schwa. [13]
Marathi exhibits extensive schwa deletion. [14] : 95–111 The schwa at the end of a word is almost always deleted, except in the case of a few tatsama words from Sanskrit [15] as well as when the word ends in a conjunct. [14] : 95–111 Schwas essentially get deleted when there is an opportunity for a consonant with a schwa to turn into a coda consonant for the previous syllable, though the actual rules are more complicated and have exceptions. [14] : 95–111
However, in places where the schwa occurs in the middle of words, Marathi does exhibit a propensity to pronounce it far more regularly than Hindi. Words like प्रेरणा, मानसी, केतकी retain the schwa sound in the र, न, and त respectively, often leading to their transliteration by native Marathi speakers in the Roman script as Prerana, Manasi and Ketaki rather than Prerna, Mansi or Ketki.
Sometimes, to avoid schwa deletion, an anusvara is placed at the end of the word. For example, the word खर (khar, "roughness") is often read without the schwa. When the schwa needs to be made explicit, it is written as खरं (khara, "true"). This often happens in the case of pluralization, e.g. फूल (phūl, "flower") can be written as having the plural फुलं (phula, "flowers"). This arises from the original plural marker -एं (as in फुलें phulẽ, "flowers") having degraded to a schwa in modern speech, and the anusvara serves a purpose as a non-deleted vowel even though it is not realized as a nasal. [14] : 114
Unlike other Indo-Aryan languages like Hindi, comprehension of Marathi is not impeded if all schwas are retained. However it will be interpreted as a formal register called 'Ati Shuddha Marathi', which is only used for certain plays and poetry recitals.
Nepali orthography is comparatively more phonetic than Hindi when it comes to schwa retention. Schwas are often retained within the words unless deletion is signaled by the use of a halanta(्). सुलोचना (a name) is pronounced sulocnā by Hindi speakers while sulocanā by Nepali speakers. Some exceptions exist, such as सरकार (government), pronounced sarkār, not sarakār.
The following rules can be followed to figure out whether or not Nepali words retain the final schwa in a word.
Note that schwas are often retained in music and poetry to facilitate singing and recitation.
Odia in its standardised form retains the schwa in its pronunciation as an open-mid back rounded vowel. Both medial and final schwas are retained: in the medial case ଝରଣା jharaṇā is pronounced /dʒʱɔɾɔɳā/ (waterfall) and in the final case ଟଗର ṭagara is pronounced /ʈɔgɔɾɔ/ (crepe jasmine flower).
Sanskrit loanwords or 'tatsama' words, being more formal, always have the schwa pronounced.
However, deletion is more common in a number of non-standard dialects, as well as increasingly in the speech of urban areas as a result of exposure to English and Hindi. For example, the name of the city Bhubaneshwar can be pronounced either informally as /bʰubɔneswɔɾ/, or more formally /bʰubɔneswɔɾɔ/.
Punjabi has broad schwa deletion rules: several base word forms (ਕਾਗ਼ਜ਼, کاغز, kāġəz/paper) drop schwas in the plural form (ਕਾਗ਼ਜ਼ਾਂ, کاغزاں, kāġzāṁ/papers) as well as with ablative (ਕਾਗ਼ਜ਼ੋਂ, کاغزوں, kāġzōṁ/from the paper) and locative (ਕਾਗ਼ਜ਼ੇ, کاغزے, kāġzé/on the paper) suffixes. [16]
Since Devanagari does not provide indications of where schwas should be deleted, it is common for non-native learners/speakers of Hindi, who are otherwise familiar with Devanagari and Sanskrit, to make incorrect pronunciations of words in Hindustani and other modern North Indian languages. [17] Similarly, systems that automate transliteration from Devanagari to Latin script by hardcoding implicit schwas in every consonant often indicate the written form rather than the pronunciation. That becomes evident when English words are transliterated into Devanagari by Hindi-speakers and then transliterated back into English by manual or automated processes that do not account for Hindi's schwa deletion rules. For instance, the word English may be written by Hindi speakers as इंगलिश (rather than इंग्लिश्) which may be transliterated back to Ingalisha by automated systems, but schwa deletion would result in इंगलिश being correctly pronounced as Inglish by native Hindi-speakers. [18]
Some examples are shown below:
Word in Devanagari and meaning | Pronunciation in Hindi (with schwa syncope) | Pronunciation without schwa syncope | Comments |
---|---|---|---|
लपट (flame) | ləpəṭ | ləpəṭə | The final schwa is deleted [19] |
लपटें (flames) | ləpṭeṅ | ləpəṭeṅ | The medial schwa, ləpəṭ, which was retained in लपट, is deleted in लपटें [19] |
समझ (understanding) | səməjh | səməjhə | The final schwa is deleted [20] |
समझा (understood, verb masc.) | səmjhā | səməjhā | The medial vowel also is deleted here, which it wasn't in समझ [20] |
भारत (India) | bhārət | bhārətə | Final schwa is deleted |
भारतीय (Indian) | bhārtīy | bhārətīyə | Both the medial and final schwa are deleted, although the final schwa is sometimes faintly pronounced due to the 'y' glide; when pronounced without this, the word sounds close to 'bhārtī' |
देवनागरी (Devanagari, the script) | devnāgrī | devənāgərī | Two medial schwas (after व and after ग) are deleted |
इंगलिश (English, the language) | inglish | ingəlishə | Medial and final schwas (after ग and after श) are deleted |
विमला (Vimla, a proper name) | vimlā | viməlā | Medial schwa is deleted [21] |
सुलोचना (Sulochna, a proper name) | sulochnā | sulochənā | Medial schwa is deleted [21] |
With some words that contain /n/ or /m/ consonants separated from succeeding consonants by schwas, the schwa deletion process has the effect of nasalising any preceding vowels. [22] Here are some examples in Hindustani:
Devanagari is an Indic script used in the northern Indian subcontinent. Also simply called Nāgari, it is a left-to-right abugida, based on the ancient Brāhmi script. It is one of the official scripts of the Republic of India and Nepal. It was developed and in regular use by the 8th century CE and achieved its modern form by 1200 CE. The Devanāgari script, composed of 48 primary characters, including 14 vowels and 34 consonants, is the fourth most widely adopted writing system in the world, being used for over 120 languages.
Modern Standard Hindi, commonly referred to as Hindi, is the standardised variety of the Hindustani language written in Devanagari script. It is the official language of India alongside English and the lingua franca of North India. Hindi is considered a Sanskritised register of the Hindustani language, which itself is based primarily on the Khariboli dialect of Delhi and neighbouring areas. It is an official language in nine states and three union territories and an additional official language in three other states. Hindi is also one of the 22 scheduled languages of the Republic of India.
Marathi is a classical Indo-Aryan language predominantly spoken by Marathi people in the Indian state of Maharashtra and is also spoken in other states like in Goa, Karnataka, Tamil Nadu, Telangana, Gujarat, Madhya Pradesh, Chhattisgarh, and the territory of Dadra and Nagar Haveli and Daman and Diu. It is the official language of Maharashtra, and an additional official language in the state of Goa, where it is used for replies, when requests are received in Marathi. It is one of the 22 scheduled languages of India, with 83 million speakers as of 2011. Marathi ranks 13th in the list of languages with most native speakers in the world. Marathi has the third largest number of native speakers in India, after Hindi and Bengali. The language has some of the oldest literature of all modern Indian languages. The major dialects of Marathi are Standard Marathi and the Varhadi Marathi. Marathi was designated as a classical language by the Government of India in October 2024.
Anusvara, also known as Bindu, is a symbol used in many Indic scripts to mark a type of nasal sound, typically transliterated ⟨ṃ⟩ or ⟨ṁ⟩ in standards like ISO 15919 and IAST. Depending on its location in the word and the language for which it is used, its exact pronunciation can vary. In the context of ancient Sanskrit, anusvara is the name of the particular nasal sound itself, regardless of written representation.
Devanagari is an Indic script used for many Indo-Aryan languages of North India and Nepal, including Hindi, Marathi and Nepali, which was the script used to write Classical Sanskrit. There are several somewhat similar methods of transliteration from Devanagari to the Roman script, including the influential and lossless IAST notation. Romanised Devanagari is also called Romanagari.
Kashmiri or Koshur is a Dardic Indo-Aryan language spoken by around 7 million Kashmiris of the Kashmir region, primarily in the Kashmir Valley and Chenab Valley of the Indian-administrated union territory of Jammu and Kashmir, over half the population of that territory. Kashmiri has split ergativity and the unusual verb-second word order.
A phonemic orthography is an orthography in which the graphemes correspond consistently to the language's phonemes, or more generally to the language's diaphonemes. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme–phoneme correspondence can be expected in orthographies based on alphabetic writing systems, but they differ in how complete this correspondence is. English orthography, for example, is alphabetic but highly nonphonemic.
Maithili is an Indo-Aryan language spoken in parts of India and Nepal. It is native to the Mithila region, which encompasses parts of the eastern Indian states of Bihar and Jharkhand as well as the Nepal's Koshi and Madhesh Provinces. It is one of the 22 scheduled languages of India. It is the second most commonly spoken language of Nepal. It is also one of the fourteen provincial official languages of Nepal.
Saurashtra is an Indo-Aryan language spoken primarily by the Saurashtrians of Southern India who migrated from the Lata region of present-day Gujarat to south of Vindhyas in the Middle Ages.
In an alphabetic writing system, a silent letter is a letter that, in a particular word, does not correspond to any sound in the word's pronunciation. In linguistics, a silent letter is often symbolised with a null sign U+2205∅EMPTY SET, which resembles the Scandinavian letter Ø. A null or zero is an unpronounced or unwritten segment.
Hindustani has been written in several different scripts. Most Hindi texts are written in the Devanagari script, which is derived from the Brāhmī script of Ancient India. Most Urdu texts are written in the Urdu alphabet, which comes from the Persian alphabet. Hindustani has been written in both scripts. In recent years, the Latin script has been used in these languages for technological or internationalization reasons. Historically, Kaithi script has also been used.
Mal Paharia is a language spoken by 51,000 of 110,000 ethnic Mal Paharia in the states of Jharkhand and West Bengal in India, and regions of Bangladesh. The language is also known as Mal Pahoria, Malto, Malti, Paharia, Parsi, and Mal Pahariya. It has been variously regarded as a Bengali–Assamese language, a dialect of Malto, and a mixed Dravidian–Indo-Aryan language. There is a generally positive attitude among speakers of the language towards it, but it is considered vulnerable as some speakers have shifted to Bengali. Mal Paharia uses the Devanagari script and rules for its writing, reading, and speech.
The Gujarati language is an Indo-Aryan language native to the Indian state of Gujarat. Much of its phonology is derived from Sanskrit.
The phoneme inventory of the Marathi language is similar to that of many other Indo-Aryan languages. An IPA chart of all contrastive sounds in Marathi is provided below.
Hindustani is the lingua franca of northern India and Pakistan, and through its two standardized registers, Hindi and Urdu, a co-official language of India and co-official and national language of Pakistan respectively. Phonological differences between the two standards are minimal.
Sambalpuri is an Indo-Aryan language variety spoken in western Odisha, India. It is alternatively known as Western Odia, and as Kosali, a recently popularised but controversial term, which draws on an association with the historical region of Dakshina Kosala, whose territories also included the present-day Sambalpur region.
Bengali, also known by its endonym Bangla, is a classical Indo-Aryan language from the Indo-European language family native to the Bengal region of South Asia. With over 237 million native speakers and another 41 million as second language speakers as of 2024, Bengali is the fifth most spoken native language and the seventh most spoken language by the total number of speakers in the world. It is the fifth most spoken Indo-European language.
Konkani alphabets refers to the five different scripts currently used to write the Konkani language.
This page describes the grammar of Maithili language, which has a complex verbal system, nominal declension with a few inflections, and extensive use of honoroficity. It is an Indo-Aryan language native to the Maithili people and is spoken in the Indian state of Bihar with some speakers in Jharkhand and nearby states.The language has a large number of speakers in Nepal too, which is second in number of speakers after Bihar.
The inherited, native lexicon of the Hindustani language exhibits a large number of extensive sound changes from its Middle Indo-Aryan and Old Indo-Aryan. Many sound changes are shared in common with other Indo-Aryan languages such as Marathi, Punjabi, and Bengali.
...The implicit /a/ is not read when the symbol appears in word-final position or in certain other contexts where it is obligatorily deleted (via the so-called schwa-deletion rule which plays a crucial role in Hindi word phonology)...
...the history of the schwa deletion rule in Gujarati has been examined. The historical perspective brings out the fact that schwa deletion is not an isolated phenomenon; the loss of final -a has preceded the loss of medial -a-;...
...Hindi literature fails as a reliable indicator of the actual pronunciation because it is written in the Devanagari script... the schwa syncope rule which operates in Hindi....
...schwa deletion is an important issue for grapheme-to-phoneme conversion of IAL, which in turn is required for a good Text-to-Speech synthesizer.... Sanskrit rəcəna, Hindi rəcna, Bengali rɔcona....
... Without the appropriate deletion of schwas, any speech output would sound unnatural. Since the orthographical representation of Devanagari gives little indication of deletion sites, modern TTS systems for Hindi implemented schwa deletion rules based on the segmental context where schwa appears ...
...The Devnāgrī (Devanāgarī) script is syllabic and all consonants carry the inherent vowel a unless otherwise indicated. The principal difference between modern Hindi and the classical Sanskrit forms is the omission in Hindi of this inherent a when in final position (e.g. rāga in Sanskrit and rāg in Hindi) and frequently in medial position (e.g. Māravā in Sanskrit and Mārvā in Hindi).
... Without any schwa deletion, not only the two words will sound very unnatural, but it will also be extremely difficult for the listener to distinguish between the two, the only difference being nasalisation of the e at the end of the former. However, a native speaker would pronounce the former as dha.D-kan-eM and the later as dha.Dak-ne, which are clearly distinguishable ...
...The compound words derived from native words of Bengali show greater tendency towards {a} deletion than those derived from Sanskrit....
...The two morphophonemic alternations that are very productive and regular in Maithili are schwa deletion and replacement of a by schwa. (a) Schwa deletion:... VCəCV → VC0CV.... Schwa deletion in Maithili occurs....
...The paradigm in Bhojpuri... hamaar in isolation is genitive and has an oblique form in -aa, which according to the general principle of vowel attenuation (schwa deletion) in this language yields the form hamraa before postpositions....
...nazar 'glance' - nazar të - nazrë. Postposition incorporation is quite productive. The stem-final schwa undergoes deletion before the vocalic postpositional elements....
...in the Devanagari script, the schwa vowel is not indicated in consonant-initial syllables. This is a well-known problem for those learning to read Hindi....
... sapokana Iṅgalisha ...
... For a pair of words eg ləpəṭ ~ ləpəṭen 'flame', one has to apply the following phonomorphological interface rules on the abstract ...
... on the suffixation: H. samajhna 'to understand' > samjha 'understood'. This too produces clusters, albeit unstable ones. As noted in Chapter 6, the most recent treatment (synchronic) of this "schwa-deletion" phenomenon in Hindi ...
... [sulochna] ~ [sulochəna] ... schwa is conditionally deleted ...
... nasalized vowels are derived from underlying sequences of vowel plus nasal consonant ...