Broken plural

Last updated February 06, 2026

In linguistics, a broken plural (or internal plural) is an irregular plural form of a noun or adjective found in the Semitic languages and other Afroasiatic languages such as the Berber languages. Broken plurals are formed by changing the pattern of consonants and vowels inside the singular form. They contrast with sound plurals (or external plurals), which are formed by adding a suffix, but are also formally distinct from phenomena like the Germanic umlaut, a form of vowel mutation used in plural forms in Germanic languages.

There have been a variety of theoretical approaches to understanding these processes and varied attempts to produce systems or rules that can systematize these plural forms.^[1] However, the question of the origin of the broken plurals for the languages that exhibit them is not settled, though there are certain probabilities in distributions of specific plural forms in relation to specific singular patterns. As the conversions outgo by far the extent of mutations caused by the Germanic umlaut that is evidenced to be caused by inflectional suffixes, the sheer multiplicity of shapes corresponds to multiplex attempts at historical explanation ranging from proposals of transphonologizations and multiple accentual changes to switches between the categories of collectives, abstracta and plurals or noun class switches.^[2]

Arabic

While the phenomenon is known from several Semitic languages, it is most productive in Arabic.^{[ citation needed ]}

In Arabic, the regular way of making a plural for a masculine noun is adding the suffix -ūn[a] (for the nominative) or -īn[a] (for the accusative and genitive) at the end. For feminine nouns, the regular way is to add the suffix -āt. However, not all plurals follow these simple rules. One class of nouns in both spoken and written Arabic produce plurals by changing the pattern of vowels inside the word, sometimes also with the addition of a prefix or suffix. This system is not fully regular, and it is used mainly for masculine non-human nouns; human nouns are pluralized regularly or irregularly.^{[ citation needed ]}

Broken plurals are known as jamʿu taksīr (جَمْعُ تَكْسِيرٍ, literally "plural of breaking") in Arabic grammar. These plurals constitute one of the most unusual aspects of the language, given the very strong and highly detailed grammar and derivation rules that govern the written language. Broken plurals can also be found in languages that have borrowed words from Arabic, for instance Persian, Pashto, Turkish, Azerbaijani, Sindhi, and Urdu. Sometimes in these languages the same noun has both a broken plural Arabic form and a local plural.^{[ citation needed ]}

In Persian this kind of plural is known by its Arabic term jamʿ-e mokassar (جَمِع مُکَسَّر, literally "broken plural"). However the Persian Academy of Literature (Farhangestan) does not recommend the usage of such Arabic plural forms, but instead the native Persian plural suffix -hā.^{[ citation needed ]}

Full knowledge of these plurals can come only with extended exposure to the Arabic language, though a few rules can be noted. One study computed the probability that the pattern of vowels in the singular would predict the pattern in the broken plural (or vice versa) and found values ranging from 20% to 100% for different patterns.^[3]

A statistical analysis of a list of the 3000 most frequent Arabic words shows that 978 (59%) of the 1670 most frequent nominal forms take a sound plural, while the remaining 692 (41%) take a broken plural.^[4] Another estimate of all existing nominal forms gives over 90,000 forms with a sound plural and just 9540 with a broken one.^[4] This is due to the almost boundless number of participles and derived nominals in "-ī", most of which take a sound plural.^{[ citation needed ]}

Example

Semitic languages typically utilize triconsonantal roots, forming a "grid" into which vowels may be inserted without affecting the basic root.

Here are a few examples; note that the commonality is in the root consonants (capitalized), not the vowels.

KiTāBكِتَاب "book" → KuTuBكُتُب "books"
KāTiBكَاتِب "writer, scribe" → KuTTāBكُتَّاب "writers, scribes"
maKTūBمَكْتُوب "letter" → maKāTīBمَكَاتِيب "letters"
maKTaBمَكْتَب "desk, office" → maKāTiBمَكَاتِب "offices"

note: these four words all have a common root, K-T-B‏ ك – ت – ب ‎ "to write"

In the non-semitic Persian language it is current to use:

KiTāBکِتَاب‌ "book" → KiTāBhāکِتَاب‌ْهَا "books"
KāTiBكَاتِبْ‌ "writer, scribe" → KāTiBhāكَاتِبْ‌هَا "writers, scribes"

Patterns in Arabic

Singular form	Plural form	Singular example			Plural example			Other examples	Notes
CiCāC	CuCuC	‏ كِتَاب ‎	kitāb	'book'	كُتُب	kutub	'books'
CaCīCah		‏ سَفِينَة ‎	safīnah	'ship'	سُفُن	sufun	'ships'	juzur (islands), mudun (cities)
CaCv̄C		‏ أَسَاس ‎	ʾasās	'foundation'	أُسُس	ʾusus	'foundations'
		‏ سَبِيل ‎	sabīl	'path'	سُبُل	subul	'paths'	turuq (paths)
		‏ رَسُول ‎	rasūl	'messenger'	رُسُل	rusul	'messengers'
CvCCah	CuCaC	‏ شَقَّة ‎	šaqqah	'apartment'	شُقَق	šuqaq	'apartments'
	CiCaC	‏ قِطّة ‎	qiṭṭah	'cat'	قِطَط	qiṭaṭ	'cats'
	CuCaC	‏ غُرْفَة ‎	ġurfah	'room'	غُرَف	ġuraf	'rooms'	sunan (habits)
CiCC	CiCaCah	‏ هِرّ ‎	hirr	'cat'	هِرَرَة	hirarah	'cats'	fiyalah (elephants) qiradah (apes)
CuCC	CiCaCah	‏ دُبّ ‎	dubb	'bear'	دِبَبَة	dibabah
CvCC	CuCūC	‏ قَلْب ‎	qalb	'heart'	قُلُوب	qulūb	'hearts'	funūn (arts), buyūt (houses) judūd (grandfathers)
		‏ عِلْم ‎	ʿilm	'science'	عُلُوم	ʿulūm	'sciences'
		‏ جُحْر ‎	juḥr	'hole'	جُحُور	juḥūr	'holes'
	CiCāC	‏ كَلْب ‎	kalb	'dog'	كِلَاب	kilāb	'dogs'
		‏ ظِلّ ‎	ẓill	'shadow'	ظِلَال	ẓilāl	'shadows'
		‏ رُمْح ‎	rumḥ	'spear'	رِمَاح	rimāḥ	'spears'
CaCaC		‏ جَمَل ‎	jamal	'camel'	جِمَال	jimāl	'camels'
CaCuC		‏ رَجُل ‎	rajul	'man'	رِجَال	rijāl	'men'
CvCC	ʾaCCāC	‏ يَوْم ‎	yawm	'day'	أَيَّام	ʾayyām	'days'	ʾarbāb (masters) ʾajdād (grandfathers)
		‏ جِنْس ‎	jins	'kind, type'	أَجْنَاس	ʾajnās	'kinds, types'
		‏ لُغْز ‎	luḡz	'mystery'	أَلْغَاز	ʾalḡāz	'mysteries'	ʾaʿmaq (deeps)
CaCaC		‏ سَبَب ‎	sabab	'cause'	أَسْبَاب	ʾasbāb	'causes'	ʾawlād (boys), ʾaqlām (pens)
CuCuC		‏ عُمُر ‎	ʿumur	'lifespan'	أَعْمَار	ʾaʿmār	'lifespans'	ʾarbāʿ (quarters)
CaCūC	ʾaCCiCah	‏ عَمُود ‎	ʿamūd	'pole'	أَعْمِدَة	ʾaʿmidah	'poles'		Ends with taʾ marbuta
CaCīC	ʾaCCiCāʾ	‏ صَدِيق ‎	ṣadīq	'friend'	أَصْدِقَاء	ʾaṣdiqāʾ	'friends'
CaCīC	CuCaCāʾ	‏ سَعِيد ‎	saʿīd	'happy'	سُعَدَاء	suʿadāʾ	'happy'	wuzarāʾ (ministers) bukhalāʾ (cheapskates)	mostly for adjectives and occupational nouns
CāCiC	CuCCāC	‏ كَاتِب ‎	kātib	'writer'	كُتَّاب	kuttāb	'writers'	ṭullāb (students) sukkān (residents)	Gemination of the second root; mostly for the active participle of Form I verbs
	CaCaCah	‏ جَاهِل ‎	jāhil	'ignorant'	جَهَلَة	jahalah	'ignorant'
	CuCCaC	‏ سَاجِد ‎	sājid	'prostrated'	سُجَّد	sujjad
CāCiCah	CuCCaC	‏ سَاجِدَة ‎	sājidah	'prostrated' (Fem.)	سُجَّد	sujjad
CāCiCah	CawāCiC	‏ قَائِمَة ‎	qāʾimah	'list'	قَوَائِم	qawāʾim	'lists'	bawārij (battleships)
CāCūC	CawāCīC	‏ صَارُوخ ‎	ṣārūḫ	'rocket'	صَوَارِيخ	ṣawārīḫ	'rockets'	ḥawāsīb (computers), ṭawāwīs (peacocks)
CiCāCah	CaCāʾiC	‏ رِسَالَة ‎	risālah	'message'	رَسَائِل	rasāʾil	'messages'	baṭāʾiq (cards)
CaCīCah	CaCāʾiC	‏ جَزِيرَة ‎	jazīrah	'island'	جَزَائِر	jazāʾir	'islands'	haqāʾib (suitcases), daqāʾiq (minutes)
CaCCaC	CaCāCiC	‏ دَفْتَر ‎	daftar	'notebook'	دَفَاتِر	dafātir	'notebooks'		applies to all four-literal nouns with short second vowel
CuCCuC	CaCāCiC	‏ فُنْدُق ‎	funduq	'hotel'	فَنَادِق	fanādiq	'hotels'		applies to all four-literal nouns with short second vowel
maCCaC	maCāCiC	‏ مَلْبَس ‎	malbas	'apparel'	مَلَابِس	malābis	'apparels'	makātib (offices)	Subcase of previous, with m as first literal
maCCiC		‏ مَسْجِد ‎	masjid	'mosque'	مَسَاجِد	masājid	'mosques'	manāzil (houses)	Subcase of previous, with m as first literal
miCCaCah		‏ مِنْطَقَة ‎	minṭaqah	'area'	مَنَاطِق	manāṭiq	'areas'
CvCCv̄C	CaCāCīC	‏ صَنْدُوق ‎	ṣandūq	'box'	صَنَادِيق	ṣanādīq	'boxes'		applies to all four-literal nouns with long second vowel
miCCāC	maCāCīC	‏ مِفْتَاح ‎	miftāḥ	'key'	مَفَاتِيح	mafātīḥ	'keys'		Subcase of previous, with m as first literal
maCCūC	maCāCīC	‏ مَكْتُوب ‎	maktūb	'message'	مَكَاتِيب	makātīb	'messages'		Subcase of previous, with m as first literal

Hebrew

In Hebrew, though all plurals must take either the -īm ־ים (generally masculine) or -ōt ־ות (generally feminine) plural suffixes, the historical stem alternations of the so-called segolate or consonant-cluster nouns between CVCC in the singular and CVCaC in the plural have often been compared to broken plural forms in other Semitic languages. Thus the form malkīמַלְכִּי‎ "my king" in the singular is opposed to məlāxīmמְלָכִים‎ "kings" in the plural.^[5]

In addition, there are many other cases where historical sound changes have resulted in stem allomorphy between singular and plural forms in Hebrew (or between absolute state and construct state, or between forms with pronominal suffixes and unsuffixed forms etc.), though such alternations do not operate according to general templates accommodating root consonants, and so are not usually considered to be true broken plurals by linguists.^[6]

Geʽez & Amharic

Broken plurals were formerly used in some Ethiopic nouns. Examples include ˁanbässa "lion" with ˁanabəst "lions", kokäb "star" with kwakəbt "stars", ganen "demon" with aganənt "demons", and hagar "region" with ˀahgur "regions".^[7] Some of these broken plurals are still used in Amharic today, but they are generally seen as archaic.^[7] A generic word for God in both languages is ˁamlak (አምላክ) which is a broken plural of Malik, Proto-Semitic for king.

References

↑ Ratcliffe, Robert R. (1998). The "Broken" Plural Problem in Arabic and Comparative Semitic. Current Issues in Linguistic Theory 168. Amsterdam/Philadelphia: John Benjamins. ISBN 978-90-272-3673-9.
↑ An overview of the theories is given by Ratcliffe, Robert R. (1998). The "Broken" Plural Problem in Arabic and Comparative Semitic. Current Issues in Linguistic Theory 168. Amsterdam/Philadelphia: John Benjamins. pp. 117 seqq. ISBN 978-90-272-3673-9.
↑ Ratcliffe, Robert R. (1998). The "Broken" Plural Problem in Arabic and Comparative Semitic. Current Issues in Linguistic Theory 168. Amsterdam/Philadelphia: John Benjamins. pp. 72–79. ISBN 978-90-272-3673-9.
1 2 Boudelaa, Sami; Gaskell, M. Gareth (21 September 2010). "A re-examination of the default system for Arabic plurals". Language and Cognitive Processes. 17 (3): 321–343. doi:10.1080/01690960143000245. S2CID 145307357.
↑ "Ge'ez (Axum)" by Gene Gragg in The Cambridge Encyclopedia of the World's Ancient Languages edited by Roger D. Woodard (2004) ISBN 0-521-56256-2, p. 440.
↑ "Hebrew" by P. Kyle McCarter Jr. in The Cambridge Encyclopedia of the World's Ancient Languages edited by Roger D. Woodard (2004) ISBN 0-521-56256-2, p. 342.
1 2 Leslau, Wolf (1991). Comparative Dictionary of Geʿez (Classical Ethiopic). Wiesbaden: Harrassowitz, pp. 64, 280, 198, 216

Relevant literature

Castagna, Giuliano. 2017. Towards a systematisation of the broken plural patterns in the Mehri language of Oman and Yemen. Quaderni di Vicino Oriente XII: 115–122. (read online)

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Ratcliffe, Robert R. (1998). The "Broken" Plural Problem in Arabic and Comparative Semitic. Current Issues in Linguistic Theory 168. Amsterdam/Philadelphia: John Benjamins. ISBN 978-90-272-3673-9.

[2] An overview of the theories is given by Ratcliffe, Robert R. (1998). The "Broken" Plural Problem in Arabic and Comparative Semitic. Current Issues in Linguistic Theory 168. Amsterdam/Philadelphia: John Benjamins. pp. 117 seqq. ISBN 978-90-272-3673-9.

[3] Ratcliffe, Robert R. (1998). The "Broken" Plural Problem in Arabic and Comparative Semitic. Current Issues in Linguistic Theory 168. Amsterdam/Philadelphia: John Benjamins. pp. 72–79. ISBN 978-90-272-3673-9.

[Boudelaa&Gaskell-4] 1 2 Boudelaa, Sami; Gaskell, M. Gareth (21 September 2010). "A re-examination of the default system for Arabic plurals". Language and Cognitive Processes. 17 (3): 321–343. doi:10.1080/01690960143000245. S2CID 145307357.

[5] "Ge'ez (Axum)" by Gene Gragg in The Cambridge Encyclopedia of the World's Ancient Languages edited by Roger D. Woodard (2004) ISBN 0-521-56256-2, p. 440.

[6] "Hebrew" by P. Kyle McCarter Jr. in The Cambridge Encyclopedia of the World's Ancient Languages edited by Roger D. Woodard (2004) ISBN 0-521-56256-2, p. 342.

[:0-7] 1 2 Leslau, Wolf (1991). Comparative Dictionary of Geʿez (Classical Ethiopic). Wiesbaden: Harrassowitz, pp. 64, 280, 198, 216

[1]

[2]

[3]

[4]

[5]

[6]

[7]