Cyrillization of Arabic

Last updated

Cyrillization of Arabic is the conversion of text written in Arabic script into Cyrillic script. Because the Arabic script is an abjad (a writing system without vowels), an accurate transliteration into Cyrillic, an alphabet, would still require prior knowledge of the subject language to read. Instead, systems of transcription have normally been used.

Contents

Russian system

Basic alphabet

LetternameScientific TranscriptionPractical TranscriptionOther
IPA Cyrillic DIA lat.ru.
alif//а̄ ā a, aa а, я [1] ә
hamza/ʔ/’, ʾ ʼ, — ъ, Ӏ
bā’/b/бbbб
tā’/t/тttт
thā’/θ/с̱th, s, (t) [2] [3] с, (т) [2] [3] ҫ
jīm/d͡ʒ~ɡ/джǧj, dj, (g) [2] [4] дж, (г, ж)җ, ж, ҹ, ӂ, џ
ḥā’ (pharyngeal)/ħ/х̣hхҳ, хӀ, хь
khā’/x~χ/х̱,х̮kh, xххь
dāl/d/дddд
zāl/ð/з̱dh, (z, d) [2] [3] з, (д) [2] [3] ҙ
rā’/r/рrrр
zayn/z/зzzз
sīn/s/сssс
shīn/ʃ/шšsh, chш
ṣād//с̣sссӀ
ḍād//д̣d, dhддӀ
ṭā’//т̣tттӀ, тъ
ẓā’/ðˁ~/з̣z, (zh, dh) [2] [3] з, (д) [3] зӀ
‘ayn/ʕ/ʿʻ—, ъ, vowel doubling [5] Ӏ, гӀ
ghayn/ɣ/г,г̣ġg, ghгғ, ҕ, гӀ, гъ
fā’/f/фffф
qāf/q/к̣qq, k, (g, gh) [2] [6] к, (г) [2] [6] қ, ҡ, ҟ, къ, кь, хъ
kāf/k/кkk, cк
lām/l/лllл, ль [7]
mīm/m/мmmм
nūn/n/нnnн
hā’/h/х,һhhхһ, ҳ, хӀ, гь
wāw/w, /в,ў,ӯw, ūw, u, oo, ouв, у, ю [1]
yā’/j, /й,ӣy, īy, i, eeй, [1] и, ы [8] ј

Vowels

Note: The following tables use the letter hamza (ء) as a carrier to illustrate the use of diacritics. It is not part of these signs.

To record short vowels after a consonant, optional signs (fatḥah, ḍammah, kasrah) are used above this consonant. To write long vowels, the same signs are used plus the corresponding consonant letter.

LetterTitleScientific transcriptionPractical transcription
IPACyrillicDINlat.ru.
Short vowels
ءَfatḥah/a/аaa, eа, е, э
ءُḍammah/u/уuu, o, ouу, о
ءِkasrah/i/иii, eи, е, э
Long vowels
ءَاfatḥah-alif//а̄āa, aaа
ءٰalif-khanjariyah//а̄āa, aaа
ءَىfatḥah-alif-maqsurah//а̄āa, aaа
ءَىٰ//а̄āa, aaа
ءُوḍammah-waw//ӯūu, oo, ouу
ءِيkasrah-ya’//ӣīi, eeи
Diphthongs
ءَوfatḥah-waw/au/аўawaw, au, aouау
ءَيfatḥah-ya’/ai/айayay, ey, eiай, ей, эй
Combinations
ءُوّ/uːw,-uː/ӯвuwwuwwув
ءِيّ/iːj,-iː/ӣйiyyuyyий [1]

Hamza

The glottal stop (in Arabic hamza ) has complex notation rules. It can be written as a single character on the line ⟨ﺀ⟩, thus not distinguishing itself from other consonants, but much more often it is written above or below three carrier letters: alif, vav, ya. At the same time, hamza, like any consonant, can be both before and after a vowel. At the beginning of words, the hamza is written exclusively above or below the alif. The combination "alif-hamza + fatha + alif" (that is, أَا) is written in a special way through alif-madda. The absence of a hamza is occasionally recorded with a wasla sign (usually used only before the initial alif). In unvoiced texts, even the complete absence of over- or signed hamza is possible (thus hamza is indistinguishable from vav and ya), which further complicates transcription. Hamza is almost never written over the alif of the definite article.

LetterNamePhonetic TranscriptionPractical Transcription
IPACyrillicDINlat.ru.
Hamza before a vowel
أَalifa-hamza-fatḥah/ʔa/’аʾa’aа
أُalifa-hamza-ḍammah/ʔu/’уʾu’uу
إِalif-hamza-kasrah/ʔi/’иʾi’iи
آalif-maddah/ʔaː/’а̄ʾā’aа
ؤَwaw-hamza-fatḥah/ʔa/’аʾa’aа
ؤُwaw-hamza-ḍammah/ʔu/’уʾu’uу
ؤِwaw-hamza-kasra/ʔi/’иʾi’iи
ئَyā’-hamza-fatḥah/ʔa/’аʾa’aа
ئُyā’-hamza-ḍammah/ʔu/’уʾu’uу
ئِyā’-hamza-kasrah/ʔi/’иʾi’iи
Hamza after a vowel
ءَأalifa-hamza-fatḥah/aʔ/а’a’а
ءُأalif-hamza-ḍammah/uʔ/у’u’у
ءِأalif-hamza-kasrah/iʔ/и’i’и
ءَؤwaw-hamza-fatḥah/aʔ/а’a’а
ءُؤwaw-hamza-ḍammah/uʔ/у’u’у
ءِؤwaw-hamza-kasrah/iʔ/и’i’и
ءَئyā’-hamza-fatḥah/aʔ/а’a’а
ءُئyā’-hamza-ḍammah/uʔ/у’u’у
ءِئyā’-hamza-kasrah/iʔ/и’i’и
Absence of Hamza
ٱhamzatu-l-waṣl/∅/

Definite Article

As with the hamza, there are some difficulties in rendering the Arabic definite article. In neutral position, it is read and transcribed as al- ( el- ). In certain positions, the sound - l - can be assimilated with the subsequent consonant (see more at Sun and moon letters). After vowels, the initial a - ( e -) usually disappears (the so-called wallowing).

LetterNamePhonetic TranscriptionPractical Transcription
IPACyrillicDINlat.ru.
الalif-lam/ʔal-, ʔaCː-/аль-al-al-, el-аль-, эль-
assimilation al before sun consonants+++− / +− / +

Nunation

The term ‘’nunation’’ (or ‘’tanwīn’’ ) in Arabic grammar refers to the case endings of the “indefinite state” (corresponds to the indefinite article in other languages). Sometimes, for historical reasons, after the nunation, the silent letters alif-maksura or a simple alif are additionally written.

LetterNamePhonetic TranscriptionPractical Transcription
IPACyrillicDINlat.ru.
ءًtanwīn-fatḥah/an/анanан
ءًى/an/анanан
ءًا/an/анanан
ءٌtanwīn-ḍammah/un/унunун
ءٍtanwīn-kasra/in/инinин

Other signs


LetterNamePhonetic TranscriptionPractical Transcription
IPACyrillicDINlat.ru.
ءْsukūn/∅/
ءّshaddahconsonant doubling

Special Letters and Combinations

LetterNamePhonetic TranscriptionPractical Transcription
IPACyrillicDINlat.ru.
tāʾ marbūṭah/a, at/а, атh, ta, ah / atа, ат/ет
alif maqsura/aː/а̄āaа
lam-alif/laː/ла̄laля

Sample Text

Universal Declaration of Human Rights, Article 1

Arabic text with harakat [9] Romanization Russian CyrillizationTranslation
يُولَدُ جَمِيعُ النَّاسِ أحْرَاراً مُتَسَاوِينَ فِي الْكَرَامَةِ وَالحُقُوقِ. وَقَدْ وُهِبُوا عَقْلاً وَ ضَمِيراً وَ عَلَيهِمْ أنْ يُعَامِلَ بَعْضُهُمْ بَعْضاً بِرُوحِ اَلإخَاء.Yūladu jamī‘u n-nāsi aḥrāran mutasāwīna fī l-karāmati wa-l-ḥuqūq. Wa-qad wuhibū ‘aqlan wa-ḍamīran wa-‘alayhim an yu‘āmila ba‘ḍuhum ba‘ḍan bi-rūḥi l-ikhā’.Йӯладу джамӣ‘у н-на̄си ах̣ра̄ран мутаса̄вӣна фӣ л-кара̄мати ва-л-х̣ук̣ӯк̣. Ва-к̣ад вухибӯ ‘ак̣лан ва-д̣амӣран ва-‘алайхим ан йу‘а̄мила ба̄‘д̣ухум ба‘д̣ан би-рӯх̣и л-их̱а̄’.All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood

Related Research Articles

<span class="mw-page-title-main">Arabic alphabet</span>

The Arabic alphabet, or Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is written from right-to-left in a cursive style, and includes 28 letters, of which most have contextual letterforms. The Arabic alphabet is considered an abjad, with only consonants required to be written; due to its optional use of diacritics to notate vowels, it is considered an impure abjad.

The Hebrew alphabet, known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is traditionally an abjad script used in the writing of the Hebrew language and other Jewish languages, most notably Yiddish, Ladino, Judeo-Arabic, and Judeo-Persian. In modern Hebrew, vowels are increasingly introduced. It is also used informally in Israel to write Levantine Arabic, especially among Druze. It is an offshoot of the Imperial Aramaic alphabet, which flourished during the Achaemenid Empire and which itself derives from the Phoenician alphabet.

<span class="mw-page-title-main">Arabic diacritics</span> Diacritics used in the Arabic script

Arabic script has numerous diacritics, which include consonant pointing known as iʻjām (إِعْجَام), and supplementary diacritics known as tashkīl (تَشْكِيل). The latter include the vowel marks termed ḥarakāt.

Abaza is a Northwest Caucasian language spoken by Abazins in Russia. The language has gone through several different orthographies based primarily on Latin and Cyrillic letters. Its consonant-to-vowel ratio is remarkably high; making it quite similar to many other languages from the same parent chain. The language evolved in popularity in the mid to late 1800s, but has become an endangered language.

<span class="mw-page-title-main">Bashkir language</span> Turkic language of the Kipchak sub-branch

Bashkir or Bashkort is a Turkic language belonging to the Kipchak branch. It is co-official with Russian in Bashkortostan. It is spoken by 1.09 million native speakers in Russia, as well as in Ukraine, Belarus, Kazakhstan, Uzbekistan, Estonia and other neighboring post-Soviet states, and among the Bashkir diaspora. It has three dialect groups: Southern, Eastern and Northwestern.

Avar, also known as Avaric, is a Northeast Caucasian language of the Avar–Andic subgroup that is spoken by Avars, primarily in Dagestan. In 2010, there were approximately 1 million speakers in Dagestan and elsewhere in Russia.

<span class="mw-page-title-main">Kalmyk Oirat</span> Oirat dialects spoken in Kalmykia, Russia

Kalmyk Oirat, commonly known as the Kalmyk language, is a variety of the Oirat language, natively spoken by the Kalmyk people of Kalmykia, a federal subject of Russia. In Russia, it is the standard form of the Oirat language, which belongs to the Mongolic language family. The Kalmyk people of the Northwest Caspian Sea of Russia claim descent from the Oirats from Eurasia, who have also historically settled in Mongolia and Northwest China. According to UNESCO, the language is "Definitely endangered". According to the Russian census of 2021, there are 110,000 speakers of an ethnic population consisting of 178,000 people.

While many languages have numerous dialects that differ in phonology, the contemporary spoken Arabic language is more properly described as a continuum of varieties. This article deals primarily with Modern Standard Arabic (MSA), which is the standard variety shared by educated speakers throughout Arabic-speaking regions. MSA is used in writing in formal print media and orally in newscasts, speeches and formal declarations of numerous types.

Waw is the sixth letter of the Semitic abjads, including Phoenician wāw 𐤅, Aramaic waw 𐡅, Hebrew vav ו, Syriac waw ܘ and Arabic wāw و.

Yodh is the tenth letter of the Semitic abjads, including Phoenician yōd 𐤉, Hebrew yud י, Aramaic yod 𐡉, Syriac yōḏ ܝ, and Arabic yāʾ ي. Its sound value is in all languages for which it is used; in many languages, it also serves as a long vowel, representing.

Aleph is the first letter of the Semitic abjads, including Phoenician ʾālep 𐤀, Hebrew ʾālef א, Aramaic ʾālap 𐡀, Syriac ʾālap̄ ܐ, Arabic ʾalif ا, and North Arabian 𐪑. It also appears as South Arabian 𐩱 and Ge'ez ʾälef አ.

Libyan Arabic, also called Sulaimitian Arabic by scholars, is a variety of Arabic spoken in Libya, and neighboring countries. It can be divided into two major dialect areas; the eastern centred in Benghazi and Bayda, and the western centred in Tripoli and Misrata. The Eastern variety extends beyond the borders to the east and share the same dialect with far Western Egypt, Western Egyptian Bedawi Arabic, with between 90,000 and 474,000 speakers in Egypt. A distinctive southern variety, centered on Sabha, also exists and is more akin to the western variety. Another Southern dialect is also shared along the borders with Niger with 12,900 speakers in Niger as of 2021.

<span class="mw-page-title-main">Kazakh alphabets</span> Alphabets used to write the Kazakh language

Three alphabets are used to write Kazakh: the Cyrillic, Latin and Arabic scripts. The Cyrillic script is used in Kazakhstan and Mongolia. An October 2017 Presidential Decree in Kazakhstan ordered that the transition from Cyrillic to a Latin script be completed by 2031. The Arabic script is used in Saudi Arabia, Iran, Afghanistan, and parts of China.

<span class="mw-page-title-main">Tajwid</span> Rules governing pronunciation during recitation of the Quran

In the context of the recitation of the Quran, tajwīd is a set of rules for the correct pronunciation of the letters with all their qualities and applying the various traditional methods of recitation (Qira'at). In Arabic, the term tajwīd is derived from the verb جود, meaning enhancement or to make something excellent. Technically, it means giving every letter its right in reciting the Qur'an.

<span class="mw-page-title-main">Urdu alphabet</span> Writing system used for Urdu

The Urdu alphabet is the right-to-left alphabet used for writing Urdu. It is a modification of the Persian alphabet, which itself is derived from the Arabic script. It has official status in the republics of Pakistan, India and South Africa. The Urdu alphabet has up to 39 or 40 distinct letters with no distinct letter cases and is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly written in the Naskh style.

<span class="mw-page-title-main">Hamza</span> Mark used in Arabic-based orthographies

The hamza is an Arabic script character that, in the Arabic alphabet, denotes a glottal stop and, in non-Arabic languages, indicates a diphthong, vowel, or other features, depending on the language. Derived from the letter ʿAyn, the hamza is written in medial and final positions as an unlinked letter or placed above or under a carrier character. Despite its common usage as a letter in Modern Standard Arabic, it is generally not considered to be one of its letters, although some argue that it should be considered a letter.

<span class="mw-page-title-main">Cyrillic alphabets</span> Related alphabets based on Cyrillic scripts

Numerous Cyrillic alphabets are based on the Cyrillic script. The early Cyrillic alphabet was developed in the 9th century AD and replaced the earlier Glagolitic script developed by the theologians Cyril and Methodius. It is the basis of alphabets used in various languages, past and present, Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages. About half of them are in Russia. Cyrillic is one of the most-used writing systems in the world. The creator is Saint Clement of Ohrid from the Preslav literary school in the First Bulgarian Empire.

<span class="mw-page-title-main">Pashto alphabet</span> Writing system used for the Pashto language

The Pashto alphabet is the right-to-left abjad-based alphabet developed from the Arabic script, used for the Pashto language in Pakistan and Afghanistan. It originated in the 16th century through the works of Pir Roshan.

<span class="mw-page-title-main">Pegon script</span> Javanese-Arabic script

Pegon is a modified Arabic script used to write the Javanese, Sundanese, and Madurese languages, as an alternative to the Latin script or the Javanese script and the Old Sundanese script. It was used in a variety of applications, from religion, to diplomacy, to poetry. But today particularly, it is used for religious (Islamic) writing and poetry, particularly in writing commentaries of the Qur'an. Pegon includes letters that are not present in Modern Standard Arabic. Pegon has been studied far less than its Jawi counterpart which is used for Malay, Acehnese and Minangkabau.

The phonological system of the Hejazi Arabic consists of approximately 26 to 28 native consonant phonemes and 8 vowel phonemes:. Consonant length and vowel length are both distinctive in Hejazi.

References

  1. 1 2 3 4 After k, l, d. The combinations y with a and y with y are transcribed as i and yu.
  2. 1 2 3 4 5 6 7 8 Dialect variants are marked in parentheses.
  3. 1 2 3 4 5 6 The consonants sa, zal, za have different pronunciations: in literary Arabic - as interdental fricatives /θ, ð, ðˤ/, in some dialects (primarily in Egypt and the Maghreb) - as dental stops /t, d, dˤ/, in others dialects (Levant) - as dental fricatives /s, z, zˤ/.
  4. In the dialects of Egypt, Yemen and Oman, jim is pronounced as a stop consonant /ɡ/. In the dialects of the Maghreb and the Levant, jim is pronounced as a hissing consonant /ʒ/.
  5. Ayin in practical transmission is either ignored or lengthens the vowel following it (علي Ali, جماعة jamāt, بعث bās).
  6. 1 2 In many dialects, kaf is pronounced as /ɡ/.
  7. before consonants.
  8. After emphatic consonants and can be transmitted as "ы"
  9. "Omniglot". 2023-09-01.

Further reading