Cyrillic digraphs

Last updated
The Cyrillic script
Slavic letters
А А̀ А̂ А̄ Ӓ Б В Г
Ґ Д Ђ Ѓ Е Ѐ Е̄ Е̂
Ё Є Ж З З́ Ѕ И І
Ї Ѝ И̂ Ӣ Й Ј К
Л Љ М Н Њ О О̀ О̂
Ō Ӧ П Р Р̌ С С́ Т
Ћ Ќ У У̀ У̂ Ӯ Ў Ӱ
Ф Х Ц Ч Џ Ш Щ
Ъ Ъ̀ Ы Ь Ѣ Э Ю Ю̀
Я Я̀
Non-Slavic letters
Ӑ А̊ А̃ Ӓ̄ Ӕ Ә Ә́ Ә̃
Ӛ В̌ Ԝ Г̑ Г̇ Г̣ Г̌ Г̂
Г̆ Г̈ Ҕ Ғ Cyrillic capital letter Ghe with stroke and descender.svg Ӻ Ғ̌ Ӷ
Cyrillic capital letter Ghe with hook.svg Д́ Д̌ Д̈ Д̣ Д̆ Ӗ Е̃
Ё̄ Є̈ Ԑ Ԑ̈ Җ Ӝ Ӂ Ж̣
Ҙ Ӟ З̌ З̣ З̆ Ӡ И̃ Ӥ
Ҋ Қ Ӄ Ҡ Ҟ Ҝ К̣ Ԛ
Л́ Ӆ Ԯ Ԓ Л̈ Ӎ
Н́ Ӊ Ң Ԩ Ӈ Ҥ О̆ О̃
Ӧ̄ Ө Ө̄ Ө́ Ө̆ Ӫ Ԥ П̈
Ҏ С̌ Ҫ С̣ С̱ Т́ Т̈ Т̌
Т̇ Т̣ Ҭ Cyrillic capital letter Te Soft-sign.svg У̃ Ӳ У̊ Ӱ̄
Ұ Ү Ү́ Х̣ Х̱ Х̮ Х̑ Х̌
Ҳ Ӽ Ӿ Һ Һ̈ Ԧ Ц̌ Ц̈
Ҵ Ҷ Ҷ̣ Ӵ Ӌ Cyrillic capital letter Che with hook.svg Ҹ Ч̇
Ч̣ Ҽ Ҿ Ш̈ Ш̣ Ы̆ Ы̄
Ӹ Ҍ Ҩ Э̆ Э̄ Э̇ Ӭ
Ӭ́ Ӭ̄ Ю̆ Ю̈ Ю̄ Я̆ Я̄ Я̈
Ӏ ʼ ˮ
Archaic or unused letters
Cyrillic capital letter script A.svg А̨ Б̀ Б̣ Б̱ В̀ Г̀ Г̧
Г̄ Г̓ Г̆ Ҕ̀ Ҕ̆ Cyrillic capital letter split by middle ring Ghe.svg Ԁ Д̓
Д̀ Д̨ Cyrillic capital letter archaic Dje.svg Ԃ Е̇ Е̨
Ж̀ Ж̑ Cyrillic small letter Zhe with stroke.svg Cyrillic capital letter Je with inverted belt.svg Cyrillic small letter Dje with high right breve serif.svg Џ̆
Ꚅ̆ З̀ З̑ Ԅ Ԇ
Ԫ Cyrillic capital letter Shha with Cil top.svg Cyrillic capital letter Shha with high right breve serif.svg І̂ І̣ І̨
Cyrillic capital letter byelorussian-ukrainian I with curve at bottom.svg Ј̵ Ј̃ К̓ К̀ К̆ Ӄ̆
К̑ К̇ К̈ К̄ Ԟ К̂ Cyrillic capital letter Ka with loop.svg Cyrillic small letter ka with ascender.svg
Л̀ Ԡ Ԉ Л̑ Л̇ Ԕ Cyrillic small letter El with retroflex hook.svg
Cyrillic small letter El Er.svg Cyrillic small letter Te El with retroflex hook.svg М̀ М̃ Н̀ Н̄ Н̧
Н̃ Ԋ Ԣ Н̡ Ѻ
Cyrillic small letter O with notch at top.svg Cyrillic capital letter O with notch at bottom.svg Cyrillic capital letter O with left notch.svg П̓ П̀
П́ Ҧ П̧ П̑ Ҁ Ԛ̆ Cyrillic capital letter Shha with hook.svg Р́
Р̀ Р̃ Cyrillic small letter split at right Er.svg Ԗ С̀ С̈ Ԍ Ҫ̓
Cyrillic capital letter long Es.svg Т̓ Т̀ Ԏ Т̑ Т̧
Ꚍ̆ Cyrillic small letter Te El.svg Cyrillic small letter voiceless L.svg Cyrillic small letter voiceless L with comma above.svg ОУ Cyrillic capital letter script U.svg У̇
У̨ ꙋ́ Ф̑ Ф̓ Х́ Х̀ Х̆ Х̇
Х̧ Х̾ Х̓ Cyrillic capital letter bashkir Ha.svg һ̱ Ѡ Ѽ
Ѿ Ц̀ Ц́ Ц̓ Cyrillic capital letter Tse with long left leg.svg Ꚏ̆
Cyrillic capital letter Cil.svg Cyrillic capital letter Cil with bar.svg Ч́ Ч̀ Ч̆ Ч̑ Ч̓
Cyrillic capital letter Char.svg Cyrillic small letter Char with high right breve serif.svg Ԭ Ꚇ̆ Ҽ̆ Ш̀
Ш̆ Ш̑ Щ̆ Ꚗ̆ Cyrillic capital letter Che Sha.svg Ы̂
Ы̃ Cyrillic small letter Yeru with connecting stroke.svg Ѣ́ Ѣ̈ Ѣ̆ Э̨ Э̂
Ю̂ Cyrillic capital letter iotified monograph uk.svg Cyrillic capital letter iotified monograph uk with breve.svg Я̈ Я̂ Я̨
Ԙ Ѥ Ѧ Ѫ Ѩ
Ѭ Ѯ Ѱ Ѳ Ѵ Ѷ

The Cyrillic script family contains many specially treated two-letter combinations, or digraphs, but few of these are used in Slavic languages. In a few alphabets, trigraphs and even the occasional tetragraph or pentagraph are used.

Contents

In early Cyrillic, the digraphs оу and оѵ were used for /u/. As with the equivalent digraph in Greek, they were reduced to a typographic ligature, , and are now written у. The modern letters ы and ю started out as digraphs, ъі and іо. In Church Slavonic printing practice, both historical and modern, оу (which is considered as a letter from the alphabet's point of view) is mostly treated as two individual characters, but ы is a single letter. For example, letter-spacing affects оу as if they were two individual letters, and never affects components of ы. In a context of Old Slavonic language, шт is a digraph that can replace a letter щ and vice versa.

Modern Slavic languages written in the Cyrillic alphabet make little or no use of digraphs. There are only two true digraphs: дж for /d͡ʒ/ and дз for /d͡z/ (Belarusian, Bulgarian, Ukrainian). Sometimes these digraphs are even considered as special letters of their respective alphabets. In standard Russian, however, the letters in дж and дз are always pronounced separately. Digraph-like letter pairs include combinations of consonants with the soft sign ь (Serbian/Macedonian letters љ and њ are derived from ль and нь), and жж or зж for the uncommon and optional Russian phoneme /ʑː/. Native descriptions of Cyrillic writing system often use the term "digraph" to combinations ьо and йо (Bulgarian, Ukrainian) as they both correspond to a single letter ё of Russian and Belarusian alphabets (ьо is used for /ʲo/, and йо for /jo/).

Cyrillic uses large numbers of digraphs only when used to write non-Slavic languages; in some languages such as Avar, these are completely regular in formation.

Many Caucasian languages use ә (Abkhaz), у (Kabardian & Adyghe), or в (Avar) for labialization, just as many of them, like Russian, use ь for palatalization. Since such sequences are decomposable, regular forms will not be listed below. (In Abkhaz, ә with sibilants is equivalent to ьә, for instance ж /ʐ/, жь /ʒ/~/ʐʲ/, жә /ʒʷ/~/ʐʲʷ/, but this is predictable phonetic detail.) Similarly, long vowels written double in some languages, such as аа for Abkhaz /aː/ or аюу for Kirghiz /ajuː/ "bear", or with glottal stop, as Tajik аъ[aʔ~aː], are not included.

Archi

Archi: а́а [áː], аӏ [aˤ], а́ӏ [áˤ], ааӏ [aːˤ], гв [ɡʷ], гь [h], гъ [ʁ], гъв [ʁʷ], гъӏ [ʁˤ], гъӏв [ʁʷˤ], гӏ [ʕ], е́е [éː], еӏ [eˤ], е́ӏ [éˤ], жв [ʒʷ], зв [zʷ], и́и [íː], иӏ [iˤ], кк [kː], кв [kʷ], ккв [kːʷ], кӏ [kʼ], кӏв [kʷʼ], къ [qʼ], къв [q’ʷ], ккъ [qː’], къӏ [qˤʼ], ккъӏ [qːˤʼ], къӏв [qʷˤʼ], ккъӏв [qːʷˤʼ], кь [kʟ̥ʼ], кьв [kʟ̥ʷʼ], лъ [ɬ], ллъ [ɬː], лъв [ɬʷ], ллъв [ɬːʷ], лӏ [kʟ̥], лӏв [kʟ̥ʷ], о́о [óː], оӏ [oˤ], о́ӏ [óˤ], ооӏ [oːˤ], пп [pː], пӏ [pʼ], сс [sː], св [sʷ], тт [tː], тӏ [tʼ], тв [tʷ], твӏ [t’ʷ], у́у [úː], уӏ [uˤ], у́ӏ [úˤ], хх [χː], хв [χʷ], ххв [χːʷ], хӏ [ħ], хьӏ [χˤ], ххьӏ [χːˤ], хьӏв [χʷˤ], ххьӏв [χːʷˤ], хъ [q], хъв [qʷ], хъӏ [qˤ], хъӏв [qʷˤ], цв [t͡sʷ], цӏ [t͡sʼ], ццӏ [t͡sː], чв [t͡ʃʷ], чӏ [t͡ʃʼ], чӏв [t͡ʃ’ʷ], шв [ʃʷ], щв [ʃːʷ], ээ [əː], эӏ [əˤ]

Avar

Avar uses в for labialization, as in хьв /xʷ/. Other digraphs are:

The ь digraphs are spelled this way even before vowels, as in гьабуна/habuna/ "made", not *гябуна.

Note that three of these are tetragraphs. However, gemination for the 'strong' consonants in Avar orthography is sporadic, and the simple letters or digraphs are frequently used in their place.

Belarusian

The Belarusian language has the following digraphs:

Chechen and Ingush

Chechen uses the following digraphs:

The vowel digraphs are used for front vowels for other Dagestanian languages and also the local Turkic languages Kumyk and Nogay. Ӏ digraphs for ejectives is common across the North Caucasus, as is гӏ for /ɣ~ʁ~ʕ/.

Kabardian and Adyghe

Kabardian and Adyghe both use у for labialization, as in ӏу /ʔʷ/. гу is /ɡʷ/, though г is /ɣ/); ку is /kʷ/, despite the fact that к is not used outside loan words. [lower-alpha 1]

Other digraphs are:

Labialized, the trigraph becomes the unusual tetragraph кхъу /qʷ/.

Tabasaran

Tabasaran uses gemination for its 'strong' consonants, but this has a different value with г.

It uses в for labialization of its postalveolar consonants: шв /ʃʷ/, жв /ʒʷ/, чв /tʃʰʷ/, джь /dʒʷ/, ь /tʃʼʷ/, ччь /tʃʷʰː/).

Tatar

Tatar has a number of vowels which are written with ambiguous letters that are normally resolved by context, but which are resolved by discontinuous digraphs when context is not sufficient. These ambiguous vowel letters are е, front /je/ or back /jɤ/, ю, front /jy/ or back /ju/; and я, front /jæ/ or back /ja/. They interact with the ambiguous consonant letters к, velar /k/ or uvular /q/, and г, velar /ɡ/ or uvular /ʁ/.

In general, velar consonants occur before front vowels and uvular consonants before back vowels, so it is frequently not necessary to specify these values in the orthography. However, this is not always the case. A uvular followed by a front vowel, as in /qærdæʃ/ "kinsman", for example, is written with the corresponding back vowel to specify the uvular value: кардәш. The front value of а is required by vowel harmony with the following front vowel ә, so this spelling is unambiguous.

If, however, the proper value of the vowel is not recoverable through vowel harmony, then the letter ь /ʔ/ is added at the end of the syllable, as in шагыйрь/ʃaʁir/ "poet". That is, /i/ is written with a ы rather than a и to show that the г is pronounced /ʁ/ rather than /ɡ/, then the ь is added to show that the ы is pronounced as if it were a и, so the discontinuous digraph ы...ь is used here to write the vowel /i/. This strategy is also followed with the ambiguous letters е, ю, and я in final syllables, for instance in юнь/jyn/cheap. That is, the discontinuous digraphs е...ь, ю...ь, я...ь are used for /j/ plus the front vowels /e,y,æ/.

Exceptional final-syllable velars and uvulars, however, are written with simple digraphs, with ь for velars and ъ for uvulars: пакь/pak/pure, вәгъдә/wæʁdæ/promise.

Ukrainian

The Ukrainian language has the following digraphs:

Other alphabets

Dungan
Mandarin Chinese

In the Cyrillization of Mandarin, there are digraphs цз and чж, which correspond to Pinyin z/j and zh. Final n is нь, while н stands for final ng. юй is yu, but ю you, ю- yu-, -уй -ui.

Karachay-Balkar
Khanty
Lezgian
Ossetian
Komi
Turkmen (now using Latin alphabet)
Yakut

See also

Notes

  1. The rest of this section only focuses on Kabardian.

Related Research Articles

Chechen is a Northeast Caucasian language spoken by approximately 1.8 million people, mostly in the Chechen Republic and by members of the Chechen diaspora throughout Russia and the rest of Europe, Jordan, Austria, Turkey, Azerbaijan, Ukraine, Central Asia and Georgia.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

<span class="mw-page-title-main">Kabardian language</span> Northwest Caucasian language natively spoken by Circassians

Kabardian also known as East Circassian, is a Northwest Caucasian language closely related to the Adyghe language. Circassian nationalists reject the distinction between the two languages and refer to them both as "Circassian".

<span class="mw-page-title-main">Soft sign</span> Letter of the Cyrillic script

The soft sign is a letter in the Cyrillic script that is used in various Slavic languages. In Old Church Slavonic, it represented a short or reduced front vowel. However, over time, the specific vowel sound it denoted was largely eliminated and merged with other vowel sounds.

<span class="mw-page-title-main">Ukrainian alphabet</span> Alphabet that uses letters from the Cyrillic script

The Ukrainian alphabet is the set of letters used to write Ukrainian, which is the official language of Ukraine. It is one of several national variations of the Cyrillic script. It comes from the Cyrillic script, which was devised in the 9th century for the first Slavic literary language, called Old Slavonic. In the 10th century, it became used in Kievan Rus' to write Old East Slavic, from which the Belarusian, Russian, Rusyn, and Ukrainian alphabets later evolved. The modern Ukrainian alphabet has 33 letters in total: 21 consonants, 1 semivowel, 10 vowels and 1 palatalization sign. Sometimes the apostrophe (') is also included, which has a phonetic meaning and is a mandatory sign in writing, but is not considered as a letter and is not included in the alphabet.

<span class="mw-page-title-main">Hard sign</span> Letter of the Cyrillic script

The letter Ъ ъ of the Cyrillic script is known as er goläm in the Bulgarian alphabet, as the hard sign in the modern Russian and Rusyn alphabets, as the debelo jer in pre-reform Serbian orthography, and as ayirish belgisi in the Uzbek Cyrillic alphabet. The letter is called back yer or back jer and yor or jor in the pre-reform Russian orthography, in Old East Slavic, and in Old Church Slavonic.

Ch is a digraph in the Latin script. It is treated as a letter of its own in the Chamorro, Old Spanish, Czech, Slovak, Igbo, Uzbek, Quechua, Ladino, Guarani, Welsh, Cornish, Breton, Ukrainian, Japanese, Latynka, and Belarusian Łacinka alphabets. Formerly ch was also considered a separate letter for collation purposes in Modern Spanish, Vietnamese, and sometimes in Polish; now the digraph ch in these languages continues to be used, but it is considered as a sequence of letters and sorted as such.

This article deals with the phonology of the standard Ukrainian language.

<span class="mw-page-title-main">Cyrillic alphabets</span> Related alphabets based on Cyrillic scripts

Numerous Cyrillic alphabets are based on the Cyrillic script. The early Cyrillic alphabet was developed in the 9th century AD and replaced the earlier Glagolitic script developed by the Bulgarian theologians Cyril and Methodius. It is the basis of alphabets used in various languages, past and present, Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages. About half of them are in Russia. Cyrillic is one of the most-used writing systems in the world. The creator is Saint Clement of Ohrid from the Preslav literary school in the First Bulgarian Empire.

Proto-Circassian is the reconstructed common ancestor of the Adyghean and Kabardian languages.

A tetragraph is a sequence of four letters used to represent a single sound (phoneme), or a combination of sounds, that do not necessarily correspond to the individual values of the letters. In German, for example, the tetragraph tsch represents the sound of the English digraph ch. English does not have tetragraphs in native words, but chth is a true tetragraph when found initially in words of Greek origin such as chthonian.

Adyghe is a language of the Northwest Caucasian family which, like the other Northwest Caucasian languages, is very rich in consonants, featuring many labialized and ejective consonants. Adyghe is phonologically more complex than Kabardian, having the retroflex consonants and their labialized forms.

The Lezgin language has been written in several different alphabets over the course of its history. These alphabets have been based on three scripts: Arabic script, Latin script, and Cyrillic script.

The Besleney Kabardian dialect is one of the East Circassian dialects and usually considered a dialect of Kabardian. However, because the Besleney tribe lived at the center of Circassia, the Besleney dialect also shares a large number of features with dialects of Western Circassian.

The Komi language, a Uralic language spoken in the north-eastern part of European Russia, has been written in several different alphabets. Currently, Komi writing uses letters from the Cyrillic script. There have been five distinct stages in the history of Komi writing:

There are several conventions for phonetic transcription using the Cyrillic script, typically augmented with Latin and Greek to fill in missing sounds. The details vary by author, and depend on which letters are available for the language of the text. For instance, in a work written in Ukrainian, ⟨г⟩ may be used for, whereas in Russian texts, ⟨г⟩ is used for. This article follows common Russian usage.

References