Sorbian alphabet

Last updated

The Sorbian alphabet is based on the ISO basic Latin alphabet but uses diacritics such as the acute accent and the caron, making it similar to the Czech and Polish alphabets. (This mixture is also found in the Belarusian Latin alphabet.) The standard character encoding for the Sorbian alphabet is ISO 8859-2 (Latin-2).

The alphabet is used for the Sorbian languages, although some letters are used in only one of the two languages (Upper Sorbian and Lower Sorbian).

Alphabet table

Upper
case
HTML
code
Lower
case
HTML
code
NameUsual
phonetic value
Other
phonetic values
Comments
A a a[a]  
B b bej[b][p] 
C c cej[ts]  
ČČčččej[tʃ]  
ĆĆćććet[tɕ]  
D d dej[d][t] 
E e ej[ɛ]  
ĚĚěějět/ět (Upper), ět (Lower)[iɪ]  
F f ef[f]  
G g gej[ɡ]  
H h haUpper [ʔ͡h]
Lower [ʔ] or silent
  
I i i[i]  
J j jot/jót (Upper), jot (Lower)[j]  
K k ka[k]  
ŁŁłł[w]  
L l el[l]  
M m em[m]  
N n en[n]  
ŃŃńńeń (Upper), ejn (Lower)[ɲ]  
O o o[ɔ]  
ÓÓóóótUpper [uʊ]
Lower [ɛ] or [ɨ]
  
P p pej[p][b] 
R r er[r]  
ŘŘřř[ʃ] Upper Sorbian only.
Represents older r after a voiceless consonant and before a front
vowel, e.g. přećel /ˈpʃɛt͡ɕɛl/ 'friend' from Proto-Slavic *prijateljь.
ŔŔŕŕejŕ[rʲ] Lower Sorbian only
S s es[s][z], [ɕ] 
ŠŠšš[ʃ]  
ŚŚśśśej[ɕ][ʑ]Lower Sorbian only
T t tej[t][d] 
U u u[u]  
W w wejUpper [v]
Lower [w] or silent
Upper Sorbian [f] 
Y y y[ɨ]  
Z z zet[z][s], [ʑ] 
ŽŽžžžet[ʒ]  
ŹŹźźźej[ʑ] Lower Sorbian only
(except as part of digraph )

An earlier version of the Lower Sorbian alphabet included the use of the letters (or b'), ṕ, ḿ, ẃ and rarely (or f') to indicate palatalized labials. These have been replaced by bj, pj, mj, wj, and fj. [1] [2]

Sorbian orthography also includes two digraphs:

Upper
case
HTML
code
Lower
case
HTML
code
NameUsual
phonetic value
Other
phonetic values
Ch ch chaUpper [kʰ] word/root initially; [x] otherwise
Lower [x]
 
Dźdźdźej/dźet[dʑ][tɕ]

The digraph ch follows h in alphabetical order. These letters are used in foreing words and names:

Upper
case
Lower
case
NameUsual
phonetic value
Other
phonetic values
Substitute letter(s)
Qqku[k] KW
Vvfau[v][f]W
Xxiks[ks][ɡz]KS

Related Research Articles

<span class="mw-page-title-main">Cyrillic script</span> Writing system used for various Eurasian languages

The Cyrillic script, Slavonic script or simply Slavic script is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Eastern Europe, the Caucasus, Central Asia, North Asia, and East Asia, and used by many other minority languages.

The ogonek is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European languages, and directly under a vowel in several Native American languages. It is also placed on the lower right corner of consonants in some Latin transcriptions of various indigenous languages of the Caucasus mountains.

<span class="mw-page-title-main">Fraktur</span> Typeface category

Fraktur is a calligraphic hand of the Latin alphabet and any of several blackletter typefaces derived from this hand. Letters are designed such that the individual strokes are broken apart; in this way it is often contrasted with the curves of the Antiqua (common) typefaces where the letters are designed to flow and strokes connect together in a continuous fashion. The word "Fraktur" derives from Latin frāctūra, built from frāctus, passive participle of frangere, which is also the root for the English word "fracture".

A caron is a diacritic mark (◌̌) commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.

The African Reference Alphabet is a largely defunct continent-wide guideline for the creation of Latin alphabets for African languages. Two variants of the initial proposal were made at a 1978 UNESCO-organized conference held in Niamey, Niger. They were based on the results of several earlier conferences on the harmonization of established Latin alphabets of individual languages. The 1978 conference recommended the use of single letters for speech sounds rather than of letter sequences or of letters with diacritics. A substantial overhaul was proposed in 1982 but was rejected in a follow-up conference held in Niamey in 1984. Since then, continent-wide harmonization has been largely abandoned, because regional needs, practices and thus preferences differ greatly across Africa.

<span class="mw-page-title-main">Š</span> Latin letter S with caron

The grapheme Š, š is used in various contexts representing the sh sound like in the word show, usually denoting the voiceless postalveolar fricative /ʃ/ or similar voiceless retroflex fricative /ʂ/. In the International Phonetic Alphabet this sound is denoted with ʃ or ʂ, but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet. It represents the same sound as the Turkic letter Ş and the Romanian letter Ș (S-comma), the Hebrew and Yiddish letter ש, the Ge'ez (Ethiopic) letter ሠ and the Arabic letter ش.

ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 which is also referred to as "Latin-2" in Czech and Slovak regions. Code page 912 is an extension. Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8.

Lower Sorbian is a West Slavic minority language spoken in eastern Germany in the historical province of Lower Lusatia, today part of Brandenburg.

<span class="mw-page-title-main">Polish alphabet</span> Script of the Polish language

The Polish alphabet is the script of the Polish language, the basis for the Polish system of orthography. It is based on the Latin alphabet but includes certain letters with diacritics: the acute accent ; the overdot ; the tail or ogonek ; and the stroke. ⟨q⟩, ⟨v⟩, and ⟨x⟩, which are used only in foreign words, are usually absent from the Polish alphabet. However, prior to the standardization of Polish spelling, ⟨x⟩ was sometimes used in place of ⟨ks⟩.

The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC. It is derived from the earlier Phoenician alphabet, and was the earliest known alphabetic script to have distinct letters for vowels as well as consonants. In Archaic and early Classical times, the Greek alphabet existed in many local variants, but, by the end of the 4th century BC, the Euclidean alphabet, with 24 letters, ordered from alpha to omega, had become standard and it is this version that is still used for Greek writing today.

<span class="mw-page-title-main">Eng (letter)</span> Letter of the Latin alphabet

Eng or engma is a letter of the Latin alphabet, used to represent a voiced velar nasal in the written form of some languages and in the International Phonetic Alphabet.

<span class="mw-page-title-main">Gaj's Latin alphabet</span> Form of Latin script used to write Serbo-Croatian

Gaj's Latin alphabet, also known as abeceda or gajica, is the form of the Latin script used for writing Serbo-Croatian and all of its standard varieties: Bosnian, Croatian, Montenegrin, and Serbian.

Ŕ is a letter of the Lower Sorbian and Slovak alphabets, Ukrainian Latin alphabet and Proto-Turkic orthography. It is formed from R with the addition of an acute. Their Unicode codepoints are U+0154ŔLATIN CAPITAL LETTER R WITH ACUTE and U+0155ŕLATIN SMALL LETTER R WITH ACUTE. The PostScript names are Racute and racute.

Polish orthography is the system of writing the Polish language. The language is written using the Polish alphabet, which derives from the Latin alphabet, but includes some additional letters with diacritics. The orthography is mostly phonetic, or rather phonemic—the written letters correspond in a consistent manner to the sounds, or rather the phonemes, of spoken Polish. For detailed information about the system of phonemes, see Polish phonology.

Scientific transliteration, variously called academic, linguistic, international, or scholarly transliteration, is an international system for transliteration of text from the Cyrillic script to the Latin script (romanization). This system is most often seen in linguistics publications on Slavic languages.

The romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.

<span class="mw-page-title-main">Latin script</span> Writing system based on the alphabet used by the Romans

The Latin script, also known as the Roman script, is an alphabetic writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae, in southern Italy. The Greek alphabet was altered by the Etruscans, and subsequently their alphabet was altered by the Romans. Several Latin-script alphabets exist, which differ in graphemes, collation and phonetic values from the classical Latin alphabet.

Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block range was extended by 80 code points and another 35 characters were assigned. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.

References

  1. Sebastian Kempgen. "Unicode 4.1 and Slavic Philology - Problems and Perspectives (I)" (PDF).
  2. Sebastian Kempgen. "Unicode 4.1 and Slavic Philology - Problems and Perspectives (II)" (PDF).