The Sorbian alphabet is based on the ISO basic Latin alphabet but uses diacritics such as the acute accent and the caron, making it similar to the Czech and Polish alphabets. (This mixture is also found in the Belarusian Latin alphabet.) The standard character encoding for the Sorbian alphabet is ISO 8859-2 (Latin-2).
The alphabet is used for the Sorbian languages, although some letters are used in only one of the two languages (Upper Sorbian and Lower Sorbian).
Upper case | HTML code | Lower case | HTML code | Name | Usual phonetic value | Other phonetic values | Comments |
---|---|---|---|---|---|---|---|
A | a | a | [a] | ||||
B | b | bej | [b] | [p] | |||
C | c | cej | [ts] | ||||
Č | Č | č | č | čej | [tʃ] | ||
Ć | Ć | ć | ć | ćet | [tɕ] | [dʑ] | |
D | d | dej | [d] | [t] | |||
E | e | ej | [ɛ] | ||||
Ě | Ě | ě | ě | jět/ět (Upper), ět (Lower) | [iɪ] | ||
F | f | ef | [f] | ||||
G | g | gej | [ɡ] | ||||
H | h | ha | Upper [ʔ͡h] Lower [ʔ] or silent | ||||
I | i | i | [i] | ||||
J | j | jot/jót (Upper), jot (Lower) | [j] | ||||
K | k | ka | [k] | ||||
Ł | Ł | ł | ł | eł | [w] | ||
L | l | el | [l] | ||||
M | m | em | [m] | ||||
N | n | en | [n] | ||||
Ń | Ń | ń | ń | eń (Upper), ejn (Lower) | [ɲ] | ||
O | o | o | [ɔ] | ||||
Ó | Ó | ó | ó | ót | Upper [uʊ] Lower [ɛ] or [ɨ] | ||
P | p | pej | [p] | [b] | |||
R | r | er | [ʁ] | ||||
Ř | Ř | ř | ř | eř | [ʃ] | Upper Sorbian only. Represents older r after a voiceless consonant and before a front vowel, e.g. přećel /ˈpʃɛt͡ɕɛl/ 'friend' from Proto-Slavic *prijateljь. | |
Ŕ | Ŕ | ŕ | ŕ | ejŕ | [ʁʲ] | Lower Sorbian only | |
S | s | es | [s] | [z], [ɕ] | |||
Š | Š | š | š | eš | [ʃ] | ||
Ś | Ś | ś | ś | śej | [ɕ] | [ʑ] | Lower Sorbian only |
T | t | tej | [t] | [d] | |||
U | u | u | [u] | ||||
W | w | wej | Upper [v] Lower [w] or silent | Upper Sorbian [f] | |||
Y | y | y | [ɨ] | ||||
Z | z | zet | [z] | [s], [ʑ] | |||
Ž | Ž | ž | ž | žet | [ʒ] | ||
Ź | Ź | ź | ź | źej | [ʑ] | [ɕ] | Lower Sorbian only (except as part of digraph dź) |
An earlier version of the Lower Sorbian alphabet included the use of the letters b́ (or b’), ṕ (or p’), ḿ (or m’), ẃ (or w’) and rarely f́ (or f’) to indicate palatalized labials. These have been replaced by bj, pj, mj, wj, and fj. [1] [2]
Sorbian orthography also includes two digraphs:
Upper case | HTML code | Lower case | HTML code | Name | Usual phonetic value | Other phonetic values |
---|---|---|---|---|---|---|
Ch | ch | cha | Upper [kʰ] word/root initially; [x] otherwise Lower [x] | |||
Dź | Dź | dź | dź | dźej/dźet | [dʑ] | [tɕ] |
The digraph ch follows h in alphabetical order. These letters are used in foreign words and names:
Upper case | Lower case | Name | Usual phonetic value | Other phonetic values | Substitute letter(s) |
---|---|---|---|---|---|
Q | q | ku | [k] | [ɡ] | KW |
V | v | fau | [v] | [f] | W |
X | x | iks | [ks] | [ɡz] | KS |
The Cyrillic script, Slavonic script or simply Slavic script is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Eastern Europe, the Caucasus, Central Asia, North Asia, and East Asia, and used by many other minority languages.
The ogonek is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European languages, and directly under a vowel in several Native American languages. It is also placed on the lower right corner of consonants in some Latin transcriptions of various indigenous languages of the Caucasus mountains.
Fraktur is a calligraphic hand of the Latin alphabet and any of several blackletter typefaces derived from this hand. It is designed such that the beginnings and ends of the individual strokes that make up each letter will be clearly visible, and often emphasized; in this way it is often contrasted with the curves of the Antiqua (common) typefaces where the letters are designed to flow and strokes connect together in a continuous fashion. The word "Fraktur" derives from Latin frāctūra, built from frāctus, passive participle of frangere, which is also the root for the English word "fracture". In non-professional contexts, the term "Fraktur" is sometimes misused to refer to all blackletter typefaces – while Fraktur typefaces do fall under that category, not all blackletter typefaces exhibit the Fraktur characteristics described above.
A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.
The African Reference Alphabet is a largely defunct continent-wide guideline for the creation of Latin alphabets for African languages. Two variants of the initial proposal were made at a 1978 UNESCO-organized conference held in Niamey, Niger. They were based on the results of several earlier conferences on the harmonization of established Latin alphabets of individual languages. The 1978 conference recommended the use of single letters for speech sounds rather than of letter sequences or of letters with diacritics. A substantial overhaul was proposed in 1982 but was rejected in a follow-up conference held in Niamey in 1984. Since then, continent-wide harmonization has been largely abandoned, because regional needs, practices and thus preferences differ greatly across Africa.
The grapheme Š, š is used in various contexts representing the sh sound like in the word show, usually denoting the voiceless postalveolar fricative /ʃ/ or similar voiceless retroflex fricative /ʂ/. In the International Phonetic Alphabet this sound is denoted with ʃ or ʂ, but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet. It represents the same sound as the Turkic letter Ş and the Romanian letter Ș (S-comma), the Hebrew and Yiddish letter ש, the Ge'ez (Ethiopic) letter ሠ, the Arabic letter ش, and the Armenian letter Շ(շ).
ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 which is also referred to as "Latin-2" in Czech and Slovak regions. Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8.
Lower Sorbian is a West Slavic minority language spoken in eastern Germany in the historical province of Lower Lusatia, today part of Brandenburg.
The Polish alphabet is the script of the Polish language, the basis for the Polish system of orthography. It is based on the Latin alphabet but includes certain letters (9) with diacritics: the acute accent ; the overdot ; the tail or ogonek ; and the stroke. ⟨q⟩, ⟨v⟩, and ⟨x⟩, which are used only in foreign words, are usually absent from the Polish alphabet. However, prior to the standardization of Polish spelling, ⟨qu⟩ was sometimes used in place of ⟨kw⟩, and ⟨x⟩ in place of ⟨ks⟩.
Alphabetical order is a system whereby character strings are placed in order based on the position of the characters in the conventional ordering of an alphabet. It is one of the methods of collation. In mathematics, a lexicographical order is the generalization of the alphabetical order to other data types, such as sequences of numbers or other ordered mathematical objects.
The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC. It is derived from the earlier Phoenician alphabet, and was the earliest known alphabetic script to have distinct letters for vowels as well as consonants. In Archaic and early Classical times, the Greek alphabet existed in many local variants, but, by the end of the 4th century BC, the Euclidean alphabet, with 24 letters, ordered from alpha to omega, had become standard and it is this version that is still used for Greek writing today.
Eng or engma is a letter of the Latin alphabet, used to represent a voiced velar nasal in the written form of some languages and in the International Phonetic Alphabet.
Gaj's Latin alphabet, also known as abeceda or gajica, is the form of the Latin script used for writing Serbo-Croatian and all of its standard varieties: Bosnian, Croatian, Montenegrin, and Serbian.
Ś is a letter of the Latin alphabet, formed from S with the addition of an acute accent. It is used in Polish and Montenegrin alphabets, and in certain other languages or romanizations.
Ŕ is a letter of the Lower Sorbian and Slovak alphabets, Ukrainian Latin alphabet and Proto-Turkic orthography. It is formed from R with the addition of an acute. Their Unicode codepoints are U+0154ŔLATIN CAPITAL LETTER R WITH ACUTE and U+0155ŕLATIN SMALL LETTER R WITH ACUTE. The PostScript names are Racute and racute.
Polish orthography is the system of writing the Polish language. The language is written using the Polish alphabet, which derives from the Latin alphabet, but includes some additional letters with diacritics. The orthography is mostly phonetic, or rather phonemic—the written letters correspond in a consistent manner to the sounds, or rather the phonemes, of spoken Polish. For detailed information about the system of phonemes, see Polish phonology.
The romanization of Macedonian is the transliteration of text in Macedonian from the Macedonian Cyrillic alphabet into the Latin alphabet. Romanization can be used for various purposes, such as rendering of proper names in foreign contexts, or for informal writing of Macedonian in environments where Cyrillic is not easily available. Official use of romanization by North Macedonia's authorities is found, for instance, on road signage and in passports. Several different codified standards of transliteration currently exist and there is widespread variability in practice.
The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia. The Greek alphabet was altered by the Etruscans, and subsequently their alphabet was altered by the Ancient Romans. Several Latin-script alphabets exist, which differ in graphemes, collation and phonetic values from the classical Latin alphabet.
Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points 0180-01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block range was extended by 80 code points and another 35 characters were assigned. In version 3.0 and later, the last 60 available code points in the block were assigned. Its block name in Unicode 1.0 was Extended Latin.