Ș

Last updated • 2 min readFrom Wikipedia, The Free Encyclopedia
S-comma Latin letter S with comma below.svg
S-comma
Appearance of comma (upper row) and cedilla (lower row) in the Times New Roman font. Note that the cedilla is placed higher than the comma. Virgulita si sedila.svg
Appearance of comma (upper row) and cedilla (lower row) in the Times New Roman font. Note that the cedilla is placed higher than the comma.

S-comma (majuscule: Ș, minuscule: ș) is a letter which is part of the Romanian alphabet, used to represent the sound /ʃ/, the voiceless postalveolar fricative (like sh in shoe). S-comma consists of an s with a diacritical comma underneath it, and is distinct from s-cedilla.

Contents

History

S with "half moon" beneath ("s subnotamus signo mediae lunulae") proposed as a letter in the Buda Lexicon. Note that the form is reversed from the modern version, resembling a small C. S signo mediae lunulae in Lexicon de Buda.png
S with "half moon" beneath ("s subnotamus signo mediae lunulae") proposed as a letter in the Buda Lexicon. Note that the form is reversed from the modern version, resembling a small C.
S-cedilla, T-cedilla and a cedilla illustrated with a comma in Ortografia limbei romane published by the Romanian Academy in 1895. S, t and cedilla in Ortografia limbei romane 1895.png
S-cedilla, T-cedilla and a cedilla illustrated with a comma in Ortografia limbei române published by the Romanian Academy in 1895.

The letter was proposed in the Buda Lexicon, a book published in 1825, which included two texts by Petru Maior, Orthographia romana sive latino-valachica una cum clavi and Dialogu pentru inceputul limbei române, introducing ș for /ʃ/ and ț for /ts/. [1]

Unicode support

S-comma was not initially supported in early Unicode versions, nor in the predecessors like ISO/IEC 8859-2 and Windows-1250. Instead, Ş (S-cedilla), a character available since Unicode 1.1.0 (1993), was used for digital texts written in Romanian. In some contexts, like with low-resolution screens and printouts, the visual distinction between ș and ş is minimal. In 1999, at the request of the Romanian Standardization Association  [ ro ][ citation needed ], S-comma was introduced in Unicode 3.0. Nevertheless, encoding for the S-comma was not supported in retail versions of Microsoft Windows XP, but a later European Union Expansion Font Update provided the feature. While digital accessibility to S-comma has since improved, both characters continue to be used interchangeably in various contexts like publishing.

The letter is part of Unicode's Latin Extended-B range, under "Additions for Romanian", titled as "Latin capital letter S with comma below" (U+0218) and "Latin small letter s with comma below" (U+0219). [2] In HTML, these can be encoded by Ș and ș, respectively.

Use of the comma with the letter S

Șș
S-comma

Character encoding

Character information
PreviewȘș
Unicode nameLATIN CAPITAL LETTER S WITH COMMA BELOWLATIN SMALL LETTER S WITH COMMA BELOW
Encodingsdecimalhexdechex
Unicode 536U+0218537U+0219
UTF-8 200 152C8 98200 153C8 99
Numeric character reference ȘȘșș

See also

Related Research Articles

<span class="mw-page-title-main">S</span> 19th letter in the Latin alphabet

S, or for lowercase, s, is the nineteenth letter of the Latin alphabet, used in the English alphabet, the alphabets of other western European languages and other latin alphabets worldwide. Its name in English is ess, plural esses.

A cedilla, or cedille, is a hook or tail added under certain letters as a diacritical mark to modify their pronunciation. In Catalan, French, and Portuguese it is used only under the letter c, and the entire letter is called, respectively, c trencada, c cédille, and c cedilhado. It is used to mark vowel nasalization in many languages of Sub-Saharan Africa, including Vute from Cameroon.

<span class="mw-page-title-main">Ç</span> Latin letter C with cedilla

Ç or ç (C-cedilla) is a Latin script letter used in the Albanian, Azerbaijani, Manx, Tatar, Turkish, Turkmen, Kurdish, Kazakh, and Romance alphabets. Romance languages that use this letter include Catalan, French, Portuguese, and Occitan, as a variant of the letter C with a cedilla. It is also occasionally used in Crimean Tatar and in Tajik to represent the sound. It is rarely used in Balinese, usually only in the word "Çaka" during Nyepi, one of the Balinese Hinduism holidays. It is often retained in the spelling of loanwords from any of these languages in English, Basque, Dutch, Spanish and other languages using the Latin alphabet.

A caron is a diacritic mark placed over certain letters in the orthography of some languages, to indicate a change of the related letter's pronunciation.

<span class="mw-page-title-main">Š</span> Latin letter S with caron

The grapheme Š, š is used in various contexts representing the sh sound like in the word show, usually denoting the voiceless postalveolar fricative /ʃ/ or similar voiceless retroflex fricative /ʂ/. In the International Phonetic Alphabet this sound is denoted with ʃ or ʂ, but the lowercase š is used in the Americanist phonetic notation, as well as in the Uralic Phonetic Alphabet. It represents the same sound as the Turkic letter Ş and the Romanian letter Ș (S-comma), the Hebrew and Yiddish letter ש, the Ge'ez (Ethiopic) letter ሠ,the Cyrillic letter Ш, the Arabic letter ش, and the Armenian letter Շ(շ).

ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 which is also referred to as "Latin-2" in Czech and Slovak regions. Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8.

<span class="mw-page-title-main">Ş</span> Latin letter S with cedilla; used in some Turkic languages

S-cedilla is a letter used in some of the Turkic languages. It occurs in the Azerbaijani, Gagauz, Turkish, and Turkmen alphabets. It is also planned to be in the Latin-based Kazakh alphabet. It is used in Brahui, Chechen, Crimean Tatar, Kurdish, and Tatar as well, when they are written in the Latin alphabet.

<span class="mw-page-title-main">Ll</span> Digraph

Ll/ll is a digraph that occurs in several languages.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

<span class="mw-page-title-main">Romanian alphabet</span> Variant of the Latin alphabet

The Romanian alphabet is a variant of the Latin alphabet used for writing the Romanian language. It is a modification of the classical Latin alphabet and consists of 31 letters, five of which have been modified from their Latin originals for the phonetic requirements of the language.

<span class="mw-page-title-main">Sha (Cyrillic)</span> Cyrillic letter

Sha, She or Shu, alternatively transliterated Ša is a letter of the Glagolitic and Cyrillic scripts. It commonly represents the voiceless postalveolar fricative, like the pronunciation of sh in "ship". More precisely, the sound in Russian denoted by ш is commonly transcribed as a palatoalveolar fricative but is actually a voiceless retroflex fricative. It is used in every variation of the Cyrillic alphabet for Slavic and non-Slavic languages.

<span class="mw-page-title-main">Esh (letter)</span> Character and IPA symbol (Ʃ, ʃ)

Esh is a character used in phonology to represent the voiceless postalveolar fricative.

<span class="mw-page-title-main">Ț</span> Latin letter T with comma

T-comma is a letter which consists of a t with a diacritical comma underneath it, and is distinct from t-cedilla. It is part of the Romanian alphabet, used to represent the Romanian language sound, the voiceless alveolar affricate. The letter is also a part of the Finno-Ugric Livonian language alphabet, representing the sound.

<span class="mw-page-title-main">Ś</span> Latin letter S with acute accent

Ś is a letter of the Latin alphabet, formed from S with the addition of an acute accent. It is used in Polish and Montenegrin alphabets, and in certain other languages or romanizations.

<span class="mw-page-title-main">Ch (digraph)</span> Latin-script digraph

Ch is a digraph in the Latin script. It is treated as a letter of its own in the Chamorro, Old Spanish, Czech, Slovak, Igbo, Uzbek, Quechua, Ladino, Guarani, Welsh, Cornish, Breton, Ukrainian, Japanese, Latynka, and Belarusian Łacinka alphabets. Formerly ch was also considered a separate letter for collation purposes in Modern Spanish, Vietnamese, and sometimes in Polish; now the digraph ch in these languages continues to be used, but it is considered as a sequence of letters and sorted as such.

<span class="mw-page-title-main">Ţ</span> Latin letter T with cedilla

T-cedilla is a letter which is part of the Gagauz alphabet, used to represent the sound, the voiceless alveolar affricate. It is written as the letter T with a cedilla below and it has both the lower-case (U+0163) and the upper-case variants (U+0162). It is also used in the Manjak and Mankanya language for.

<span class="mw-page-title-main">J</span> 10th letter of the Latin alphabet

J, or j, is the tenth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its usual name in English is jay, with a now-uncommon variant jy.

<span class="mw-page-title-main">C</span> 3rd letter of the Latin alphabet

C, or c, is the third letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is cee, plural cees.

Unicode supports several phonetic scripts and notation systems through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

<span class="mw-page-title-main">Rafe</span> Diacritical mark used in Hebrew

In Hebrew orthography the rafe or raphe is a diacritic, a subtle horizontal overbar placed above certain letters to indicate that they are to be pronounced as fricatives.

References

  1. Marinella Lörinczi Angioni, "Coscienza nazionale romanza e ortografia: il romeno tra alfabeto cirillico e alfabeto latino ", La Ricerca Folklorica, No. 5, La scrittura: funzioni e ideologie. (Apr., 1982), pp. 75–85.
  2. Unicode code charts. Latin Extended-B: Range 0180–024F