Latin-script alphabet

Last updated

A Latin-script alphabet (Latin alphabet or Roman alphabet) is an alphabet that uses letters of the Latin script. The 21-letter archaic Latin alphabet and the 23-letter classical Latin alphabet belong to the oldest of this group. [1] The 26-letter modern Latin alphabet is the newest of this group.

Contents

Encoding

The 26-letter ISO basic Latin alphabet (adopted from the earlier ASCII) contains the 26 letters of the English alphabet. To handle the many other alphabets also derived from the classical Latin one, ISO and other telecommunications groups "extended" the ISO basic Latin multiple times in the late 20th century. More recent international standards (e.g. Unicode) include those that achieved ISO adoption.

Key types of differences

Apart from alphabets for modern spoken languages, there exist phonetic alphabets and spelling alphabets in use derived from Latin script letters. Historical languages may also have used (or are now studied using) alphabets that are derived but still distinct from those of classical Latin and their modern forms (if any).

The Latin script was typically slightly altered to function as an alphabet for each different language (or other use), although the main letters are largely the same. A few general classes of alteration cover many particular cases:

These often were given a place in the alphabet by defining an alphabetical order or collation sequence, which can vary between languages. Some of the results, especially from just adding diacritics, were not considered distinct letters for this purpose; for example, the French é and the German ö are not listed separately in their respective alphabet sequences. With some alphabets, some altered letters are considered distinct while others are not; for instance, in Spanish, ñ (which indicates a unique phoneme) is listed separately, while á, é, í, ó, ú, and ü (which do not; the first five of these indicate a nonstandard stress-accent placement, while the last forces the pronunciation of a normally-silent letter) are not. Digraphs in some languages may be separately included in the collation sequence (e.g. Hungarian CS, Welsh RH). New letters must be separately included unless collation is not practised.

Properties

Letter inventory

Coverage of the letters of the ISO basic Latin alphabet can be

and additional letters can be

Grapheme order

Most alphabets have the letters of the ISO basic Latin alphabet in the same order as that alphabet.

Multigraphs

Some alphabets regard digraphs as distinct letters, e.g. the Spanish alphabet from 1803 to 1994 had CH and LL sorted apart from C and L.

Diacritics and ligatures

Some alphabets sort letters that have diacritics or are ligatures at the end of the alphabet. Examples are the Scandinavian Danish, Norwegian, Swedish, and Finnish alphabets.

New letter forms

Icelandic sorts a new letter form and a ligature at the end, as well as one letter with diacritic, while others with diacritics are sorted behind the corresponding non-diacritic letter.

Grapheme–sound correspondence

The phonetic values of graphemes can differ between alphabets.

Sound values of letters of the ISO basic Latin alphabet in IPA and various Latin-script languages
Lowercase letter to Latin Alphabet IPA IPA for Classical Latin Alphabet IPA for English Alphabet IPA for French Alphabet [lower-alpha 1] IPA for Spanish Alphabet [lower-alpha 2] IPA for Malay Orthography IPA for Turkish Alphabet [lower-alpha 3]
a a , eɪ, æ , ɑː a
b b
c k k , s k , θ t͡ʃ d͡ʒ
d d
e e , , ɛ ə , ɛ e e , ə e
f f
g g g , d͡ʒ g , ʒ g , x g g , ɟ
h h , h (silent)(silent) h
i i , , j aɪ, ɪ i
j (not used) d͡ʒ ʒ x d͡ʒ ʒ
k k k , k , ʔ k , c
l l l , ɫ
m m
n n | ŋ
o o , oʊ, ɒ ɔ , o o
p p
q k (not used)
r r ɹ ʁ r ɾ
s s s , z s
t t
u (not used)juː, ʌ , ʊ , y u
v u , , w v b v
w (not used) w w , v w , b w (not used)
x ksks, z ksks, s , x ks(not used)
y y , aɪ, , ɪ , j i , j j
z z θ ~ s z
  1. The French alphabet also has letters with 5 diacritics: à, â, ç, é, è, ê, ë, î, ï, ô, ù, û, ü, and ÿ, and 2 ligatures, æ and œ
  2. The Spanish alphabet has letters with diacritics: á, é, í, ó, ú, and ñ (ñ is considered a separate letter)
  3. The Turkish alphabet has additional letters: ç, ğ, ı, ö, ş, ü (all are separate letters)

Names of letters

Names of letters of the ISO basic Latin alphabet in various Latin-script languages
Lowercase Latin alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z
Classical Latin Written (majus)áéefíelemenóqeresixí graecazéta
Written (modern)āēefīelemenōeresūixī Graecazēta
Pronunciation (IPA)beːkeːdeːɛfɡeːhaːkaːɛlɛmɛnpeːkuːɛrɛsteːiks ˈɡraɪkaˈdzeːta
English Writtenabeeceedeeeef, effɡeeaitch, haitchijaykayelemenopeecuearessteeuveedouble-uexwyezed, zee
Pronunciation (IPA)/eɪ//bi//siː//diː//iː//ɛf//dʒiː//eɪtʃ/, /heɪtʃ//aɪ//dʒeɪ//keɪ//el//em//en//oʊ//piː//kjuː//ɑːr//ɛs//tiː//juː//viː//ˈdʌbəl.juː//ɛks//waɪ//zɛd/, /ziː/
French Writtenaeeffeacheijikaelleemmeenneoquerreesseudouble véixei greczède
Pronunciation (IPA)/a//be//se//de//ə//ɛf//ʒe//aʃ//i//ʒi//ka//ɛl//ɛm//ɛn//o//pe//ky//ɛʁ//ɛs//te//y//ve//dubləve//iks//iɡʁɛk//zed/
Spanish [2] abe, be larga, be altacedeeefegehacheijotakaeleemeeneopecuerreeseteuuve, ve, ve corta, ve bajauve doble, ve doble, doble ve, doble uequisye, i griegazeta
Malay (Indonesia)Writtenaééfhaikaéléménokiérésuékszét
Pronunciation (IPA)/a//be//t͡ʃe//de//e//ef//ge//ha//i//d͡ʒe//ka//el//em//en//o//pe//ki//er//es//te//u//ve/, /fe//we//eks//je//zet/
Malay (Malaysia, Brunei and Singapore)Writtenebisidiiéfjihécayéléménopikiuaréstiyuvidabel yuékswayzet
Pronunciation (IPA)/e//bi//si//di//i//ef//d͡ʒi//het͡ʃ//i//d͡ʒe//ke//el//em//en//o//pe//qiu/, /qju//ar/, /aː//es//ti//ju//vi//dabəlˈju//eks//wai̯//zed/
Turkish Writtenabecedeefegehe, haijeke, kalemeneopekû, küreseteuveçift veiksyeze
Pronunciation (IPA)/aː//beː//d͡ʒeː//deː//eː//feː//ɟeː//heː/,/haː//iː//ʒeː//ceː/,/kaː//leː//meː//neː//oː//peː//cuː/,/cyː//ɾeː//seː//teː//uː//veː//t͡ʃiftveː//ics//jeː//zeː/

Related Research Articles

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

Esperanto is written in a Latin-script alphabet of twenty-eight letters, with upper and lower case. This is supplemented by punctuation marks and by various logograms, such as the digits 0–9, currency signs such as $ € ¥ £ ₷, and mathematical symbols. The creator of Esperanto, L. L. Zamenhof, declared a principle of "one letter, one sound", though this is a general rather than strict guideline.

The double acute accent is a diacritic mark of the Latin and Cyrillic scripts. It is used primarily in Hungarian or Chuvash, and consequently it is sometimes referred to by typographers as hungarumlaut. The signs formed with a regular umlaut are letters in their own right in the Hungarian alphabet—for instance, they are separate letters for the purpose of collation. Letters with the double acute, however, are considered variants of their equivalents with the umlaut, being thought of as having both an umlaut and an acute accent.

<span class="mw-page-title-main">Æ</span> Letter Æ

Æ is a character formed from the letters a and e, originally a ligature representing the Latin diphthong ae. It has been promoted to the status of a letter in some languages, including Danish, Norwegian, Icelandic, and Faroese. It was also used in Old Swedish before being changed to ä. The modern International Phonetic Alphabet uses it to represent the near-open front unrounded vowel. Diacritic variants include Ǣ/ǣ, Ǽ/ǽ, Æ̀/æ̀, Æ̂/æ̂ and Æ̃/æ̃.

Finnish orthography is based on the Latin script, and uses an alphabet derived from the Swedish alphabet, officially comprising twenty-nine letters but also including two additional letters found in some loanwords. The Finnish orthography strives to represent all morphemes phonologically and, roughly speaking, the sound value of each letter tends to correspond with its value in the International Phonetic Alphabet (IPA) – although some discrepancies do exist.

Filipinoorthography specifies the correct use of the writing system of the Filipino language, the national and co-official language of the Philippines.

<span class="mw-page-title-main">English alphabet</span> Latin-script alphabet consisting of 26 letters

The alphabet for Modern English is a Latin-script alphabet consisting of 26 letters, each having an upper- and lower-case form. The word alphabet is a compound of the first two letters of the Greek alphabet, alpha and beta. The alphabet originated around the 7th century to write Old English from Latin script. Since then, letters have been added or removed to give the current letters:

<span class="mw-page-title-main">Polish alphabet</span> Script of the Polish language

The Polish alphabet is the script of the Polish language, the basis for the Polish system of orthography. It is based on the Latin alphabet but includes certain letters with diacritics: the acute accent ; the overdot ; the tail or ogonek ; and the stroke. ⟨q⟩, ⟨v⟩, and ⟨x⟩, which are used only in foreign words, are usually absent from the Polish alphabet. However, prior to the standardization of Polish spelling, ⟨x⟩ was sometimes used in place of ⟨ks⟩.

Alphabetical order is a system whereby character strings are placed in order based on the position of the characters in the conventional ordering of an alphabet. It is one of the methods of collation. In mathematics, a lexicographical order is the generalization of the alphabetical order to other data types, such as sequences of numbers or other ordered mathematical objects.

<span class="mw-page-title-main">Ligature (writing)</span> Glyph combining two or more letterforms

In writing and typography, a ligature occurs where two or more graphemes or letters are joined to form a single glyph. Examples are the characters ⟨æ⟩ and ⟨œ⟩ used in English and French, in which the letters ⟨a⟩ and ⟨e⟩ are joined for the first ligature and the letters ⟨o⟩ and ⟨e⟩ are joined for the second ligature. For stylistic and legibility reasons, ⟨f⟩ and ⟨i⟩ are often merged to create ⟨fi⟩ ; the same is true of ⟨s⟩ and ⟨t⟩ to create ⟨st⟩. The common ampersand, ⟨&⟩, developed from a ligature in which the handwritten Latin letters ⟨e⟩ and ⟨t⟩ were combined.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

Diacritical marks of two dots¨, placed side-by-side over or under a letter, are used in a number of languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons.

Unicode supports several phonetic scripts and notations through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.

<span class="mw-page-title-main">Latin script</span> Writing system based on the alphabet used by the Romans

The Latin script, also known as the Roman script, is an alphabetic writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae, in southern Italy. The Greek alphabet was altered by the Etruscans, and subsequently their alphabet was altered by the Romans. Several Latin-script alphabets exist, which differ in graphemes, collation and phonetic values from the classical Latin alphabet.

There are various systems of romanization of the Armenian alphabet.

The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.

<span class="mw-page-title-main">Umlaut (diacritic)</span> Diacritic mark to indicate sound shift

The umlaut is the diacritical mark used to indicate in writing the result of the historical sound shift due to which former back vowels are now pronounced as front vowels.

The Osage script is a new script promulgated in 2006 and revised 2012–2014 for the Osage language. Because Latin orthographies were subject to interference from English conventions among Osage students who were more familiar with English than with Osage, in 2006 the director of the Osage Language Program, Herman Mongrain Lookout, decided to create a distinct script by modifying or fusing Latin letters. This Osage script has been in regular use on the Osage Nation ever since.

ISO 11940-2 is an ISO standard for a simplified transcription of the Thai language into Latin characters.

References

  1. "Latin alphabet | Definition, Description, History, & Facts". Encyclopedia Britannica. Retrieved January 8, 2021.
  2. Ortografía de la lengua española (2010). Real Academia Española y Asociación de Academias de la Lengua Española. p. 63.