List of Latin-script alphabets

Last updated
The distribution of the Latin scripts. Latin alphabet world distribution.svg
The distribution of the Latin scripts.

The lists and tables below summarize and compare the letter inventories of some of the Latin-script alphabets. In this article, the scope of the word "alphabet" is broadened to include letters with tone marks, and other diacritics used to represent a wide range of orthographic traditions, without regard to whether or how they are sequenced in their alphabet or the table.

Contents

Parentheses indicate characters not used in modern standard orthographies of the languages, but used in obsolete and/or dialectal forms.

Letters contained in the ISO basic Latin alphabet

Alphabets that contain only ISO basic Latin letters

Among alphabets for natural languages the English, [36] Indonesian, and Malay alphabets only use the 26 letters in both cases.

Among alphabets for constructed languages the Ido and Interlingua alphabets only use the 26 letters, while the Toki Pona uses a 14-letter subset.

Extended by ligatures

  • German (ß), Scandinavian (æ), French (œ)

Extended by diacritical marks

  • Spanish (ñ), German (ä, ö, and ü), Dutch (ij and ë)

Extended by multigraphs

  • Filipino (ng)

Alphabets that contain all ISO basic Latin letters

Among alphabets for natural languages the Afrikaans, [54] Aromanian, Azerbaijani (some dialects) [53] , Basque, [4] , Celtic British, Catalan, [6] Cornish, Czech, [8] Danish, [9] Dutch, [10] Emilian-Romagnol, Filipino, [11] Finnish, French, [12] , German, [13] Greenlandic, Hungarian, [15] Javanese, Karakalpak, [23] Kurdish, Modern Latin, Luxembourgish, Norwegian, [9] Oromo [65] , Papiamento [63] , Polish [22] , Portuguese, Quechua, Rhaeto-Romance, Romanian, Slovak, [24] Spanish, [25] Sundanese, Swedish, Tswana, [52] Uyghur, Venda, [51] Võro, Walloon, [27] West Frisian, Xhosa, Zhuang, Zulu alphabets include all 26 letters, at least in their largest version.

Among alphabets for constructed languages the Interglossa and Occidental alphabets include all 26 letters.

The International Phonetic Alphabet (IPA) includes all 26 letters in their lowercase forms, although g is always single-storey ( ɡ ) in the IPA and never double-storey ( Looptail g.svg ).

Alphabets that do not contain all ISO basic Latin letters

This list is based on official definitions of each alphabet. However, excluded letters might occur in non-integrated loan words and place names.

Reduced usage of the letters of the ISO basic Latin alphabet [1] (A–Z) in various alphabets:
Alphabet A B C D E F G H I J K L M N O P Q R S T U V W X Y Z #
Classical Latin [2] ABCDEFGHIKLMNOPQRSTVXYZ23
Albanian [3] ABCDEFGHIJKLMNOPQRSTUVXYZ25
Anglo-Saxon ABCDEFGHIKLMNOPQRSTUXYZ23
Arbëresh ABCDEFGHIJKLMNOPQRSTUVXZ24
Asturian ABCDEFGHILMNOPQRSTUVXYZ23
Azeri [53] ABCDEFGHIJKLMNOPQRSTUVXYZ25
Bambara [39] ABCDEFGHIJKLMNOPRSTUWYZ23
Belarusian [5] ABCDEFGHIJKLMNOPRSTUVYZ23
Berber ABCDEFGHIJKLMNOQRSTUWXYZ24
Bislama [45] ABDEFGHIJKLMNOPRSTUVWY22
Breton ABCDEFGHIJKLMNOPRSTUVWYZ24
Chamorro [43] ABCDEFGHIKLMNOPRSTUY20
Chewa ABCDEFGHIJKLMNOPRSTUVWYZ24
Corsican [31] ABCDEFGHIJLMNOPQRSTUVZ22
Crimean Tatar ABCDEFGHIJKLMNOPQRSTUVYZ24
Croatian [7] ABCDEFGHIJKLMNOPRSTUVZ22
Cypriot Arabic [59] ABCDEFGIJKLMNOPRSTUVWXYZ24
Dakelh [61] ABCDEFGHIJKLMNOPRSTUVWYZ22
Dakota ABCDEFGHIJKMNOPSTUWYZ20
Dalecarlian ABDEFGHIJKLMNOPRSTUVWY22
Dinka [40] ABCDEGHIJKLMNOPRTUWY20
Esperanto ABCDEFGHIJKLMNOPRSTUVZ22
Estonian ABDEFGHIJKLMNOPRSTUVZ21
Extremaduran ABCDEFGHIJLMNOPQRSTUVXYZ24
Fala ABCDEFGHIJLMNOPQRSTUVXZ23
Faroese ABDEFGHIJKLMNOPRSTUVY21
Filipino Abakada [11] ABDEGHIKLMNOPRSTUWY19
Friulian ABCDEFGHIJLMNOPQRSTUVZ22
Fula [41] ABCDEFGHIJKLMNOPRSTUWXY23
Gagauz ABCDEFGHIJKLMNOPRSTUVYZ23
Galician [33] ABCDEFGHILMNOPQRSTUVXZ22
Gilbertese ABEIKMNORTUW12
Glosa ABCDEFGHIJKLMNOPQRSTUVWXZ25
Traditional Greenlandic AEFGIJKLMNOPQRSTUV18
Guaraní [14] ABCDEGHIJKLMNOPRSTUVY21
Gwich'in [67] ABCDEFGHIJKLMNORSTUVWYZ23
Haitian ABCDEFGHIJKLMNOPRSTUVWXYZ25
Hän ABCDEGHIJKLMNOPRSTUWYZ22
Hausa [30] ABCDEFGHIJKLMNOPRSTUWYZ23
Hawaiian AEHIKLMNOPUW12
Icelandic ABDEFGHIJKLMNOPRSTUVXY22
Igbo [42] ABCDEFGHIJKLMNOPRSTUVWYZ24
Inari Sami ABCDEFGHIJKLMNOPRSTUVYZ23
Irish [16] ABCDEFGHILMNOPRSTUVZ20
Italian [17] ABCDEFGHILMNOPQRSTUVZ21
Karelian ABCDEFGHIJKLMNOPRSTUVYZ23
Kashubian ABCDEFGHIJKLMNOPRSTUWYZ23
Kazakh [38] ABCDEFGHIJKLMNOPQRSTUVYZ24
Khasi ABDEGHIJKLMNOPRSTUWY20
Latvian [18] ABCDEFGHIJKLMNOPRSTUV(Y)Z23
Lithuanian [19] ABCDEFGHIJKLMNOPRSTUVYZ23
Livonian [46] ABDEFGHIJKLMNOPRSTUV(Y)Z23
Lojban ABCDEFGIJKLMNOPRSTUVXYZ23
Lule Sami [60] ABDEFGHIJKLMNOPRSTUV20
Malagasy ABDEFGHIJKLMNOPRSTVYZ21
Maltese [20] ABDEFGHIJKLMNOPQRSTUVWXZ24
Manx Gaelic ABCDEFGHIJKLMNOPQRSTUVWY24
Māori [34] AEGHIKMNOPRTUW14
Marshallese [47] ABDEIJKLMNOPRTUWY17
Massachusett [62] ACEHKMNPQSTUWY14
Mirandese ABCDEFGHIJLMNOPQRSTUXYZ23
Mohawk [35] AEHIKNORSTWY12
Na'vi [57] [2] AEFGHIKLMNOPRSTUVWXYZ21
Navajo ABCDEGHIJKLMNOSTWXYZ20
Northern Sami ABCDEFGHIJKLMNOPRSTUVZ22
Nuxalk ACHIKLMNPQSTUWXY16
Occitan ABCDEFGHIJLMNOPQRSTUVXZ23
Pan-Nigerian ABCDEFGHIJKLMNOPRSTUVWYZ24
Piedmontese ABCDEFGHIJLMNOPQRSTUVZ22
Pinyin [32] ABCDEFGHIJKLMNOPQRSTUWXYZ25
Romani [29] ABCDEFGHIJKLMNOPRSTUVXZ23
Rotokas AEGIKOPRSTUV12
Samoan AEFGHIKLMNOPRSTUV17
Sardinian ABCDEFGHIJLMNOPQRSTUVXYZ24
Scottish Gaelic ABCDEFGHILMNOPRSTU18
Serbian [7] ABCDEFGHIJKLMNOPRSTUVZ22
Shona ABCDEFGHIJKLMNOPRSTUVWYZ24
Sicilian ABCDEFGHIJLMNOPQRSTUVZ22
Skolt Sami ABCDEFGHIJKLMNOPRSTUVZ22
Slovenian ABCDEFGHIJKLMNOPRSTUVZ22
Somali ABCDEFGHIJKLMNOQRSTUWXY23
Sorbian [64] ABCDEFGHIJKLMNOPRSTUWYZ23
Southern Sami ABDEFGHIJKLMNOPRSTUVY21
Swahili ABCDEFGHIJKLMNOPRSTUVWYZ24
Tahitian AEFHIMNOPRTUV13
Tetum ABDEFGHIJKLMNOPRSTUVWXZ23
Toki Pona AEIJKLMNOPSTUW14
Tongan AEFGHIKLMNOPSTUV16
Turkish ABCDEFGHIJKLMNOPRSTUVYZ23
Turkmen [55] ABDEFGHIJKLMNOPRSTUWYZ22
Ulithian [49] ABCDEFGHIKLMNOPRSTUWY21
Ume Sami ABDEFGHIJKLMNOPRSTUVY21
Uzbek [25] ABCDEFGHIJKLMNOPQRSTUVXYZ25
Veps ABCDEFGHIJKLMNOPRSTUVZ22
Vietnamese [26] ABCDEGHIKLMNOPQRSTUVXY22
Volapük ABCDEFGHIJKLMNOPRSTUVXYZ24
Welsh [28] ABCDEFGHIJLMNOPRSTUWY21
Wolof ABCDEFGHIJKLMNOPQRSTUWXY24
Yapese [50] ABCDEFGHIJKLMNOPQRSTUWY23
Yoruba [44] ABDEFGHIJKLMNOPRSTUWY21
Zuni [66] ABCDEHIKLMNOPSTUWY18
count10089728899849294997784939899989432939499966845316865

The I is used in two distinct versions in Turkic languages: dotless (I ı) and dotted (İ i). They are considered different letters, and case conversion must take care to preserve the distinction. Irish traditionally does not write the dot, or tittle, over the small letter i, but the language makes no distinction here if a dot is displayed, so no specific encoding and special case conversion rule is needed as it is for Turkic alphabets.

Statistics

The chart above lists a variety of alphabets that do not officially contain all 26 letters of the ISO basic Latin alphabet. In this list, one letter is used by all of them: A. For each of the 26 basic ISO Latin alphabet letters, the number of alphabets in the list above using it is as follows:

LetterAEINTMOUPSHLRGBDFKJCVYZWQX
Alphabets10099999999989896949494939392898884847772686865453231

Letters not contained in the ISO basic Latin alphabet

Some languages have extended the Latin alphabet with ligatures, modified letters, or digraphs. These symbols are listed below.

Additional letters by type

Independent letters and ligatures

Additional base letters Æ Ð Ǝ Ə Ɛ Ɣ I Ɩ Ŋ Œ Ɔ Ʊ K' Ʃ Þ Ʋ Ƿ Ȝ Ʒ ʔ
æɑðǝəɛɣıɩŋœɔʊ ĸ ß ʃþʋƿȝʒʔ
Anglo-Saxon ÆÐÞǷȜ
Azeri [53] ƏI
Bambara [39] ƐŊƆ
Northern Berber ƐƔ
Southern Berber ƎƔŊ
Crimean Tatar I
Dalecarlian Ð
Danish [9]
Norwegian [9]
Southern Sami (Norway)
Æ
Dinka ƐƔŊƆ
Faroese ÆÐ
Greenlandic Æ(ĸ)
German [13] ß
Icelandic
Norn
ÆÐÞ
Celtic British
English [36]
French [12]
Latin [2]
ƌ
Inari Sami
Northern Sami
Lule Sami [60]
Fula [41]
Alphabet of Mauritania
Alphabet of Senegal
Ŋ
Skolt Sami ŊƷ
Pan-Nigerian Ǝ
Turkish
Kazakh [38]
I
Alphabet of Cameroon ÆƏƐŊŒƆ
Alphabet of Benin ƎƐƔŊƆƱƲ
Alphabet of Burkina Faso ƎƐƖŊƆƲ
Alphabet of Chad [68] ƏƐŊƆ
Alphabet of Côte d'Ivoire ƐƖŊƆƱɁ
Scientific Alphabet of Gabon ÐƎƐƔŊƆƩƷɁ
Alphabet of Mali ƎƐƔŊƆɁ
Alphabet of Niger ƎƔŊ
Alphabet of Zaïre ƐƆ
African reference alphabet ƎƐƔƖŊƆƩƲƷɁ
Count 721583128331421112112231134

Letter–diacritic combinations: connected or overlaid

Modified letters Ą Ą̊ Ɓ Ƈ Ç Đ Ɗ Ɖ Ę Ȩ Ə̧ Ɛ̧ Ƒ Ǥ Ɠ Ħ Į Ɨ Ɨ̧ Ƙ Ł Ɲ Ǫ Ø Ơ Ɔ̧ Ƥ Ɍ Ş Ƭ Ţ Ŧ Ų Ư Ʉ Ƴ
ąą̊ɓƈçđɗɖęȩə̧ɛ̧ƒǥɠħɦįɨɨ̧ƙłɲǫøơɔ̧ƥɍşƭţŧųưʉƴ
Albanian [3]
Arbëresh
Catalan [6]
English [36]
Extremaduran
Fala
French [12]
Friulian
German [13]
Manx
Mirandese
Norwegian [9]
Occitan
Portuguese [23]
(Spanish) [25]
Walloon [27]
Ç
Azeri [53]
Crimean Tatar
Kurdish
Turkish
Turkmen [55]
ÇŞ
Bambara [39]
Dinka [40]
Ɲ
Belarusian [5]
Sorbian [64]
Ł
Croatian [7]
Inari Sami
Đ
Cypriot Arabic [59]
Kazakh [38]
Ş
Danish [9]
Faroese Greenlandic
Norn
Norwegian [9]
Southern Sami (Norway)
Ø
Dalecarlian ĄĄ̊ĘĮŲ
Fula [41] ƁƊƓƝƳ
Gagauz ÇŞŢ
Hän
Navajo
ĄĘĮŁǪ
Hausa [30] ƁƊƘƳ
Gwich'in ĄĘĮŁǪŲ
Kashubian
Polish [22]
ĄĘŁ
Lithuanian [19] ĄĘĮŲ
Pan-Nigerian ƁƊƘ
Maltese [20] Ħ
Marshallese [47]
Romanian (nonstandard) [10] ŞŢ
Northern Sami
Ume Sami
ĐŦ
Skolt Sami ĐǤ
Vietnamese [26] ĐƠƯ
Zuni [66] Ł
Alphabet of Benin ƉƑ
Alphabet of Burkina Faso ƁÇƊƳ
Alphabet of Chad [68] ƁƊƗƳ
Alphabet of Cameroon ƁƊȨƏ̧Ɛ̧ƗƗ̧ØƆ̧Ƴ
Scientific Alphabet of Gabon ƉɌ
Alphabet of Mali ƁƊƝƳ
Alphabet of Mauritania ƁƊƳ
Alphabet of Niger ƁƊƘƝɌƳ
Alphabet of Senegal ƁƇƊŞƬŢŦƳ

Other letters in collation order

The tables below are a work in progress. Eventually, table cells with light blue shading will indicate letter forms that do not constitute distinct letters in their associated alphabets. Please help with this task if you have the required linguistic knowledge and technical editing skill.

For the order in which the characters are sorted in each alphabet, see collating sequence.

Letters derived from A–H

Letter-diacritic combinations (detached) in various Latin alphabets (A–H)
Alphabet Á À Ȧ Â Ä Ǟ Ǎ Ă Ā Ã Å Ǻ Ǽ Ǣ Ć Ċ Ĉ Č Ď É È Ė Ê Ë Ě Ĕ Ē Ǵ Ġ Ĝ Ǧ Ğ Ģ Ĥ
áàȧâäǟǎăāãåǻǽǣćċĉčďéèėêëěĕēǵġĝǧğģĥ
Latin [2] ĂĀËĔĒ
Afrikaans [54] ÁÄÉÈÊË
Albanian [3] (Â)(Ê)Ë
Alemannic ÁÀÂÄÅÉÈÊ
Anglo-Saxon ĀǢĒ
Arbëresh ÁÉË
Aromanian Ã
Asturian ÁÉ
Austro-Bavarian ÁÀÂÄÅÉÈÊ
Azeri [53] (Ä)Ğ
Belarusian [5] ĆČ
Northern Berber ČǦ
Southern Berber Ă
Bislama [45] É
Breton ÂÉÊ
Catalan [6] ÀÉÈ
Celtic British ĂĀĔĒ
Chamorro [43] ÁÅǺÉ
Corsican [31] ÀÈ
Crimean Tatar ÂĞ
Croatian [7] ĆČ
Cypriot Arabic [59] ĊĠ
Czech [8] ÁČĎÉĚ
Dalecarlian ÄÅ
Danish [9] ÁÅǺǼÉ
Dutch [10] ÁÀÂÄÉÈÊË
Emilian-Romagnol ÀÂÄÅÉÈÊËĒ
English [36] ÀÂÄÅÉÈÊË
Esperanto ĈĜĤ
Estonian Ä
Extremaduran ÁÉ
Fala ÁÃÉ
Faroese Á
Filipino [11] ÁÀÂÉÈÊ
Finnish ÄÅ
French [12] ÀÂÉÈÊË
Friulian ÀÂÈÊ
Gagauz ÄÊ
Galician [33] ÁÉ
German [13] ÁÀÂÄÉÈÊ
Greenlandic (Á)(Â)(Ã)Å(É)(Ê)
Guaraní [14] ÁÃÉ
Gwich'in ÀÈ
Haitian ÀÈ
Hän ÀÂÄǍÈÊËĚ
Hawaiian ĀĒ
Hungarian [15] ÁÉ
Icelandic ÁÉ
Igbo ÁÀÉÈ
Inari Sami ÁÂÄÅČ
Irish [16] ÁÉ
Italian [17] ÁÀÉÈ
Javanese ÉÈ
Karakalpak [23] ÁǴ
Karelian ÄČ
Kashubian ÃĆÉË
Kazakh [38] ÄĞ
Kurdish Ê
Latvian [18] ĀČĒĢ
Lithuanian [19] ČĖ
Livonian [46] ÄǞĀĒ
Lule Sami [60] ÁÄÅ
Luxembourgish ÂÄÉÈÊË
Malagasy ÁÀÂÈÊ
Maltese [20] ÀĊÈĠ
Māori ĀĒ
Marshallese [47] Ā
Massachusett [62] Â
Mirandese ÁÉÊ
Mohawk ÁÀÉÈ
Na'vi [57] Ä
Navajo ÁÉ
Norn ÁÅÉ
Northern Sami ÁČ
Norwegian [9] ÀÄÅÉÈÊ
Occitan ÁÀÉÈ
Pan-Nigerian
Papiamento [63] ÁÉÈ
Piedmontese [37] ÀÉÈË
Pinyin [32] ÁÀǍĀÉÈĚĒ
Polish [22] Ć(É)
Portuguese [23] ÁÀÂÃÉ(È)Ê
Rhaeto-Romance ÀÉÈ
Romani [29] Č
Romanian ÂĂ
Samoan ÁĀÉĒ
Sardinian ÁÀÉÈ
Scottish Gaelic (Á)À(É)È
Alphabet of Senegal ËĤ
Serbian [7] ĆČ
Sicilian ÀÂÈÊ
Skolt Sami ÂÄÅČǦ
Slovak [24] ÁÄČĎÉ
Slovenian ÁÀÄĆČÉÈÊ
Sorbian [64] ĆČĚ
Southern Sami (Norway) Å
Southern Sami (Sweden) ÄÅ
Spanish [25] ÁÉ
Sundanese É
Swedish [21] ÁÀÄÅÉÈË
Tahitian ĀĒ
Tetum ÁÉ
Tongan ÁĀÉĒ
Tswana [52] Ê
Turkish ÂĞ
Turkmen [55] Ä
Ulithian [49] ȦĖ
Ume Sami ÁÄÅ
Uyghur Ë
Venda [51] ÁÉ
Veps ÄČ
Vietnamese [26] ÁÀÂĂÃÉÈÊ
Volapük Ä
Võro Ä
Walloon [27] ÀÂÅÉÈÊË
Welsh [28] ÁÀÂÄÉÈÊË
West Frisian ÂÄÉÊË
Wolof
Alphabet of Mauritania
ÀÉË
Xhosa ÁÀÂÄÉÈÊË
Yapese [50] ÄË
Yoruba [56] ÁÀÂǍÃÉÈÊĚ

Letters derived from I–O

Letter-diacritic combinations (detached) in various Latin alphabets (I–O)
Alphabet Í Ì İ Î Ï Ǐ Ĭ Ī Ĩ Ĵ Ķ Ǩ Ĺ Ļ Ľ Ŀ ʼN Ń Ň Ñ Ņ Ó Ò Ȯ Ȱ Ô Ö Ȫ Ǒ Ŏ Ō Õ Ȭ Ő Ǿ Ơ
íìiîïǐĭīĩĵķǩĺļľŀ ʼn ńňñņóòȯȱôöȫǒŏōõȭőǿơ
Latin [2] ĬĪŎŌ
Afrikaans [54] ÍÎÏʼnÓÔÖ
Albanian [3] (Î)(Ô)
Alemannic ÍÌÎÓÒÔÖ
Anglo-Saxon ĪŌ
Arbëresh ÍÓÒÔ
Asturian ÍÑÓ
Austro-Bavarian ÍÌÎÓÒÔÖ
Azeri [53] İÖ
Basque [4] Ñ
Belarusian [5] Ń
Northern Berber
Southern Berber
Bislama [45] Ï(M̄)
Breton ÎÑÔ
Catalan [6] ÍÏĿÓÒ
Celtic British ĬĪŎŌ
Chamorro [43] ÍÑÓ
Corsican [31] ÌÏÒ
Crimean Tatar İÑÖ
Czech [8] ÍŇÓ
Dalecarlian Ö
Danish [9] ÍÓǾ
Dutch [10] ÍÌÎÏÓÒÔÖ
Emilian-Romagnol ÌÎÓÒÔÖŌ
English [36] ÎÏÓÔÖ
Esperanto Ĵ
Estonian ÖÕ
Extremaduran ÍÑÓ
Fala ÍĨÓÕ
Faroese ÍÓ
Filipino [11] ÍÌÎÑÓÒ
Finnish Ö
French [12] ÎÏÔ
Friulian ÌÎÒÔ
Fula [41] Ñ
Gagauz İÑÖ
Galician [33] ÍÏÑÓ
German [13] ÑÖ
Greenlandic (Í)(Î)(Ĩ)(Ô)
Guaraní [14] ÍĨÑÓÕ
Gwich'in ÌÒ
Haitian Ò
Hän ÌÎǏÒÔǑ
Hawaiian ĪŌ
Hungarian [15] ÍÓÖŐ
Icelandic ÍÓÖ
Igbo ÍÌÓÒ
Inari Sami Ö
Irish [16] ÍÓ
Italian [17] ÍÌÎÏÓÒ
Karakalpak [23] ÍŃÓ
Karelian Ö
Kashubian ŃÓÒÔ
Kazakh [38] İÑÖ
Khasi ÏÑ
Kurdish Î
Latvian [18] ĪĶĻŅ(Ō)
Livonian [46] ĪĻŅȮȰ(Ö)(Ȫ)ŌÕȬ
Luxembourgish Î(M̂)(N̂)ÔÖ
Malagasy ÌÑÒÔ
Maltese [20] ÌÎÒ
Māori ĪŌ
Marshallese [47] ĻŅŌ
Massachusett [62] Ô
Mirandese ÍÓÔ
Mohawk ÍÌÓÒ
Na'vi [57] Ì
Navajo ÍÓ
Norn ÍÓ
Norwegian [9] ÎÓÒÔ
Occitan ÍÓÒ
Pan-Nigerian
Papiamento [63] ÍÑÓÒ
Piedmontese [37] ÌÒ
Pinyin [32] ÍÌǏĪÓÒǑŌ
Polish [22] ŃÓ
Portuguese [23] Í(Ì)ÓÒÔÕ
Quechua Ñ
Rhaeto-Romance ÌÎÒ
Romanian Î
Samoan ÍĪÓŌ
Sardinian ÍÌÓÒ
Scottish Gaelic Ì(Ó)Ò
Alphabet of Senegal Ñ
Sicilian ÌÎÒÔ
Skolt Sami ǨÖÕ
Slovak [24] ÍĹĽŇÓÔÖ
Slovenian ÍÌÓÒÔÖ
Sorbian [64] ŃÓ
Southern Sami (Norway) Ï
Southern Sami (Sweden) ÏÖ
Spanish [25] ÍÏÑÓ
Swedish [21] Ö
Tahitian ĪŌ
Tetum ÍÑÓ
Tongan ÍĪÓŌ
Tswana [52] Ô
Turkish İÎÖ
Turkmen [55] ŇÖ
Ulithian [49] Ȯ
Ume Sami ÏÖ
Uyghur Ö
Venda [51] ÍÓ
Veps Ö
Vietnamese [26] ÍÌĨÓÒÔÕƠ
Volapük Ö
Võro ÖÕ
Walloon [27] ÌÎÔÖ
Welsh [28] ÍÌÎÏÓÒÔÖ
West Frisian ÏÔÖ
Wolof
Alphabet of Mauritania
ÑÓ
Xhosa ÍÌÎÏÓÒÔÖ
Yapese [50] Ö
Yoruba [56] ÍÌÎǏĨŃÓÒÔǑÕ

Letters derived from P–Z

Letter-diacritic combinations (detached) in various Latin alphabets (P–Z)
Alphabet Ŕ Ř Ŗ Ś Ŝ Š Ș Ť Ț Ú Ù Û Ü Ǔ Ŭ Ū Ũ Ű Ů Ŵ Ý Ŷ Ÿ Ȳ Ź Ż Ž Ǯ
ŕřŗśŝšșťțúùûüǔŭūũűůŵýŷÿȳźżžǯ
Latin [2] ŬŪ
Afrikaans [54] ÚÛÜÝ
Albanian [3] (Û)(Ŷ)
Alemannic ÚÙÛÜ
Anglo-Saxon ŪȲ
Arbëresh ÚÙÛ
Asturian ÚÜ
Austro-Bavarian ÚÙÛÜ
Azeri [53] Ü
Basque [4] Ü
Belarusian [5] ŚŠŬŹŽ
Northern Berber Ř
Southern Berber ŠŽ
Bislama [45] (P̄)Ü
Breton ÙÛÜ
Alphabet of Burkina Faso Ü
Catalan [6] ÚÜ
Celtic British ŬŪ
Chamorro [43] Ú
Chewa Ŵ
Corsican [31] Ù
Crimean Tatar Ü
Croatian [7] ŠŽ
Czech [8] ŘŠŤÚÜŮÝŽ
Danish [9] ÚÝ
Dutch [10] ÚÙÛÜ
Emilian-Romagnol ÙÛÜŻ
English [36] ÛÜ
Esperanto ŜŬ
Estonian ŠÜŽ
Extremaduran ÚÜ
Fala ÚÜŨ
Faroese ÚÝ
Filipino [11] ÚÙÛ
Finnish ŠŽ
French [12] ÙÛÜŸ
Friulian ÙÛ
Gagauz Ü
Galician [33] ÚÜ
German [13] Ü
Greenlandic (Ú)(Û)(Ũ)
Guaraní [14] ÚŨÝ
Gwich'in Ù
Hän ÙÛǓ
Hawaiian Ū
Hungarian [15] ÚÜŰ
Icelandic ÚÝ
Igbo ÚÙ
Inari Sami ŠŽ
Irish [16] Ú
Italian [17] ÚÙ
Karakalpak [23] Ú
Karelian Š(Ü)Ž(Ǯ)
Kashubian ŚÙŹŻ
Kazakh [38] ÜŪ
Kurdish Û
Latvian [18] (Ŗ)ŠŪŽ
Lithuanian [19] ŠŪŽ
Livonian [46] ŖŠȚŪ(Ȳ)Ž
Luxembourgish ÛÜ
Malagasy
Maltese [20] ÙŻ
Māori Ū
Marshallese [47] Ū
Mirandese ÚŨ
Norn ÚÝ
Northern Sami ŠŽ
Norwegian [9] ÙÜ
Occitan Ú
Pan-Nigerian
Papiamento [63] ÚÙÜ
Piedmontese [37] Ù
Pinyin [32] ÚÙÜǓŪ
Polish [22] ŚŹŻ
Portuguese [23] Ú(Ù)(Ü)
Rhaeto-Romance Ù
Romani [29] ŠŽ
Romanian ȘȚ
Samoan ÚŪ
Sardinian ÚÙ
Scottish Gaelic Ù
Alphabet of Senegal ŜŴŶŻ
Serbian [7] ŠŽ
Sicilian ÙÛ
Skolt Sami ŠŽǮ
Slovak [24] ŔŠŤÚÜÝŽ
Slovenian ŠÚÙÜŽ
Sorbian [64] ŔŘŚŠŹŽ
Spanish [25] ÚÜ
Swedish Ü
Tahitian Ū
Tetum Ú
Tongan ÚŪ
Tswana [52] Š
Turkish ÛÜ
Turkmen [55] ÜÝŽ
Ume Sami Ü
Uyghur Ü
Venda [51] Ú
Veps ŠÜŽ
Vietnamese [26] ÚÙŨÝ
Volapük Ü
Võro ŠÜŽ
Walloon [27] ÙÛ
Welsh [28] ÚÙÛÜŴÝŶŸ
West Frisian ÚÛÜ
Xhosa ÚÙÛÜ
Yoruba [56] ÚÙÛǓŨ

Notes

  1. In classical Latin, the digraphs ch , ph , rh , th were used in loanwords from Greek, but they were not included in the alphabet. The ligatures æ , œ and w , as well as lowercase letters, were added to the alphabet only in Middle Ages. The letters j and u were used as typographical variants of i and v, respectively, roughly until the Enlightenment.
  2. In Afrikaans, c and q are only (and x and z almost only) used in loanwords.
  3. Albanian officially has the digraphs dh, gj, ll, nj, rr, sh, th, xh, zh, which is sufficient to represent the Tosk dialect. The Gheg dialect supplements the official alphabet with 6 nasal vowels, namely â, ê, î, ô, û, ŷ.
  4. Arbëresh officially has the digraphs dh, gj, hj, ll, nj, rr, sh, th, xh, zh. Arbëresh has the distinctive hj, which is considered as a letter in its own right.
  5. Achomi also has the digraph a'.
  6. Azeri only uses the letter ä as a substitute for ə if the latter cannot be used (it was replaced by the schwa one year later because it is the most common letter). These cases should be avoided! The letters w, đ, ŋ, , ć (or the digraph ts), and the digraph dz are only used in certain dialects.
  7. Bambara also has the digraphs: kh (only present in loanwords), sh (also written as ʃ; only present in some dialects). Historically, è was used instead of ɛ, ny was used instead of ɲ, and ò was used instead of ɔ in Mali.
  8. Basque has several digraphs: dd, ll, rr, ts, tt, tx, tz. The ü, which represents /ø/, is required for various words in its Zuberoan dialect. c, q, v, w, y are used in foreign words, but are officially considered part of the alphabet.
  9. Belarusian also has several digraphs: ch, dz, dź, dž.
  10. Bislama also has the digraph ng.
  11. Breton also has the digraphs ch, c'h, zh. c, q, x are used in foreign words or digraphs only.
  12. Catalan also has a large number of digraphs: dj, gu, gü, ig, ix, ll, l·l, ny, qu, qü, rr, ss, tg, tj, ts, tx, tz. The letters k, q, w, y are only used in loanwords or the digraphs mentioned.
  13. The Alphabet of Chad also uses the unique letters and .
  14. Chamorro also has the digraphs ch, ng. c used only in digraphs.
  15. Corsican has the trigraphs: chj, ghj.
  16. Croatian Gaj's alphabet also has the digraphs: dž, lj, nj. There are also four tone markers that are sometimes used on vowels to avoid ambiguity in homophones, but this is generally uncommon. Gaj's alphabet has been adopted by the Serbian and Bosnian standards and that it has complete one-to-one congruence with Serbian Cyrillic, where the three digraphs map to Cyrillic letters џ, љ and њ, respectively. Rarely and non-standardly, digraph dj is used instead of đ (like it was previously) (Cyrillic ђ). Montenegrin variant additionally uses ś and ź to indicate dialectal pronunciation.
  17. Cypriot Arabic also has the letters Θ and Δ.
  18. Czech also has the digraph ch, which is considered a separate letter and is sorted between h and i. While á, ď, é, ě, í, ň, ó, ť, ú, ů, ý are considered separate letters, in collation they are treated merely as letters with diacritics. However, č, ř, š, ž are sorted as separate letters. q, w, x occur only in loanwords.
  19. Dakelh also contains the letter ', which represents the glottal stop. The letters f, p, r, v are only used in loanwords.
  20. The Norwegian alphabet is currently identical with the Danish alphabet. c is part of both alphabets and is not used in native Danish or Norwegian words (except some proper names), but occurs quite frequently in well-established loanwords in Danish. Norwegian and Danish use é in some words such as én, although é is considered a diacritic mark, while å, æ, ø are letters. q, w, x, z are not used except for names and some foreign words.
  21. Dinka also has the digraphs: dh, nh, ny, th. h is only present in these digraphs. Dinka also used the letters ä, ë, ï, ö, ɛ̈, ɔ̈ (the last two which do not exist as precomposed characters in Unicode)
  22. The status of ij as a letter, ligature or digraph in Dutch is disputed. c (outside the digraph ch), q, x, and y occur mostly in foreign words. Letters with grave and letters with circumflex occur only in loanwords.
  23. English generally now uses extended Latin letters only in loan words, such as fiancé, fiancée, and résumé. Rare publication guides may still use the dieresis on words, such as "coöperate", rather than the now-more-common "co-operate" (UK) or "cooperate" (US). For a fuller discussion, see articles branching from Lists of English words of international origin, which was used to determine the diacritics needed for more unambiguous English. However, an é or è is sometimes used in poetry to show that a normally silent vowel is to be pronounced, as in "blessèd".
  24. Filipino also known as Tagalog also uses the digraph ng, even originally with a large tilde that spanned both n and g (as in n͠g) when a vowel follows the digraph. (The use of the tilde over the two letters is now rare). Only ñ is required for everyday use (only in loanwords). The accented vowels are used in dictionaries to indicate pronunciation, and g with tilde is only present in older works.
  25. Uppercase diacritics in French are often (incorrectly) thought of as being optional, but the official rules of French orthography designate accents on uppercase letters as obligatory in most cases. Many pairs or triplets are read as digraphs or trigraphs depending on context, but are not treated as such lexicographically: consonants ph, (ng), th, gu/gü, qu, ce, ch/(sh/sch), rh; vocal vowels (ee), ai/ay, ei/ey, eu, au/eau, ou; nasal vowels ain/aim, in/im/ein, un/um/eun, an/am, en/em, om/on; the half-consonant -(i)ll-; half-consonant and vowel pairs oi, oin/ouin, ien, ion. When rules that govern the French orthography are not observed, they are read as separate letters, or using an approximating phonology of a foreign language for loan words, and there are many exceptions. In addition, most final consonants are mute (including those consonants that are part of feminine, plural, and conjugation endings). ÿ and ü are only used in certain geographical names and proper names plus their derivatives, or, in the case of ü with diaeresis, newly proposed reforms, e.g. capharnaüm 'shambles' is derived from the proper name Capharnaüm. æ occurs only in Latin or Greek loanwords.
  26. Fula has x as part of the alphabet in all countries except Guinea, Guinea-Bissau, Liberia, and Sierra Leone (used only in loanwords in these countries). ɠ, which is used only in loanwords (but still part of the alphabet), is used in Guinea only. Fula also uses the digraphs mb (In Guinea spelled mb), nd, ng, nj. aa, ee, ii, oo, uu are part of the alphabet in all countries except Guinea, Guinea-Bissau, Liberia, and Sierra Leone. ƴ is used in all countries except for Nigeria, where it is written 'y. ŋ is used in all countries except for Nigeria. ɲ is used in Guinea, Mali, and Burkina Faso, ñ is used in Senegal, Gambia, Mauritania, Guinea-Bissau, Liberia, and Sierra Leone, and the digraph ny is used in Niger, Cameroon, Chad, Central African Republic, and Nigeria. The apostrophe is a letter (representing the glottal stop) in Guinea-Bissau, Liberia, and Sierra Leone. q, v, z are only used in loanwords, and are not part of the alphabet.
  27. Galician. The standard of 1982 set also the digraphs gu, qu (both always before e and i), ch, ll, nh and rr. In addition, the standard of 2003 added the grapheme ao as an alternative writing of ó. Although not marked (or forgotten) in the list of digraphs, they are used to represent the same sound, so the sequence ao should be considered as a digraph. The sequence nh represents a velar nasal (not a palatal as in Portuguese) and is restricted only to three feminine words, being either demonstrative or pronoun: unha ('a' and 'one'), algunha ('some') and ningunha ('not one'). The Galician reintegracionismo movement uses it as in Portuguese. j (outside of the Limia Baixa region), k, w, and y are only used in loanwords, and are not part of the alphabet.
  28. German also retains most original letters in French loan words. Swiss German does not use ß any more. The long s ſ was in use until the mid-20th century. sch is usually not considered a separate letter, neither are the digraphs ch, ck, st, sp, th, (ph, rh), qu. q only appears in the sequence qu and in loanwords, while x and y are found almost only in loan words. The capital ß () is almost never used. The accented letters (other than the letters ä, ö, ü, and ß) are used only in loanwords.
  29. Guaraní also uses digraphs ch, mb, nd, ng, nt, rr and the glottal stop '. b, c, d are only used in these digraphs.
  30. Gwich'in also contains the letter ', which represents the glottal stop. Gwich'in also uses the letters ą̀, ę̀, į̀, ǫ̀, ų̀, which are not available as precomposed characters in Unicode. Gwich'in also uses the digraphs and trigraphs: aa, ąą, àà, ą̀ą̀, ch, ch', ddh, dh, dl, dr, dz, ee, ęę, èè, ę̀ę̀, gh, ghw, gw, ii, įį, ìì, į̀į̀, kh, kw, k', nd, nh, nj, oo, ǫǫ, òò, ǫ̀ǫ̀, rh, sh, shr, th, tl, tl', tr, tr', ts, ts', tth, tth', t', uu, ųų, ùù, ų̀ų̀, zh, zhr. The letter c is only used the digraphs above. b, f, m are only used in loanwords.
  31. Hausa has the digraphs: sh, ts. Vowel length and tone are usually not marked. Textbooks usually use macron or doubled vowel to mark the length, grave to mark the low tone and circumflex to mark the falling tone. Therefore, in some systems, it is possible that macron is used in combination with grave or circumflex over a, e, i, o or u. The letter p is only used in loanwords.
  32. Hungarian also has the digraphs: cs, dz, gy, ly, ny, sz, ty, zs; and the trigraph: dzs. á, é, í, ó, ő, ú, ű are considered separate letters, but are collated as variants of a, e, i, o, ö, u, ü.
  33. Irish traditionally used the dot diacritic (Irish : ponc séimhithe) to mark lenition, forming the dotted letters (litreacha buailte "struck letters") ḃ, ċ, ḋ, ḟ, ġ, ṁ, ṗ, ṡ, ṫ. These have largely been replaced by the digraphs: bh, ch, dh, fh, gh, mh, ph, sh, th except for in decorative or self-consciously traditional contexts. v occurs in a small number of (mainly onomatopoeic) native words (e.g. vácarnach "to quack") and colloquialisms (vís for bís "screw"). j, k, q, w, x, y, z only occur in loanwords and scientific terminology.
  34. Igbo writes alternatively as . Igbo has the digraphs: ch, gb, gh, gw, kp, kw, nw, ny, sh. c is only used in the digraph before. Also, vowels take a grave accent, an acute accent, or no accent, depending on tone.
  35. Italian also has the digraphs: ch, gh, gn, gl, sc. j, k, w, x, y are used in foreign words, and are not part of the alphabet. x is also used for native words derived from Latin and Greek; j is also used for just a few native words, mainly names of persons (as in Jacopo) or of places (as in Jesolo and Jesi), in which represents /i/. While it does not occur in ordinary running texts, geographical names on maps are often written only with acute accents. The circumflex is used on an -i ending that was anciently written -ii (or -ji, -ij, -j, etc.) to distinguish homograph plurals and verb forms: e.g. principî form principi, genî from geni.
  36. Karakalpak also has the digraphs: ch, sh. c, f, v are used in foreign words.
  37. Kazakh also has the digraphs: ia, io, iu. f, h, v and the digraph io are used in foreign words.
  38. Latvian also has the digraphs: dz, dž, ie. Dz and are occasionally considered separate letters of the alphabet in more archaic examples, which have been published as recently as the 1950s; however, modern alphabets and teachings discourage this due to an ongoing effort to set decisive rules for Latvian and eliminate barbaric words accumulated during the Soviet occupation. The digraph "ie" is never considered a separate letter. Ō, Ŗ, and the digraphs CH (only used in loanwords) and UO are no longer part of the alphabet, but are still used in certain dialects and newspapers that use the old orthography. Y is used only in certain dialects and not in the standard language. F and H are only used in loanwords.
  39. A nearby language, Pite Sami, uses Lule Sami orthography but also includes the letters đ and ŧ, which are not in Lule Sami.
  40. Lithuanian also has the digraphs: ch, dz, dž, ie, uo. However, these are not considered separate letters of the alphabet. F, H, and the digraph CH are only used in loanwords. Demanding publications such as dictionaries, maps, schoolbooks etc. need additional diacritical marks to differentiate homographs. Using grave accent on A, E, I, O, U, acute accent on all vowels, and tilde accent on all vowels and on L, M, N and R. Small E and I (also with ogonek) must retain the dot when additional accent mark is added to the character; the use of ì and í (with missing dot) is considered unacceptable.
  41. In Livonian, the letters Ö, Ȫ, Y, Ȳ were used by the older generation, but the younger generation merged these sounds; Around the late 1990s, these letters were removed from the alphabet.
  42. Maltese also has the digraphs: ie, għ.
  43. Māori only uses g in ng digraph. wh is also a digraph.
  44. Marshallese often uses the old orthography (because people did not approve of the new orthography), which writes ļ as l, as m, ņ as n, p as b, as o at the ends of words or in the word yokwe (also spelled iakwe under the old orthography; under the new orthography, spelled io̧kwe), but a[ clarification needed ] at other places, and d as dr before vowels, or r after vowels. The old orthography writes ā as e in some words, but ā in others; it also writes ū as i between consonants. The old orthography writes geminates and long vowels as two letters instead. Allophones of /ɘ/, written as only e o ō in the new orthography, are also written as i u and very rarely, ū. The letter Y only occurs in the words yokwe or the phrase yokwe yuk (also spelled iakwe iuk in the old orthography or io̧kwe eok in the new orthography).
  45. Massachusett also uses the digraphs ch, ee, sh, ty and the letter 8 (which was previously written oo). c is only used in the digraph ŋ.
  46. Some Mohawk speakers use orthographic i in place of the consonant y. The glottal stop is indicated with an apostrophe ' and long vowels are written with a colon :.
  47. Na'vi uses the letter ʼ and the digraphs aw, ay, ew, ey, kx, ll, ng (sometimes written g), px, rr, ts (sometimes written c), tx. g (in standard orthography) and x are used only in digraphs.
  48. Oromo uses the following digraphs: ch, dh, ny, ph, sh. p is only used in the digraph ph and loanwords. v and z are only used in loanwords.
  49. Papiamento also has the digraphs: ch, dj, sh, zj. q, x are only used in loanwords and proper names. j is only used in digraphs, loanwords, and proper names. Papiamentu in Bonaire and Curaçao is different from Papiamento in Aruba in the following ways: Papiamento in Aruba uses a more etymological spelling, so Papiamento uses c in native words outside of the digraph ch, but Papiamentu in Bonaire and Curaçao does not. Papiamentu in Bonaire and Curaçao uses è, ò, ù, and ü for various sounds and á, é, í, ó, ú for stress, but Papiamento in Aruba does not use these letters.
  50. Piedmontese also uses the letter n-, which usually precedes a vowel, as in lun-a "moon".
  51. Pinyin has four tone markers that can go on top of any of the six vowels (a, e, i, o, u, ü); e.g.: macron (ā, ē, ī, ō, ū, ǖ), acute accent (á, é, í, ó, ú, ǘ), caron (ǎ, ě, ǐ, ǒ, ǔ, ǚ), grave accent (à, è, ì, ò, ù, ǜ). It also uses the digraphs: ch, sh, zh.
  52. Polish also has the digraphs: ch, cz, dz, dż, dź, sz, rz. q, v, x occur only in loanwords, and are sometimes not considered as part of the alphabet.
  53. Portuguese uses the digraphs ch, lh, nh, rr, ss. The trema on ü was used in Brazilian Portuguese from 1943 to 2009. European Portuguese in that case used the grave accent (ù) from 1911 to 1945, then abolished. The grave accent was used on e, i, o, u, until 1973. è, ò are used in geographical names outside Europe and not part of the language proper. The now abandoned practice was to indicate underlying stress in words ending in -mente, e.g. sòmente, ùltimamente etc. Neither the digraphs nor accented letters are considered part of the alphabet. k, w, y occur only in loanwords, and were not letters of the alphabet until 2009, but these letters were used before 1911.
  54. Romani has the digraphs: čh, dž, kh, ph, th.
  55. Romanian normally uses the letters ș, ț (s, t with a comma diacritic below) but they are frequently replaced by ş, ţ (s, t with a cedilla) due to past lack of standardization. k, q, w, x, y occur only in loanwords.
  56. Slovak also has the digraphs dz, dž, ch which are considered separate letters. While á, ä, ď, é, í, ĺ, ň, ó, ô, ŕ, ť, ú', ý are considered separate letters, in collation they are treated merely as letters with diacritics. However, č, ľ, š, ž, as well as the digraphs, are actually sorted as separate letters. q, w, x, ö, ü occur only in loanwords.
  57. Sorbian also uses the digraphs: ch, . ř is only used in Upper Sorbian, and ŕ, ś, and ź (outside the digraph ) are only used in Lower Sorbian.
  58. Spanish uses several digraphs to represent single sounds: ch, gu (preceding e or i), ll, qu, rr; of these, the digraphs ch and ll were traditionally considered individual letters with their own name (che, elle) and place in the alphabet (after c and l, respectively), but in order to facilitate international compatibility the Royal Spanish Academy decided to cease this practice in 1994 and all digraphs are now collated as combinations of two separate characters. While cedilla is etymologically Spanish diminutive of ceda (z) and Sancho Pança is the original form in Cervantes books, C with cedilla ç is now completely displaced by z in contemporary language. In poetry, the diaeresis may be used to break a diphthong into separate vowels. Regarding that usage, Ortografía de la lengua española states that "diaeresis is usually placed over the closed vowel [i.e. i or u] and, when both are closed, generally over the first"[ citation needed ]. In this context, the use of ï is rare, but part of the normative orthography.
  59. Swedish uses é in well integrated loan words like idé and armé, although é is considered a modified e, while å, ä, ö are letters. á and à are rarely used words. w and z are used in some integrated words like webb and zon. q, ü, è, and ë are used for names only, but exist in Swedish names. For foreign names ó, ç, ñ and more are sometimes used, but usually not. Swedish has many digraphs and some trigraphs. ch, dj, lj, rl, rn, rs, sj, sk, si, ti, sch, skj, stj and others are usually pronounced as one sound.
  60. Tswana also has the digraphs: kg, kh, ng, ph, th, tl, tlh, ts, tsh, tš, tšh. The letters c, q, and x only appear in onomatopoeic and loanwords. The letters v and z only appear in loanwords.
  61. Turkmen had a slightly different alphabet in 1993–1995 (which used some rare letters) ý was written as ÿ (capital ¥), ň as ñ, ş as ¢ (capital $), and ž as (capital £) (so that all characters were available in Code page 437). In the new alphabet, all characters are available in ISO/IEC 8859-2.
  62. Ulithian also has the digraphs: ch, l', mw, ng. c is used only in digraphs.
  63. Uzbek also has the digraphs: ch, ng, sh considered as letters. c is used only in digraphs. g', o' and apostrophe ' are considered as letters. These letters have preferred typographical variants: , , and ʼ respectively.
  64. Venda also has the digraphs and trigraphs: bv, bw, dz, dzh, dzw, fh, hw, kh, khw, ng, ny, nz, ṅw, ph, pf, pfh, sh, sw, th, ts, tsh, tsw, ty, ṱh, vh, zh, zw. c, j, q are used in foreign words.
  65. Vietnamese has seven additional base letters: ă â đ ê ô ơ ư. It uses five tone markers that can go on top (or below) any of the 12 vowels (a, ă, â, e, ê, i, o, ô, ơ, u, ư, y); e.g.: grave accent (à, ằ, ầ, è, ề, ì, ò, ồ, ờ, ù, ừ, ỳ), hook above (ả, ẳ, ẩ, ẻ, ể, ỉ, ỏ, ổ, ở, ủ, ử, ỷ), tilde (ã, ẵ, ẫ, ẽ, ễ, ĩ, õ, ỗ, ỡ, ũ, ữ, ỹ), acute accent (á, ắ, ấ, é, ế, í, ó, ố, ớ, ú, ứ, ý), and dot below (ạ, ặ, ậ, ẹ, ệ, ị, ọ, ộ, ợ, ụ, ự, ỵ). It also uses several digraphs and trigraphs ch, gh, gi, kh, ng, ngh, nh, ph, th, tr but they are no longer considered letters.
  66. Walloon has the digraphs and trigraphs: ae, ch, dj, ea, jh, oe, oen, oi, sch, sh, tch, xh. The letter x outside the digraph xh is in some orthographies, but not the default two. The letter q is in some orthographies (including one default orthography), but not in the other default orthography. Also in some orthographies are à, ì, ù, ö, and even (which is not available as a precomposed character in Unicode, so ë is used as a substitute)
  67. Welsh has the digraphs ch, dd, ff, ng, ll, ph, rh, th. Each of these digraphs is collated as a separate letter, and ng comes immediately after g in the alphabet. It also frequently uses circumflexes, and occasionally uses diaereses, acute accents and grave accents, on its seven vowels (a, e, i, o, u, w, y), but accented characters are not regarded as separate letters of the alphabet.
  68. Xhosa has a large number of digraphs, trigraphs, and even one tetragraph are used to represent various phonemes: bh, ch, dl, dy, dz, gc, gq, gr, gx, hh, hl, kh, kr, lh, mb, mf, mh, nc, ndl, ndz, ng, ng', ngc, ngh, ngq, ngx, nh, nkc, nkq, nkx, nq, nx, ntl, ny, nyh, ph, qh, rh, sh, th, ths, thsh, ts, tsh, ty, tyh, wh, xh, yh, zh. It also occasionally uses acute accents, grave accents, circumflexes, and diaereses on its five vowels (a, e, i, o, u), but accented characters are not regarded as separate letters of the alphabet.
  69. Yapese has the digraphs and trigraphs: aa, ae, ch, ea, ee, ii, k', l', m', n', ng, ng', oe, oo, p', t', th, th', uu, w', y'. q, representing the glottal stop, is not always used. Often an apostrophe is used to represent the glottal stop instead. c is used only in digraphs. h is used only in digraphs and loanwords. q is used only in loanwords.
  70. Yoruba uses the digraph gb. Also, vowels take a grave accent, an acute accent, or no accent, depending on tone. Although the "dot below" diacritic is widely used, purists prefer a short vertical underbar (Unicode COMBINING VERTICAL LINE BELOW U+0329) - this resembles the IPA notation for a syllabic consonant, attached to the base of the letter (e, o or s). The seven Yoruba vowels (a, e, , i, o, , u) can be uttered in three different tones: high (acute accent); middle (no accent) and low (grave accent). The letters m and n, when written without diacritics, indicate nasalisation of the preceding vowel. m and n also occur as syllabics - in these circumstances, they take acute or grave tonal diacritics, like the vowels. Middle tone is marked with a macron to differentiate it from the unmarked nasalising consonants. A tilde was used in older orthography (still occasionally used) to indicate a double vowel. This is tonally ambiguous, and has now been replaced by showing the paired vowels, each marked with the appropriate tones. However, where a double vowel has the tonal sequence high-low or low-high, it may optionally be replaced by a single vowel with a circumflex (high-low) or caron (low-high), e.g. á + à = â; à + á = ǎ.
  71. Zuni contains the glottal stop ' and the digraph: ch; c is only used in that digraph. The other digraphs kw, sh, and ts are not part of the alphabet.

Miscellanea

See also

Categories

Footnotes

  1. As defined in ISO/IEC 646 based on ASCII, which was based on the 26 letters of the English alphabet and previous telecommunications standards, and used in later ISO standards, see Latin characters in Unicode.
  2. "Nav script".

Related Research Articles

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

The Danish and Norwegian alphabets, together called the Dano-Norwegian alphabet, is the set of symbols, forming a variant of the Latin alphabet, used for writing the Danish and Norwegian languages. It has consisted of the following 29 letters since 1917 (Norwegian) and 1948 (Danish):

Welsh orthography uses 29 letters of the Latin script to write native Welsh words as well as established loanwords.

A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.

Filipinoorthography specifies the correct use of the writing system of the Filipino language, the national and co-official language of the Philippines.

<span class="mw-page-title-main">English alphabet</span> Latin-script alphabet consisting of 26 letters

The alphabet for Modern English is a Latin-script alphabet consisting of 26 letters, each having an upper- and lower-case form. The word alphabet is a compound of the first two letters of the Greek alphabet, alpha and beta. The alphabet originated around the 7th century to write Old English from Latin script. Since then, letters have been added or removed to give the current letters:

<span class="mw-page-title-main">Polish alphabet</span> Script of the Polish language

The Polish alphabet is the script of the Polish language, the basis for the Polish system of orthography. It is based on the Latin alphabet but includes certain letters with diacritics: the acute accent ; the overdot ; the tail or ogonek ; and the stroke. ⟨q⟩, ⟨v⟩, and ⟨x⟩, which are used only in foreign words, are usually absent from the Polish alphabet. However, prior to the standardization of Polish spelling, ⟨x⟩ was sometimes used in place of ⟨ks⟩.

Alphabetical order is a system whereby character strings are placed in order based on the position of the characters in the conventional ordering of an alphabet. It is one of the methods of collation. In mathematics, a lexicographical order is the generalization of the alphabetical order to other data types, such as sequences of numbers or other ordered mathematical objects.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

Italian orthography uses 21 letters of the 26-letter Latin alphabet to write the Italian language. This article focuses on the writing of Standard Italian, based historically on the Florentine dialect, and not the other Italian dialects.

Lithuanian orthography employs a Latin-script alphabet of 32 letters, two of which denote sounds not native to the Lithuanian language. Additionally, it uses five digraphs.

Irish orthography is the set of conventions used to write Irish. A spelling reform in the mid-20th century led to An Caighdeán Oifigiúil, the modern standard written form used by the Government of Ireland, which regulates both spelling and grammar. The reform removed inter-dialectal silent letters, simplified some letter sequences, and modernised archaic spellings to reflect modern pronunciation, but it also removed letters pronounced in some dialects but not in others.

French orthography encompasses the spelling and punctuation of the French language. It is based on a combination of phonemic and historical principles. The spelling of words is largely based on the pronunciation of Old French c. 1100–1200 AD, and has stayed more or less the same since then, despite enormous changes to the pronunciation of the language in the intervening years. Even in the late 17th century, with the publication of the first French dictionary by the Académie française, there were attempts to reform French orthography.

Polish orthography is the system of writing the Polish language. The language is written using the Polish alphabet, which derives from the Latin alphabet, but includes some additional letters with diacritics. The orthography is mostly phonetic, or rather phonemic—the written letters correspond in a consistent manner to the sounds, or rather the phonemes, of spoken Polish. For detailed information about the system of phonemes, see Polish phonology.

The modern Latvian orthography is based on Latin script adapted to phonetic principles, following the pronunciation of the language. The standard alphabet consists of 33 letters – 22 unmodified Latin letters and 11 modified by diacritics. It was developed by the Knowledge Commission of the Riga Latvian Association in 1908, and was approved the same year by the orthography commission under the leadership of Kārlis Mīlenbahs and Jānis Endzelīns. It was introduced by law from 1920 to 1922 in the Republic of Latvia.

The modern Corsican alphabet uses twenty-two basic letters taken from the Latin alphabet with some changes, plus some multigraphs. The pronunciations of the English, French, Italian or Latin forms of these letters are not a guide to their pronunciation in Corsican, which has its own pronunciation, often the same, but frequently not. As can be seen from the table below, two of the phonemic letters are represented as trigraphs, plus some other digraphs. Nearly all the letters are allophonic; that is, a phoneme of the language might have more than one pronunciation and be represented by more than one letter. The exact pronunciation depends mainly on word order and usage and is governed by a complex set of rules, variable to some degree by dialect. These have to be learned by the speaker of the language.

The Cyrillic script family contains many specially treated two-letter combinations, or digraphs, but few of these are used in Slavic languages. In a few alphabets, trigraphs and even the occasional tetragraph or pentagraph are used.

<span class="mw-page-title-main">Norwegian orthography</span> Norwegian language writing conventions

Norwegian orthography is the method of writing the Norwegian language, of which there are two written standards: Bokmål and Nynorsk. While Bokmål has for the most part derived its forms from the written Danish language and Danish-Norwegian speech, Nynorsk gets its word forms from Aasen's reconstructed "base dialect", which is intended to represent the distinctive dialectal forms. Both standards use a 29-letter variant of the Latin alphabet and the same orthographic principles.