This is a list of precomposed Latin characters in Unicode. [1] [2] Unicode typefaces may be needed for these to display correctly.
Aa | Ææ | Bb | Cc | Dd | Ee | Ff | Gg | Hh | Ii | Jj | Kk | Ll | Mm | Nn | Oo | Øø | Pp | Rr | Ssſ | Tt | Uu | Vv | Ww | Xx | Yy | Zz | Ʒʒ | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
acute | Áá | Ǽǽ | Ćć | Éé | Ǵǵ | Íí | Ḱḱ | Ĺĺ | Ḿḿ | Ńń | Óó | Ǿǿ | Ṕṕ | Ŕŕ | Śś | Úú | Ẃẃ | Ýý | Źź | ||||||||||
acute and dot above | Ṥṥ | ||||||||||||||||||||||||||||
breve | Ăă | Ĕĕ | Ğğ | Ĭĭ | Ŏŏ | Ŭŭ | |||||||||||||||||||||||
breve and acute | Ắắ | ||||||||||||||||||||||||||||
breve and dot below | Ặặ | ||||||||||||||||||||||||||||
breve and grave | Ằằ | ||||||||||||||||||||||||||||
breve and hook above | Ẳẳ | ||||||||||||||||||||||||||||
breve and tilde | Ẵẵ | ||||||||||||||||||||||||||||
breve below | Ḫḫ | ||||||||||||||||||||||||||||
caron | Ǎǎ | Čč | Ďď | Ěě | Ǧǧ | Ȟȟ | Ǐǐ | ǰ | Ǩǩ | Ľľ | Ňň | Ǒǒ | Řř | Šš | Ťť | Ǔǔ | Žž | Ǯǯ | |||||||||||
caron and dot above | Ṧṧ | ||||||||||||||||||||||||||||
cedilla | Çç | Ḑḑ | Ȩȩ | Ģģ | Ḩḩ | Ķķ | Ļļ | Ņņ | Ŗŗ | Şş | Ţţ | ||||||||||||||||||
cedilla and acute | Ḉḉ | ||||||||||||||||||||||||||||
cedilla and breve | Ḝḝ | ||||||||||||||||||||||||||||
circumflex | Ââ | Ĉĉ | Êê | Ĝĝ | Ĥĥ | Îî | Ĵĵ | Ôô | Ŝŝ | Ûû | Ŵŵ | Ŷŷ | Ẑẑ | ||||||||||||||||
circumflex and acute | Ấấ | Ếế | Ốố | ||||||||||||||||||||||||||
circumflex and dot below | Ậậ | Ệệ | Ộộ | ||||||||||||||||||||||||||
circumflex and grave | Ầầ | Ềề | Ồồ | ||||||||||||||||||||||||||
circumflex and hook above | Ẩẩ | Ểể | Ổổ | ||||||||||||||||||||||||||
circumflex and tilde | Ẫẫ | Ễễ | Ỗỗ | ||||||||||||||||||||||||||
circumflex below | Ḓḓ | Ḙḙ | Ḽḽ | Ṋṋ | Ṱṱ | Ṷṷ | |||||||||||||||||||||||
comma below | Șș | Țț | |||||||||||||||||||||||||||
diaeresis | Ää | Ëë | Ḧḧ | Ïï | Öö | ẗ | Üü | Ẅẅ | Ẍẍ | Ÿÿ | |||||||||||||||||||
diaeresis and acute | Ḯḯ | Ǘǘ | |||||||||||||||||||||||||||
diaeresis and caron | Ǚǚ | ||||||||||||||||||||||||||||
diaeresis and grave | Ǜǜ | ||||||||||||||||||||||||||||
diaeresis and macron | Ǟǟ | Ȫȫ | Ǖǖ | ||||||||||||||||||||||||||
diaeresis below | Ṳṳ | ||||||||||||||||||||||||||||
dot above | Ȧȧ | Ḃḃ | Ċċ | Ḋḋ | Ėė | Ḟḟ | Ġġ | Ḣḣ | İ | Ṁṁ | Ṅṅ | Ȯȯ | Ṗṗ | Ṙṙ | Ṡṡẛ | Ṫṫ | Ẇẇ | Ẋẋ | Ẏẏ | Żż | |||||||||
dot above and macron | Ǡǡ | Ȱȱ | |||||||||||||||||||||||||||
dot below | Ạạ | Ḅḅ | Ḍḍ | Ẹẹ | Ḥḥ | Ịị | Ḳḳ | Ḷḷ | Ṃṃ | Ṇṇ | Ọọ | Ṛṛ | Ṣṣ | Ṭṭ | Ụụ | Ṿṿ | Ẉẉ | Ỵỵ | Ẓẓ | ||||||||||
dot below and dot above | Ṩṩ | ||||||||||||||||||||||||||||
dot below and macron | Ḹḹ | Ṝṝ | |||||||||||||||||||||||||||
double acute | Őő | Űű | |||||||||||||||||||||||||||
double grave | Ȁȁ | Ȅȅ | Ȉȉ | Ȍȍ | Ȑȑ | Ȕȕ | |||||||||||||||||||||||
grave | Àà | Èè | Ìì | Ǹǹ | Òò | Ùù | Ẁẁ | Ỳỳ | |||||||||||||||||||||
hook above | Ảả | Ẻẻ | Ỉỉ | Ỏỏ | Ủủ | Ỷỷ | |||||||||||||||||||||||
horn | Ơơ | Ưư | |||||||||||||||||||||||||||
horn and acute | Ớớ | Ứứ | |||||||||||||||||||||||||||
horn and dot below | Ợợ | Ựự | |||||||||||||||||||||||||||
horn and grave | Ờờ | Ừừ | |||||||||||||||||||||||||||
horn and hook above | Ởở | Ửử | |||||||||||||||||||||||||||
horn and tilde | Ỡỡ | Ữữ | |||||||||||||||||||||||||||
inverted breve | Ȃȃ | Ȇȇ | Ȋȋ | Ȏȏ | Ȓȓ | Ȗȗ | |||||||||||||||||||||||
macron | Āā | Ǣǣ | Ēē | Ḡḡ | Īī | Ōō | Ūū | Ȳȳ | |||||||||||||||||||||
macron and acute | Ḗḗ | Ṓṓ | |||||||||||||||||||||||||||
macron and diaeresis | Ṻṻ | ||||||||||||||||||||||||||||
macron and grave | Ḕḕ | Ṑṑ | |||||||||||||||||||||||||||
macron below | Ḇḇ | Ḏḏ | ẖ | Ḵḵ | Ḻḻ | Ṉṉ | Ṟṟ | Ṯṯ | Ẕẕ | ||||||||||||||||||||
ogonek | Ąą | Ęę | Įį | Ǫǫ | Ųų | ||||||||||||||||||||||||
ogonek and macron | Ǭǭ | ||||||||||||||||||||||||||||
ring above | Åå | Ůů | ẘ | ẙ | |||||||||||||||||||||||||
ring above and acute | Ǻǻ | ||||||||||||||||||||||||||||
ring below | Ḁḁ | ||||||||||||||||||||||||||||
tilde | Ãã | Ẽẽ | Ĩĩ | Ññ | Õõ | Ũũ | Ṽṽ | Ỹỹ | |||||||||||||||||||||
tilde and acute | Ṍṍ | Ṹṹ | |||||||||||||||||||||||||||
tilde and diaeresis | Ṏṏ | ||||||||||||||||||||||||||||
tilde and macron | Ȭȭ | ||||||||||||||||||||||||||||
tilde below | Ḛḛ | Ḭḭ | Ṵṵ | ||||||||||||||||||||||||||
Name | Uppercase | Lowercase |
---|---|---|
angstrom sign | Å | |
a with right half ring | ẚ | |
kelvin sign | K | |
l with interpunct | Ŀ | ŀ |
apostrophe n | ʼn | |
long s | ſ |
A collection of precomposed Latin characters (mostly abbreviations of units of measurement) is also included in the CJK Compatibility and Enclosed CJK Letters and Months sections of Unicode, as are a set of precomposed Roman numerals; these characters are intended for use in East Asian languages and are not meant to be mixed with Latin languages. Several enclosed alphanumerics are also featured in Unicode.
Some characters in the Letterlike Symbols block can be substituted with characters in the ASCII range.
ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.
Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text written in all of the world's major writing systems. Version 15.1 of the standard defines 149813 characters and 161 scripts used in various ordinary, literary, academic, and technical contexts.
A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.
A precomposed character is a Unicode entity that can also be defined as a sequence of one or more other characters. A precomposed character may typically represent a letter with a diacritical mark, such as é. Technically, é (U+00E9) is a character that can be decomposed into an equivalent string of the base letter e (U+0065) and combining acute accent (U+0301). Similarly, ligatures are precompositions of their constituent letters or graphemes.
Gaj's Latin alphabet, also known as abeceda or gajica, is the form of the Latin script used for writing Serbo-Croatian and all of its standard varieties: Bosnian, Croatian, Montenegrin, and Serbian.
Dž is the seventh letter of the Gaj's Latin alphabet for Serbo-Croatian, after D and before Đ. It is pronounced. Dž is a digraph that corresponds to the letter Dzhe (Џ/џ) of the Serbian Cyrillic alphabet. It is also the tenth letter of the Slovak alphabet. Although several other languages also use the letter combination DŽ, they treat it as a pair of the letters D and Ž, not as a single distinct letter.
YUSCII is an informal name for several JUS standards for 7-bit character encoding. These include:
New Gulim (새굴림/SaeGulRim) is a sans-serif type Unicode font designed especially for the Korean-language script, designed by HanYang System Co., Limited. It is an expanded version of Hanyang Gulrim.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.
In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.
Unicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character. This feature was introduced in the standard to allow compatibility with preexisting standard character sets, which often included similar or identical characters.
In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards. As the Unicode Glossary says:
A character that would not have been encoded except for compatibility and round-trip convertibility with other standards
Enclosed Alphanumerics is a Unicode block of typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane.
Hangul Jamo is a Unicode block containing positional forms of the Hangul consonant and vowel clusters. While the Hangul Syllables Unicode block contains precomposed syllables used in standard modern Korean, the Hangul Jamo block can be used to compose arbitrary syllables dynamically including those not included in the Hangul Syllables block.
Enclosed CJK Letters and Months is a Unicode block containing circled and parenthesized Katakana, Hangul, and CJK ideographs. Also included in the block are miscellaneous glyphs that would more likely fit in CJK Compatibility or Enclosed Alphanumerics: a few unit abbreviations, circled numbers from 21 to 50, and circled multiples of 10 from 10 to 80 enclosed in black squares.
CJK Compatibility is a Unicode block containing square symbols encoded for compatibility with East Asian character sets. In Unicode 1.0, it was divided into two blocks, named CJK Squared Words (U+3300–U+337F) and CJK Squared Abbreviations (U+3380–U+33FF).
Noto is a font family comprising over 100 individual computer fonts, which are together designed to cover all the scripts encoded in the Unicode standard. As of October 2016, Noto fonts cover all 93 scripts defined in Unicode version 6.1, although fewer than 30,000 of the nearly 75,000 CJK unified ideographs in version 6.0 are covered. In total, Noto fonts cover over 77,000 characters, which is around half of the 149,186 characters defined in Unicode 15.0.