This article needs additional citations for verification .(January 2016) |
Unicode supports several phonetic scripts and notations through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.
The International Phonetic Alphabet (IPA) makes use of letters from other writing systems as most phonetic scripts do. IPA notably uses Latin, Greek and Cyrillic characters. Combining diacritics also add meaning to the phonetic text. Finally, these phonetic alphabets make use of modifier letters, that are specially constructed for phonetic meaning. A "modifier letter" is strictly intended not as an independent grapheme but as a modification of the preceding character [1] resulting in a distinct grapheme, notably in the context of the International Phonetic Alphabet. For example, ʰ should not occur on its own but modifies the preceding or following symbol. Thus, tʰ is a single IPA symbol, distinct from t. In practice, however, several of these "modifier letters" are also used as full graphemes, e.g. ʿ as transliterating Semitic ayin or Hawaiian ʻokina, or ˚ transliterating Abkhaz ә.
The following tables indicates the Unicode code point sequences for phonemes as used in the International Phonetic Alphabet. A bold code point indicates that the Unicode chart provides an application note such as "voiced retroflex lateral" for U+026DɭLATIN SMALL LETTER L WITH RETROFLEX HOOK. An entry in bold italics indicates the character name itself refers to a phoneme such as U+0298ʘLATIN LETTER BILABIAL CLICK
Basic Latin/Greek Latin extended IPA extension
Bilabial | Labiodental | Dental | Alveolar | Postalveolar | Retroflex | Labialized palatal | Postalveolar-velar | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Plosive | p
| b
| p̪ U+0070 U+032A | b̪ U+0062 U+032A | t̪ U+0074 U+032A | d̪ U+0064 U+032A | t
| d
| ʈ
| ɖ
| ||||||
Implosive | ɓ̥U+0253U+0325 | ɓ
| ɗ̪ U+0257U+032A | ɗ
| ᶑ U+1D91 | |||||||||||
Ejective | pʼ U+0070 U+02BC | t̪ʼ U+0074 U+032AU+02BC | tʼ U+0074 U+02BC | ʈʼ U+0288U+02BC | ||||||||||||
Nasal | m̥ U+006D U+0325 | m
| ɱ̊U+0271 U+030A | ɱ
| n̪̊U+006E U+032A U+030A | n̪ U+006E U+032A | n̥ U+006E U+0325 | n
| ɳ̊ U+0273 U+030A | ɳ
| ||||||
Trill | ʙ
| r̥ U+0072 U+0325 | r
| * | ||||||||||||
Tap or Flap | ⱱ̟ U+2C71 U+031F | ⱱ
| ɾ
| ɽ
| ||||||||||||
Lateral flap | ɺ
| 𝼈 U+1DF08 | ||||||||||||||
Fricative | ɸ
| β
| f
| v
| θ
| ð
| s
| z
| ʃ
| ʒ
| ʂ
| ʐ
| ɧ
| |||
Lateral fricative | ɬ
| ɮ
| ꞎ
| |||||||||||||
Ejective fricative | sʼ U+0073 U+02BC | ʃʼU+0283U+02BC | ||||||||||||||
Ejective lateral fricative | ɬʼ U+026CU+02BC | |||||||||||||||
Percussive | ʬ
| ʭ
| ||||||||||||||
Approximant | β̞̊ U+03B2 U+031E U+030A | β̞ U+03B2 U+031E | ʋ̥U+028B U+0325 | ʋ
| ð̞ U+00F0 U+031E | ɹ̥U+0279 U+0325 | ɹ
| ɻ̊U+027B U+030A | ɻ
| ɥ̊U+0265 U+030A | ɥ
| |||||
Lateral approximant | l̥ U+006C U+0325 | l
| ɭ
| |||||||||||||
Click consonant | ʘ
| ǀ
| ǃ
| ǃ / ǂ /
| 𝼊
| |||||||||||
Lateral click | * | ǁ
|
Alveolo-palatal | Palatal | Labial-velar | Velar | Uvular | Pharyngeal | Epiglottal | Glottal | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Plosive | ȶ
| ȡ
| c
| ɟ
| k͡p U+006B U+0361 U+0070 | ɡ͡b U+0261 U+0361 U+0062 | k
| ɡ
| q
| ɢ
| ʡ
| ʔ
| ||||
Implosive | ʄ
| ɠ
| ʛ
| |||||||||||||
Ejective | cʼ U+0063 U+02BC | kʼ U+006B U+02BC | qʼ U+0071 U+02BC | |||||||||||||
Nasal | ȵ
| ɲ
| ŋ͡m U+014B U+0361 U+006D | ŋ
| ɴ
| |||||||||||
Trill | ʀ
| * | ||||||||||||||
Tap or Flap | * | |||||||||||||||
Lateral flap | * | * | ||||||||||||||
Fricative | ɕ
| ʑ
| ç U+00E7 | ʝ
| x
| ɣ
| χ
| ʁ
| ħ
| ʕ
| ʜ
| ʢ
| h
| ɦ
| ||
Approximant | j
| ʍ
| w
| ɰ
| ||||||||||||
Lateral approximant | ȴ
| ʎ
| ʟ
|
IPA: Vowels | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||
Legend: unrounded • rounded |
The following figures depict the phonetic vowels and their Unicode / UCS code points, arranged to represent the phonetic vowel trapezium. Vowels appearing in pairs in the figure to the right indicate rounded and unrounded variations respectively. Again, characters with Unicode names referring to phonemes are indicated by bold text. Those with explicit application notes are indicated by bold italic text. Those from borrowed unchanged from another script (Latin, Greek or Cyrillic) are indicated by italics. Before and after a bullet are the unrounded • rounded vowels.
Front | Central | Back | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Close | i • y U+0069 | ɨ • ʉ U+0268 | ɯ • u U+026F | ||||||||||||
Near-close | ɪ • ʏ U+026A | ɪ̈ • ʊ̈ U+026A U+0308 | • ʊ
| ||||||||||||
Close-mid | e • ø U+0065 | ɘ • ɵ U+0258 | ɤ • o U+0264 | ||||||||||||
Mid | e̞ • ø̞ U+0065 U+031E | ə
| ɤ̞ • o̞ U+0264 U+031E | ||||||||||||
Open-mid | ɛ • œ U+025B | ɜ • ɞ U+025C | ʌ • ɔ U+028C | ||||||||||||
Near-open | æ • U+00E6 | ɐ | |||||||||||||
Open | a • ɶ U+0061 | ä • U+0061 U+0308 | ɑ • ɒ U+0251 |
Diacritics may be encoded as either modifier (e.g. ˳) or combining (e.g. ◌̥) characters.
Voiceless | Breathy Voiced | Dental | Syllabic | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
˳ • ◌̥U+02F3 • U+0325 | ◌̤U+0324 | ◌͏̪U+032A | ˌ • ◌̩U+02CC • U+0329 | ||||||||
Voiced | Creaky Voiced | Apical | Non-syllabic | ||||||||
ˬ • ◌̬U+02EC • U+032C | ˷ • ◌̰U+02F7 • U+0330 | ˽ • ◌̺U+02FD • U+033A | ◌͏̯U+032F | ||||||||
Aspirated | Linguolabial | Laminal | More Rounded | ||||||||
ʰU+02B0 | ◌͏̼U+033C | ◌͏̻U+033B | ˒ • ◌̹U+02D2 • U+0339 | ||||||||
Labialized | Nasalized | Palatalized | Less Rounded | ||||||||
ʷU+02B7 | ◌̃U+0303 | ʲU+02B2 | ˓ • ◌̜U+02D3 • U+031C | ||||||||
Advanced | Nasal release | Centralized | Velarized | ||||||||
˖ • ◌̟U+02D6 • U+031F | ⁿU+207F | ¨ • ◌̈U+00A8 [1] • U+0308 | ˠU+02E0 | ||||||||
Retracted | Lateral release | Mid-Centralized | Pharyngealized | ||||||||
ˍ • ◌̠U+02CD • U+0320 | ˡU+02E1 | ˟ • ◌̽U+02DF • U+033D | ˤU+02E4 | ||||||||
Advanced Tongue Root | No audible release | Raised | Velarized or Pharyngealized | ||||||||
꭪ • ◌̘U+AB6A • U+0318 | ˺ • ◌̚U+02FA • U+031A | ˔ • ◌̝U+02D4 • U+031D | ◌̴U+0334 | ||||||||
Retracted Tongue Root | Rhoticity | Lowered | Lengthened | ||||||||
꭫ • ◌̙U+AB6B • U+0319 | ˞U+02DE | ˕ • ◌̞U+02D5 • U+031E | ː U+02D0 | ||||||||
Notes
|
Six Unicode blocks contain many phonetic symbols:
IPA Extensions [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+025x | ɐ | ɑ | ɒ | ɓ | ɔ | ɕ | ɖ | ɗ | ɘ | ə | ɚ | ɛ | ɜ | ɝ | ɞ | ɟ |
U+026x | ɠ | ɡ | ɢ | ɣ | ɤ | ɥ | ɦ | ɧ | ɨ | ɩ | ɪ | ɫ | ɬ | ɭ | ɮ | ɯ |
U+027x | ɰ | ɱ | ɲ | ɳ | ɴ | ɵ | ɶ | ɷ | ɸ | ɹ | ɺ | ɻ | ɼ | ɽ | ɾ | ɿ |
U+028x | ʀ | ʁ | ʂ | ʃ | ʄ | ʅ | ʆ | ʇ | ʈ | ʉ | ʊ | ʋ | ʌ | ʍ | ʎ | ʏ |
U+029x | ʐ | ʑ | ʒ | ʓ | ʔ | ʕ | ʖ | ʗ | ʘ | ʙ | ʚ | ʛ | ʜ | ʝ | ʞ | ʟ |
U+02Ax | ʠ | ʡ | ʢ | ʣ | ʤ | ʥ | ʦ | ʧ | ʨ | ʩ | ʪ | ʫ | ʬ | ʭ | ʮ | ʯ |
Notes
|
The characters in the "Spacing Modifier Letters" block are intended as forming a unity with the preceding letter (which they "modify"). E.g. the character U+02B0ʰMODIFIER LETTER SMALL H isn't intended simply as a superscript h (h), but as the mark of aspiration placed after the letter being aspirated, as in pʰ "aspirated voiceless bilabial plosive". The block contains:
Spacing Modifier Letters [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+02Bx | ʰ | ʱ | ʲ | ʳ | ʴ | ʵ | ʶ | ʷ | ʸ | ʹ | ʺ | ʻ | ʼ | ʽ | ʾ | ʿ |
U+02Cx | ˀ | ˁ | ˂ | ˃ | ˄ | ˅ | ˆ | ˇ | ˈ | ˉ | ˊ | ˋ | ˌ | ˍ | ˎ | ˏ |
U+02Dx | ː | ˑ | ˒ | ˓ | ˔ | ˕ | ˖ | ˗ | ˘ | ˙ | ˚ | ˛ | ˜ | ˝ | ˞ | ˟ |
U+02Ex | ˠ | ˡ | ˢ | ˣ | ˤ | ˥ | ˦ | ˧ | ˨ | ˩ | ˪ | ˫ | ˬ | ˭ | ˮ | ˯ |
U+02Fx | ˰ | ˱ | ˲ | ˳ | ˴ | ˵ | ˶ | ˷ | ˸ | ˹ | ˺ | ˻ | ˼ | ˽ | ˾ | ˿ |
Notes
|
This block, together with Phonetic Extensions Supplement below, contains:
Phonetic Extensions [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1D0x | ᴀ | ᴁ | ᴂ | ᴃ | ᴄ | ᴅ | ᴆ | ᴇ | ᴈ | ᴉ | ᴊ | ᴋ | ᴌ | ᴍ | ᴎ | ᴏ |
U+1D1x | ᴐ | ᴑ | ᴒ | ᴓ | ᴔ | ᴕ | ᴖ | ᴗ | ᴘ | ᴙ | ᴚ | ᴛ | ᴜ | ᴝ | ᴞ | ᴟ |
U+1D2x | ᴠ | ᴡ | ᴢ | ᴣ | ᴤ | ᴥ | ᴦ | ᴧ | ᴨ | ᴩ | ᴪ | ᴫ | ᴬ | ᴭ | ᴮ | ᴯ |
U+1D3x | ᴰ | ᴱ | ᴲ | ᴳ | ᴴ | ᴵ | ᴶ | ᴷ | ᴸ | ᴹ | ᴺ | ᴻ | ᴼ | ᴽ | ᴾ | ᴿ |
U+1D4x | ᵀ | ᵁ | ᵂ | ᵃ | ᵄ | ᵅ | ᵆ | ᵇ | ᵈ | ᵉ | ᵊ | ᵋ | ᵌ | ᵍ | ᵎ | ᵏ |
U+1D5x | ᵐ | ᵑ | ᵒ | ᵓ | ᵔ | ᵕ | ᵖ | ᵗ | ᵘ | ᵙ | ᵚ | ᵛ | ᵜ | ᵝ | ᵞ | ᵟ |
U+1D6x | ᵠ | ᵡ | ᵢ | ᵣ | ᵤ | ᵥ | ᵦ | ᵧ | ᵨ | ᵩ | ᵪ | ᵫ | ᵬ | ᵭ | ᵮ | ᵯ |
U+1D7x | ᵰ | ᵱ | ᵲ | ᵳ | ᵴ | ᵵ | ᵶ | ᵷ | ᵸ | ᵹ | ᵺ | ᵻ | ᵼ | ᵽ | ᵾ | ᵿ |
Notes
|
Phonetic Extensions Supplement [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1D8x | ᶀ | ᶁ | ᶂ | ᶃ | ᶄ | ᶅ | ᶆ | ᶇ | ᶈ | ᶉ | ᶊ | ᶋ | ᶌ | ᶍ | ᶎ | ᶏ |
U+1D9x | ᶐ | ᶑ | ᶒ | ᶓ | ᶔ | ᶕ | ᶖ | ᶗ | ᶘ | ᶙ | ᶚ | ᶛ | ᶜ | ᶝ | ᶞ | ᶟ |
U+1DAx | ᶠ | ᶡ | ᶢ | ᶣ | ᶤ | ᶥ | ᶦ | ᶧ | ᶨ | ᶩ | ᶪ | ᶫ | ᶬ | ᶭ | ᶮ | ᶯ |
U+1DBx | ᶰ | ᶱ | ᶲ | ᶳ | ᶴ | ᶵ | ᶶ | ᶷ | ᶸ | ᶹ | ᶺ | ᶻ | ᶼ | ᶽ | ᶾ | ᶿ |
Notes
|
Modifier Tone Letters [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+A70x | ꜀ | ꜁ | ꜂ | ꜃ | ꜄ | ꜅ | ꜆ | ꜇ | ꜈ | ꜉ | ꜊ | ꜋ | ꜌ | ꜍ | ꜎ | ꜏ |
U+A71x | ꜐ | ꜑ | ꜒ | ꜓ | ꜔ | ꜕ | ꜖ | ꜗ | ꜘ | ꜙ | ꜚ | ꜛ | ꜜ | ꜝ | ꜞ | ꜟ |
Notes
|
Superscripts and Subscripts [1] [2] [3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+207x | ⁰ | ⁱ | ⁴ | ⁵ | ⁶ | ⁷ | ⁸ | ⁹ | ⁺ | ⁻ | ⁼ | ⁽ | ⁾ | ⁿ | ||
U+208x | ₀ | ₁ | ₂ | ₃ | ₄ | ₅ | ₆ | ₇ | ₈ | ₉ | ₊ | ₋ | ₌ | ₍ | ₎ | |
U+209x | ₐ | ₑ | ₒ | ₓ | ₔ | ₕ | ₖ | ₗ | ₘ | ₙ | ₚ | ₛ | ₜ | |||
Notes
|
Many systems provide a way to select Unicode characters visually. ISO/IEC 14755 refers to this as a screen-selection entry method.
Microsoft Windows has provided a Unicode version of the Character Map program (find it by hitting ⊞ Win+R then type charmap
then hit ↵ Enter) since version NT 4.0 – appearing in the consumer edition since XP. This is limited to characters in the Basic Multilingual Plane (BMP). Characters are searchable by Unicode character name, and the table can be limited to a particular code block. More advanced third-party tools of the same type are also available (a notable freeware example is BabelMap).
macOS provides a "character palette" with much the same functionality, along with searching by related characters, glyph tables in a font, etc. It can be enabled in the input menu in the menu bar under System Preferences → International → Input Menu (or System Preferences → Language and Text → Input Sources) or can be viewed under Edit → Emoji & Symbols in many programs.
Equivalent tools – such as gucharmap (GNOME) or kcharselect (KDE) – exist on most Linux desktop environments.
A, or a, is the first letter and the first vowel of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is a, plural aes. It is similar in shape to the Ancient Greek letter alpha, from which it derives. The uppercase version consists of the two slanting sides of a triangle, crossed in the middle by a horizontal bar. The lowercase version can be written in two forms: the double-storey a and single-storey ɑ. The latter is commonly used in handwriting and fonts based on it, especially fonts intended to be read by children, and is also found in italic type.
E, or e, is the fifth letter and the second vowel letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is e ; plural ees, Es or E's. It is the most commonly used letter in many languages, including Czech, Danish, Dutch, English, French, German, Hungarian, Latin, Latvian, Norwegian, Spanish, and Swedish.
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation of speech sounds in written form. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.
O, or o, is the fifteenth letter and the fourth vowel letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is o, plural oes.
T, or t, is the twentieth letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is tee, plural tees. It is derived from the Semitic Taw 𐤕 of the Phoenician and Paleo-Hebrew script via the Greek letter τ (tau). In English, it is most commonly used to represent the voiceless alveolar plosive, a sound it also denotes in the International Phonetic Alphabet. It is the most commonly used consonant and the second-most commonly used letter in English-language texts.
U or u, is the twenty-first and sixth-to-last letter and fifth vowel letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is u, plural ues.
Æ is a character formed from the letters a and e, originally a ligature representing the Latin diphthong ae. It has been promoted to the status of a letter in some languages, including Danish, Norwegian, Icelandic, and Faroese. It was also used in Old Swedish before being changed to ä. Today, the International Phonetic Alphabet uses it to represent the near-open front unrounded vowel. Diacritic variants include Ǣ/ǣ, Ǽ/ǽ, Æ̀/æ̀, Æ̂/æ̂ and Æ̃/æ̃.
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
As of Unicode version 15.0 Cyrillic script is encoded across several blocks:
Unicode has a certain amount of duplication of characters. These are pairs of single Unicode code points that are canonically equivalent. The reason for this are compatibility issues with legacy systems.
Latin epsilon or open E is a letter of the extended Latin alphabet, based on the lowercase of the Greek letter epsilon (ε). It occurs in the orthographies of many Niger–Congo and Nilo-Saharan languages, such as Ewe, Akan, Lingala, Dinka and Maasai, for the vowel or, and is included in the African reference alphabet.
Latin alpha or script a is a letter of the Latin alphabet based on one lowercase form of a, or on the Greek lowercase alpha (α).
Diacritical marks of two dots¨, placed side-by-side over or under a letter, are used in a number of languages for several different purposes. The most familiar to English language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such dots are also sometimes used for stylistic reasons.
L, or l or ℓ in cursive, is the twelfth letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is el, plural els.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.
Bopomofo, or Mandarin Phonetic Symbols, also named Zhuyin, is a Chinese transliteration system for Mandarin Chinese and other related languages and dialects. More commonly used in Taiwanese Mandarin, it may also be used to transcribe other varieties of Chinese, particularly other varieties of Mandarin Chinese dialects, as well as Taiwanese Hokkien. Consisting of 37 characters and five tone marks, it transcribes all possible sounds in Mandarin.
Tone letters are letters that represent the tones of a language, most commonly in languages with contour tones.
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.