This article needs additional citations for verification .(January 2016) |
Unicode supports several phonetic scripts and notation systems through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.
The International Phonetic Alphabet (IPA) makes use of letters from other writing systems as most phonetic scripts do. IPA notably uses Latin, Greek and Cyrillic characters. Combining diacritics also add meaning to the phonetic text. Finally, these phonetic alphabets make use of modifier letters, that are specially constructed for phonetic meaning. A "modifier letter" is strictly intended not as an independent grapheme but as a modification of the preceding character [1] resulting in a distinct grapheme, notably in the context of the International Phonetic Alphabet. For example, ʰ should not occur on its own but modifies the preceding or following symbol. Thus, tʰ is a single IPA symbol, distinct from t. In practice, however, several of these "modifier letters" are also used as full graphemes, e.g. ʿ as transliterating Semitic ayin or Hawaiian ʻokina, or ˚ transliterating Abkhaz ә.
The following tables indicates the Unicode code point sequences for phonemes as used in the International Phonetic Alphabet. A bold code point indicates that the Unicode chart provides an application note such as "voiced retroflex lateral" for U+026DɭLATIN SMALL LETTER L WITH RETROFLEX HOOK. An entry in bold italics indicates the character name itself refers to a phoneme such as U+0298ʘLATIN LETTER BILABIAL CLICK
Basic Latin/Greek Latin extended IPA extension
Bilabial | Labiodental | Dental | Alveolar | Postalveolar | Retroflex | Labialized palatal | Postalveolar-velar | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Plosive | p
| b
| p̪ U+0070 U+032A | b̪ U+0062 U+032A | t̪ U+0074 U+032A | d̪ U+0064 U+032A | t
| d
| ʈ
| ɖ
| ||||||
Implosive | ɓ̥U+0253U+0325 | ɓ
| ɗ̪ U+0257U+032A | ɗ
| ᶑ U+1D91 | |||||||||||
Ejective | pʼ U+0070 U+02BC | t̪ʼ U+0074 U+032AU+02BC | tʼ U+0074 U+02BC | ʈʼ U+0288U+02BC | ||||||||||||
Nasal | m̥ U+006D U+0325 | m
| ɱ̊U+0271 U+030A | ɱ
| n̪̊U+006E U+032A U+030A | n̪ U+006E U+032A | n̥ U+006E U+0325 | n
| ɳ̊ U+0273 U+030A | ɳ
| ||||||
Trill | ʙ
| r̥ U+0072 U+0325 | r
| * | ||||||||||||
Tap or Flap | ⱱ̟ U+2C71 U+031F | ⱱ
| ɾ
| ɽ
| ||||||||||||
Lateral flap | ɺ
| 𝼈 U+1DF08 | ||||||||||||||
Fricative | ɸ
| β
| f
| v
| θ
| ð
| s
| z
| ʃ
| ʒ
| ʂ
| ʐ
| ɧ
| |||
Lateral fricative | ɬ
| ɮ
| ꞎ
| |||||||||||||
Ejective fricative | sʼ U+0073 U+02BC | ʃʼU+0283U+02BC | ||||||||||||||
Ejective lateral fricative | ɬʼ U+026CU+02BC | |||||||||||||||
Percussive | ʬ
| ʭ
| ||||||||||||||
Approximant | β̞̊U+03B2 U+031E U+030A | β̞ U+03B2 U+031E | ʋ̥U+028B U+0325 | ʋ
| ð̞ U+00F0 U+031E | ɹ̥U+0279 U+0325 | ɹ
| ɻ̊U+027B U+030A | ɻ
| ɥ̊U+0265 U+030A | ɥ
| |||||
Lateral approximant | l̥ U+006C U+0325 | l
| ɭ
| |||||||||||||
Click consonant | ʘ
| ǀ
| ǃ
| ǃ / ǂ /
| 𝼊
| |||||||||||
Lateral click | * | ǁ
|
Alveolo-palatal | Palatal | Labial-velar | Velar | Uvular | Pharyngeal | Epiglottal | Glottal | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Plosive | ȶ
| ȡ
| c
| ɟ
| k͡p U+006B U+0361 U+0070 | ɡ͡b U+0261 U+0361 U+0062 | k
| ɡ
| q
| ɢ
| ʡ
| ʔ
| ||||
Implosive | ʄ
| ɠ
| ʛ
| |||||||||||||
Ejective | cʼ U+0063 U+02BC | kʼ U+006B U+02BC | qʼ U+0071 U+02BC | |||||||||||||
Nasal | ȵ
| ɲ
| ŋ͡m U+014B U+0361 U+006D | ŋ
| ɴ
| |||||||||||
Trill | ʀ
| * | ||||||||||||||
Tap or Flap | * | |||||||||||||||
Lateral flap | * | * | ||||||||||||||
Fricative | ɕ
| ʑ
| ç U+00E7 | ʝ
| x
| ɣ
| χ
| ʁ
| ħ
| ʕ
| ʜ
| ʢ
| h
| ɦ
| ||
Approximant | j
| ʍ
| w
| ɰ
| ||||||||||||
Lateral approximant | ȴ
| ʎ
| ʟ
|
IPA: Vowels | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||
Legend: unrounded • rounded |
The following figures depict the phonetic vowels and their Unicode / UCS code points, arranged to represent the phonetic vowel trapezium. Vowels appearing in pairs in the figure to the right indicate rounded and unrounded variations respectively. Again, characters with Unicode names referring to phonemes are indicated by bold text. Those with explicit application notes are indicated by bold italic text. Those from borrowed unchanged from another script (Latin, Greek or Cyrillic) are indicated by italics. Before and after a bullet are the unrounded • rounded vowels.
Front | Central | Back | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Close | i • y U+0069 | ɨ • ʉ U+0268 | ɯ • u U+026F | ||||||||||||
Near-close | ɪ • ʏ U+026A | ɪ̈ • ʊ̈ U+026A U+0308 | • ʊ
| ||||||||||||
Close-mid | e • ø U+0065 | ɘ • ɵ U+0258 | ɤ • o U+0264 | ||||||||||||
Mid | e̞ • ø̞ U+0065 U+031E | ə
| ɤ̞ • o̞ U+0264 U+031E | ||||||||||||
Open-mid | ɛ • œ U+025B | ɜ • ɞ U+025C | ʌ • ɔ U+028C | ||||||||||||
Near-open | æ • U+00E6 | ɐ | |||||||||||||
Open | a • ɶ U+0061 | ä • U+0061 U+0308 | ɑ • ɒ U+0251 |
Diacritics may be encoded as either modifier (e.g. ˳) or combining (e.g. ◌̥) characters.
Voiceless | Breathy Voiced | Dental | Syllabic | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
˳ • ◌̥U+02F3 • U+0325 | ◌̤U+0324 | ◌͏̪U+032A | ˌ • ◌̩U+02CC • U+0329 | ||||||||
Voiced | Creaky Voiced | Apical | Non-syllabic | ||||||||
ˬ • ◌̬U+02EC • U+032C | ˷ • ◌̰U+02F7 • U+0330 | ˽ • ◌̺U+02FD • U+033A | ◌͏̯U+032F | ||||||||
Aspirated | Linguolabial | Laminal | More Rounded | ||||||||
ʰU+02B0 | ◌͏̼U+033C | ◌͏̻U+033B | ˒ • ◌̹U+02D2 • U+0339 | ||||||||
Labialized | Nasalized | Palatalized | Less Rounded | ||||||||
ʷU+02B7 | ◌̃U+0303 | ʲU+02B2 | ˓ • ◌̜U+02D3 • U+031C | ||||||||
Advanced | Nasal release | Centralized | Velarized | ||||||||
˖ • ◌̟U+02D6 • U+031F | ⁿU+207F | ¨ • ◌̈U+00A8 [1] • U+0308 | ˠU+02E0 | ||||||||
Retracted | Lateral release | Mid-Centralized | Pharyngealized | ||||||||
ˍ • ◌̠U+02CD • U+0320 | ˡU+02E1 | ˟ • ◌̽U+02DF • U+033D | ˤU+02E4 | ||||||||
Advanced Tongue Root | No audible release | Raised | Velarized or Pharyngealized | ||||||||
꭪ • ◌̘U+AB6A • U+0318 | ˺ • ◌̚U+02FA • U+031A | ˔ • ◌̝U+02D4 • U+031D | ◌̴U+0334 | ||||||||
Retracted Tongue Root | Rhoticity | Lowered | Lengthened | ||||||||
꭫ • ◌̙U+AB6B • U+0319 | ˞U+02DE | ˕ • ◌̞U+02D5 • U+031E | ː U+02D0 | ||||||||
Notes
|
Six Unicode blocks contain many phonetic symbols:
IPA Extensions [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+025x | ɐ | ɑ | ɒ | ɓ | ɔ | ɕ | ɖ | ɗ | ɘ | ə | ɚ | ɛ | ɜ | ɝ | ɞ | ɟ |
U+026x | ɠ | ɡ | ɢ | ɣ | ɤ | ɥ | ɦ | ɧ | ɨ | ɩ | ɪ | ɫ | ɬ | ɭ | ɮ | ɯ |
U+027x | ɰ | ɱ | ɲ | ɳ | ɴ | ɵ | ɶ | ɷ | ɸ | ɹ | ɺ | ɻ | ɼ | ɽ | ɾ | ɿ |
U+028x | ʀ | ʁ | ʂ | ʃ | ʄ | ʅ | ʆ | ʇ | ʈ | ʉ | ʊ | ʋ | ʌ | ʍ | ʎ | ʏ |
U+029x | ʐ | ʑ | ʒ | ʓ | ʔ | ʕ | ʖ | ʗ | ʘ | ʙ | ʚ | ʛ | ʜ | ʝ | ʞ | ʟ |
U+02Ax | ʠ | ʡ | ʢ | ʣ | ʤ | ʥ | ʦ | ʧ | ʨ | ʩ | ʪ | ʫ | ʬ | ʭ | ʮ | ʯ |
Notes
|
The characters in the "Spacing Modifier Letters" block are intended as forming a unity with the preceding letter (which they "modify"). E.g. the character U+02B0ʰMODIFIER LETTER SMALL H isn't intended simply as a superscript h (h), but as the mark of aspiration placed after the letter being aspirated, as in pʰ "aspirated voiceless bilabial plosive". The block contains:
Spacing Modifier Letters [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+02Bx | ʰ | ʱ | ʲ | ʳ | ʴ | ʵ | ʶ | ʷ | ʸ | ʹ | ʺ | ʻ | ʼ | ʽ | ʾ | ʿ |
U+02Cx | ˀ | ˁ | ˂ | ˃ | ˄ | ˅ | ˆ | ˇ | ˈ | ˉ | ˊ | ˋ | ˌ | ˍ | ˎ | ˏ |
U+02Dx | ː | ˑ | ˒ | ˓ | ˔ | ˕ | ˖ | ˗ | ˘ | ˙ | ˚ | ˛ | ˜ | ˝ | ˞ | ˟ |
U+02Ex | ˠ | ˡ | ˢ | ˣ | ˤ | ˥ | ˦ | ˧ | ˨ | ˩ | ˪ | ˫ | ˬ | ˭ | ˮ | ˯ |
U+02Fx | ˰ | ˱ | ˲ | ˳ | ˴ | ˵ | ˶ | ˷ | ˸ | ˹ | ˺ | ˻ | ˼ | ˽ | ˾ | ˿ |
Notes
|
This block, together with Phonetic Extensions Supplement below, contains:
Phonetic Extensions [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1D0x | ᴀ | ᴁ | ᴂ | ᴃ | ᴄ | ᴅ | ᴆ | ᴇ | ᴈ | ᴉ | ᴊ | ᴋ | ᴌ | ᴍ | ᴎ | ᴏ |
U+1D1x | ᴐ | ᴑ | ᴒ | ᴓ | ᴔ | ᴕ | ᴖ | ᴗ | ᴘ | ᴙ | ᴚ | ᴛ | ᴜ | ᴝ | ᴞ | ᴟ |
U+1D2x | ᴠ | ᴡ | ᴢ | ᴣ | ᴤ | ᴥ | ᴦ | ᴧ | ᴨ | ᴩ | ᴪ | ᴫ | ᴬ | ᴭ | ᴮ | ᴯ |
U+1D3x | ᴰ | ᴱ | ᴲ | ᴳ | ᴴ | ᴵ | ᴶ | ᴷ | ᴸ | ᴹ | ᴺ | ᴻ | ᴼ | ᴽ | ᴾ | ᴿ |
U+1D4x | ᵀ | ᵁ | ᵂ | ᵃ | ᵄ | ᵅ | ᵆ | ᵇ | ᵈ | ᵉ | ᵊ | ᵋ | ᵌ | ᵍ | ᵎ | ᵏ |
U+1D5x | ᵐ | ᵑ | ᵒ | ᵓ | ᵔ | ᵕ | ᵖ | ᵗ | ᵘ | ᵙ | ᵚ | ᵛ | ᵜ | ᵝ | ᵞ | ᵟ |
U+1D6x | ᵠ | ᵡ | ᵢ | ᵣ | ᵤ | ᵥ | ᵦ | ᵧ | ᵨ | ᵩ | ᵪ | ᵫ | ᵬ | ᵭ | ᵮ | ᵯ |
U+1D7x | ᵰ | ᵱ | ᵲ | ᵳ | ᵴ | ᵵ | ᵶ | ᵷ | ᵸ | ᵹ | ᵺ | ᵻ | ᵼ | ᵽ | ᵾ | ᵿ |
Notes
|
Phonetic Extensions Supplement [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+1D8x | ᶀ | ᶁ | ᶂ | ᶃ | ᶄ | ᶅ | ᶆ | ᶇ | ᶈ | ᶉ | ᶊ | ᶋ | ᶌ | ᶍ | ᶎ | ᶏ |
U+1D9x | ᶐ | ᶑ | ᶒ | ᶓ | ᶔ | ᶕ | ᶖ | ᶗ | ᶘ | ᶙ | ᶚ | ᶛ | ᶜ | ᶝ | ᶞ | ᶟ |
U+1DAx | ᶠ | ᶡ | ᶢ | ᶣ | ᶤ | ᶥ | ᶦ | ᶧ | ᶨ | ᶩ | ᶪ | ᶫ | ᶬ | ᶭ | ᶮ | ᶯ |
U+1DBx | ᶰ | ᶱ | ᶲ | ᶳ | ᶴ | ᶵ | ᶶ | ᶷ | ᶸ | ᶹ | ᶺ | ᶻ | ᶼ | ᶽ | ᶾ | ᶿ |
Notes
|
Modifier Tone Letters [1] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+A70x | ꜀ | ꜁ | ꜂ | ꜃ | ꜄ | ꜅ | ꜆ | ꜇ | ꜈ | ꜉ | ꜊ | ꜋ | ꜌ | ꜍ | ꜎ | ꜏ |
U+A71x | ꜐ | ꜑ | ꜒ | ꜓ | ꜔ | ꜕ | ꜖ | ꜗ | ꜘ | ꜙ | ꜚ | ꜛ | ꜜ | ꜝ | ꜞ | ꜟ |
Notes
|
Superscripts and Subscripts [1] [2] [3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+207x | ⁰ | ⁱ | ⁴ | ⁵ | ⁶ | ⁷ | ⁸ | ⁹ | ⁺ | ⁻ | ⁼ | ⁽ | ⁾ | ⁿ | ||
U+208x | ₀ | ₁ | ₂ | ₃ | ₄ | ₅ | ₆ | ₇ | ₈ | ₉ | ₊ | ₋ | ₌ | ₍ | ₎ | |
U+209x | ₐ | ₑ | ₒ | ₓ | ₔ | ₕ | ₖ | ₗ | ₘ | ₙ | ₚ | ₛ | ₜ | |||
Notes
|
Many systems provide a way to select Unicode characters visually. ISO/IEC 14755 refers to this as a screen-selection entry method.
Microsoft Windows has provided a Unicode version of the Character Map program (find it by hitting ⊞ Win+R then type charmap
then hit ↵ Enter) since version NT 4.0 – appearing in the consumer edition since XP. This is limited to characters in the Basic Multilingual Plane (BMP). Characters are searchable by Unicode character name, and the table can be limited to a particular code block. More advanced third-party tools of the same type are also available (a notable freeware example is BabelMap).
macOS provides a "character palette" with much the same functionality, along with searching by related characters, glyph tables in a font, etc. It can be enabled in the input menu in the menu bar under System Preferences → International → Input Menu (or System Preferences → Language and Text → Input Sources) or can be viewed under Edit → Emoji & Symbols in many programs.
Equivalent tools – such as gucharmap (GNOME) or kcharselect (KDE) – exist on most Linux desktop environments.
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation for the sounds of speech. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.
N, or n, is the fourteenth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages, and others worldwide. Its name in English is en, plural ens.
T, or t, is the twentieth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is tee, plural tees.
A caron is a diacritic mark commonly placed over certain letters in the orthography of some languages to indicate a change of the related letter's pronunciation.
Pharyngealization is a secondary articulation of consonants or vowels by which the pharynx or epiglottis is constricted during the articulation of the sound.
A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.
EzhEZH, also called the "tailed z", is a letter, notable for its use in the International Phonetic Alphabet (IPA) to represent the voiced postalveolar fricative consonant. For example, the pronunciation of "si" in vision and precision, or the ⟨s⟩ in treasure. See also the letter ⟨Ž⟩ as used in many Slavic languages, the letter ⟨Ż⟩ as used in Kashubian, the Persian alphabet letter ⟨ژ⟩, the Cyrillic letter ⟨Ж⟩, the Devanagari letter (झ़) and the Esperanto letter ⟨Ĵ⟩.
When used as a diacritic mark, the term dot refers to the glyphs "combining dot above", and "combining dot below" which may be combined with some letters of the extended Latin alphabets in use in a variety of languages. Similar marks are used with other scripts.
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks.
The bilabial clicks are a family of click consonants that sound like a smack of the lips. They are found as phonemes only in the small Tuu language family, in the ǂ’Amkoe language of Botswana, and in the extinct Damin ritual jargon of Australia. However, bilabial clicks are found paralinguistically for a kiss in various languages, including integrated into a greeting in the Hadza language of Tanzania, and as allophones of labial–velar stops in some West African languages, as of /mw/ in some of the languages neighboring Shona, such as Ndau and Tonga.
In phonetics, a trill is a consonantal sound produced by vibrations between the active articulator and passive articulator. Standard Spanish ⟨rr⟩ as in perro, for example, is an alveolar trill.
Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
L, or l, is the twelfth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is el, plural els.
Heng is a letter of the Latin alphabet, originating as a typographic ligature of h and ŋ. It is used for a voiceless y-like sound, such as in Dania transcription of the Danish language.
J, or j, is the tenth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its usual name in English is jay, with a now-uncommon variant jy.
C, or c, is the third letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is cee, plural cees.
In typesetting, the hook or tail is a diacritic mark attached to letters in many alphabets. In shape it looks like a hook and it can be attached below as a descender, on top as an ascender and sometimes to the side. The orientation of the hook can change its meaning: when it is below and curls to the left it can be interpreted as a palatal hook, and when it curls to the right is called hook tail or tail and can be interpreted as a retroflex hook. It should not be mistaken with the hook above, a diacritical mark used in Vietnamese, or the rhotic hook, used in the International Phonetic Alphabet.
Dania is the traditional linguistic transcription system used in Denmark to describe the Danish language. It was invented by Danish linguist Otto Jespersen and published in 1890 in the Dania, Tidsskrift for folkemål og folkeminder magazine from which the system was named.
The Phonetic Symbol Guide is a book by Geoffrey Pullum and William Ladusaw that explains the histories and uses of the symbols of various phonetic transcription conventions. It was published in 1986, with a second edition in 1996, by the University of Chicago Press. Symbols include letters and diacritics of the International Phonetic Alphabet and Americanist phonetic notation, though not of the Uralic Phonetic Alphabet. The Guide was consulted by the International Phonetic Association when they established names and numerical codes for the International Phonetic Alphabet and was the basis for the characters of the TIPA set of phonetic fonts.
Latin Extended-F is a Unicode block containing modifier letters, nearly all IPA and extIPA, for phonetic transcription. The Latin Extended-F and -G blocks contain the first Latin characters defined outside of the Basic Multilingual Plane (BMP). They were added to the free Gentium Plus and Andika fonts with version 6.2 in February 2023. Some computers have 𐞃, 𐞎 and 𐞥 supported on the font Calibri.