This article needs additional citations for verification .(March 2016) |
The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of SAMPA developed in 1995 by John C. Wells, professor of phonetics at University College London. [1] It is designed to unify the individual language SAMPA alphabets, and extend SAMPA to cover the entire range of characters in the 1993 version of International Phonetic Alphabet (IPA). The result is a SAMPA-inspired remapping of the IPA into 7-bit ASCII.
SAMPA was devised as a hack to work around the inability of text encodings to represent IPA symbols. Later, as Unicode support for IPA symbols became more widespread, the necessity for a separate, computer-readable system for representing the IPA in ASCII decreased. However, X-SAMPA is still useful as the basis for an input method for true IPA.
O
is a distinct sound from O\
, to which it bears no relation. Such use of the backslash character can be a problem, since many programs interpret it as an escape character for the character following it. For example, such X-SAMPA symbols do not work in EMU, so backslashes must be replaced with some other symbol (e.g., an asterisk: '*') when adding phonemic transcription to an EMU speech database. The backslash has no fixed meaning.~
for nasalization, =
for syllabicity, and `
for retroflexion and rhotacization, diacritics are joined to the character with the underscore character _
.k_p
codes for /k͡p/._1
to _6
are reserved diacritics as shorthand for language-specific tone numbers.fonxsamp
as the subtag for text transcribed in X-SAMPA. [2] X-SAMPA | IPA | IPA image | Description | Examples |
---|---|---|---|---|
a | a | open front unrounded vowel | French dame[dam] | |
b | b | voiced bilabial plosive | English bed[bEd] , French bon[bO~] | |
b_< | ɓ | voiced bilabial implosive | Sindhi ɓarʊ [b_<arU ] | |
c | c | voiceless palatal plosive | Hungarian latyak["lQcQk] | |
d | d | voiced alveolar plosive | English dig[dIg] , French doigt[dwa] | |
d` | ɖ | voiced retroflex plosive | Swedish hord[hu:d`] | |
d_< | ɗ | voiced alveolar implosive | Sindhi ɗarʊ [d_<arU ] | |
e | e | close-mid front unrounded vowel | French blé[ble] | |
f | f | voiceless labiodental fricative | English five[faIv] , French femme[fam] | |
g | ɡ | voiced velar plosive | English game[geIm] , French longue[lO~g] | |
g_< | ɠ | voiced velar implosive | Sindhi ɠəro [g_<@ro ] | |
h | h | voiceless glottal fricative | English house[haUs] | |
h\ | ɦ | voiced glottal fricative | Czech hrad[h\rat] | |
i | i | close front unrounded vowel | English be[bi:] , French oui[wi] , Spanish si[si] | |
j | j | palatal approximant | English yes[jEs] , French yeux[j2] | |
j\ | ʝ | voiced palatal fricative | Greek γειά[j\a] | |
k | k | voiceless velar plosive | English skip[skIp] , Spanish carro["karo] | |
l | l | alveolar lateral approximant | English lay[leI] , French mal[mal] | |
l` | ɭ | retroflex lateral approximant | Svealand Swedish sorl[so:l`] | |
l\ | ɺ | alveolar lateral flap | Wayuu püülükü[pM:l\MkM] | |
m | m | bilabial nasal | English mouse[maUs] , French homme[Om] | |
n | n | alveolar nasal | English nap[n{p] , French non[nO~] | |
n` | ɳ | retroflex nasal | Swedish hörn[h2:n`] | |
o | o | close-mid back rounded vowel | French veau[vo] | |
p | p | voiceless bilabial plosive | English speak[spik] , French pose[poz] , Spanish perro["pero] | |
p\ | ɸ | voiceless bilabial fricative | Japanese fuku[p\M_0kM] | |
q | q | voiceless uvular plosive | Arabic qasbah["qQs_Gba] | |
r | r | alveolar trill | Spanish perro["pero] | |
r` | ɽ | retroflex flap | Bengali gari[gar`i:] | |
r\ | ɹ | alveolar approximant | English red[r\Ed] | |
r\` | ɻ | retroflex approximant | Malayalam വഴി["v@r\`i] | |
s | s | voiceless alveolar fricative | English seem[si:m] , French session[sE"sjO~] | |
s` | ʂ | voiceless retroflex fricative | Swedish mars[mas`] | |
s\ | ɕ | voiceless alveolo-palatal fricative | Polish świerszcz[s\v'ers`ts`] | |
t | t | voiceless alveolar plosive | English stew[stju:] , French raté[Ra"te] | |
t` | ʈ | voiceless retroflex plosive | Swedish mört[m2t`] | |
u | u | close back rounded vowel | English boom[bu:m] , Spanish su[su] | |
v | v | voiced labiodental fricative | English vest[vEst] , French voix[vwa] | |
v\ (or P ) | ʋ | labiodental approximant | Dutch west[v\Est] /[PEst] | |
w | w | labial-velar approximant | English west[wEst] , French oui[wi] | |
x | x | voiceless velar fricative | Scots loch[lOx] or [5Ox] ; German Buch, Dach; Spanish caja, gestión | |
x\ | ɧ | voiceless palatal-velar fricative | Swedish sjal[x\A:l] | |
y | y | close front rounded vowel | French tu[ty] German über["y:b6] | |
z | z | voiced alveolar fricative | English zoo[zu:] , French azote[a"zOt] | |
z` | ʐ | voiced retroflex fricative | Mandarin Chinese rang[z`aN] | |
z\ | ʑ | voiced alveolo-palatal fricative | Polish źrebak["z\rEbak] |
X-SAMPA | IPA | IPA image | Description | Example |
---|---|---|---|---|
A | ɑ | open back unrounded vowel | English father ["fA:D@ (r\ )] (RP and Gen.Am.) | |
B | β | voiced bilabial fricative | Spanish lavar[la"Ba4] | |
B\ | ʙ | bilabial trill | Reminiscent of shivering ("brrr") | |
C | ç | voiceless palatal fricative | German ich[IC] , English human["Cjum@n] (broad transcription uses [hj -]) | |
D | ð | voiced dental fricative | English then[DEn] | |
E | ɛ | open-mid front unrounded vowel | French même[mE:m] , English met[mEt] (RP and Gen.Am.) | |
F | ɱ | labiodental nasal | English emphasis["EFf@sIs] (spoken quickly, otherwise uses [Emf -]) | |
G | ɣ | voiced velar fricative | Greek γωνία[Go"nia] | |
G\ | ɢ | voiced uvular plosive | Inuktitut nirivvik[niG\ivvik] | |
G\_< | ʛ | voiced uvular implosive | Mam ʛa [G\_<a ] | |
H | ɥ | labial-palatal approximant | French huit[Hit] | |
H\ | ʜ | voiceless epiglottal fricative | Agul мехӀ[mEH\] | |
I | ɪ | near-close front unrounded vowel | English kit[kIt] | |
I\ | ᵻ | near-close central unrounded vowel (non-IPA) | Polish ryba[rI\bA] | |
J | ɲ | palatal nasal | Spanish año["aJo] , English canyon["k{J@n] (broad transcription uses [-nj -]) | |
J\ | ɟ | voiced palatal plosive | Hungarian egy[EJ\] | |
J\_< | ʄ | voiced palatal implosive | Sindhi ʄaro [J\_<aro ] | |
K | ɬ | voiceless alveolar lateral fricative | Welsh llaw[KaU] | |
K\ | ɮ | voiced alveolar lateral fricative | Mongolian долоо[tOK\O:] | |
L | ʎ | palatal lateral approximant | Italian famiglia[fa"miLLa] , Castilian: llamar[La"mar] | |
L\ | ʟ | velar lateral approximant | Korean 달구지[t6L\gudz\i] | |
M | ɯ | close back unrounded vowel | Korean 음식[M:ms\_hik_}] | |
M\ | ɰ | velar approximant | Spanish fuego["fweM\o] | |
N | ŋ | velar nasal | English thing[TIN] | |
N\ | ɴ | uvular nasal | Japanese さんsan[saN\] | |
O | ɔ | open-mid back rounded vowel | American English off[O:f] | |
O\ | ʘ | bilabial click | ||
P (or v\ ) | ʋ | labiodental approximant | Dutch west[PEst] /[v\Est] , allophone of English phoneme /r\/ | |
Q | ɒ | open back rounded vowel | RP lot[lQt] | |
R | ʁ | voiced uvular fricative | German rein[RaIn] | |
R\ | ʀ | uvular trill | French roi[R\wa] | |
S | ʃ | voiceless postalveolar fricative | English ship[SIp] | |
T | θ | voiceless dental fricative | English thin[TIn] | |
U | ʊ | near-close back rounded vowel | English foot[fUt] | |
U\ | ᵿ | near-close central rounded vowel (non-IPA) | English euphoria[jU\"fO@r\i@] | |
V | ʌ | open-mid back unrounded vowel | Scottish English strut[str\Vt] | |
W | ʍ | voiceless labial-velar fricative | Scots when[WEn] | |
X | χ | voiceless uvular fricative | Klallam sχaʔqʷaʔ[sXa?q_wa?] | |
X\ | ħ | voiceless pharyngeal fricative | Arabic حḥāʾ[X\A:] | |
Y | ʏ | near-close front rounded vowel | German hübsch[hYpS] | |
Z | ʒ | voiced postalveolar fricative | English vision["vIZ@n] |
X-SAMPA | IPA | IPA image | Description | Example |
---|---|---|---|---|
. | . | syllable break | ||
" | ˈ | primary stress | ||
% | ˌ | secondary stress | American English pronunciation[pr\@%nVn.si."eI.S@n] | |
' (or _j ) | ʲ | palatalized | Russian Земля (Earth) [z'I"ml'a] or [z_jI"ml_ja] | |
: | ː | long | ||
:\ | ˑ | half long | Estonian differentiates three vowel lengths | |
- | separator | Polish trzy[t-S1] vs. czy[tS1] (affricate) | ||
@ | ə | schwa | English arena[@"r\i:n@] | |
@\ | ɘ | close-mid central unrounded vowel | Paicĩ kɘ̄ɾɘ[k@\_M4@\_M] | |
@` | ɚ | r-coloured schwa | American English color["kVl@`] | |
{ | æ | near-open front unrounded vowel | English trap[tr\{p] | |
} | ʉ | close central rounded vowel | Swedish sju[x\}:] ; AuE/NZE boot[b}:t] | |
1 | ɨ | close central unrounded vowel | Welsh tu[t1] , American English rose's["r\oUz1z] | |
2 | ø | close-mid front rounded vowel | Danish købe["k2:b@] , French deux[d2] | |
3 | ɜ | open-mid central unrounded vowel | English nurse[n3:s] (RP) or [n3`s] (Gen.Am.) | |
3\ | ɞ | open-mid central rounded vowel | Irish tomhail[t3\:l'] | |
4 | ɾ | alveolar flap | Spanish pero["pe4o] , American English better["bE4@`] | |
5 | ɫ | velarized alveolar lateral approximant; also see _e | English milk[mI5k] , Portuguese livro["5iv4u] | |
6 | ɐ | near-open central vowel | German besser["bEs6] , Australian English mud[m6d] | |
7 | ɤ | close-mid back unrounded vowel | Estonian kõik[k7ik] , Vietnamese mơ[m7_M] | |
8 | ɵ | close-mid central rounded vowel | Swedish buss[b8s] | |
9 | œ | open-mid front rounded vowel | French neuf[n9f] , Danish drømme[dR9m@] | |
& | ɶ | open front rounded vowel | Swedish skörd[x\&d`] | |
? | ʔ | glottal stop | Cockney English bottle["bQ?o] | |
?\ | ʕ | voiced pharyngeal fricative | Arabic عʿayn[?\Ajn] | |
* | undefined escape character, SAMPA's "conjunctor" | |||
/ | / | (a) French vowel archiphonemes or indeterminacies (b) delimiter of phonemic transcriptions | maison/mE/zO~/ | |
< | ⟨ | begin nonsegmental notation, e.g., SAMPROSA [3] | ||
<\ | ʢ | voiced epiglottal fricative | Siwi arˤbˤəʢa (four) [ar_?\b_?\@<\a] | |
> | ⟩ | end nonsegmental notation | ||
>\ | ʡ | epiglottal plosive | Archi гӀарз (complaint) [>\arz] | |
^ | ꜛ | upstep | ||
! | ꜜ | downstep | ||
!\ | ǃ | postalveolar click | Zulu iqaqa (polecat) [i:!\a:!\a] | |
| | | | minor (foot) group | ||
|\ | ǀ | dental click | Zulu icici (earring) [i:|\i:|\i] | |
|| | ‖ | major (intonation) group | ||
|\|\ | ǁ | alveolar lateral click | Zulu xoxa (to converse) [|\|\O:|\|\a] | |
=\ | ǂ | palatal click | ||
-\ | ‿ | linking mark |
X-SAMPA | IPA | IPA image | Description |
---|---|---|---|
_" | ̈ | centralized | |
_+ | ̟ | advanced | |
_- | ̠ | retracted | |
_/ | ̌ | rising tone | |
_0 | ̥ | voiceless | |
_< | implosive (IPA uses separate symbols for implosives) | ||
= (or _= ) | ̩ | syllabic | |
_> | ʼ | ejective | |
_?\ | ˤ | pharyngealized | |
_\ | ̂ | falling tone | |
_^ | ̯ | non-syllabic | |
_} | ̚ | no audible release | |
` | ˞ | rhotacization in vowels, retroflexion in consonants (IPA uses separate symbols for consonants, see t` for an example) | |
~ (or _~ ) | ̃ | nasalization | |
_A | ̘ | advanced tongue root | |
_a | ̺ | apical | |
_B | ̏ | extra low tone | |
_B_L | ᷅ | low rising tone | |
_c | ̜ | less rounded | |
_d | ̪ | dental | |
_e | ̴ | velarized or pharyngealized; also see 5 | |
<F> | ↘ | global fall | |
_F | ̂ | falling tone | |
_G | ˠ | velarized | |
_H | ́ | high tone | |
_H_T | ᷄ | high rising tone | |
_h | ʰ | aspirated | |
_j (or ' ) | ʲ | palatalized | |
_k | ̰ | creaky voice | |
_L | ̀ | low tone | |
_l | ˡ | lateral release | |
_M | ̄ | mid tone | |
_m | ̻ | laminal | |
_N | ̼ | linguolabial | |
_n | ⁿ | nasal release | |
_O | ̹ | more rounded | |
_o | ̞ | lowered | |
_q | ̙ | retracted tongue root | |
<R> | ↗ | global rise | |
_R | ̌ | rising tone | |
_R_F | ᷈ | rising falling tone | |
_r | ̝ | raised | |
_T | ̋ | extra high tone | |
_t | ̤ | breathy voice | |
_v | ̬ | voiced | |
_w | ʷ | labialized | |
_X | ̆ | extra-short | |
_x | ̽ | mid-centralized |
Consonants (pulmonic) | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Place of articulation → | Labial | Coronal | Dorsal | Laryngeal | |||||||||||||
Manner of articulation ↓ | Bilabial | Labio‐ dental | Dental | Alveolar | Post‐ alveolar | Retro‐ flex | Palatal | Velar | Uvular | Pharyn‐ geal | Epi‐ glottal | Glottal | |||||
Nasal | m | F | n | n` | J | N | N\ | ||||||||||
Plosive | p b | p_d b_d | t d | t` d` | c J\ | k g | q G\ | >\ | ? | ||||||||
Fricative | p\ B | f v | T D | s z | S Z | s` z` | C j\ | x G | X | R | X\ | ?\ | H\ | <\ | h h\ | ||
Approximant | B_o | v\ | r\ | r\` | j | M\ | |||||||||||
Trill | B\ | r | * | R\ | * | ||||||||||||
Tap or Flap | * † | * † | 4 | r` | * | ||||||||||||
Lateral Fricative | K K\ | * | * | * | |||||||||||||
Lateral Approximant | l | l` | L | L\ | |||||||||||||
Lateral Flap | l\ | * | * | * |
Coarticulated | |
---|---|
W | Voiceless labialized velar approximant |
w | Voiced labialized velar approximant |
H | Voiced labialized palatal approximant |
s\ | Voiceless palatalized postalveolar (alveolo-palatal) fricative |
z\ | Voiced palatalized postalveolar (alveolo-palatal) fricative |
x\ | Voiceless "palatal-velar" fricative |
Affricates and double articulation | |
---|---|
ts | voiceless alveolar affricate |
dz | voiced alveolar affricate |
tS | voiceless postalveolar affricate |
dZ | voiced postalveolar affricate |
ts\ | voiceless alveolo-palatal affricate |
dz\ | voiced alveolo-palatal affricate |
tK | voiceless alveolar lateral affricate |
dK\ | voiced alveolar lateral affricate |
kp | voiceless labial-velar plosive |
gb | voiced labial-velar plosive |
Nm | labial-velar nasal stop |
Consonants (non-pulmonic) | |||||
---|---|---|---|---|---|
Clicks | Implosives | Ejectives | |||
O\ | Bilabial | b_< | Bilabial | _> | For example: |
|\ | Laminal alveolar ("dental") | d_< | Alveolar | p_> | Bilabial |
!\ | Apical (post-) alveolar ("retroflex") | J\_< | Palatal | t_> | Alveolar |
=\ | Laminal postalveolar ("palatal") | g_< | Velar | k_> | Velar |
|\|\ | Lateral coronal ("lateral") | G\_< | Uvular | s_> | Alveolar fricative |
F, or f, is the sixth letter of the Latin alphabet and many modern alphabets influenced by it, including the modern English alphabet and the alphabets of all other modern western European languages. Its name in English is ef, and the plural is efs.
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation for the sounds of speech. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.
The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for six European languages by the EEC ESPRIT information technology research and development program. As many symbols as possible have been taken over from the IPA; where this is not possible, other signs that are available are used, e.g. [@
] for schwa, [2
] for the vowel sound found in French deux 'two', and [9
] for the vowel sound found in French neuf 'nine'.
T, or t, is the twentieth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is tee, plural tees.
The following show the typical symbols for consonants and vowels used in SAMPA, an ASCII-based system based on the International Phonetic Alphabet. SAMPA is not a universal system as it varies from language to language.
In phonetics, labiodentals are consonants articulated with the lower lip and the upper teeth, such as and. In English, labiodentalized /s/, /z/ and /r/ are characteristic of some individuals; these may be written.
Phonetic transcription is the visual representation of speech sounds by means of symbols. The most common type of phonetic transcription uses a phonetic alphabet, such as the International Phonetic Alphabet.
Kirshenbaum, sometimes called ASCII-IPA or erkIPA, is a system used to represent the International Phonetic Alphabet (IPA) in ASCII. This way it allows typewriting IPA-symbols by regular keyboard. It was developed for Usenet, notably the newsgroups sci.lang and alt.usage.english. It is named after Evan Kirshenbaum, who led the collaboration that created it. The eSpeak open source software speech synthesizer uses the Kirshenbaum scheme.
Americanist phonetic notation, also known as the North American Phonetic Alphabet (NAPA), the Americanist Phonetic Alphabet or the American Phonetic Alphabet (APA), is a system of phonetic notation originally developed by European and American anthropologists and language scientists for the phonetic and phonemic transcription of indigenous languages of the Americas and for languages of Europe. It is still commonly used by linguists working on, among others, Slavic, Uralic, Semitic languages and for the languages of the Caucasus, of India, and of much of Africa; however, Uralists commonly use a variant known as the Uralic Phonetic Alphabet.
The Uralic Phonetic Alphabet (UPA) or Finno-Ugric transcription system is a phonetic transcription or notational system used predominantly for the transcription and reconstruction of Uralic languages. It was first published in 1901 by Eemil Nestor Setälä, a Finnish linguist; it was somewhat modified in the 1970s.
The Extensions to the International Phonetic Alphabet for Disordered Speech, commonly abbreviated extIPA, are a set of letters and diacritics devised by the International Clinical Phonetics and Linguistics Association to augment the International Phonetic Alphabet for the phonetic transcription of disordered speech. Some of the symbols are used for transcribing features of normal speech in IPA transcription, and are accepted as such by the International Phonetic Association.
L, or l, is the twelfth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is el, plural els.
C, or c, is the third letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is cee, plural cees.
Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with combining diacritics, as well as some ligatures and distinct letters, used for example in the orthographies of various African languages and the Vietnamese alphabet. Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista). Latin Extended-F and -G contain characters for phonetic transcription.
Unicode supports several phonetic scripts and notation systems through its existing scripts and the addition of extra blocks with phonetic characters. These phonetic characters are derived from an existing script, usually Latin, Greek or Cyrillic. Apart from the International Phonetic Alphabet (IPA), extensions to the IPA and obsolete and nonstandard IPA symbols, these blocks also contain characters from the Uralic Phonetic Alphabet and the Americanist Phonetic Alphabet.
ARPABET is a set of phonetic transcription codes developed by Advanced Research Projects Agency (ARPA) as a part of their Speech Understanding Research project in the 1970s. It represents phonemes and allophones of General American English with distinct sequences of ASCII characters. Two systems, one representing each segment with one character and the other with one or two (case-insensitive), were devised, the latter being far more widely adopted.
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks. Its block name in Unicode 1.0 was Standard Phonetic.
The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.
The Phonetic Symbol Guide is a book by Geoffrey Pullum and William Ladusaw that explains the histories and uses of the symbols of various phonetic transcription conventions. It was published in 1986, with a second edition in 1996, by the University of Chicago Press. Symbols include letters and diacritics of the International Phonetic Alphabet and Americanist phonetic notation, though not of the Uralic Phonetic Alphabet. The Guide was consulted by the International Phonetic Association when they established names and numerical codes for the International Phonetic Alphabet and was the basis for the characters of the TIPA set of phonetic fonts.
The International Phonetic Alphabet (IPA) consists of more than 100 letters and diacritics. Before Unicode became widely available, several ASCII-based encoding systems of the IPA were proposed. The alphabet went through a large revision at the Kiel Convention of 1989, and the vowel symbols again in 1993. Systems devised before these revisions inevitably lack support for the additions they introduced.