ARPABET

Last updated

ARPABET (also spelled ARPAbet) is a set of phonetic transcription codes developed by Advanced Research Projects Agency (ARPA) as a part of their Speech Understanding Research project in the 1970s. It represents phonemes and allophones of General American English with distinct sequences of ASCII characters. Two systems, one representing each segment with one character (alternating upper- and lower-case letters) and the other with one or two (case-insensitive), were devised, the latter being far more widely adopted. [1]

Contents

ARPABET has been used in several speech synthesizers, including Computalker for the S-100 system, SAM for the Commodore 64, SAY for the Amiga, TextAssist for the PC and Speakeasy from Intelligent Artefacts which used the Votrax SC-01 speech synthesiser IC. It is also used in the CMU Pronouncing Dictionary. A revised version of ARPABET is used in the TIMIT corpus. [1]

Symbols

Stress is indicated by a digit immediately following a vowel. Auxiliary symbols are identical in 1- and 2-letter codes. In 2-letter notation, segments are separated by a space.

Vowels [2]
ARPABET IPA Example(s)
1-letter2-letter
aAA ɑ ~ ɒ balm, bot (with father–bother merger)
@AE æ bat
AAH ʌ butt
cAO ɔ caught, story
WAWbout
xAX ə comma
AXR [3] ɚ letter, forward
YAYbite
EEH ɛ bet
RER ɝ bird, foreword
eEYbait
IIH ɪ bit
XIX ɨ roses, rabbit
iIY i beat
oOWboat
OOYɔɪboy
UUH ʊ book
uUW u boot
UX [3] ʉ dude
Consonants [2]
ARPABET IPA Example
1-letter2-letter
bB b buy
CCH China
dD d die
DDH ð thy
FDX ɾ butter
LEL bottle
MEM rhythm
NEN button
fF f fight
gG ɡ guy
hHH or H [3] h high
JJH jive
kK k kite
lL l lie
mM m my
nN n nigh
GNX or NG [3] ŋ sing
NX [3] ɾ̃ winter
pP p pie
QQ ʔ uh-oh
rR ɹ rye
sS s sigh
SSH ʃ shy
tT t tie
TTH θ thigh
vV v vie
wW w wise
HWH ʍ why (without wine–whine merger)
yY j yacht
zZ z zoo
ZZH ʒ pleasure
Stress and auxiliary symbols [2]
ABDescription
0No stress
1 Primary stress
2 Secondary stress
3... Tertiary and further stress
-Silence
 !Non-speech segment
+ Morpheme boundary
/ Word boundary
# Utterance boundary
 : Tone group boundary
:1 or .Falling or declining juncture
:2 or ?Rising or internal juncture
:3 or .Fall-rise or non-terminal juncture

TIMIT

In TIMIT, the following symbols are used in addition to the ones listed above: [4]

Symbol IPA ExampleDescription
AX-Hə̥suspect Devoiced /ə/
BCLobtain[b] closure
DCLwidth[d] closure
ENGŋ̍Washington Syllabic [ŋ]
GCLɡ̚dogtooth[ɡ] closure
HV ɦ ahead Voiced /h/
KCLdoctor[k] closure
PCLaccept[p] closure
TCLcatnip[t] closure
PAUPause
EPIEpenthetic silence
H#Begin/end marker

See also

References

  1. 1 2 Klautau, Aldebaro (2001). "ARPABET and the TIMIT alphabet" (PDF). Archived from the original (PDF) on June 3, 2016. Retrieved September 8, 2017.
  2. 1 2 3 Rice, Lloyd (April 1976). "Hardware & software for speech synthesis". Dr. Dobb's Journal of Computer Calisthenics & Orthodontia . 1 (4): 6–8.
  3. 1 2 3 4 5 Jurafsky, Daniel; Martin, James H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall. pp. 94–5. ISBN   0-1309-5069-6.
  4. "Table of all the phonemic and phonetic symbols used in the TIMIT lexicon". Linguistic Data Consortium. October 12, 1990. Retrieved September 8, 2017.