Comparison of ASCII encodings of the International Phonetic Alphabet

Last updated

The International Phonetic Alphabet (IPA) consists of more than 100 letters and diacritics. Before Unicode became widely available, several ASCII-based encoding systems of the IPA were proposed. The alphabet went through a large revision at the Kiel Convention of 1989, and the vowel symbols again in 1993. [1] Systems devised before these revisions inevitably lack support for the additions they introduced.

Contents

Only language-neutral systems are discussed below because language-dependent ones (such as ARPABET) do not allow for a systematic comparison.

General information

SystemAuthor(s)CreatedLast
updated
NoteRef
Branner (unnamed)David Prager Branner at the University of Washington 1994 ? [2]
Millar & Oasa (unnamed)J. Bruce Millar and Hiroaki Oasa at Australian National University 19811981 [3]
PHONASCIIGeorge D. Allen at Purdue University 19881988Not a direct mapping of the IPA. Segments are separated by spaces, and diacritics by commas. [4]
Praat Paul Boersma and David Weenink at the University of Amsterdam 19912021 [5]
IPA (SIL) Keyboard SIL International 19942021 [6]
UCLA Phonological Segment Inventory Database (UPSID) Ian Maddieson at the University of California, Los Angeles 1984 ?Presented here is the scheme used for representing phonemes in the database of phonological inventories. Consequently, it is not designed for transcription of multiple segments and does not have symbols for values not found phonemically in the languages sampled. [7]
Usenet IPA/ASCII transcription Participants in sci.lang and alt.usage.english newsgroups (later maintained by Evan Kirshenbaum at HP Labs)19912011Also known variously as "ASCII-IPA", "Kirshenbaum", etc. [8] IETF language subtags register fonkirsh to identify text in this convention. [9] [10]
WorldbetJames L. Hieronymus at AT&T Bell Laboratories 19941994Segments are separated by spaces. [11]
X-SAMPA John C. Wells at University College London 19952000 IETF language subtags register fonxsamp to identify text in this convention. [9] [12]

Symbols

Only the symbols in the latest IPA chart are included. The numbers in the leftmost column, according to which the symbols are sorted, are the IPA Numbers. Some of the IPA symbols to which a system lacks a corresponding symbol may still be represented in that system by use of a modifier (diacritic), but such combinations are not included unless the documentation explicitly assigns one for the value.

Coverage

ScopeBrannerMillar & OasaPHONASCIIPraatSILUPSIDUsenetWorldbetX-SAMPA
Consonants (80)79 (69 (67 (79 (80 (75 (73 (73 (79 (
Vowels (29)29 (27 (26 (29 (28 (28 (28 (26 (29 (
Diacritics (35)34 (15 (25 (30 (34 (12 (17 (25 (26 (
Suprasegmentals (28)28 (20 (21 (14 (28 (2 (4 (11 (28 (
Total (172)170 (131 (139 (152 (170 (117 (122 (135 (162 (

See also

Notes

  1. 1 2 3 4 5 6 7 8 9 10 11 12 In Worldbet, these combinations are given as merely proposed for values "for which no machine-readable coding has yet been proposed".
  2. The uvular approximant is represented by R in PHONASCII.
  3. 1 2 3 L represents either a voiceless alveolar lateral fricative, a velar approximant, or a velarized alveolar lateral approximant in the Usenet IPA/ASCII transcription.
  4. 1 2 c! represents either an alveolar or palatal click in the Usenet IPA/ASCII transcription.
  5. 1 2 - represents either retracted or "velarized or pharyngealized" in Millar & Oasa's system.
  6. 1 2 ¿ and ¡ are not part of ASCII, but are nonetheless proposed as encoding advanced and retracted tongue root, respectively, in Worldbet.
  7. . represents either raised or palatalized in Millar & Oasa's system.
  8. 1 2 * represents either non-syllabic or extra-short in Millar & Oasa's system.
  9. )) representing a tie bar is placed after both segments, as in ts)), in Branner's system.

Related Research Articles

<span class="mw-page-title-main">International Phonetic Alphabet</span> System of phonetic notation

The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation of speech sounds in written form. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.

The following show the typical symbols for consonants and vowels used in SAMPA, an ASCII-based system based on the International Phonetic Alphabet. SAMPA is not a universal system as it varies from language to language.

Phonetic transcription is the visual representation of speech sounds by means of symbols. The most common type of phonetic transcription uses a phonetic alphabet, such as the International Phonetic Alphabet.

Labialization is a secondary articulatory feature of sounds in some languages. Labialized sounds involve the lips while the remainder of the oral cavity produces another sound. The term is normally restricted to consonants. When vowels involve the lips, they are called rounded.

Kirshenbaum, sometimes called ASCII-IPA or erkIPA, is a system used to represent the International Phonetic Alphabet (IPA) in ASCII. This way it allows typewriting IPA-symbols by regular keyboard. It was developed for Usenet, notably the newsgroups sci.lang and alt.usage.english. It is named after Evan Kirshenbaum, who led the collaboration that created it. The eSpeak open source software speech synthesizer uses the Kirshenbaum scheme.

The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of SAMPA developed in 1995 by John C. Wells, professor of phonetics at University College London. It is designed to unify the individual language SAMPA alphabets, and extend SAMPA to cover the entire range of characters in the 1993 version of International Phonetic Alphabet (IPA). The result is a SAMPA-inspired remapping of the IPA into 7-bit ASCII.

<span class="mw-page-title-main">Voiced labial–palatal approximant</span> Consonantal sound represented by ⟨ɥ⟩ in IPA

The voiced labial–palatalapproximant is a type of consonantal sound, used in some spoken languages. It has two constrictions in the vocal tract: with the tongue on the palate, and rounded at the lips. The symbol in the International Phonetic Alphabet that represents this sound is ɥ, a rotated lowercase letter ⟨h⟩, or occasionally , which indicates with a different kind of rounding.

The voiced alveolar lateral approximant is a type of consonantal sound used in many spoken languages. The symbol in the International Phonetic Alphabet that represents dental, alveolar, and postalveolar lateral approximants is l, and the equivalent X-SAMPA symbol is l.

<span class="mw-page-title-main">Voiced velar lateral approximant</span> Consonantal sound represented by ⟨ʟ⟩ in IPA

The voiced velar lateral approximant is a type of consonantal sound, used as a distinct consonant in a very small number of spoken languages in the world. The symbol in the International Phonetic Alphabet that represents this sound is ʟ a small capital version of the Latin letter l, and the equivalent X-SAMPA symbol is L\.

The voiced palatal approximant, or yod, is a type of consonant used in many spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is j. The equivalent X-SAMPA symbol is j, and in the Americanist phonetic notation it is ⟨y⟩. Because the English name of the letter J, jay, starts with, the approximant is sometimes instead called yod (jod), as in the phonological history terms yod-dropping and yod-coalescence.

<span class="mw-page-title-main">Voiced velar approximant</span> Consonantal sound represented by ⟨ɰ⟩ in IPA

The voiced velar approximant is a type of consonantal sound, used in some spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is ɰ, and the equivalent X-SAMPA symbol is M\.

<span class="mw-page-title-main">Voiced dental fricative</span> Consonantal sound represented by ⟨ð⟩ in IPA

The voiced dental fricative is a consonant sound used in some spoken languages. It is familiar to English-speakers as the th sound in father. Its symbol in the International Phonetic Alphabet is eth, or and was taken from the Old English and Icelandic letter eth, which could stand for either a voiced or unvoiced (inter)dental non-sibilant fricative. Such fricatives are often called "interdental" because they are often produced with the tongue between the upper and lower teeth, and not just against the back of the upper teeth, as they are with other dental consonants.

In phonetics, a flap or tap is a type of consonantal sound, which is produced with a single contraction of the muscles so that one articulator is thrown against another.

Velarization or velarisation is a secondary articulation of consonants by which the back of the tongue is raised toward the velum during the articulation of the consonant. In the International Phonetic Alphabet, velarization is transcribed by one of four diacritics:

<span class="mw-page-title-main">History of the International Phonetic Alphabet</span> History of the IPA phonetic representation system

The International Phonetic Alphabet was created soon after the International Phonetic Association was established in the late 19th century. It was intended as an international system of phonetic transcription for oral languages, originally for pedagogical purposes. The Association was established in Paris in 1886 by French and British language teachers led by Paul Passy. The prototype of the alphabet appeared in Phonetic Teachers' Association (1888b). The Association based their alphabet upon the Romic alphabet of Henry Sweet, which in turn was based on the Phonotypic Alphabet of Isaac Pitman and the Palæotype of Alexander John Ellis.

The Uralic Phonetic Alphabet (UPA) or Finno-Ugric transcription system is a phonetic transcription or notational system used predominantly for the transcription and reconstruction of Uralic languages. It was first published in 1901 by Eemil Nestor Setälä, a Finnish linguist.

In phonetics, secondary articulation occurs when the articulation of a consonant is equivalent to the combined articulations of two or three simpler consonants, at least one of which is an approximant. The secondary articulation of such co-articulated consonants is the approximant-like articulation. It "colors" the primary articulation rather than obscuring it. Maledo (2011) defines secondary articulation as the superimposition of lesser stricture upon a primary articulation.

<span class="mw-page-title-main">Extensions to the International Phonetic Alphabet</span> Disordered speech additions to the phonetic alphabet

The Extensions to the International Phonetic Alphabet for Disordered Speech, commonly abbreviated extIPA, are a set of letters and diacritics devised by the International Clinical Phonetics and Linguistics Association to augment the International Phonetic Alphabet for the phonetic transcription of disordered speech. Some of the symbols are used for transcribing features of normal speech in IPA transcription, and are accepted as such by the International Phonetic Association.

L, or l, is the twelfth letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is el, plural els.

The pronunciation of the phoneme in the English language has many variations in different dialects.

References

  1. International Phonetic Association (1993). "Council actions on revisions of the IPA". Journal of the International Phonetic Association. 23 (1): 32–34. doi:10.1017/S002510030000476X. S2CID   249420050.
  2. Branner, David Prager (1994). "Proposal for an ASCII Version of the IPA". University of Washington. Archived from the original on 9 February 1999.
  3. Millar, J. B.; Oasa, H. (1981). "Proposal for ASCII coded phonetic script". Journal of the International Phonetic Association. 11 (2): 62–74. doi:10.1017/S0025100300002279. S2CID   146352996.
  4. Allen, George D. (1988). "The PHONASCII system". Journal of the International Phonetic Association. 18 (1): 9–25. doi:10.1017/S0025100300003509. S2CID   143899772.
  5. Boersma, Paul; Weenink, David (4 August 2009). "Phonetic symbols". Praat.
  6. "IPA (SIL) Keyboard Help". Keyman Help. SIL International.
  7. Reetz, Henning (23 May 2018). "Simple UPSID interface". Universität Frankfurt.
  8. Gómez-Vilda, Pedro; Ferrández-Vicente, José Manuel; Rodellar-Biarge, Victoria; Álvarez-Marquina, Agustín; Mazaira-Fernández, Luis Miguel; Martínez-Olalla, Rafael; Muñoz-Mulas, Cristina (2009). "Detection of Speech Dynamics by Neuromorphic Units". In Mira, José; Ferrández, José Manuel; Álvarez, José R.; de la Paz, Félix; Toledo, F. Javier (eds.). Methods and Models in Artificial and Natural Computation: A Homage to Professor Mira's Scientific Legacy – Third International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2009, Santiago de Compostela, Spain, June 22-26, 2009, Proceedings, Part I. Springer. pp. 67–78. doi:10.1007/978-3-642-02264-7_8. ISBN   978-3-642-02263-0. Page 74.
  9. 1 2 "Language Subtag Registry". IANA. 2021-03-05. Retrieved 30 April 2021.
  10. Kirshenbaum, Evan (6 September 2011). "Representing IPA phonetics in ASCII" (PDF). Archived from the original (PDF) on 19 April 2016.
  11. Hieronymus, James L. (1994). "ASCII Phonetic Symbols for the World's Languages: Worldbet". AT&T Bell Laboratories Technical Memorandum. CiteSeerX   10.1.1.225.9914 .
  12. Wells, John (3 May 2000). "Computer-coding the IPA: a proposed extension of SAMPA". University College London.