Voice onset time

Last updated
A graphical representation of the VOT of voiced, tenuis, and aspirated stops Vot.svg
A graphical representation of the VOT of voiced, tenuis, and aspirated stops

In phonetics, voice onset time (VOT) is a feature of the production of stop consonants. It is defined as the length of time that passes between the release of a stop consonant and the onset of voicing, the vibration of the vocal folds, or, according to other authors, periodicity. Some authors allow negative values to mark voicing that begins during the period of articulatory closure for the consonant and continues in the release, for those unaspirated voiced stops in which there is no voicing present at the instant of articulatory closure.

Contents

History

The concept of voice onset time can be traced back as far as the 19th century, when Adjarian (1899: 119) [1] studied the Armenian stops, and characterized them by "the relation that exists between two moments: the one when the consonant bursts when the air is released out of the mouth, or explosion, and the one when the larynx starts vibrating". However, the concept became widely known only in the 1960s, in a context described by Lin & Wang (2011: 514): [2] "At that time, there was an ongoing debate about which phonetic attribute would allow voiced and voiceless stops to be effectively distinguished. For instance, voicing, aspiration, and articulatory force were some of the attributes being studied regularly. In English, "voicing" can successfully separate /b,d,ɡ/ from /p,t,k/ when stops are at word-medial positions, but this is not always true for word-initial stops. Strictly speaking, word-initial voiced stops /b,d,ɡ/ are only partially voiced, and sometimes are even voiceless." The concept of VOT finally acquired its name in the famous study of Leigh Lisker and Arthur Abramson (Word, 1964), done while working together at Haskins Laboratories. [3]

Analytic problems

A number of problems arose in defining VOT in some languages, and there is a call for reconsidering whether this speech synthesis parameter should be used to replace articulatory or aerodynamic model parameters [ clarification needed ] which do not have these problems, and which have a stronger explanatory significance. [4] As in the discussion below, any explication of VOT variations will invariably lead back to such aerodynamic and articulatory concepts, and there is no reason presented why VOT adds to an analysis, other than that, as an acoustic parameter, it may sometimes be easier to measure than an aerodynamic parameter (pressure or airflow) or an articulatory parameter (closure interval or the duration, extent and timing of a vocal fold abductory gesture).

Types

Voice Onset Timing spectrograms for English "die" and "tie". The voiceless gap between release and voicing is highlighted in red. Here the phoneme /t/
has a VOT of 95 ms, and /d/
has one of 25 ms. Voice Onset Timing differences for English Tie and Die.jpg
Voice Onset Timing spectrograms for English "die" and "tie". The voiceless gap between release and voicing is highlighted in red. Here the phoneme /t/ has a VOT of 95 ms, and /d/ has one of 25 ms.

Three major phonation types of stops can be analyzed in terms of their voice onset time.

Because neither aspiration nor voicing is absolute, with intermediate degrees of both, the relative terms fortis and lenis are often used to describe a binary opposition between a series of consonants with higher (more positive) VOT, defined as fortis, and a second series with lower (more negative) VOT, defined as lenis. Of course, being relative, what fortis and lenis mean in one language will not in general correspond to what they mean in another.

Voicing contrast applies to all types of consonants, but aspiration is generally only a feature of stops and affricates.

Transcription

Aspiration may be transcribed ◌ʰ, long (strong) aspiration ◌ʰʰ. Voicing is most commonly indicated by the choice of consonant letter. For one way of transcribing pre-voicing and other timing variants, see extensions to the IPA#Diacritics. Other systems include that of Laver (1994), [5] who distinguishes fully devoiced b̥a and ab̥ from initial partial devoicing of the onset of a syllable by ˳ba and from final partial devoicing of the coda of a syllable by ab˳.

Examples in languages

Relative VOT distinctions in various languages[ citation needed ]
Voice Onset TimeExamples
English Cantonese Tlingit Navajo Korean Japanese Spanish, Russian Thai, Armenian
(fortis)Strong aspiration YesYesYes
Moderate aspirationYesYesYes
Mild aspirationYesYesYes
Tenuis YesYesYesYesYes
Partially voicedYes
(lenis)Fully voiced YesYesYes

Publications

Related Research Articles

<span class="mw-page-title-main">Allophone</span> Phone used to pronounce a single phoneme

In phonology, an allophone is one of multiple possible spoken sounds – or phones – used to pronounce a single phoneme in a particular language. For example, in English, the voiceless plosive and the aspirated form are allophones for the phoneme, while these two are considered to be different phonemes in some languages such as Central Thai. Similarly, in Spanish, and are allophones for the phoneme, while these two are considered to be different phonemes in English.

Approximants are speech sounds that involve the articulators approaching each other but not narrowly enough nor with enough articulatory precision to create turbulent airflow. Therefore, approximants fall between fricatives, which do produce a turbulent airstream, and vowels, which produce no turbulence. This class is composed of sounds like and semivowels like and, as well as lateral approximants like.

In phonetics, aspiration is the strong burst of breath that accompanies either the release or, in the case of preaspiration, the closure of some obstruents. In English, aspirated consonants are allophones in complementary distribution with their unaspirated counterparts, but in some other languages, notably most South Asian languages and East Asian languages, the difference is contrastive.

Breathy voice is a phonation in which the vocal folds vibrate, as they do in normal (modal) voicing, but are adjusted to let more air escape which produces a sighing-like sound. A simple breathy phonation,, can sometimes be heard as an allophone of English between vowels, such as in the word behind, for some speakers.

The English words Daoism and Taoism are alternative spellings for the same-named Chinese philosophy and religion. The root for Daoism or Taoism is the Chinese word , which was transcribed tao or tau in the earliest systems for the romanization of Chinese and dao or dau in 20th century systems.

A fricative is a consonant produced by forcing air through a narrow channel made by placing two articulators close together. These may be the lower lip against the upper teeth, in the case of ; the back of the tongue against the soft palate in the case of German ; or the side of the tongue against the molars, in the case of Welsh. This turbulent airflow is called frication.

The term phonation has slightly different meanings depending on the subfield of phonetics. Among some phoneticians, phonation is the process by which the vocal folds produce certain sounds through quasi-periodic vibration. This is the definition used among those who study laryngeal anatomy and physiology and speech production in general. Phoneticians in other subfields, such as linguistic phonetics, call this process voicing, and use the term phonation to refer to any oscillatory state of any part of the larynx that modifies the airstream, of which voicing is just one example. Voiceless and supra-glottal phonations are included under this definition.

In phonetics, a plosive, also known as an occlusive or simply a stop, is a pulmonic consonant in which the vocal tract is blocked so that all airflow ceases.

Grimm's law is a set of sound laws describing the Proto-Indo-European (PIE) stop consonants as they developed in Proto-Germanic in the 1st millennium BC. First discovered by Rasmus Rask but systematically put forward by Jacob Grimm, it establishes a set of regular correspondences between early Germanic stops and fricatives and stop consonants of certain other centum Indo-European languages.

An affricate is a consonant that begins as a stop and releases as a fricative, generally with the same place of articulation. It is often difficult to decide if a stop and fricative form a single phoneme or a consonant pair. English has two affricate phonemes, and, often spelled ch and j, respectively.

Voice or voicing is a term used in phonetics and phonology to characterize speech sounds. Speech sounds can be described as either voiceless or voiced.

In linguistics, fortis and lenis, sometimes identified with 'tense' and 'lax', are pronunciations of consonants with relatively greater and lesser energy, respectively. English has fortis consonants, such as the p in pat, with a corresponding lenis consonant, such as the b in bat. Fortis and lenis consonants may be distinguished by tenseness or other characteristics, such as voicing, aspiration, glottalization, velarization, length, and length of nearby vowels. Fortis and lenis were coined for languages where the contrast between sounds such as 'p' and 'b' does not involve voicing.

The phonology of Japanese features a phonemic inventory of about 15 consonants, plus a five-vowel system of commonly seen in other languages. There is a relatively simple phonotactic distribution of phonemes, allowing for few consonant clusters. Traditionally, autochthonous Japanese linguistics has described the language as having a unit of timing called the mora, with each mora taking up about the same length of time. In this way, the disyllabic may be analyzed as, dissected into four moras:, ,, and.

A stop with no audible release, also known as an unreleased stop or an applosive, is a stop consonant with no release burst: no audible indication of the end of its occlusion (hold). In the International Phonetic Alphabet, lack of an audible release is denoted with an upper-right corner diacritic after the consonant letter, as in:, ,.

In linguistics, a tenuis consonant is an obstruent that is voiceless, unaspirated and unglottalized.

The Sikkimese language, also called Sikkimese, Bhutia, or Drenjongké, Dranjoke, Denjongka, Denzongpeke and Denzongke, belongs to the Tibeto-Burman languages. It is spoken by the Bhutia in Sikkim, India and in parts of Koshi, Nepal. The Sikkimese people refer to their own language as Drendzongké and their homeland as Drendzong. Up until 1975 Sikkimese was not a written language. After gaining Indian Statehood the language was introduced as a school subject in Sikkim and the written language was developed.

In articulatory phonetics, fortition, also known as strengthening, is a consonantal change that increases the degree of stricture. It is the opposite of the more common lenition. For example, a fricative or an approximant may become a stop. Although not as typical of sound change as lenition, fortition may occur in prominent positions, such as at the beginning of a word or stressed syllable; as an effect of reducing markedness; or due to morphological leveling.

This article is a technical description of the phonetics and phonology of Korean. Unless otherwise noted, statements in this article refer to South Korean standard language based on the Seoul dialect.

This article is about the sound system of the Navajo language. The phonology of Navajo is intimately connected to its morphology. For example, the entire range of contrastive consonants is found only at the beginning of word stems. In stem-final position and in prefixes, the number of contrasts is drastically reduced. Similarly, vowel contrasts found outside of the stem are significantly neutralized. For details about the morphology of Navajo, see Navajo grammar.

In phonology, voicing is a sound change where a voiceless consonant becomes voiced due to the influence of its phonological environment; shift in the opposite direction is referred to as devoicing or desonorization. Most commonly, the change is a result of sound assimilation with an adjacent sound of opposite voicing, but it can also occur word-finally or in contact with a specific vowel.

References

  1. ADJARIAN, H., Les explosives de l'ancien arménien étudiées dans les dialectes modernes, La Parole. Revue internationale de Rhinologie, Otologie, Laryngologie et Phonétique expérimentale, 119-127 (1899) "... la relation qui existe entre deux moments : celui où la consonne éclate par l'effet de l'expulsion de l'air hors de la bouche, ou explosion, et celui où le larynx entre en vibration."
  2. LIN, C. & WANG, H., Automatic estimation of voice onset time for word-initial stops by applying random forest to onset detection, The Journal of the Acoustical Society of America, 514-525 (2011)
  3. "Lisker, L. and Abramson, A.S., A cross-language study of voicing in initial stops: acoustical measurements, Word Vol. 20, 384-422 (1964)" (PDF). Retrieved 2022-07-02.
  4. ROTHENBERG, M. "Voice Onset Time vs. Articulatory Modeling for Stop Consonants", The Jan Gauffin Memorial Symposium, October 16, 2008. Royal Institute of Technology, Stockholm. (To be published in the proceedings)
  5. Principles of Phonetics, p. 340

Sources