Viseme

Last updated August 28, 2025

This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA).For an introductory guide on IPA symbols, see Help:IPA.For the distinction between [ ], / / and ⟨ ⟩, see IPA § Brackets and transcription delimiters.

A viseme is any of several speech sounds that look the same, for example when lip reading.^[1]

Visemes and phonemes do not share a one-to-one correspondence. Often several phonemes correspond to a single viseme, as several phonemes look the same on the face when produced, such as /k,ɡ,ŋ/; as well as /t,d,n,l/ and /p,b,m/). Thus words such as pet, bell, and men are difficult for lip-readers to distinguish, as all look like alike. On one account, visemes offer (phonetic) information about place of articulation, while manner of articulation requires auditory input.^[2]

However, there may be differences in timing and duration during natural speech in terms of the visual "signature" of a given gesture that cannot be captured by simply concatenating (stilled) images of each of the mouth patterns in sequence.^[3] Conversely, some sounds which are hard to distinguish acoustically are clearly distinguished by the face. For example, in spoken English /l/ and /r/ can often sound quite similar (especially in clusters, such as 'grass' vs. 'glass'), yet the visual information can disambiguate. Some linguists have argued that speech is best understood as bimodal (aural and visual), and comprehension can be compromised if one of these two domains is absent.^[4]

Visemes can often be humorous, as in the phrase "elephant juice", which when lip-read appears identical to "I love you".

Applications for the study of visemes include speech processing, speech recognition, and computer facial animation.

References

↑ Fisher, Cletus G. (1 December 1968). "Confusions Among Visually Perceived Consonants" . Journal of Speech and Hearing Research. 11 (4): 796–804. doi:10.1044/jshr.1104.796. PMID 5719234.
↑ Summerfield, Quentin (29 January 1992). "Lipreading and audio-visual speech perception" . Philosophical Transactions of the Royal Society B Biological Sciences. 335 (1273): 71–78. doi:10.1098/rstb.1992.0009. eISSN 1471-2970. ISSN 0962-8436. PMID 1348140.
↑ Calvert, Gemma A.; Campbell, Ruth (1 January 2003). "Reading Speech from Still and Moving Faces: The Neural Substrates of Visible Speech". Journal of Cognitive Neuroscience. 15 (1): 57–70. doi:10.1162/089892903321107828. PMID 12590843. S2CID 14153329.
↑ McGurk, Harry; MacDonald, John (23 December 1976). "Hearing lips and seeing voices" . Nature. 264 (5588): 746–748. Bibcode:1976Natur.264..746M. doi:10.1038/264746a0. PMID 1012311.

Lucey, Patrick; Martin, Terrence; Sridharan, Sridha (8–10 December 2004). Confusability of Phonemes Grouped According to their Viseme Classes in Noisy Environments (PDF). 10th Australian International Conference on Speech Science & Technology. Sydney: Macquarie University. pp. 265–270. Archived from the original (PDF) on 5 July 2017.

This phonology article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Fisher, Cletus G. (1 December 1968). "Confusions Among Visually Perceived Consonants" . Journal of Speech and Hearing Research. 11 (4): 796–804. doi:10.1044/jshr.1104.796. PMID 5719234.

[Summerfield92-2] Summerfield, Quentin (29 January 1992). "Lipreading and audio-visual speech perception" . Philosophical Transactions of the Royal Society B Biological Sciences. 335 (1273): 71–78. doi:10.1098/rstb.1992.0009. eISSN 1471-2970. ISSN 0962-8436. PMID 1348140.

[CalvertCampbell03-3] Calvert, Gemma A.; Campbell, Ruth (1 January 2003). "Reading Speech from Still and Moving Faces: The Neural Substrates of Visible Speech". Journal of Cognitive Neuroscience. 15 (1): 57–70. doi:10.1162/089892903321107828. PMID 12590843. S2CID 14153329.

[4] McGurk, Harry; MacDonald, John (23 December 1976). "Hearing lips and seeing voices" . Nature. 264 (5588): 746–748. Bibcode:1976Natur.264..746M. doi:10.1038/264746a0. PMID 1012311.

[1]

[2]

[3]

[4]

Viseme

Contents

See also

References

Further reading