Motor theory of speech perception

Last updated January 24, 2024

The motor theory of speech perception is the hypothesis that people perceive spoken words by identifying the vocal tract gestures with which they are pronounced rather than by identifying the sound patterns that speech generates.^[1]^[2]^[3]^[4]^[5] It originally claimed that speech perception is done through a specialized module that is innate and human-specific. Though the idea of a module has been qualified in more recent versions of the theory,^[5] the idea remains that the role of the speech motor system is not only to produce speech articulations but also to detect them.

The hypothesis has gained more interest outside the field of speech perception than inside. This has increased particularly since the discovery of mirror neurons that link the production and perception of motor movements, including those made by the vocal tract.^[5] The theory was initially proposed in the Haskins Laboratories in the 1950s by Alvin Liberman and Franklin S. Cooper, and developed further by Donald Shankweiler, Michael Studdert-Kennedy, Ignatius Mattingly, Carol Fowler and Douglas Whalen.

Origins and development

The hypothesis has its origins in research using pattern playback to create reading machines for the blind that would substitute sounds for orthographic letters.^[6] This led to a close examination of how spoken sounds correspond to the acoustic spectrogram of them as a sequence of auditory sounds. This found that successive consonants and vowels overlap in time with one another (a phenomenon known as coarticulation).^[7]^[8]^[9] This suggested that speech is not heard like an acoustic "alphabet" or "cipher," but as a "code" of overlapping speech gestures.

Associationist approach

Initially, the theory was associationist: infants mimic the speech they hear and that this leads to behavioristic associations between articulation and its sensory consequences. Later, this overt mimicry would be short-circuited and become speech perception.^[8] This aspect of the theory was dropped, however, with the discovery that prelinguistic infants could already detect most of the phonetic contrasts used to separate different speech sounds.^[1]

Cognitivist approach

The behavioristic approach was replaced by a cognitivist one in which there was a speech module.^[1] The module detected speech in terms of hidden distal objects rather than at the proximal or immediate level of their input. The evidence for this was the research finding that speech processing was special such as duplex perception.^[10]

Changing distal objects

Initially, speech perception was assumed to link to speech objects that were both

the invariant movements of speech articulators^[8]
the invariant motor commands sent to muscles to move the vocal tract articulators^[11]

This was later revised to include the phonetic gestures rather than motor commands,^[1] and then the gestures intended by the speaker at a prevocal, linguistic level, rather than actual movements.^[12]

Modern revision

The "speech is special" claim has been dropped,^[5] as it was found that speech perception could occur for nonspeech sounds (for example, slamming doors for duplex perception).^[13]

Mirror neurons

The discovery of mirror neurons has led to renewed interest in the motor theory of speech perception, and the theory still has its advocates,^[5] although there are also critics.^[14]

Support

Nonauditory gesture information

If speech is identified in terms of how it is physically made, then nonauditory information should be incorporated into speech percepts even if it is still subjectively heard as "sounds". This is, in fact, the case.

The McGurk effect shows that seeing the production of a spoken syllable that differs from an auditory cue synchronized with it affects the perception of the auditory one. In other words, if someone hears "ba" but sees a video of someone pronouncing "ga", what they hear is different—some people believe they hear "da".
People find it easier to hear speech in noise if they can see the speaker.^[15]
People can hear syllables better when their production can be felt haptically.^[16]

Categorical perception

Using a speech synthesizer, speech sounds can be varied in place of articulation along a continuum from /bɑ/ to /dɑ/ to /ɡɑ/, or in voice onset time on a continuum from /dɑ/ to /tɑ/ (for example). When listeners are asked to discriminate between two different sounds, they perceive sounds as belonging to discrete categories, even though the sounds vary continuously. In other words, 10 sounds (with the sound on one extreme being /dɑ/ and the sound on the other extreme being /tɑ/, and the ones in the middle varying on a scale) may all be acoustically different from one another, but the listener will hear all of them as either /dɑ/ or /tɑ/. Likewise, the English consonant /d/ may vary in its acoustic details across different phonetic contexts (the /d/ in /du/ does not technically sound the same as the one in /di/, for example), but all /d/'s as perceived by a listener fall within one category (voiced alveolar plosive) and that is because "linguistic representations are abstract, canonical, phonetic segments or the gestures that underlie these segments."^[17] This suggests that humans identify speech using categorical perception, and thus that a specialized module, such as that proposed by the motor theory of speech perception, may be on the right track.^[18]

Speech imitation

If people can hear the gestures in speech, then the imitation of speech should be very fast, as in when words are repeated that are heard in headphones as in speech shadowing.^[19] People can repeat heard syllables more quickly than they would be able to produce them normally.^[20]

Speech production

Hearing speech activates vocal tract muscles,^[21] and the motor cortex ^[22] and premotor cortex.^[23] The integration of auditory and visual input in speech perception also involves such areas.^[24]
Disrupting the premotor cortex disrupts the perception of speech units such as plosives.^[25]
The activation of the motor areas occurs in terms of the phonemic features which link with the vocal track articulators that create speech gestures.^[26]
The perception of a speech sound is aided by pre-emptively stimulating the motor representation of the articulators responsible for its pronunciation .^[27]
Auditory and motor cortical coupling is restricted to a specific range of neuronal firing frequency.^[28]

Perception-action meshing

Evidence exists that perception and production are generally coupled in the motor system. This is supported by the existence of mirror neurons that are activated both by seeing (or hearing) an action and when that action is carried out.^[29] Another source of evidence is that for common coding theory between the representations used for perception and action.^[30]

Criticisms

The motor theory of speech perception is not widely held in the field of speech perception, though it is more popular in other fields, such as theoretical linguistics. As three of its advocates have noted, "it has few proponents within the field of speech perception, and many authors cite it primarily to offer critical commentary".^[5]^p. 361 Several critiques of it exist.^[31]^[32]

Multiple sources

Speech perception is affected by nonproduction sources of information, such as context. Individual words are hard to understand in isolation but easy when heard in sentence context. It therefore seems that speech perception uses multiple sources that are integrated together in an optimal way.^[31]

Production

The motor theory of speech perception would predict that speech motor abilities in infants predict their speech perception abilities, but in actuality it is the other way around.^[33] It would also predict that defects in speech production would impair speech perception, but they do not.^[34] However, this only affects the first and already superseded behaviorist version of the theory, where infants were supposed to learn all production-perception patterns by imitation early in childhood. This is no longer the mainstream view of motor-speech theorists.

Speech module

Several sources of evidence for a specialized speech module have failed to be supported.

Duplex perception can be observed with door slams.^[13]
The McGurk effect can also be achieved with nonlinguistic stimuli, such as showing someone a video of a basketball bouncing but playing the sound of a ping-pong ball bouncing.^{[ citation needed ]}
As for categorical perception, listeners can be sensitive to acoustic differences within single phonetic categories.

As a result, this part of the theory has been dropped by some researchers.^[5]

Sublexical tasks

The evidence provided for the motor theory of speech perception is limited to tasks such as syllable discrimination that use speech units not full spoken words or spoken sentences. As a result, "speech perception is sometimes interpreted as referring to the perception of speech at the sublexical level. However, the ultimate goal of these studies is presumably to understand the neural processes supporting the ability to process speech sounds under ecologically valid conditions, that is, situations in which successful speech sound processing ultimately leads to contact with the mental lexicon and auditory comprehension."^[35] This however creates the problem of " a tenuous connection to their implicit target of investigation, speech recognition".^[35]

Birds

It has been suggested that birds also hear each other's bird song in terms of vocal gestures.^[36]

Related Research Articles

Phonetics is a branch of linguistics that studies how humans produce and perceive sounds, or in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians. The field of phonetics is traditionally divided into three sub-disciplines based on the research questions involved such as how humans plan and execute movements to produce speech, how various movements affect the properties of the resulting sound, or how humans convert sound waves to linguistic information. Traditionally, the minimal linguistic unit of phonetics is the phone—a speech sound in a language which differs from the phonological unit of phoneme; the phoneme is an abstract categorization of phones, and it is also defined as the smallest unit that discerns meaning between sounds in any given language.

Lip reading, also known as speechreading, is a technique of understanding a limited range of speech by visually interpreting the movements of the lips, face and tongue without sound. Estimates of the range of lip reading vary, with some figures as low as 30% because lip reading relies on context, language knowledge, and any residual hearing. Although lip reading is used most extensively by deaf and hard-of-hearing people, most people with normal hearing process some speech information from sight of the moving mouth.

<span class="mw-page-title-main">McGurk effect</span> Perceptual illusion

The McGurk effect is a perceptual phenomenon that demonstrates an interaction between hearing and vision in speech perception. The illusion occurs when the auditory component of one sound is paired with the visual component of another sound, leading to the perception of a third sound. The visual information a person gets from seeing a person speak changes the way they hear the sound. If a person is getting poor-quality auditory information but good-quality visual information, they may be more likely to experience the McGurk effect. Integration abilities for audio and visual information may also influence whether a person will experience the effect. People who are better at sensory integration have been shown to be more susceptible to the effect. Many people are affected differently by the McGurk effect based on many factors, including brain damage and other disorders.

A mirror neuron is a neuron that fires both when an organism acts and when the organism observes the same action performed by another. Thus, the neuron "mirrors" the behavior of the other, as though the observer were itself acting. Mirror neurons are not always physiologically distinct from other types of neurons in the brain; their main differentiating factor is their response patterns. By this definition, such neurons have been directly observed in humans and primate species, and in birds.

<span class="mw-page-title-main">Language processing in the brain</span> How humans use words to communicate

In psycholinguistics, language processing refers to the way humans use words to communicate ideas and feelings, and how such communications are processed and understood. Language processing is considered to be a uniquely human ability that is not produced with the same grammatical understanding or systematicity in even human's closest primate relatives.

Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words, and using those words in their semantic character as words in the lexicon of a language according to the syntactic constraints that govern lexical words' function in a sentence. In speaking, speakers perform many different intentional speech acts, e.g., informing, declaring, asking, persuading, directing, and can use enunciation, intonation, degrees of loudness, tempo, and other non-representational or paralinguistic aspects of vocalization to convey meaning. In their speech, speakers also unintentionally communicate many aspects of their social position such as sex, age, place of origin, physical states, psychological states, physico-psychological states, education or experience, and the like.

The two-streams hypothesis is a model of the neural processing of vision as well as hearing. The hypothesis, given its initial characterisation in a paper by David Milner and Melvyn A. Goodale in 1992, argues that humans possess two distinct visual systems. Recently there seems to be evidence of two distinct auditory systems as well. As visual information exits the occipital lobe, and as sound leaves the phonological network, it follows two main pathways, or "streams". The ventral stream leads to the temporal lobe, which is involved with object and visual identification and recognition. The dorsal stream leads to the parietal lobe, which is involved with processing the object's spatial location relative to the viewer and with speech repetition.

Categorical perception is a phenomenon of perception of distinct categories when there is a gradual change in a variable along a continuum. It was originally observed for auditory stimuli but now found to be applicable to other perceptual modalities.

Speech perception is the process by which the sounds of language are heard, interpreted, and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

<span class="mw-page-title-main">Haskins Laboratories</span>

Haskins Laboratories, Inc. is an independent 501(c) non-profit corporation, founded in 1935 and located in New Haven, Connecticut, since 1970. Haskins has formal affiliation agreements with both Yale University and the University of Connecticut; it remains fully independent, administratively and financially, of both Yale and UConn. Haskins is a multidisciplinary and international community of researchers that conducts basic research on spoken and written language. A guiding perspective of their research is to view speech and language as emerging from biological processes, including those of adaptation, response to stimuli, and conspecific interaction. Haskins Laboratories has a long history of technological and theoretical innovation, from creating systems of rules for speech synthesis and development of an early working prototype of a reading machine for the blind to developing the landmark concept of phonemic awareness as the critical preparation for learning to read an alphabetic writing system.

Alvin Meyer Liberman was born in St. Joseph, Missouri. Liberman was an American psychologist. His ideas set the agenda for fifty years of psychological research in speech perception.

Duplex perception refers to the linguistic phenomenon whereby "part of the acoustic signal is used for both a speech and a nonspeech percept." A listener is presented with two simultaneous, dichotic stimuli. One ear receives an isolated third-formant transition that sounds like a nonspeech chirp. At the same time the other ear receives a base syllable. This base syllable consists of the first two formants, complete with formant transitions, and the third formant without a transition. Normally, there would be peripheral masking in such a binaural listening task but this does not occur. Instead, the listener's percept is duplex, that is, the completed syllable is perceived and the nonspeech chirp is heard at the same time. This is interpreted as being due to the existence of a special speech module.

Phonological development refers to how children learn to organize sounds into meaning or language (phonology) during their stages of growth.

The concept of motor cognition grasps the notion that cognition is embodied in action, and that the motor system participates in what is usually considered as mental processing, including those involved in social interaction. The fundamental unit of the motor cognition paradigm is action, defined as the movements produced to satisfy an intention towards a specific motor goal, or in reaction to a meaningful event in the physical and social environments. Motor cognition takes into account the preparation and production of actions, as well as the processes involved in recognizing, predicting, mimicking, and understanding the behavior of other people. This paradigm has received a great deal of attention and empirical support in recent years from a variety of research domains including embodied cognition, developmental psychology, cognitive neuroscience, and social psychology.

Speech shadowing is a psycholinguistic experimental technique in which subjects repeat speech at a delay to the onset of hearing the phrase. The time between hearing the speech and responding, is how long the brain takes to process and produce speech. The task instructs participants to shadow speech, which generates intent to reproduce the phrase while motor regions in the brain unconsciously process the syntax and semantics of the words spoken. Words repeated during the shadowing task would also imitate the parlance of the shadowed speech.

The neuroscience of music is the scientific study of brain-based mechanisms involved in the cognitive processes underlying music. These behaviours include music listening, performing, composing, reading, writing, and ancillary activities. It also is increasingly concerned with the brain basis for musical aesthetics and musical emotion. Scientists working in this field may have training in cognitive neuroscience, neurology, neuroanatomy, psychology, music theory, computer science, and other relevant fields.

<span class="mw-page-title-main">Speech repetition</span> Repeating something someone else said

Speech repetition occurs when individuals speak the sounds that they have heard another person pronounce or say. In other words, it is the saying by one individual of the spoken vocalizations made by another individual. Speech repetition requires the person repeating the utterance to have the ability to map the sounds that they hear from the other person's oral pronunciation to similar places and manners of articulation in their own vocal tract.

Dichotic listening is a psychological test commonly used to investigate selective attention and the lateralization of brain function within the auditory system. It is used within the fields of cognitive psychology and neuroscience.

Sensory-motor coupling is the coupling or integration of the sensory system and motor system. Sensorimotor integration is not a static process. For a given stimulus, there is no one single motor command. "Neural responses at almost every stage of a sensorimotor pathway are modified at short and long timescales by biophysical and synaptic processes, recurrent and feedback connections, and learning, as well as many other internal and external variables".

Neurocomputational speech processing is computer-simulation of speech production and speech perception by referring to the natural neuronal processes of speech production and speech perception, as they occur in the human nervous system. This topic is based on neuroscience and computational neuroscience.

References

1 2 3 4 Liberman, A. M.; Cooper, F. S.; Shankweiler, D. P.; Studdert-Kennedy, M. (1967). "Perception of the speech code". Psychological Review. 74 (6): 431–461. doi:10.1037/h0020279. PMID 4170865.
↑ Liberman, A. M.; Mattingly, I. G. (1985). "The motor theory of speech perception revised". Cognition. 21 (1): 1–36. CiteSeerX 10.1.1.330.220 . doi:10.1016/0010-0277(85)90021-6. PMID 4075760. S2CID 112932.
↑ Liberman, A. M.; Mattingly, I. G. (1989). "A specialization for speech perception". Science. 243 (4890): 489–494. Bibcode:1989Sci...243..489L. doi:10.1126/science.2643163. PMID 2643163.
↑ Liberman, A. M.; Whalen, D. H. (2000). "On the relation of speech to language". Trends in Cognitive Sciences. 4 (5): 187–196. doi:10.1016/S1364-6613(00)01471-6. PMID 10782105. S2CID 12252728.
1 2 3 4 5 6 7 Galantucci, B.; Fowler, C. A.; Turvey, M. T. (2006). "The motor theory of speech perception reviewed". Psychonomic Bulletin & Review. 13 (3): 361–377. doi:10.3758/bf03193857. PMC 2746041 . PMID 17048719.
↑ Liberman, A. M. (1996). Speech: A special code. Cambridge, MA: MIT Press. ISBN 978-0-262-12192-7
↑ Liberman, A. M.; Delattre, P.; Cooper, F. S. (1952). "The role of selected stimulus-variables in the perception of the unvoiced stop consonants". The American Journal of Psychology. 65 (4): 497–516. doi:10.2307/1418032. JSTOR 1418032. PMID 12996688.
1 2 3 Liberman, A. M.; Delattre, P. C.; Cooper, F. S.; Gerstman, L. J. (1954). "The role of consonant-vowel transitions in the perception of the stop and nasal consonants". Psychological Monographs: General and Applied. 68 (8): 1–13. doi:10.1037/h0093673. PDF Archived 2016-03-03 at the Wayback Machine
↑ Fowler, C. A.; Saltzman, E. (1993). "Coordination and coarticulation in speech production". Language and Speech. 36 ( Pt 2-3) (2–3): 171–195. doi:10.1177/002383099303600304. PMID 8277807. S2CID 7199908. PDF
↑ Liberman, A. M.; Isenberg, D.; Rakerd, B. (1981). "Duplex perception of cues for stop consonants: Evidence for a phonetic mode". Perception & Psychophysics. 30 (2): 133–143. doi: 10.3758/bf03204471 . PMID 7301513.
↑ Liberman, A. M. (1970). "The grammars of speech and language" (PDF). Cognitive Psychology. 1 (4): 301–323. doi:10.1016/0010-0285(70)90018-6. Archived from the original (PDF) on 2015-12-31. Retrieved 2009-06-02.
↑ Liberman, A. M.; Mattingly, I. G. (1985). "The motor theory of speech perception revised" (PDF). Cognition. 21 (1): 1–36. CiteSeerX 10.1.1.330.220 . doi:10.1016/0010-0277(85)90021-6. PMID 4075760. S2CID 112932. Archived from the original (PDF) on 2021-04-15. Retrieved 2009-06-02.
1 2 Fowler, C. A.; Rosenblum, L. D. (1990). "Duplex perception: A comparison of monosyllables and slamming doors". Journal of Experimental Psychology. Human Perception and Performance. 16 (4): 742–754. doi:10.1037/0096-1523.16.4.742. PMID 2148589.
↑ Massaro, D. W.; Chen, T. H. (2008). "The motor theory of speech perception revisited". Psychonomic Bulletin & Review. 15 (2): 453–457, discussion 457–62. doi:10.3758/pbr.15.2.453. PMID 18488668. S2CID 9266946.
↑ MacLeod, A.; Summerfield, Q. (1987). "Quantifying the contribution of vision to speech perception in noise". British Journal of Audiology. 21 (2): 131–141. doi:10.3109/03005368709077786. PMID 3594015.
↑ Fowler, C. A.; Dekle, D. J. (1991). "Listening with eye and hand: Cross-modal contributions to speech perception". Journal of Experimental Psychology. Human Perception and Performance. 17 (3): 816–828. doi:10.1037/0096-1523.17.3.816. PMID 1834793.
↑ Nygaard LC, Pisoni DB (1995). "Speech Perception: New Directions in Research and Theory". In J.L. Miller, P.D. Eimas (eds.). Handbook of Perception and Cognition: Speech, Language, and Communication. San Diego: Academic Press. ISBN 978-0-12-497770-9.
↑ Liberman, A. M.; Harris, K. S.; Hoffman, H. S.; Griffith, B. C. (1957). "The discrimination of speech sounds within and across phoneme boundaries". Journal of Experimental Psychology. 54 (5): 358–368. doi:10.1037/h0044417. PMID 13481283. S2CID 10117886.
↑ Marslen-Wilson, W. (1973). "Linguistic structure and speech shadowing at very short latencies". Nature. 244 (5417): 522–523. Bibcode:1973Natur.244..522M. doi:10.1038/244522a0. PMID 4621131. S2CID 4220775.
↑ Porter Jr, R. J.; Lubker, J. F. (1980). "Rapid reproduction of vowel-vowel sequences: Evidence for a fast and direct acoustic-motoric linkage in speech". Journal of Speech and Hearing Research. 23 (3): 593–602. doi:10.1044/jshr.2303.593. PMID 7421161.
↑ Fadiga, L.; Craighero, L.; Buccino, G.; Rizzolatti, G. (2002). "Speech listening specifically modulates the excitability of tongue muscles: A TMS study". The European Journal of Neuroscience. 15 (2): 399–402. CiteSeerX 10.1.1.169.4261 . doi:10.1046/j.0953-816x.2001.01874.x. PMID 11849307. S2CID 16504172.
↑ Watkins, K. E.; Strafella, A. P.; Paus, T. (2003). "Seeing and hearing speech excites the motor system involved in speech production". Neuropsychologia. 41 (8): 989–994. doi:10.1016/s0028-3932(02)00316-0. PMID 12667534. S2CID 518384.
↑ Wilson, S. M.; Saygin, A. E. P.; Sereno, M. I.; Iacoboni, M. (2004). "Listening to speech activates motor areas involved in speech production". Nature Neuroscience. 7 (7): 701–702. doi:10.1038/nn1263. PMID 15184903. S2CID 8080063.
↑ Skipper, J. I.; Van Wassenhove, V.; Nusbaum, H. C.; Small, S. L. (2006). "Hearing Lips and Seeing Voices: How Cortical Areas Supporting Speech Production Mediate Audiovisual Speech Perception". Cerebral Cortex. 17 (10): 2387–2399. doi:10.1093/cercor/bhl147. PMC 2896890 . PMID 17218482.
↑ Meister, I. G.; Wilson, S. M.; Deblieck, C.; Wu, A. D.; Iacoboni, M. (2007). "The Essential Role of Premotor Cortex in Speech Perception". Current Biology. 17 (19): 1692–1696. doi:10.1016/j.cub.2007.08.064. PMC 5536895 . PMID 17900904.
↑ Pulvermuller, F.; Huss, M.; Kherif, F.; Moscoso del Prado Martin F; Hauk, O.; Shtyrov, Y. (2006). "Motor cortex maps articulatory features of speech sounds". Proceedings of the National Academy of Sciences. 103 (20): 7865–7870. Bibcode:2006PNAS..103.7865P. doi: 10.1073/pnas.0509989103 . PMC 1472536 . PMID 16682637.
↑ d'Ausilio, A.; Pulvermüller, F.; Salmas, P.; Bufalari, I.; Begliomini, C.; Fadiga, L. (2009). "The Motor Somatotopy of Speech Perception". Current Biology. 19 (5): 381–385. doi: 10.1016/j.cub.2009.01.017 . hdl: 11392/534437 . PMID 19217297.
↑ Assaneo, M. Florencia; Poeppel, David (2018). "The coupling between auditory and motor cortices is rate-restricted: Evidence for an intrinsic speech-motor rhythm". Science Advances. 4 (2): eaao3842. Bibcode:2018SciA....4.3842A. doi:10.1126/sciadv.aao3842. PMC 5810610 . PMID 29441362.
↑ Rizzolatti, G.; Craighero, L. (2004). "The Mirror-Neuron System". Annual Review of Neuroscience. 27: 169–192. doi:10.1146/annurev.neuro.27.070203.144230. PMID 15217330. S2CID 1729870. PDF Archived 2007-06-30 at the Wayback Machine
↑ Hommel, B.; Müsseler, J.; Aschersleben, G.; Prinz, W. (2001). "The Theory of Event Coding (TEC): A framework for perception and action planning". The Behavioral and Brain Sciences. 24 (5): 849–878, discussion 878–937. doi:10.1017/s0140525x01000103. PMID 12239891.
1 2 Massaro, D. W. (1997). Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, MA: MIT Press. ISBN 978-0-262-13337-1.
↑ Lane, H (1965). "The Motor Theory of Speech Perception: A Critical Review". Psychological Review. 72 (4): 275–309. doi:10.1037/h0021986. PMID 14348425.
↑ Tsao, F. M.; Liu, H. M.; Kuhl, P. K. (2004). "Speech perception in infancy predicts language development in the second year of life: A longitudinal study". Child Development. 75 (4): 1067–84. doi:10.1111/j.1467-8624.2004.00726.x. PMID 15260865. S2CID 10954073.
↑ MacNeilage, P. F.; Rootes, T. P.; Chase, R. A. (1967). "Speech production and perception in a patient with severe impairment of somesthetic perception and motor control". Journal of Speech and Hearing Research. 10 (3): 449–67. doi:10.1044/jshr.1003.449. PMID 6081929.
1 2 Hickok, G.; Poeppel, D. (2007). "The cortical organization of speech processing". Nature Reviews Neuroscience. 8 (5): 393–402. doi:10.1038/nrn2113. PMID 17431404. S2CID 6199399. See page 394
↑ Williams, H.; Nottebohm, F. (1985). "Auditory responses in avian vocal motor neurons: A motor theory for song perception in birds". Science. 229 (4710): 279–282. Bibcode:1985Sci...229..279W. doi:10.1126/science.4012321. PMID 4012321.

External links

Haskins Laboratories Archived 2019-05-09 at the Wayback Machine
Source of pdfs upon the motor theory of speech perception Archived 2009-05-04 at the Wayback Machine

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Liberman67-1] 1 2 3 4 Liberman, A. M.; Cooper, F. S.; Shankweiler, D. P.; Studdert-Kennedy, M. (1967). "Perception of the speech code". Psychological Review. 74 (6): 431–461. doi:10.1037/h0020279. PMID 4170865.

[2] Liberman, A. M.; Mattingly, I. G. (1985). "The motor theory of speech perception revised". Cognition. 21 (1): 1–36. CiteSeerX 10.1.1.330.220 . doi:10.1016/0010-0277(85)90021-6. PMID 4075760. S2CID 112932.

[3] Liberman, A. M.; Mattingly, I. G. (1989). "A specialization for speech perception". Science. 243 (4890): 489–494. Bibcode:1989Sci...243..489L. doi:10.1126/science.2643163. PMID 2643163.

[Liberman00-4] Liberman, A. M.; Whalen, D. H. (2000). "On the relation of speech to language". Trends in Cognitive Sciences. 4 (5): 187–196. doi:10.1016/S1364-6613(00)01471-6. PMID 10782105. S2CID 12252728.

[Galantucci-5] 1 2 3 4 5 6 7 Galantucci, B.; Fowler, C. A.; Turvey, M. T. (2006). "The motor theory of speech perception reviewed". Psychonomic Bulletin & Review. 13 (3): 361–377. doi:10.3758/bf03193857. PMC 2746041 . PMID 17048719.

[6] Liberman, A. M. (1996). Speech: A special code. Cambridge, MA: MIT Press. ISBN 978-0-262-12192-7

[7] Liberman, A. M.; Delattre, P.; Cooper, F. S. (1952). "The role of selected stimulus-variables in the perception of the unvoiced stop consonants". The American Journal of Psychology. 65 (4): 497–516. doi:10.2307/1418032. JSTOR 1418032. PMID 12996688.

[Liberman54-8] 1 2 3 Liberman, A. M.; Delattre, P. C.; Cooper, F. S.; Gerstman, L. J. (1954). "The role of consonant-vowel transitions in the perception of the stop and nasal consonants". Psychological Monographs: General and Applied. 68 (8): 1–13. doi:10.1037/h0093673. PDF Archived 2016-03-03 at the Wayback Machine

[9] Fowler, C. A.; Saltzman, E. (1993). "Coordination and coarticulation in speech production". Language and Speech. 36 ( Pt 2-3) (2–3): 171–195. doi:10.1177/002383099303600304. PMID 8277807. S2CID 7199908. PDF

[LibermanDuplex-10] Liberman, A. M.; Isenberg, D.; Rakerd, B. (1981). "Duplex perception of cues for stop consonants: Evidence for a phonetic mode". Perception & Psychophysics. 30 (2): 133–143. doi: 10.3758/bf03204471 . PMID 7301513.

[lib70-11] Liberman, A. M. (1970). "The grammars of speech and language" (PDF). Cognitive Psychology. 1 (4): 301–323. doi:10.1016/0010-0285(70)90018-6. Archived from the original (PDF) on 2015-12-31. Retrieved 2009-06-02.

[lib85-12] Liberman, A. M.; Mattingly, I. G. (1985). "The motor theory of speech perception revised" (PDF). Cognition. 21 (1): 1–36. CiteSeerX 10.1.1.330.220 . doi:10.1016/0010-0277(85)90021-6. PMID 4075760. S2CID 112932. Archived from the original (PDF) on 2021-04-15. Retrieved 2009-06-02.

[Duplex-13] 1 2 Fowler, C. A.; Rosenblum, L. D. (1990). "Duplex perception: A comparison of monosyllables and slamming doors". Journal of Experimental Psychology. Human Perception and Performance. 16 (4): 742–754. doi:10.1037/0096-1523.16.4.742. PMID 2148589.

[Massaro01-14] Massaro, D. W.; Chen, T. H. (2008). "The motor theory of speech perception revisited". Psychonomic Bulletin & Review. 15 (2): 453–457, discussion 457–62. doi:10.3758/pbr.15.2.453. PMID 18488668. S2CID 9266946.

[15] MacLeod, A.; Summerfield, Q. (1987). "Quantifying the contribution of vision to speech perception in noise". British Journal of Audiology. 21 (2): 131–141. doi:10.3109/03005368709077786. PMID 3594015.

[16] Fowler, C. A.; Dekle, D. J. (1991). "Listening with eye and hand: Cross-modal contributions to speech perception". Journal of Experimental Psychology. Human Perception and Performance. 17 (3): 816–828. doi:10.1037/0096-1523.17.3.816. PMID 1834793.

[np-17] Nygaard LC, Pisoni DB (1995). "Speech Perception: New Directions in Research and Theory". In J.L. Miller, P.D. Eimas (eds.). Handbook of Perception and Cognition: Speech, Language, and Communication. San Diego: Academic Press. ISBN 978-0-12-497770-9.

[lib57-18] Liberman, A. M.; Harris, K. S.; Hoffman, H. S.; Griffith, B. C. (1957). "The discrimination of speech sounds within and across phoneme boundaries". Journal of Experimental Psychology. 54 (5): 358–368. doi:10.1037/h0044417. PMID 13481283. S2CID 10117886.

[19] Marslen-Wilson, W. (1973). "Linguistic structure and speech shadowing at very short latencies". Nature. 244 (5417): 522–523. Bibcode:1973Natur.244..522M. doi:10.1038/244522a0. PMID 4621131. S2CID 4220775.

[Porter-20] Porter Jr, R. J.; Lubker, J. F. (1980). "Rapid reproduction of vowel-vowel sequences: Evidence for a fast and direct acoustic-motoric linkage in speech". Journal of Speech and Hearing Research. 23 (3): 593–602. doi:10.1044/jshr.2303.593. PMID 7421161.

[21] Fadiga, L.; Craighero, L.; Buccino, G.; Rizzolatti, G. (2002). "Speech listening specifically modulates the excitability of tongue muscles: A TMS study". The European Journal of Neuroscience. 15 (2): 399–402. CiteSeerX 10.1.1.169.4261 . doi:10.1046/j.0953-816x.2001.01874.x. PMID 11849307. S2CID 16504172.

[22] Watkins, K. E.; Strafella, A. P.; Paus, T. (2003). "Seeing and hearing speech excites the motor system involved in speech production". Neuropsychologia. 41 (8): 989–994. doi:10.1016/s0028-3932(02)00316-0. PMID 12667534. S2CID 518384.

[23] Wilson, S. M.; Saygin, A. E. P.; Sereno, M. I.; Iacoboni, M. (2004). "Listening to speech activates motor areas involved in speech production". Nature Neuroscience. 7 (7): 701–702. doi:10.1038/nn1263. PMID 15184903. S2CID 8080063.

[24] Skipper, J. I.; Van Wassenhove, V.; Nusbaum, H. C.; Small, S. L. (2006). "Hearing Lips and Seeing Voices: How Cortical Areas Supporting Speech Production Mediate Audiovisual Speech Perception". Cerebral Cortex. 17 (10): 2387–2399. doi:10.1093/cercor/bhl147. PMC 2896890 . PMID 17218482.

[25] Meister, I. G.; Wilson, S. M.; Deblieck, C.; Wu, A. D.; Iacoboni, M. (2007). "The Essential Role of Premotor Cortex in Speech Perception". Current Biology. 17 (19): 1692–1696. doi:10.1016/j.cub.2007.08.064. PMC 5536895 . PMID 17900904.

[26] Pulvermuller, F.; Huss, M.; Kherif, F.; Moscoso del Prado Martin F; Hauk, O.; Shtyrov, Y. (2006). "Motor cortex maps articulatory features of speech sounds". Proceedings of the National Academy of Sciences. 103 (20): 7865–7870. Bibcode:2006PNAS..103.7865P. doi: 10.1073/pnas.0509989103 . PMC 1472536 . PMID 16682637.

[27] 'Ausilio, A.; Pulvermüller, F.; Salmas, P.; Bufalari, I.; Begliomini, C.; Fadiga, L. (2009). "The Motor Somatotopy of Speech Perception". Current Biology. 19 (5): 381–385. doi: 10.1016/j.cub.2009.01.017 . hdl: 11392/534437 . PMID 19217297.

[28] Assaneo, M. Florencia; Poeppel, David (2018). "The coupling between auditory and motor cortices is rate-restricted: Evidence for an intrinsic speech-motor rhythm". Science Advances. 4 (2): eaao3842. Bibcode:2018SciA....4.3842A. doi:10.1126/sciadv.aao3842. PMC 5810610 . PMID 29441362.

[29] Rizzolatti, G.; Craighero, L. (2004). "The Mirror-Neuron System". Annual Review of Neuroscience. 27: 169–192. doi:10.1146/annurev.neuro.27.070203.144230. PMID 15217330. S2CID 1729870. PDF Archived 2007-06-30 at the Wayback Machine

[30] Hommel, B.; Müsseler, J.; Aschersleben, G.; Prinz, W. (2001). "The Theory of Event Coding (TEC): A framework for perception and action planning". The Behavioral and Brain Sciences. 24 (5): 849–878, discussion 878–937. doi:10.1017/s0140525x01000103. PMID 12239891.

[Massaro02-31] 1 2 Massaro, D. W. (1997). Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, MA: MIT Press. ISBN 978-0-262-13337-1.

[32] Lane, H (1965). "The Motor Theory of Speech Perception: A Critical Review". Psychological Review. 72 (4): 275–309. doi:10.1037/h0021986. PMID 14348425.

[33] Tsao, F. M.; Liu, H. M.; Kuhl, P. K. (2004). "Speech perception in infancy predicts language development in the second year of life: A longitudinal study". Child Development. 75 (4): 1067–84. doi:10.1111/j.1467-8624.2004.00726.x. PMID 15260865. S2CID 10954073.

[34] MacNeilage, P. F.; Rootes, T. P.; Chase, R. A. (1967). "Speech production and perception in a patient with severe impairment of somesthetic perception and motor control". Journal of Speech and Hearing Research. 10 (3): 449–67. doi:10.1044/jshr.1003.449. PMID 6081929.

[Hickok07-35] 1 2 Hickok, G.; Poeppel, D. (2007). "The cortical organization of speech processing". Nature Reviews Neuroscience. 8 (5): 393–402. doi:10.1038/nrn2113. PMID 17431404. S2CID 6199399. See page 394

[36] Williams, H.; Nottebohm, F. (1985). "Auditory responses in avian vocal motor neurons: A motor theory for song perception in birds". Science. 229 (4710): 279–282. Bibcode:1985Sci...229..279W. doi:10.1126/science.4012321. PMID 4012321.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]