Duplex perception

Last updated

Duplex perception refers to the linguistic phenomenon whereby "part of the acoustic signal is used for both a speech and a nonspeech percept." [1] A listener is presented with two simultaneous, dichotic stimuli. One ear receives an isolated third-formant transition that sounds like a nonspeech chirp. At the same time the other ear receives a base syllable. This base syllable consists of the first two formants, complete with formant transitions, and the third formant without a transition. Normally, there would be peripheral masking in such a binaural listening task but this does not occur. Instead, the listener's percept is duplex, that is, the completed syllable is perceived and the nonspeech chirp is heard at the same time. This is interpreted as being due to the existence of a special speech module.

The phenomenon was discovered in 1974 by Timothy C. Rand at the Haskins Laboratories associated with Yale University. [2]

Duplex perception was argued as evidence for the existence of distinct systems for general auditory perception and speech perception. [3] It is also notable that this same phenomenon can be obtained with slamming doors. [4]

See also

Related Research Articles

Absolute pitch (AP), often called perfect pitch, is a rare ability of a person to identify or re-create a given musical note without the benefit of a reference tone. AP may be demonstrated using linguistic labeling, associating mental imagery with the note, or sensorimotor responses. For example, an AP possessor can accurately reproduce a heard tone on a musical instrument without "hunting" for the correct pitch.

The octave illusion is an auditory illusion discovered by Diana Deutsch in 1973. It is produced when two tones that are an octave apart are repeatedly played in alternation ("high-low-high-low") through stereo headphones. The same sequence is played to both ears simultaneously; however when the right ear receives the high tone, the left ear receives the low tone, and conversely. Instead of hearing two alternating pitches, most subjects instead hear a single tone that alternates between ears while at the same time its pitch alternates between high and low.

The McGurk effect is a perceptual phenomenon that demonstrates an interaction between hearing and vision in speech perception. The illusion occurs when the auditory component of one sound is paired with the visual component of another sound, leading to the perception of a third sound. The visual information a person gets from seeing a person speak changes the way they hear the sound. If a person is getting poor quality auditory information but good quality visual information, they may be more likely to experience the McGurk effect. Integration abilities for audio and visual information may also influence whether a person will experience the effect. People who are better at sensory integration have been shown to be more susceptible to the effect. Many people are affected differently by the McGurk effect based on many factors, including brain damage and other disorders.

Sound localization is a listener's ability to identify the location or origin of a detected sound in direction and distance.

Cocktail party effect Ability of the brain to focus on a single auditory stimulus by filtering out background noise

The cocktail party effect is the phenomenon of the brain's ability to focus one's auditory attention on a particular stimulus while filtering out a range of other stimuli, such as when a partygoer can focus on a single conversation in a noisy room. Listeners have the ability to both segregate different stimuli into different streams, and subsequently decide which streams are most pertinent to them. Thus, it has been proposed that one's sensory memory subconsciously parses all stimuli and identifies discrete pieces of information by classifying them by salience. This effect is what allows most people to "tune into" a single voice and "tune out" all others. This phenomenon is often described in terms of "selective attention" or "selective hearing". It may also describe a similar phenomenon that occurs when one may immediately detect words of importance originating from unattended stimuli, for instance hearing one's name among a wide range of auditory input.

In speech communication, intelligibility is a measure of how comprehensible speech is in given conditions. Intelligibility is affected by the level and quality of the speech signal, the type and level of background noise, reverberation, and, for speech over communication devices, the properties of the communication system. A common standard measurement for the quality of the intelligibility of speech is the Speech Transmission Index (STI). The concept of speech intelligibility is relevant to several fields, including phonetics, human factors, acoustical engineering, and audiometry.

Speech perception is the process by which the sounds of language are heard, interpreted, and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

Dichotic pitch is a pitch heard due to binaural processing, when the brain combines two noises presented simultaneously to the ears. In other words, it cannot be heard when the sound stimulus is presented monaurally but, when it is presented binaurally a sensation of a pitch can be heard. The binaural stimulus is presented to both ears through headphones simultaneously, and is the same in several respects except for a narrow frequency band that is manipulated. The most common variation is the Huggins Pitch, which presents white-noise that only differ in the interaural phase relation over a narrow range of frequencies. For humans, this phenomenon is restricted to fundamental frequencies lower than 330 Hz and extremely low sound pressure levels. Experts investigate the effects of the dichotic pitch on the brain. For instance, there are studies that suggested it evokes activation at the lateral end of Heschl's gyrus.

Japanese has one liquid phoneme, realized usually as an apico-alveolar tap and sometimes as an alveolar lateral approximant. English has two: rhotic and lateral, with varying phonetic realizations centered on the postalveolar approximant and on the alveolar lateral approximant, respectively. Japanese speakers who learn English as a second language later than childhood often have difficulty in hearing and producing the and of English accurately.

Speech shadowing is a psycholinguistic experimental technique in which subjects repeat speech at a delay to the onset of hearing the phrase. The time between hearing the speech and responding, is how long the brain takes to process and produce speech. The task instructs participants to shadow speech, which generates intent to reproduce the phrase while motor regions in the brain unconsciously process the syntax and semantics of the words spoken. Words repeated during the shadowing task would also imitate the parlance of the shadowed speech.

The motor theory of speech perception is the hypothesis that people perceive spoken words by identifying the vocal tract gestures with which they are pronounced rather than by identifying the sound patterns that speech generates. It originally claimed that speech perception is done through a specialized module that is innate and human-specific. Though the idea of a module has been qualified in more recent versions of the theory, the idea remains that the role of the speech motor system is not only to produce speech articulations but also to detect them.

Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sounds. More specifically, it is the branch of science studying the psychological responses associated with sound. Psychoacoustics is an interdisciplinary field of many areas, including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science.

Dichotic listening is a psychological test commonly used to investigate selective attention and the lateralization of brain function within the auditory system. It is used within the fields of cognitive psychology and neuroscience.

Phonemic restoration effect is a perceptual phenomenon where under certain conditions, sounds actually missing from a speech signal can be restored by the brain and may appear to be heard. The effect occurs when missing phonemes in an auditory signal are replaced with a noise that would have the physical properties to mask those phonemes, creating an ambiguity. In such ambiguity, the brain tends towards filling in absent phonemes. The effect can be so strong that some listeners may not even notice that there are phonemes missing. This effect is commonly observed in a conversation with heavy background noise, making it difficult to properly hear every phoneme being spoken. Different factors can change the strength of the effect, including how rich the context or linguistic cues are in speech, as well as the listener's state, such as their hearing status or age.

Monita Chatterjee is an auditory scientist and the Director of the Auditory Prostheses & Perception Laboratory at Boys Town National Research Hospital. She investigates the basic mechanisms underlying auditory processing by cochlear implant listeners.

Temporal envelope (ENV) and temporal fine structure (TFS) are changes in the amplitude and frequency of sound perceived by humans over time. These temporal changes are responsible for several aspects of auditory perception, including loudness, pitch and timbre perception and spatial hearing.

Brian Moore (scientist)

Brian C.J. Moore FmedSci, FRS is an Emeritus Professor of Auditory Perception in the University of Cambridge and an Emeritus Fellow of Wolfson College, Cambridge. His research focuses on psychoacoustics, audiology, and the development and assessment of hearing aids.

Auditory science or hearing science is a field of research and education concerning the perception of sounds by humans, animals, or machines. It is a heavily interdisciplinary field at the crossroad between acoustics, neuroscience, and psychology. It is often related to one or many of these other fields: psychophysics, psychoacoustics, audiology, physiology, otorhinolaryngology, speech science, automatic speech recognition, music psychology, linguistics, and psycholinguistics.

Binaural unmasking is phenomenon of auditory perception discovered by Ira Hirsh. In binaural unmasking, the brain combines information from the two ears in order to improve signal detection and identification in noise. The phenomenon is most commonly observed when there is a difference between the interaural phase of the signal and the interaural phase of the noise. When such a difference is present there is an improvement in masking threshold compared to a reference situation in which the interaural phases are the same, or when the stimulus has been presented monaurally. Those two cases usually give very similar thresholds. The size of the improvement is known as the "binaural masking level difference" (BMLD), or simply as the "masking level difference".

The speech-to-song illusion is an auditory illusion discovered by Diana Deutsch in 1995. A spoken phrase is repeated several times, without altering it in any way, and without providing any context. This repetition causes the phrase to transform perceptually from speech into song.

References

  1. Vorperian, H. K.; Ochs; Grantham, D. W. (1995). "Stimulus intensity and fundamental frequency effects on duplex perception". Journal of the Acoustical Society of America. 98: 734–744. doi:10.1121/1.413567.
  2. Rand, T. C. (1974). "Letter: Dichotic release from masking for speech". The Journal of the Acoustical Society of America. 55 (3): 678–680. doi:10.1121/1.1914584. PMID   4819869.
  3. Liberman, A. M.; Isenberg, D.; Rakerd, B. (1981). "Duplex perception of cues for stop consonants: Evidence for a phonetic mode". Perception & Psychophysics. 30 (2): 133–143. doi: 10.3758/bf03204471 . PMID   7301513.
  4. Fowler, C. A.; Rosenblum, L. D. (1990). "Duplex perception: A comparison of monosyllables and slamming doors". Journal of Experimental Psychology: Human Perception and Performance. 16 (4): 742–754. doi:10.1037/0096-1523.16.4.742. PMID   2148589.