Multistable auditory perception

Last updated

Multistable auditory perception is a cognitive phenomenon in which certain auditory stimuli can be perceived in multiple ways. While multistable perception has been most commonly studied in the visual domain, it also has been observed in the auditory and olfactory modalities. In the olfactory domain, different scents are piped to the two nostrils, [1] while in the auditory domain, researchers often examine the effects of binaural sequences of pure tones. Generally speaking, multistable perception has three main characteristics: exclusivity, implying that the multiple perceptions cannot simultaneously occur; randomness, indicating that the duration of perceptual phases follows a random law, and inevitability, meaning that subjects are unable to completely block out one percept indefinitely. [2]

Contents

History

While binocular rivalry has been studied since the 16th century, the study of multistable auditory perception is relatively new. [3] Diana Deutsch was the first to discover multistability in human auditory perception, in the form of auditory illusions involving periodically oscillating tones. [4]

Experimental Findings

Different experimental paradigms have since been used to study multistable perception in the auditory modality. One is auditory stream segregation, in which two different frequencies are presented in a temporal pattern. Listeners experience alternating percepts: one percept is of a single stream fluctuating between frequencies, and the alternative percept is of two separate streams repeating single frequencies each.

Other experimental findings demonstrate the verbal transformation effect. In this paradigm, the input is a speech form repeated rapidly and continuously. The alternating percepts here are words—for example, continuous repetition of the word “life” results in the bistability of “life” and “fly.” Prefrontal activation is implicated with such fluctuations in percept, and not with changes in the physical stimulus, and there is also a possible inverse relationship between left inferior frontal and cingulate activation involved in this percept alternation. [5]

Principles of Perceptual Bistability

The temporal dynamics observed in auditory stream segregation are similar to those of bistable visual perception, suggesting that the mechanisms mediating multistable perception, the alternating dominance and suppression of multiple competing interpretations of ambiguous sensory input, might be shared across modalities. Pressnitzer and Hupe analyzed results of an auditory streaming experiment and demonstrated that the perceptual experience that occurred exhibited all three properties of multistable perception found in the visual modality—exclusivity, randomness, and inevitability. [6]

Exclusivity was satisfied, as there was “spontaneous alternation between mutually exclusive percepts,” and very little time was spent in an “indeterminate” experience. Randomness also characterized the phenomenon, as the first phase of perception is longer in duration than subsequent phases, and then the “steady-state of the temporal dynamics of auditory streaming is purely stochastic with no long-term trend.” Lastly, the percept alternation was inevitable; even though volitional control did reduce suppression of the specified percept, it did not exclude perception of the alternative percept altogether. These similarities between perceptual bistability in the visual and auditory modalities raise the possibility of a common mechanism governing the phenomenon. In Pressnitzer and Hupe's subjects, the distributions of phase durations in the two modalities were not significantly different, and it has been speculated that the intraparietal sulcus, likely involved in crossmodal integration, could be responsible for bistability in both domains. However, the absence of subject-specific biases across the modalities contradicts the notion that a “single top-down selection mechanism were the sole determinant of the auditory and visual bistability.” This observation, along with evidence of neural correlates at different stages of processing, instead suggests that competition is distributed and “based on adaptation and mutual inhibition, at multiple neural processing stages.” [6]

Neural Correlates

Place model

When using a two stream tone test, specific populations of neurons activate, known as the place model. Event related potential (ERP) amplitude increases when the difference of the frequency of the two tones increase. This model hypothesizes that when this is happening, the distance between the two populations of neurons increase, so that the two populations will interact less with each other, allowing for easier tone segregation.

fMRI results

FMRI has been used to measure the correlation between listening to alternating tones compared to single stream of tones. The posterior regions of the left auditory cortex were modulated by the alternating tones, indicating that there may be areas of the brains responsible for stream segregation.

Theoretical View

Sequential grouping

A problem of large behavioral importance is the question of how to group auditory stimuli. When a continuous stream of auditory information is received, numerous alternative interpretations are possible, but individuals are only consciously aware of one percept at a time. For this to occur, the auditory system must segregate and group incoming sounds, the goal being to “construct, modify, and maintain dynamic representations of putative objects within its environment”. [7] It has been suggested that this process of binding sound events into groups is driven by different levels of similarities. One principle for binding is based on the perceptual similarity between individual events. Sounds that share many or all of their acoustic features are more likely to have been emitted by the same source, and thus are more likely to be linked to form a “proto-object”. [7] The other principle for binding is based on the sequential predictability of sound events. If events reliably follow each other, it is also more likely that they have a common underlying cause.

Competition

A theory explaining the alternation of auditory percepts is that different interpretations are neurally represented simultaneously, but all but the dominant one at the time are suppressed. This idea of competition among parallel hypotheses might provide an explanation for the temporal dynamics observed in auditory stream segregation. The initial perceptual phase is held longer than the subsequent ones, “with the duration of the first phase being stimulus-parameter dependent and an order of magnitude longer in duration than parameter-independent subsequent phases”. [8] At stimulus onset, the first percept might be that which is easiest to discover, based on featural proximity (and thus stimulus-parameter dependent), and it is held for relatively longer because time is required for other hypotheses to form. As more sensory information is received and processed, the “neural associations underlying the alternative sound organizations become strong and start to vie for dominance” and “the probabilities of perceiving different organizations tend to become more balanced with time”. [7]

Related Research Articles

<span class="mw-page-title-main">Perception</span> Interpretation of sensory information

Perception is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information or environment. All perception involves signals that go through the nervous system, which in turn result from physical or chemical stimulation of the sensory system. Vision involves light striking the retina of the eye; smell is mediated by odor molecules; and hearing involves pressure waves.

An illusion is a distortion of the senses, which can reveal how the mind normally organizes and interprets sensory stimulation. Although illusions distort our perception of reality, they are generally shared by most people.

Pitch (music) Perceptual property in music ordering sounds from low to high

Pitch is a perceptual property of sounds that allows their ordering on a frequency-related scale, or more commonly, pitch is the quality that makes it possible to judge sounds as "higher" and "lower" in the sense associated with musical melodies. Pitch is a major auditory attribute of musical tones, along with duration, loudness, and timbre.

Stimulus modality, also called sensory modality, is one aspect of a stimulus or what is perceived after a stimulus. For example, the temperature modality is registered after heat or cold stimulate a receptor. Some sensory modalities include: light, sound, temperature, taste, pressure, and smell. The type and location of the sensory receptor activated by the stimulus plays the primary role in coding the sensation. All sensory modalities work together to heighten stimuli sensation when necessary.

Iconic memory is the visual sensory memory register pertaining to the visual domain and a fast-decaying store of visual information. It is a component of the visual memory system which also includes visual short-term memory (VSTM) and long-term memory (LTM). Iconic memory is described as a very brief, pre-categorical, high capacity memory store. It contributes to VSTM by providing a coherent representation of our entire visual perception for a very brief period of time. Iconic memory assists in accounting for phenomena such as change blindness and continuity of experience during saccades. Iconic memory is no longer thought of as a single entity but instead, is composed of at least two distinctive components. Classic experiments including Sperling's partial report paradigm as well as modern techniques continue to provide insight into the nature of this SM store.

Figure–ground (perception)

Figure–ground organization is a type of perceptual grouping that is a vital necessity for recognizing objects through vision. In Gestalt psychology it is known as identifying a figure from the background. For example, black words on a printed paper are seen as the "figure", and the white sheet as the "background".

Multistability is a scientific phenomenon unique to multicellular living systems, which allows multiple steady-state equilibrium points to exist. These stable states are alternated between, through periods of instability, as a single, final perception is derived from physical stimuli1.

Multistable perception is a perceptual phenomenon in which an observer experiences an unpredictable sequence of spontaneous subjective changes. While usually associated with visual perception, multistable perception can also be experienced with auditory and olfactory percepts.

Multisensory integration, also known as multimodal integration, is the study of how information from the different sensory modalities may be integrated by the nervous system. A coherent representation of objects combining modalities enables animals to have meaningful perceptual experiences. Indeed, multisensory integration is central to adaptive behavior because it allows animals to perceive a world of coherent perceptual entities. Multisensory integration also deals with how different sensory modalities interact with one another and alter each other's processing.

Associative visual agnosia Medical condition

Associative visual agnosia is a form of visual agnosia. It is an impairment in recognition or assigning meaning to a stimulus that is accurately perceived and not associated with a generalized deficit in intelligence, memory, language or attention. The disorder appears to be very uncommon in a "pure" or uncomplicated form and is usually accompanied by other complex neuropsychological problems due to the nature of the etiology. Affected individuals can accurately distinguish the object, as demonstrated by the ability to draw a picture of it or categorize accurately, yet they are unable to identify the object, its features or its functions.

The temporal theory of hearing states that human perception of sound depends on temporal patterns with which neurons respond to sound in the cochlea. Therefore, in this theory, the pitch of a pure tone is determined by the period of neuron firing patterns—either of single neurons, or groups as described by the volley theory. Temporal or timing theory competes with the place theory of hearing, which instead states that pitch is signaled according to the locations of vibrations along the basilar membrane.

The kappa effect or perceptual time dilation is a temporal perceptual illusion that can arise when observers judge the elapsed time between sensory stimuli applied sequentially at different locations. In perceiving a sequence of consecutive stimuli, subjects tend to overestimate the elapsed time between two successive stimuli when the distance between the stimuli is sufficiently large, and to underestimate the elapsed time when the distance is sufficiently small.

Neural coding is a neuroscience field concerned with characterising the hypothetical relationship between the stimulus and the individual or ensemble neuronal responses and the relationship among the electrical activity of the neurons in the ensemble. Based on the theory that sensory and other information is represented in the brain by networks of neurons, it is thought that neurons can encode both digital and analog information.

In perception and psychophysics, auditory scene analysis (ASA) is a proposed model for the basis of auditory perception. This is understood as the process by which the human auditory system organizes sound into perceptually meaningful elements. The term was coined by psychologist Albert Bregman. The related concept in machine perception is computational auditory scene analysis (CASA), which is closely related to source separation and blind signal separation.

Computational auditory scene analysis (CASA) is the study of auditory scene analysis by computational means. In essence, CASA systems are "machine listening" systems that aim to separate mixtures of sound sources in the same way that human listeners do. CASA differs from the field of blind signal separation in that it is based on the mechanisms of the human auditory system, and thus uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem.

Neural correlates of consciousness Neuronal events sufficient for a specific conscious percep

The neural correlates of consciousness (NCC) constitute the minimal set of neuronal events and mechanisms sufficient for a specific conscious percept. Neuroscientists use empirical approaches to discover neural correlates of subjective phenomena; that is, neural changes which necessarily and regularly correlate with a specific experience. The set should be minimal because, under the assumption that the brain is sufficient to give rise to any given conscious experience, the question is which of its components is necessary to produce it.

Auditory spatial attention is a specific form of attention, involving the focusing of auditory perception to a location in space.

In music cognition, melodic fission, is a phenomenon in which one line of pitches is heard as two or more separate melodic lines. This occurs when a phrase contains groups of pitches at two or more distinct registers or with two or more distinct timbres.

Interindividual differences in perception describes the effect that differences in brain structure or factors such as culture, upbringing and environment have on the perception of humans. Interindividual variability is usually regarded as a source of noise for research. However, in recent years, it has become an interesting source to study sensory mechanisms and understand human behavior. With the help of modern neuroimaging methods such as fMRI and EEG, individual differences in perception could be related to the underlying brain mechanisms. This has helped to explain differences in behavior and cognition across the population. Common methods include studying the perception of illusions, as they can effectively demonstrate how different aspects such as culture, genetics and the environment can influence human behavior.

Temporal envelope (ENV) and temporal fine structure (TFS) are changes in the amplitude and frequency of sound perceived by humans over time. These temporal changes are responsible for several aspects of auditory perception, including loudness, pitch and timbre perception and spatial hearing.

References

  1. Zhou, W.; Chen, D. (2009). "Binaral Rivalry Between the Nostrils and in the Cortex". Current Biology. 19 (18): 1561–1565. doi:10.1016/j.cub.2009.07.052. PMC   2901510 . PMID   19699095.
  2. Blake, R.; Logothetis, N. (2002). "Visual Competition". Nature Reviews Neuroscience. 3 (1): 13–21. doi:10.1038/nrn701. PMID   11823801. S2CID   8410171.
  3. Blake, R. (2001). A Primer on Binocular Rivalry, Including Current Controversies. Brain and Mind, 2, 5-38
  4. Deutsch, D. (1974). An auditory illusion. Nature, 251, 307-309
  5. Sterzer, P. Kleinschimdt, A. & Rees, G. (2009). The neural bases of multistable perception. Trends in Cognitive Sciences, Vol.13 No.7, 310-318
  6. 1 2 Pressnitzer, D. & Hupe, J. (2006). Temporal Dynamics of Auditory and Visual Bistability Reveal Common Principles of Organization. Current Biology, 16, 1351–1357
  7. 1 2 3 Winkler, I. Denham, S. Mill. R, Bohm, T. & Bendixen, A. (2012). Multistability in auditory stream segregation: a predictive coding view. Philosophical Transactions of the Royal Society Biological Sciences, 367, 1001–1012
  8. Denham, S. Gyimesi, K. Stefanics, G. & Winkler, I. (2010). The Neurophysiological Bases of Auditory Perception, 477-487