Speech-to-song illusion

Last updated
The phrase 'sometimes behave so strangely' as it appears to be sung after several repetitions Speech-to-Song Illusion.jpg
The phrase ‘sometimes behave so strangely’ as it appears to be sung after several repetitions

The speech-to-song illusion is an auditory illusion discovered by Diana Deutsch in 1995. A spoken phrase is repeated several times, without altering it in any way, and without providing any context. This repetition causes the phrase to transform perceptually from speech into song. [1] [2] Though mostly notable with languages that are non-tone, like English and German, it is possible to happen with tone languages, like Thai and Mandarin. [3] [4]

Contents

Discovery and first experiment

The illusion was discovered by Deutsch in 1995 when she was preparing the spoken commentary on her CD ‘Musical Illusions and Paradoxes'. She had the phrase ‘sometimes behave so strangely' on a loop, and noticed that after it had been repeated several times it appeared to be sung rather than spoken. [1] Later she included this illusion on her CD ‘Phantom Words and other Curiosities' [2] and noted that once the phrase had perceptually morphed into song, it continued to be heard as song when played in the context of the full sentence in which it occurred.

Deutsch, Henthorn, and Lapidis [5] [6] examined the illusion in detail. They showed that when this phrase was heard only once, listeners perceived it as speech, but after several repetitions, they perceived it as song. This perceptual transformation required that the intervening repetitions be exact; it did not occur when they were transposed slightly, or presented with the syllables in jumbled orderings. In addition, when listeners were asked to repeat back the phrase after hearing it once, they repeated it back as speech. Yet when they were asked to repeat back the phrase after hearing it ten times, they repeated it back as song.

Neurological substrates of the illusion

Theories of the neurological substrates of speech and song perception have been based on responses to speech and song stimuli, and these differ in their features. For example, the pitch content within spoken syllables generally changes dynamically, while the pitches of musical notes tend to be stable and the notes tend to be of longer duration. For this reason, theories of the brain substrates of speech and song perception have invoked explanations in terms of the acoustic features involved. [7] Yet in the speech-to-song illusion a phrase is repeated exactly, with no change in its features; however, it can be heard either as speech or as song. For this reason, several studies have explored the brain regions that are involved in the illusion. Increased activation has been found in the frontal and temporal lobes of both hemispheres when the listener was perceiving a repeated spoken phrase as sung rather than spoken. [8] [9] The activated regions included several that other researchers had found to be activated while listening to song. [10] [11]

Speech material conducive to the illusion

Phrases that are marked by syllables with stable pitches and that favor a metrical interpretation tend to be conducive to the illusion. [8] [12] However, the illusion is not enhanced by regular repetitions of the entire phrase. [12] [4] Further, the illusion is stronger for phrases in languages that are more difficult to pronounce [4] and when listeners are unable to understand the language of the utterance. [3]

Listeners who experience the illusion

The speech-to-song illusion occurs in listeners both with and without musical training. [8] [5] [6] [9] [12] [4] [3] It occurs in listeners who speak different languages, including the non-tone languages English, Irish, Catalan, German, Italian, Portuguese, French, Croatian, and Hindi, and the tone languages Thai and Mandarin; however, it is weaker in speakers of tone languages than non-tone languages. [4] [3]

Margulis and Simchi-Gross have reported related illusions in which different types of sound are transformed into music by repetition. Random sequences of tones were heard as more musical when they were looped, [13] and clips consisting of a mix of environmental sounds sounded more musical following repetition. [14] These effects were weaker than that of the original speech-to-song illusion, perhaps because speech and song are particularly intertwined perceptually, and also because the characteristics of the speech producing the original illusion are particularly conducive to a strong effect.

Explanations of the illusion

Repetition is a particularly important characteristic of music, and so provides an important cue that a phrase should be considered as music rather than speech. [15] [16] More specifically, in song, the pitches of vowels are distinctly heard, but in speech they appear watered down. It has been suggested that in speech the neural circuitry underlying pitch perception is somewhat inhibited, enabling the listener to focus attention on consonants and vowels, which are important to verbal meaning. Exact repetition of spoken words may cause this circuitry to become disinhibited, so that pitches are heard more saliently, and so as sung. [16] Indeed, the brain structures that are activated when the illusion occurs correspond largely to those that are activated in response to song. [8] [10] [11]

In addition, several features of a spoken phrase that are likely to occur in song are conducive to the illusion. These include syllables with more stable pitches, and phrases with more regular distributions of accents. [12] Other explanations invoke higher-level musical structure and memory. Listeners are better able to discriminate pitches in repeated rather than unrepeated phrases when the pitches violate the structure Western of tonal music. [17] Long term memory for melodies may also be involved: If the prosodic features of a spoken phrase are similar to those of a well-known melody, the brain circuitries underlying musical pitch patterns and rhythms can be invoked, so that the phrase is heard as song. [16]

Relationship to musical composition

Many composers, including Gesualdo, Monteverdi, and Mussorgsky, have argued that expressivity in music can be derived from inflections in speech, and they have included features of speech in their music. [16] Another relationship was invoked by Steve Reich, in his compositions such as Come Out and It's Gonna Rain . He presented spoken phrases in stereo and looped them, gradually offsetting the sounds from the two sources so as to create musical effects, and these were enhanced as the discrepancy widened. [18] [19] Further, in Reich’s composition Different Trains brief excerpts of speech were embedded in instrumental music so as to bring out their musical quality. [20] Today, much popular music, particularly rap music, consists of chanting rhythmic speech with musical accompaniment.

Related Research Articles

Absolute pitch (AP), often called perfect pitch, is the ability to identify or re-create a given musical note without the benefit of a reference tone. AP may be demonstrated using linguistic labelling, associating mental imagery with the note, or sensorimotor responses. For example, an AP possessor can accurately reproduce a heard tone on a musical instrument without "hunting" for the correct pitch.

<span class="mw-page-title-main">Shepard tone</span> Auditory illusion

A Shepard tone, named after Roger Shepard, is a sound consisting of a superposition of sine waves separated by octaves. When played with the bass pitch of the tone moving upward or downward, it is referred to as the Shepard scale. This creates the auditory illusion of a tone that seems to continually ascend or descend in pitch, yet which ultimately gets no higher or lower.

<span class="mw-page-title-main">Pitch (music)</span> Perceptual property in music ordering sounds from low to high

Pitch is a perceptual property of sounds that allows their ordering on a frequency-related scale, or more commonly, pitch is the quality that makes it possible to judge sounds as "higher" and "lower" in the sense associated with musical melodies. Pitch is a major auditory attribute of musical tones, along with duration, loudness, and timbre.

Auditory illusions are false perceptions of a real sound or outside stimulus. These false perceptions are the equivalent of an optical illusion: the listener hears either sounds which are not present in the stimulus, or sounds that should not be possible given the circumstance on how they were created.

<span class="mw-page-title-main">Missing fundamental</span>

The pitch being perceived with the first harmonic being absent in the waveform is called the missing fundamental phenomenon.

The octave illusion is an auditory illusion discovered by Diana Deutsch in 1973. It is produced when two tones that are an octave apart are repeatedly played in alternation ("high-low-high-low") through stereo headphones. The same sequence is played to both ears simultaneously; however when the right ear receives the high tone, the left ear receives the low tone, and conversely. Instead of hearing two alternating pitches, most subjects instead hear a single tone that alternates between ears while at the same time its pitch alternates between high and low.

<span class="mw-page-title-main">Tritone paradox</span> An auditory illusion perceived by some people to be rising in pich and by others to be falling

The tritone paradox is an auditory illusion in which a sequentially played pair of Shepard tones separated by an interval of a tritone, or half octave, is heard as ascending by some people and as descending by others. Different populations tend to favor one of a limited set of different spots around the chromatic circle as central to the set of "higher" tones. Roger Shepard in 1963 had argued that such tone pairs would be heard ambiguously as either ascending or descending. However, psychology of music researcher Diana Deutsch in 1986 discovered that when the judgments of individual listeners were considered separately, their judgments depended on the positions of the tones along the chromatic circle. For example, one listener would hear the tone pair C–F as ascending and the tone pair G–C as descending. Yet another listener would hear the tone pair C–F as descending and the tone pair G–C as ascending. Furthermore, the way these tone pairs were perceived varied depending on the listener's language or dialect.

<span class="mw-page-title-main">Illusory continuity of tones</span> Auditory illusion

The illusory continuity of tones is the auditory illusion caused when a tone is interrupted for a short time, during which a narrow band of noise is played. The noise has to be of a sufficiently high level to effectively mask the gap, unless it is a gap transfer illusion. Whether the tone is of constant, rising or decreasing pitch, the ear perceives the tone as continuous if the discontinuity is masked by noise. Because the human ear is very sensitive to sudden changes, however, it is necessary for the success of the illusion that the amplitude of the tone in the region of the discontinuity not decrease or increase too abruptly. While the inner mechanisms of this illusion is not well understood, there is evidence that supports activation of primarily the auditory cortex is present.

Diana Deutsch is a British-American psychologist from London, England. She is a professor of psychology at the University of California, San Diego, and is a prominent researcher on the psychology of music. Deutsch is primarily known for her discoveries in music and speech illusions. She also studies the cognitive foundation of musical grammars, which consists of the way people hold musical pitches in memory, and how people relate the sounds of music and speech to each other. In addition, she is known for her work on absolute pitch, which she has shown is far more prevalent among speakers of tonal languages. Deutsch is the author of Musical Illusions and Phantom Words: How Music and Speech Unlock Mysteries of the Brain (2019), the editor for Psychology of Music, and also the compact discs Musical Illusions and Paradoxes (1995) and Phantom Words and Other Curiosities (2003).

Music psychology, or the psychology of music, may be regarded as a branch of both psychology and musicology. It aims to explain and understand musical behaviour and experience, including the processes through which music is perceived, created, responded to, and incorporated into everyday life. Modern music psychology is primarily empirical; its knowledge tends to advance on the basis of interpretations of data collected by systematic observation of and interaction with human participants. Music psychology is a field of research with practical relevance for many areas, including music performance, composition, education, criticism, and therapy, as well as investigations of human attitude, skill, performance, intelligence, creativity, and social behavior.

Speech perception is the process by which the sounds of language are heard, interpreted, and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

<span class="mw-page-title-main">Auditory scene analysis</span>

In perception and psychophysics, auditory scene analysis (ASA) is a proposed model for the basis of auditory perception. This is understood as the process by which the human auditory system organizes sound into perceptually meaningful elements. The term was coined by psychologist Albert Bregman. The related concept in machine perception is computational auditory scene analysis (CASA), which is closely related to source separation and blind signal separation.

Musical memory refers to the ability to remember music-related information, such as melodic content and other progressions of tones or pitches. The differences found between linguistic memory and musical memory have led researchers to theorize that musical memory is encoded differently from language and may constitute an independent part of the phonological loop. The use of this term is problematic, however, since it implies input from a verbal system, whereas music is in principle nonverbal.

<span class="mw-page-title-main">Illusory conjunctions</span> Illusory conjunctions

Illusory conjunctions are psychological effects in which participants combine features of two objects into one object. There are visual illusory conjunctions, auditory illusory conjunctions, and illusory conjunctions produced by combinations of visual and tactile stimuli. Visual illusory conjunctions are thought to occur due to a lack of visual spatial attention, which depends on fixation and the amount of time allotted to focus on an object. With a short span of time to interpret an object, blending of different aspects within a region of the visual field – like shapes and colors – can occasionally be skewed, which results in visual illusory conjunctions. For example, in a study designed by Anne Treisman and Schmidt, participants were required to view a visual presentation of numbers and shapes in different colors. Some shapes were larger than others but all shapes and numbers were evenly spaced and shown for just 200 ms. When the participants were asked to recall the shapes they reported answers such as a small green triangle instead of a small green circle. If the space between the objects is smaller, illusory conjunctions occur more often.

The neuroscience of music is the scientific study of brain-based mechanisms involved in the cognitive processes underlying music. These behaviours include music listening, performing, composing, reading, writing, and ancillary activities. It also is increasingly concerned with the brain basis for musical aesthetics and musical emotion. Scientists working in this field may have training in cognitive neuroscience, neurology, neuroanatomy, psychology, music theory, computer science, and other relevant fields.

Cognitive musicology is a branch of cognitive science concerned with computationally modeling musical knowledge with the goal of understanding both music and cognition.

Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how the human auditory system perceives various sounds. More specifically, it is the branch of science studying the psychological responses associated with sound. Psychoacoustics is an interdisciplinary field including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science.

Change deafness is a perceptual phenomenon that occurs when, under certain circumstances, a physical change in an auditory stimulus goes unnoticed by the listener. There is uncertainty regarding the mechanisms by which changes to auditory stimuli go undetected, though scientific research has been done to determine the levels of processing at which these consciously undetected auditory changes are actually encoded. An understanding of the mechanisms underlying change deafness could offer insight on issues such as the completeness of our representation of the auditory environment, the limitations of the auditory perceptual system, and the relationship between the auditory system and memory. The phenomenon of change deafness is thought to be related to the interactions between high and low level processes that produce conscious experiences of auditory soundscapes.

In music cognition, melodic fission, is a phenomenon in which one line of pitches is heard as two or more separate melodic lines. This occurs when a phrase contains groups of pitches at two or more distinct registers or with two or more distinct timbres.

Interindividual differences in perception describes the effect that differences in brain structure or factors such as culture, upbringing and environment have on the perception of humans. Interindividual variability is usually regarded as a source of noise for research. However, in recent years, it has become an interesting source to study sensory mechanisms and understand human behavior. With the help of modern neuroimaging methods such as fMRI and EEG, individual differences in perception could be related to the underlying brain mechanisms. This has helped to explain differences in behavior and cognition across the population. Common methods include studying the perception of illusions, as they can effectively demonstrate how different aspects such as culture, genetics and the environment can influence human behavior.

References

  1. 1 2 Diana Deutsch (1995). Musical Illusions and Paradoxes, Track 1 (CD). La Jolla: Philomel Records. 1377600012.
  2. 1 2 Diana Deutsch (2003). Phantom Words and Other Curiosities, Tracks 21-26 (CD). La Jolla: Philomel Records. 1377600022.
  3. 1 2 3 4 Jaisin, K.; Suphanchaimat, R.; Figueroa, M. A. F. C.; Warren, J. D. (2016). "The speech-to-song illusion is reduced in speakers of tonal (vs. non-tonal) languages". Frontiers in Psychology. May: 662. doi: 10.3389/fpsyg.2016.00662 . PMC   4860502 . PMID   27242580.
  4. 1 2 3 4 5 Margulis, E. H.; Simchy-Gross, R.; Black, J. L. (2015). "Pronunciation difficulty, temporal regularity, and the speech-to-song illusion". Frontiers in Psychology. 6: 48. doi: 10.3389/fpsyg.2015.00048 . PMC   4310215 . PMID   25688225.
  5. 1 2 Deutsch, D.; Henthorn, T.; Lapidis, R. (2008). "The speech-to-song illusion". Journal of the Acoustical Society of America. 124 (4): 2471. Bibcode:2008ASAJ..124.2471D. doi:10.1121/1.4808987.
  6. 1 2 Deutsch, D.; Henthorn, T.; Lapidis, R. (2011). "Illusory transformation from speech to song". Journal of the Acoustical Society of America. 129 (4): 2245–2252. Bibcode:2011ASAJ..129.2245D. doi:10.1121/1.3562174. PMID   21476679.
  7. Zatorre, R. J.; Belin, P.; Penhune, V. B. (2002). "Structure and function of auditory cortex: music and speech". Trends in Cognitive Sciences. 6 (1): 37–46. doi:10.1016/S1364-6613(00)01816-7. PMID   11849614. S2CID   3076176.
  8. 1 2 3 4 Tierney, A.; Dick, F.; Deutsch, D.; Sereno, M. (2013). "Speech versus song: multiple pitch-sensitive areas revealed by a naturally occurring musical illusion". Cerebral Cortex. 23 (2): 249–254. doi:10.1093/cercor/bhs003. PMC   3539450 . PMID   22314043.
  9. 1 2 Hymers, M.; Prendergast, G.; Liu, C.; Schulze, A.; Young, M. L.; Wastling, S. J.; Barker, G. J.; Millman, R. E. (2015). "Neural mechanisms underlying song and speech perception can be differentiated using an illusory percept". NeuroImage. 108: 225–233. doi: 10.1016/j.neuroimage.2014.12.010 . PMID   25512041.
  10. 1 2 Callan, D.; Tsytsarev, V.; Hanakawa, T.; Callan, A.; Katsuhara M.; Fukuyama, H.; Turner, R. (2006). "Song and speech: brain regions involved with perception and covert production". NeuroImage. 31 (3): 1327–1342. doi:10.1016/j.neuroimage.2006.01.036. PMID   16546406. S2CID   12480888.
  11. 1 2 Schön, D.; Gordon, R.; Campagne, A.; Magne, C.; Astésano C.; Anton, J.; Besson, M. (2010). "Similar cerebral networks in language, music and song perception". NeuroImage. 51 (1): 450–461. doi:10.1016/j.neuroimage.2010.02.023. PMID   20156575. S2CID   6386437.
  12. 1 2 3 4 Falk, S.; Rathcke, T.; Dalla Bella, S. (2014). "When speech sounds like music". Journal of Experimental Psychology: Human Perception and Performance. 40 (4): 1491–1506. doi:10.1037/a0036858. PMID   24911013. S2CID   2380724.
  13. Margulis, E. H.; Simchi-Gross, R. (2016). "Repetition enhances the musicality of randomly generated tone sequences". Music Perception. 33 (4): 509–514. doi:10.1525/MP.2016.33.4.509.
  14. Simchi-Gross, R.; Margulis, E. H. (2018). "The sound-to music illusion: Repetition can musicalize nonspeech sounds". Music and Science. doi: 10.1177/2059204317731992 .
  15. Margulis, E.H. (2014). On repeat: How music plays the mind. New York: Oxford University Press.
  16. 1 2 3 4 Deutsch, D. (2019). Musical Illusions and Phantom Words: How Music and Speech Unlock Mysteries of the Brain. Oxford University Press. ISBN   9780190206833. LCCN   2018051786.
  17. Vanden Bosch der Nederlanden, C.; Hannon, E.E.; Snyder, J.S. (2015). "Finding the music of speech: Musical knowledge influences pitch processing in speech". Cognition. 142: 135–140. doi: 10.1016/j.cognition.2015.06.015 . PMID   26151370.
  18. Steve Reich (1966). Come Out. Columbia Records, a division of Sony Music Entertainment.
  19. Steve Reich (1965). It’s Gonna Rain. Nonesuch.
  20. Steve Reich (1988). Different Trains. Nonesuch.

Further reading