Voice confrontation

Last updated January 15, 2024

In psychology, voice confrontation, which is related to self-confrontation,^[1] is the phenomenon of a person not liking the sound of their own voice.^[2]^[3]^[4]^[5] The phenomenon is generally caused by disappointment due to differences between what a person expects their voice to sound like to other people and what they actually hear in recordings.^[2]^[3] These differences arise both in audio quality, including factors such as audio frequency, and in extra-linguistic cues about their personality.^[2]^[3]^[4]

Causes

The auditory perception of a person's own voice is different when the person hears their own voice live and through recordings. Upon hearing a recording of their own voice, a person may experience disappointment due to cognitive dissonance between their perception and expectation for the sound of their voice.

Audio differences

The perception of hearing in humans is performed by the auditory system receiving mechanical sound waves in the eardrum. When the source of sound is another person, the sound waves are only received through the air (an external stimulus). However, when the source of sound is the observer's own vocal cords, sound waves also travel through the person's body to their ears (an internal stimulus).^[2]

Laryngologist Martin Birchall described hearing one's own voice while talking as "hearing it through a cave complex inside our own heads" due to the sound traveling through sinuses and various parts of the cranial cavity.^[2] As a result, a combination of internal and external stimuli has a different sound quality and different frequencies than external stimuli alone.^[2]^[3]

However, audio quality differences are not the sole factors in voice confrontation, as participants of a 2013 study gave significantly higher ratings to their voice when they did not recognise it as their own.^[3]^[6]

Extra-linguistic cues

In 1966, Philip Holzman and Clyde Rousey concluded from their studies that voice confrontation also arises from differences in "extra-linguistic cues" that reveal aspects of personality which are only perceivable through recordings, such as anxiety levels, indecision, sadness, and anger.^[3]^[4] People are accustomed to the sound of their voice from the combination of internal and external stimuli, so people "build our self-image and vocal self image around what we hear, rather than the reality" according to Birchall.^[2] In a 1967 study, only 38% of people were able to identify recordings of their own voice within 5 seconds.^[1]^[2]^[7]

The disappointment from extra-linguistic cues can be especially problematic for people with body dysmorphia and gender dysphoria since they may perceive their voice as sounding like someone of the opposite gender.^[2]

In specific populations

In 1967, Holzman, Andrew Berger, and Rousey published a follow-up study on voice confrontation in bilingual people who had learned a second language after age 16. The study showed that the bilingual subjects experienced greater discomfort when hearing their own recorded voices in their first language.^[1]^[3]^[8]

Another study in 1970 found that people with speech disorders experienced greater voice confrontation than those without such issues. The results of the same study also found that females show a "significantly greater semantic differential reaction to hearing their own voices".^[5]

Related Research Articles

Social psychology is the scientific study of how thoughts, feelings, and behaviors are influenced by the actual, imagined, or implied presence of others. Social psychologists typically explain human behavior as a result of the relationship between mental states and social situations, studying the social conditions under which thoughts, feelings, and behaviors occur, and how these variables influence social interactions.

Gaydar is a colloquialism referring to the intuitive ability of a person to assess others' sexual orientations as homosexual, bisexual or straight. Gaydar relies on verbal and nonverbal clues and LGBT stereotypes, including a sensitivity to social behaviors and mannerisms like body language, the tone of voice used by a person when speaking, overt rejections of traditional gender roles, a person's occupation, and grooming habits.

Wishful thinking is the formation of beliefs based on what might be pleasing to imagine, rather than on evidence, rationality, or reality. It is a product of resolving conflicts between belief and desire. Methodologies to examine wishful thinking are diverse. Various disciplines and schools of thought examine related mechanisms such as neural circuitry, human cognition and emotion, types of bias, procrastination, motivation, optimism, attention and environment. This concept has been examined as a fallacy. It is related to the concept of wishful seeing.

<span class="mw-page-title-main">McGurk effect</span> Perceptual illusion

The McGurk effect is a perceptual phenomenon that demonstrates an interaction between hearing and vision in speech perception. The illusion occurs when the auditory component of one sound is paired with the visual component of another sound, leading to the perception of a third sound. The visual information a person gets from seeing a person speak changes the way they hear the sound. If a person is getting poor-quality auditory information but good-quality visual information, they may be more likely to experience the McGurk effect. Integration abilities for audio and visual information may also influence whether a person will experience the effect. People who are better at sensory integration have been shown to be more susceptible to the effect. Many people are affected differently by the McGurk effect based on many factors, including brain damage and other disorders.

Actor–observer asymmetry is a bias one makes when forming attributions about the behavior of others or themselves. When people judge their own behavior, they are more likely to attribute their actions to the particular situation than to their personality. However, when an observer is explaining the behavior of another person, they are more likely to attribute this behavior to the actors' personality rather than to situational factors.

The Levels of Processing model, created by Fergus I. M. Craik and Robert S. Lockhart in 1972, describes memory recall of stimuli as a function of the depth of mental processing. Deeper levels of analysis produce more elaborate, longer-lasting, and stronger memory traces than shallow levels of analysis. Depth of processing falls on a shallow to deep continuum. Shallow processing leads to a fragile memory trace that is susceptible to rapid decay. Conversely, deep processing results in a more durable memory trace. There are three levels of processing in this model. Structural processing, or visual, is when we remember only the physical quality of the word E.g how the word is spelled and how letters look. Phonemic processing includes remembering the word by the way it sounds. E.G the word tall rhymes with fall. Lastly, we have semantic processing in which we encode the meaning of the word with another word that is similar of has similar meaning. Once the word is perceived, the brain allows for a deeper processing.

The cocktail party effect refers to the phenomenon wherein the brain focuses a person's attention on a particular stimulus, usually auditory. This focus excludes a range of other stimuli from conscious awareness, as when a partygoer follows a single conversation in a noisy room. This ability is widely distributed among humans, with most listeners more or less easily able to portion the totality of sound detected by the ears into distinct streams, and subsequently to decide which streams are most pertinent, excluding all or most others.

<span class="mw-page-title-main">Affect (psychology)</span> Experience of feeling or emotion

Affect, in psychology, refers to the underlying experience of feeling, emotion, attachment, or mood. In psychology, "affect" refers to the experience of feeling or emotion. It encompasses a wide range of emotional states and can be positive or negative. Affect is a fundamental aspect of human experience and plays a central role in many psychological theories and studies. It can be understood as a combination of three components: emotion, mood, and affectivity. In psychology, the term "affect" is often used interchangeably with several related terms and concepts, though each term may have slightly different nuances. These terms encompass: emotion, feeling, mood, emotional state, sentiment, affective state, emotional response, affective reactivity, disposition. Researchers and psychologists may employ specific terms based on their focus and the context of their work.

Self-monitoring, a concept introduced in the 1970s by Mark Snyder, describes the extent to which people monitor their self-presentations, expressive behavior, and nonverbal affective displays. Snyder held that human beings generally differ in substantial ways in their abilities and desires to engage in expressive controls. Self-monitoring is defined as a personality trait that refers to an ability to regulate behavior to accommodate social situations. People concerned with their expressive self-presentation tend to closely monitor their audience in order to ensure appropriate or desired public appearances. Self-monitors try to understand how individuals and groups will perceive their actions. Some personality types commonly act spontaneously and others are more apt to purposely control and consciously adjust their behavior. Recent studies suggest that a distinction should be made between acquisitive and protective self-monitoring due to their different interactions with metatraits. This differentiates the motive behind self-monitoring behaviours: for the purpose of acquiring appraisal from others (acquisitive) or protecting oneself from social disapproval (protective).

Speech perception is the process by which the sounds of language are heard, interpreted, and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

A sensory cue is a statistic or signal that can be extracted from the sensory input by a perceiver, that indicates the state of some property of the world that the perceiver is interested in perceiving.

Attribution is a term used in psychology which deals with how individuals perceive the causes of everyday experience, as being either external or internal. Models to explain this process are called Attribution theory. Psychological research into attribution began with the work of Fritz Heider in the early 20th century, and the theory was further advanced by Harold Kelley and Bernard Weiner. Heider first introduced the concept of perceived 'locus of causality' to define the perception of one's environment. For instance, an experience may be perceived as being caused by factors outside the person's control (external) or it may be perceived as the person's own doing (internal). These initial perceptions are called attributions. Psychologists use these attributions to better understand an individual's motivation and competence. The theory is of particular interest to employers who use it to increase worker motivation, goal orientation, and productivity.

Impression formation in social psychology refers to the processes by which different pieces of knowledge about another are combined into a global or summary impression. Social psychologist Solomon Asch is credited with the seminal research on impression formation and conducted research on how individuals integrate information about personality traits. Two major theories have been proposed to explain how this process of integration takes place. The Gestalt approach views the formation of a general impression as the sum of several interrelated impressions. As an individual seeks to form a coherent and meaningful impression of another individual, previous impressions significantly influence the interpretation of subsequent information. In contrast to the Gestalt approach, the cognitive algebra approach asserts that individuals' experiences are combined with previous evaluations to form a constantly changing impression of a person. A related area to impression formation is the study of person perception, making dispositional attributions, and then adjusting those inferences based on the information available.

Philip Holzman (1922–2004) was the Esther and Sidney R. Rabb Professor of Psychology Emeritus at Harvard University and one of the world’s preeminent scientists in schizophrenia research. His landmark studies of oculomotor function documented the presence of abnormal smooth pursuit eye movements in individuals with schizophrenia and their clinically unaffected biological relatives. He was one of the first to investigate the genetic basis of schizophrenia. Another key contribution to the study of schizophrenia was his work on language and thought disorder in individuals with schizophrenia. He also discovered the presence of an active short-term memory deficit in people with schizophrenia and their biological relatives.

Dichotic listening is a psychological test commonly used to investigate selective attention and the lateralization of brain function within the auditory system. It is used within the fields of cognitive psychology and neuroscience.

Subliminal stimuli are any sensory stimuli below an individual's threshold for conscious perception, in contrast to supraliminal stimuli.

Social cues are verbal or non-verbal signals expressed through the face, body, voice, motion and guide conversations as well as other social interactions by influencing our impressions of and responses to others. These percepts are important communicative tools as they convey important social and contextual information and therefore facilitate social understanding.

The stare-in-the-crowd effect is the notion that an eyes-forward, direct gaze is more easily detected than an averted gaze. First discovered by psychologist and neurophysiologist Michael von Grünau and his psychology student Christina Marie Anston using human subjects in 1995, the processing advantage associated with this effect is thought to derive from the importance of eye contact as a cue for social interactions.

Emotion perception refers to the capacities and abilities of recognizing and identifying emotions in others, in addition to biological and physiological processes involved. Emotions are typically viewed as having three components: subjective experience, physical changes, and cognitive appraisal; emotion perception is the ability to make accurate decisions about another's subjective experience by interpreting their physical changes through sensory systems responsible for converting these observed changes into mental representations. The ability to perceive emotion is believed to be both innate and subject to environmental influence and is also a critical component in social interactions. How emotion is experienced and interpreted depends on how it is perceived. Likewise, how emotion is perceived is dependent on past experiences and interpretations. Emotion can be accurately perceived in humans. Emotions can be perceived visually, audibly, through smell and also through bodily sensations and this process is believed to be different from the perception of non-emotional material.

Debra Titone is a cognitive psychologist known for her research on bilingualism and multilingualism. She is currently a Professor of Psychology and a chair holder of Canada Research in Language & Multilingualism at McGill University. Titone is a founding member and officer of the professional society, Women in Cognitive Science. She and her colleagues have written about gender disparities in opportunities, along with the advancement of women the field of cognitive science, with specific reference to Canada.

References

1 2 3 Shapiro, David (2012). Consciousness and Self-Regulation. Advances in Research and Theory. Vol. 2. Springer Science+Business Media. pp. 154–169. ISBN 9781468425710. Archived from the original on 2020-10-17. Retrieved 2020-05-25.
1 2 3 4 5 6 7 8 9 Samuelson, Kate (19 June 2017). "Why Do I Hate the Sound of My Own Voice?". Time. Archived from the original on 12 August 2020. Retrieved 8 May 2020.
1 2 3 4 5 6 7 Jaekl, Philip (12 July 2018). "The real reason the sound of your own voice makes you cringe". The Guardian . Archived from the original on 24 April 2020. Retrieved 27 April 2020.
1 2 3 Holzman, Philip S.; Rousey, Clyde (1966). "The voice as a percept". Journal of Personality and Social Psychology. 4 (1): 79–86. doi:10.1037/h0023518. ISSN 1939-1315. PMID 5965194.
1 2 Weston, Alan J.; Rousey, Clyde L. (1970). "Voice Confrontation in Individuals with Normal and Defective Speech Patterns". Perceptual and Motor Skills. 30 (1): 187–190. doi:10.2466/pms.1970.30.1.187. ISSN 0031-5125. PMID 5476102. S2CID 46635019.
↑ Hughes, Susan M.; Harrison, Marissa A. (2013). "I Like My Voice Better: Self-Enhancement Bias in Perceptions of Voice Attractiveness". Perception. 42 (9): 941–949. doi:10.1068/p7526. PMID 24386714. S2CID 31006486.
↑ Rousey, Clyde; Holzman, Philip S. (1967). "Recognition of one's own voice". Journal of Personality and Social Psychology. 6 (4, Pt.1): 464–466. doi:10.1037/h0024837. ISSN 1939-1315. PMID 6082480.
↑ Holzman, Philip S.; Berger, Andrew; Rousey, Clyde (1967). "Voice confrontation: A bilingual study". Journal of Personality and Social Psychology. 7 (4, Pt.1): 423–428. doi:10.1037/h0025233. ISSN 1939-1315. PMID 6065872.

External links

Hughes, Susan M; Harrison, Marissa A (2013). "I like My Voice Better: Self-Enhancement Bias in Perceptions of Voice Attractiveness". Perception. 42 (9): 941–949. doi:10.1068/p7526. ISSN 0301-0066. PMID 24386714. S2CID 31006486.
Schwartz, Gary (2012). Consciousness and Self-Regulation. Advances in Research and Theory. Vol. 2. Springer Science & Business Media. p. 154. ISBN 9781468425710.
Hollien, Harry; Hollien, Patricia, eds. (1979). Current Issues in the Phonetic Sciences. John Benjamins Publishing Company. p. 684. ISBN 9789027281265.
Pavia, Will (19 December 2019). "Marriage Story star Adam Driver can't stand watching his performances". The Times . Retrieved 27 April 2020.
Desai, Rajvi (19 August 2019). "Is This Normal? "I Can't Stand the Sound of My Voice"". The Swaddle. Retrieved 8 May 2020.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Shapiro-1] 1 2 3 Shapiro, David (2012). Consciousness and Self-Regulation. Advances in Research and Theory. Vol. 2. Springer Science+Business Media. pp. 154–169. ISBN 9781468425710. Archived from the original on 2020-10-17. Retrieved 2020-05-25.

[Time-2] 1 2 3 4 5 6 7 8 9 Samuelson, Kate (19 June 2017). "Why Do I Hate the Sound of My Own Voice?". Time. Archived from the original on 12 August 2020. Retrieved 8 May 2020.

[Guardian-3] 1 2 3 4 5 6 7 Jaekl, Philip (12 July 2018). "The real reason the sound of your own voice makes you cringe". The Guardian . Archived from the original on 24 April 2020. Retrieved 27 April 2020.

[Holzman-Rousey-4] 1 2 3 Holzman, Philip S.; Rousey, Clyde (1966). "The voice as a percept". Journal of Personality and Social Psychology. 4 (1): 79–86. doi:10.1037/h0023518. ISSN 1939-1315. PMID 5965194.

[Weston-Rousey-5] 1 2 Weston, Alan J.; Rousey, Clyde L. (1970). "Voice Confrontation in Individuals with Normal and Defective Speech Patterns". Perceptual and Motor Skills. 30 (1): 187–190. doi:10.2466/pms.1970.30.1.187. ISSN 0031-5125. PMID 5476102. S2CID 46635019.

[6] Hughes, Susan M.; Harrison, Marissa A. (2013). "I Like My Voice Better: Self-Enhancement Bias in Perceptions of Voice Attractiveness". Perception. 42 (9): 941–949. doi:10.1068/p7526. PMID 24386714. S2CID 31006486.

[recognition-7] Rousey, Clyde; Holzman, Philip S. (1967). "Recognition of one's own voice". Journal of Personality and Social Psychology. 6 (4, Pt.1): 464–466. doi:10.1037/h0024837. ISSN 1939-1315. PMID 6082480.

[Holzman-Berger-Rousey-8] Holzman, Philip S.; Berger, Andrew; Rousey, Clyde (1967). "Voice confrontation: A bilingual study". Journal of Personality and Social Psychology. 7 (4, Pt.1): 423–428. doi:10.1037/h0025233. ISSN 1939-1315. PMID 6065872.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]