Emotional prosody

Last updated

Emotional prosody or affective prosody is the various paralinguistic aspects of language use that convey emotion. [1] It includes an individual's tone of voice in speech that is conveyed through changes in pitch, loudness, timbre, speech rate, and pauses. It can be isolated from semantic information, and interacts with verbal content (e.g. sarcasm). [2]

Contents

Emotional prosody in speech is perceived or decoded slightly worse than facial expressions but accuracy varies with emotions. Anger and sadness are perceived most easily, followed by fear and happiness, with disgust being the most poorly perceived. [3] [ better source needed ]

Production of vocal emotion

Studies have found that some emotions, such as fear, joy and anger, are portrayed at a higher frequency than emotions such as sadness. [4]

Perception of vocal emotion

Decoding emotions in speech includes three stages: determining acoustic features, creating meaningful connections with these features, and processing the acoustic patterns in relation to the connections established. In the processing stage, connections with basic emotional knowledge is stored separately in memory network specific to associations. These associations can be used to form a baseline for emotional expressions encountered in the future. Emotional meanings of speech are implicitly and automatically registered after the circumstances, importance and other surrounding details of an event have been analyzed. [6]

On average, listeners are able to perceive intended emotions exhibited to them at a rate significantly better than chance (chance=approximately 10%). [5] However, error rates are also high. This is partly due to the observation that listeners are more accurate at emotional inference from particular voices and perceive some emotions better than others. [4] Vocal expressions of anger and sadness are perceived most easily, fear and happiness are only moderately well-perceived, and disgust has low perceptibility. [3] [ self-published source? ]

Vocal emotions and the brain

Language can be split into two components: the verbal and vocal channels. The verbal channel is the semantic content made by the speaker's chosen words. In the verbal channel, the semantic content of the speakers words determines the meaning of the sentence. The way a sentence is spoken, however, can change its meaning which is the vocal channel. This channel of language conveys emotions felt by the speaker and gives us as listeners a better idea of the intended meaning. Nuances in this channel are expressed through intonation, intensity, a rhythm which combined for prosody. Usually these channels convey the same emotion, but sometimes they differ. Sarcasm and irony are two forms of humor based on this incongruent style. [7]

Neurological processes integrating verbal and vocal (prosodic) components are relatively unclear. However, it is assumed that verbal content and vocal are processed in different hemispheres of the brain. Verbal content composed of syntactic and semantic information is processed in the left hemisphere. Syntactic information is processed primarily in the frontal regions and a small part of the temporal lobe of the brain while semantic information is processed primarily in the temporal regions with a smaller part of the frontal lobes incorporated. In contrast, prosody is processed primarily in the same pathway as verbal content, but in the right hemisphere. Neuroimaging studies using functional magnetic resonance imaging (fMRI) machines provide further support for this hemisphere lateralization and temporo-frontal activation. Some studies however show evidence that prosody perception is not exclusively lateralized to the right hemisphere and may be more bilateral. There is some evidence that the basal ganglia may also play an important role in the perception of prosody. [7]

Impairment of emotion recognition

Deficits in expressing and understanding prosody, caused by right hemisphere lesions, are known as aprosodias. These can manifest in different forms and in various mental illnesses or diseases. Aprosodia can be caused by stroke and alcohol abuse as well. The types of aprosodia include: motor (the inability to produce vocal inflection), expressive (when brain limitations and not motor functions are the cause of this inability), and receptive (when a person cannot decipher the emotional speech). [8]

It has been found that it gets increasingly difficult to recognize vocal expressions of emotion with increasing age. Older adults have slightly more difficulty labeling vocal expressions of emotion, particularly sadness and anger than young adults but have a much greater difficulty integrating vocal emotions and corresponding facial expressions. A possible explanation for this difficulty is that combining two sources of emotion requires greater activation of emotion areas of the brain, in which adults show decreased volume and activity. Another possible explanation is that hearing loss could have led to a mishearing of vocal expressions. High frequency hearing loss is known to begin occurring around the age of 50, particularly in men. [9]

Because the right hemisphere of the brain is associated with prosody, patients with right hemisphere lesions have difficulty varying speech patterns to convey emotion. Their speech may therefore sound monotonous. In addition, people with right-hemisphere damage have been studied to be impaired when it comes to identifying the emotion in intoned sentences.

Difficulty in decoding both syntactic and affective prosody is also found in people with autism spectrum disorder and schizophrenia, where "patients have deficits in a large number of functional domains, including social skills and social cognition. These social impairments consist of difficulties in perceiving, understanding, anticipating and reacting to social cues that are crucial for normal social interaction." This has been determined in multiple studies, such as Hoekert et al.'s 2017 study on emotional prosody in schizophrenia, which illustrated that more research must be done to fully confirm the correlation between the illness and emotional prosody. However, people with schizophrenia have no problem deciphering non-emotional prosody. [10]

Non-linguistic emotional prosody

Emotional states such as happiness, sadness, anger, and disgust can be determined solely based on the acoustic structure of a non-linguistic speech act. These acts can be grunts, sighs, exclamations, etc. There is some research that supports the notion that these non-linguistic acts are universal, eliciting the same assumptions even from speakers of different languages.

In addition, it has been proven that emotion can be expressed in non-linguistic vocalizations differently than in speech. As Laukka et al. state: Speech requires highly precise and coordinated movement of the articulators (e.g., lips, tongue, and larynx) in order to transmit linguistic information, whereas non-linguistic vocalizations are not constrained by linguistic codes and thus do not require such precise articulations. This entails that non-linguistic vocalizations can exhibit larger ranges for many acoustic features than prosodic expressions.

In their study, actors were instructed to vocalize an array of different emotions without words. The study showed that listeners could identify a wide range of positive and negative emotions above chance. However, emotions like guilt and pride were less easily recognized. [11]

In a 2015 study by Verena Kersken, Klaus Zuberbühler and Juan-Carlos Gomez, non-linguistic vocalizations of infants were presented to adults to see if the adults could distinguish from infant vocalizations indicating requests for help, pointing to an object, or indicating an event. Infants show different prosodic elements in crying, depending on what they are crying for. They also have differing outbursts for positive and negative emotional states. Decipherment ability of this information was determined to be applicable across cultures and independent of the adult's level of experience with infants.

Sex differences

Men and women differ in both how they use language and also how they understand it. It is known that there is a difference in the rate of speech, the range of pitch, and the duration of speech, and pitch slope (Fitzsimmons et al.). For example, "In a study of relationship of spectral and prosodic signs, it was established that the dependence of pitch and duration differed in men and women uttering the sentences in affirmative and inquisitive intonation. Tempo of speech, pitch range, and pitch steepness differ between the genders" (Nesic et al.). One such illustration is how women are more likely to speak faster, elongate the ends of words, and raise their pitch at the end of sentences.

Women and men are also different in how they neurologically process emotional prosody. In an fMRI study, men showed a stronger activation in more cortical areas than female subjects when processing the meaning or manner of an emotional phrase. In the manner task, men had more activation in the bilateral middle temporal gyri. For women, the only area of significance was the right posterior cerebellar lobe. Male subjects in this study showed stronger activation in the prefrontal cortex, and on average needed a longer response time than female subjects. This result was interpreted to mean that men need to make conscious inferences about the acts and intentions of the speaker, while women may do this sub-consciously. Therefore, men needed to integrate linguistic semantics and emotional intent "at a higher stage than the semantic processing stage." [12]

Considerations

Most research regarding vocal expression of emotion has been studied through the use of synthetic speech or portrayals of emotion by professional actors. Little research has been done with spontaneous, "natural" speech samples. These artificial speech samples have been considered to be close to natural speech but specifically portrayals by actors may be influenced stereotypes of emotional vocal expression and may exhibit intensified characteristics of speech skewing listeners perceptions. Another consideration lies in listeners individual perceptions. Studies typically take the average of responses but few examine individual differences in great depth. This may provide a better insight into the vocal expressions of emotions. [5]

See also

Related Research Articles

<span class="mw-page-title-main">Language center</span> Speech processing areas of the brain

In neuroscience and psychology, the term language center refers collectively to the areas of the brain which serve a particular function for speech processing and production. Language is a core system that gives humans the capacity to solve difficult problems and provides them with a unique type of social interaction. Language allows individuals to attribute symbols to specific concepts, and utilize them through sentences and phrases that follow proper grammatical rules. Finally, speech is the mechanism by which language is orally expressed.

<span class="mw-page-title-main">Sadness</span> Negative emotion

Sadness is an emotional pain associated with, or characterized by, feelings of disadvantage, loss, despair, grief, helplessness, disappointment and sorrow. An individual experiencing sadness may become quiet or lethargic, and withdraw themselves from others. An example of severe sadness is depression, a mood which can be brought on by major depressive disorder or persistent depressive disorder. Crying can be an indication of sadness.

<span class="mw-page-title-main">Pathognomy</span> Study of expressed emotions

Pathognomy is "a 'semiotik' of the transient features of someone's face or body, be it voluntary or involuntary". Examples of this can be laughter and winking to the involuntary such as sneezing or coughing. By studying the features or expressions, there is then an attempt to infer the mental state and emotion felt by the individual.

<span class="mw-page-title-main">Face perception</span> Cognitive process of visually interpreting the human face

Facial perception is an individual's understanding and interpretation of the face. Here, perception implies the presence of consciousness and hence excludes automated facial recognition systems. Although facial recognition is found in other species, this article focuses on facial perception in humans.

Paralanguage, also known as vocalics, is a component of meta-communication that may modify meaning, give nuanced meaning, or convey emotion, by using techniques such as prosody, pitch, volume, intonation, etc. It is sometimes defined as relating to nonphonemic properties only. Paralanguage may be expressed consciously or unconsciously.

In linguistics, prosody is the study of elements of speech that are not individual phonetic segments but which are properties of syllables and larger units of speech, including linguistic functions such as intonation, stress, and rhythm. Such elements are known as suprasegmentals.

Affective neuroscience is the study of how the brain processes emotions. This field combines neuroscience with the psychological study of personality, emotion, and mood. The basis of emotions and what emotions are remains an issue of debate within the field of affective neuroscience.

<span class="mw-page-title-main">Speech</span> Human vocal communication using spoken language

Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words, and using those words in their semantic character as words in the lexicon of a language according to the syntactic constraints that govern lexical words' function in a sentence. In speaking, speakers perform many different intentional speech acts, e.g., informing, declaring, asking, persuading, directing, and can use enunciation, intonation, degrees of loudness, tempo, and other non-representational or paralinguistic aspects of vocalization to convey meaning. In their speech, speakers also unintentionally communicate many aspects of their social position such as sex, age, place of origin, physical states, psychological states, physico-psychological states, education or experience, and the like.

Categorical perception is a phenomenon of perception of distinct categories when there is gradual change in a variable along a continuum. It was originally observed for auditory stimuli but now found to be applicable to other perceptual modalities.

Speech perception is the process by which the sounds of language are heard, interpreted, and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

Dysprosody, which may manifest as pseudo-foreign accent syndrome, refers to a disorder in which one or more of the prosodic functions are either compromised or eliminated.

Aprosodia is a neurological condition characterized by the inability of a person to properly convey or interpret emotional prosody. Prosody in language refers to the ranges of rhythm, pitch, stress, intonation, etc. These neurological deficits can be the result of damage of some form to the non-dominant hemisphere areas of language production. The prevalence of aprosodias in individuals is currently unknown, as testing for aprosodia secondary to other brain injury is only a recent occurrence.

<span class="mw-page-title-main">Superior temporal sulcus</span> Part of the brains temporal lobe

In the human brain, the superior temporal sulcus (STS) is the sulcus separating the superior temporal gyrus from the middle temporal gyrus in the temporal lobe of the brain. A sulcus is a deep groove that curves into the largest part of the brain, the cerebrum, and a gyrus is a ridge that curves outward of the cerebrum.

The neuroscience of music is the scientific study of brain-based mechanisms involved in the cognitive processes underlying music. These behaviours include music listening, performing, composing, reading, writing, and ancillary activities. It also is increasingly concerned with the brain basis for musical aesthetics and musical emotion. Scientists working in this field may have training in cognitive neuroscience, neurology, neuroanatomy, psychology, music theory, computer science, and other relevant fields.

Emotional lateralization is the asymmetrical representation of emotional control and processing in the brain. There is evidence for the lateralization of other brain functions as well.

<span class="mw-page-title-main">Right hemisphere brain damage</span> Medical condition

Right hemisphere brain damage (RHD) is the result of injury to the right cerebral hemisphere. The right hemisphere of the brain coordinates tasks for functional communication, which include problem solving, memory, and reasoning. Deficits caused by right hemisphere brain damage vary depending on the location of the damage.

Dichotic listening is a psychological test commonly used to investigate selective attention and the lateralization of brain function within the auditory system. It is used within the fields of cognitive psychology and neuroscience.

Culture in music cognition refers to the impact that a person's culture has on their music cognition, including their preferences, emotion recognition, and musical memory. Musical preferences are biased toward culturally familiar musical traditions beginning in infancy, and adults' classification of the emotion of a musical piece depends on both culturally specific and universal structural features. Additionally, individuals' musical memory abilities are greater for culturally familiar music than for culturally unfamiliar music. The sum of these effects makes culture a powerful influence in music cognition.

<span class="mw-page-title-main">Music and emotion</span> Psychological relationship between human affect and music

Research into music and emotion seeks to understand the psychological relationship between human affect and music. The field, a branch of music psychology, covers numerous areas of study, including the nature of emotional reactions to music, how characteristics of the listener may determine which emotions are felt, and which components of a musical composition or performance may elicit certain reactions.

Emotion perception refers to the capacities and abilities of recognizing and identifying emotions in others, in addition to biological and physiological processes involved. Emotions are typically viewed as having three components: subjective experience, physical changes, and cognitive appraisal; emotion perception is the ability to make accurate decisions about another's subjective experience by interpreting their physical changes through sensory systems responsible for converting these observed changes into mental representations. The ability to perceive emotion is believed to be both innate and subject to environmental influence and is also a critical component in social interactions. How emotion is experienced and interpreted depends on how it is perceived. Likewise, how emotion is perceived is dependent on past experiences and interpretations. Emotion can be accurately perceived in humans. Emotions can be perceived visually, audibly, through smell and also through bodily sensations and this process is believed to be different from the perception of non-emotional material.

References

  1. Leentjens, AF; Wielaert, SM; van Harskamp, F; Wilmink, FW (1998). "Disturbances of affective prosody in patients with schizophrenia; a cross sectional study". Journal of Neurology, Neurosurgery, and Psychiatry. 64 (3): 375–378. doi: 10.1136/jnnp.64.3.375 . PMC   2169997 . PMID   9527153.
  2. Esteve-Gibert, Núria; Guellaï, Bahia (2018). "Prosody in the Auditory and Visual Domains: A Developmental Perspective". Frontiers in Psychology. 9: 338. doi: 10.3389/fpsyg.2018.00338 . ISSN   1664-1078. PMC   5868325 . PMID   29615944.
  3. 1 2 "The Social and Emotional Voice" (PDF). Archived from the original (PDF) on 3 February 2014. Retrieved 29 March 2012.
  4. 1 2 Bachorowski, Jo-Anne (1999). "Vocal Expression and Perception of Emotion". Current Directions in Psychological Science. 8 (2): 53–57. doi:10.1111/1467-8721.00013. S2CID   18785659.
  5. 1 2 3 4 5 6 Sauter, Disa A.; Eisner, Frank; Calder, Andrew J.; Scott, Sophie K. (2010). "Perceptual cues in nonverbal vocal expressions of emotion". The Quarterly Journal of Experimental Psychology. 63 (11): 2251–2272. doi:10.1080/17470211003721642. PMC   4178283 . PMID   20437296. Closed Access logo transparent.svg
  6. Pell, Marc D.; Kotz, Sonja A. (2011). "On the Time Course of Vocal Emotion Recognition". PLoS ONE . 6 (11): e27256. Bibcode:2011PLoSO...627256P. doi: 10.1371/journal.pone.0027256 . PMC   3210149 . PMID   22087275. Open Access logo PLoS transparent.svg
  7. 1 2 Berckmoes, Celine; Guy Vingerhoets (2004). "Neural Foundations of Emotional Speech Processing". Current Directions in Psychological Science. 13 (5): 182–185. doi:10.1111/j.0963-7214.2004.00303.x. S2CID   143618738.
  8. Gorelick PB, Ross ED (May 1987). "The aprosodias: further functional-anatomical evidence for the organisation of affective language in the right hemisphere". Journal of Neurology, Neurosurgery, and Psychiatry. 50 (5): 553–560. doi:10.1136/jnnp.50.5.553. PMC   1031966 . PMID   2438386.
  9. Ryan, Melissa; Murray, Janice; Ruffman, Ted (2010). "Aging and the perception of emotion: Processing vocal emotions alone and with faces". Experimental Aging Research . 36 (1): 1–22. doi:10.1080/03610730903418372. PMID   20054724. S2CID   205555217.
  10. Hoekert, L. M. (2009). "Impaired recognition and expression of emotional prosody in schizophrenia: review and meta-analysis" (PDF). Beyond what is being said: emotional prosody.
  11. Laukka P, Elfenbein HA, Söder N, Nordström H, Althoff J, Chui W, Iraki FK, Rockstuhl T, Thingujam NS (July 2013). "Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations". Frontiers in Psychology. 4: 353. doi: 10.3389/fpsyg.2013.00353 . PMC   3728469 . PMID   23914178.
  12. Fitzsimons M, Sheahan N, Staunton H (July 2001). "Gender and the integration of acoustic dimensions of prosody: implications for clinical studies". Brain and Language. 78 (1): 94–108. doi:10.1006/brln.2000.2448. PMID   11412018. S2CID   45581597.