Sounds good! Acoustic patterns of positive emotional expressions

My friend recently exclaimed with delight, “You will not believe this!” but before she could tell me what I wouldn’t be able to believe, she answered her phone. What was the news? She sounded happy, for sure. Was she awarded major funding? Did her crush just ask her out? Did she solve a difficult problem?

Much research has been conducted on the perception and acoustic configurations of emotions, such as joy or happiness, in the voice (for example, here and here). In their recent paper, published in Psychonomic Bulletin & Review, Kamiloğlu, Fischer, and Sauter (pictured below) reviewed over 100 studies on the acoustic characteristics of nuanced positive emotions to get a better understanding of a wider range of positive emotion. If you’re curious about how different positive vocal emotions sound, you can listen to examples here.

Going beyond joy

Most research on the acoustics of emotional expressions focuses on basic emotions. Usually, only one positive emotion is included in these types of studies, referred to as either “joy,” “happiness,” or “amusement”. However, there are many ways in which one can feel good, and, consequently, many ways to sound good.

The key to studying the acoustic configurations of different positive emotions is understanding how they are expressed and what makes them different from each other. Emotions are expressed vocally with

prosody (aka tone of voice)
non-linguistic vocalizations, such as when we laugh, sigh, and cheer

Within a functional framework, all emotions serve important adaptive goals, such as promoting goal-achievement or influencing other people’s behavior, particularly important in the case of emotional expressions. As you can see below, the realm of positive emotions goes way beyond joy (see the interactive version of the figure).

Some of the terms are more similar to other terms, and therefore they tend to cluster together. These clusters give rise to different families of emotion, which encompass different terms (i.e., discrete emotions), based on similarities on the functions they serve.

Positive emotions are classified in the following families:

Epistemological. They occur in response to changes in the individual’s knowledge about the world, such as amusement and interest.
Savouring. They are triggered by thinking or experiencing different kinds of sensory enjoyment, such as sensory pleasure and sexual desire.
Prosocial. They are linked to concern for others, such as love, gratitude, and admiration.
Agency approach. They are characterized by approach tendencies, such as elation and pride.

What is it to “sound good”?

There are three main acoustic domains that makeup “sounding good” and they are:

Frequency refers to the pitch (fundamental frequency or f0). For example, we tend to speak with a higher pitch when we are happy than when we are sad.
Amplitude refers to loudness (amplitude). For example, expressions of anger tend to have a higher amplitude than expressions of disgust.
Duration refers to speed (speech rate). For example, we talk faster when we are excited than when in awe.

Those domains can be used alone or in tandem to derive measures that allow describing the acoustic properties of vocal emotions. You can hear examples of voices differing in pitch and loudness here.

They can be relatively simple, such as the mean of f0, which captures whether the pitch of a recording is higher or lower (think about the difference between a female vs a male voice), or more complex, such as jitter and shimmer, which measure the instability of the frequency and amplitude of the speech acoustic signal over time and lead to impressions of different vocal qualities. Think about voices differing in roughness (compare Louis Armstrong’s voice with Frank Sinatra’s voice) and breathiness (compare Nora Jones’s voice with Ariana Grande’s voice).

Various measures have been used to describe differences between positive vocal emotions as compared to neutral sounding speech (in the left panel below and its interactive version), and to compare different positive emotions among themselves (in the right panel below and its interactive version).

In the figure above, the larger the size of the point, the more frequently a given feature has been studied, and the thicker the connection line, the more frequently two acoustic features have been studied together.

Good vibrations: Main acoustic characteristics of positive emotional expressions

Ok, sounds good! But then, how do positive emotions differ from each other in terms of acoustic patterns? The answer depends partly on what type of vocal expressions they are (prosody or vocalizations), and on what do we compare them with.

When comparing positive emotions vs. a neutral baseline, a happy voice has:

A higher and more variable pitch, covering also a wider pitch range (i.e. a more ample difference between the minimum and maximum pitch).
Louder and with more loudness variation throughout the utterance.
The first two formants are higher. Reflecting variations in the distribution of spectral energy due to how the vocal cavity is shaped while producing the vocalization.

When comparing different positive emotions with each other:

Pitch is higher for epistemological emotions, moderate for savoring emotions, and low for prosocial emotions. This is the case for speech prosody and emotional vocalizations alike.
The same pattern is observed for loudness (epistemological > savoring > prosocial), but only for speech prosody.
In the time domain, pride (an epistemological emotion) is produced at a faster rate than pleasure and contentment (savoring emotions) and than admiration (a prosocial emotion). Note that, in this case, the patterns differentiate some discrete emotions belonging to different emotional families, but tend to not generalize to all emotions within each emotional family.

Keep in mind that there are differences between patterns in speech prosody and vocalizations. For example, pleasure was louder than amusement and relief for vocalizations, but quieter than them in prosody.

Below is a visual summary that shows the main acoustic patterns by emotion families and discrete emotions.

The larger circles in the figure above reflect higher values on the acoustic measure and asterisks reflect that the illustrations refer to speech prosody only.

These acoustic differences reflect physiological factors and functions of emotions. For example, vocalizations of joy, a high-arousal emotion, had a higher frequency, loudness, and speech rate than vocalizations of contentment, characterized by lower arousal. On the other hand, emotions with communicative and social functions (such as amusement and interest), were produced with highly salient cues related to the respiratory effort (ie. higher pitch, loudness, and speech rate), than emotions with a less obvious social function (such as savoring emotions).

From feeling good to sounding good

There are many ways of feeling good. The realm of positive emotions goes well beyond joy. In everyday life, we can express happiness, love, pleasure, admiration, and lots of other great things using our voice. People around us often can perceive what we are feeling, and react to it in meaningful ways, they bond with us, they cheer with us, or sometimes they simply know something is going well, without us needing to let them know with words.

This may be thanks to the complex acoustic characteristics that differentiate positive emotions, a subject that we are understanding better each day. Ultimately, this will allow a better understanding of our positive emotional experiences and the important functions they serve for us and those around us.

Psychonomic Society’s article focused on in this post:

Kamiloğlu, R., Fischer, A. & Sauter, D. (2020) Good vibrations: A review of vocal expressions of positive emotions. Psychonomic Bulletin & Review, 27, 237-265 https://doi.org/10.3758/s13423-019-01701-x