How the voices around us shape our own speech

Have you ever visited a foreign country or region and found your own speech being influenced by the accent of the native people? Or perhaps you live far from home, and when you go back, your speech shifts back to the local accent? This happens to me every time I visit my family in Baltimore, MD—I suddenly sound like Tracy Turnblad’s parents in Hairspray.

Two people sitting and engaged in conversation. — *Speakers most likely aligning their accents in discourse. Source: pexels.com.*

We often converge our speech to match the people around us. This process develops at a very young age. Some children have strong speech patterns that reflect the adults in their community. Take, for example, 8-year-old Jackson from Indiana, who talks about his love of tractors in this video. Jackson has picked up a particular regional cadence and lexicon because that’s what surrounds him.

Vocal convergence can happen at higher levels, like word choice and grammar, as well as lower levels, like acoustic-phonetic properties of speech. One theory suggests that we converge our speech to conform with our conversation partner—to build a social attunement in talking. However, what if this convergence is more a result of internal processes? That is, perhaps speech perception has a direct influence on speech production in the moment of speaking. Perhaps perception and production use shared mechanisms that shape the structure of our speech in real-time. This is the question that Bradshaw and colleagues (pictured below) investigated in a recent paper in Psychonomic Bulletin & Review.

Authors of the featured article “Sensorimotor learning during synchronous speech is modulated by the acoustics of the other voice.” Abigail R. Bradshaw (first), Emma D. Wheeler (second), Carolyn McGettigan (third), Daniel R. Lametti (last).

The researchers looked at vocal convergence through a task of synchronous speaking. Saying the same thing at the same time is common in certain settings, like stadiums and places of worship. People in these contexts may adapt their rate of speech, loudness, pitch, etc., according to the speech they hear around them.

A sport stadium with crowds cheering — *Fans of synchronized cheering in crowd. Source: pexels.com.*

For the current study, the authors looked at perturbations in formants—the resonant frequencies of speech. As vowel formants shift, the sound will morph into a different vowel category. For example, changing the F1 and F2 values in the word “head” can make it sound like “had.” In this study, the researchers used 50 sentences spoken by a female speaker, and they manipulated the formants into two conditions:

Condition 1: F1 increased, F2 decreased
Condition 2: F1 decreased, F2 increased

Participants read the sentences aloud at the same time as one of these speaker conditions. After a baseline period, the participants began to hear their own speech through altered feedback. Real-time feedback on their own speech was altered by increasing F1 and decreasing F2. In response to perturbation, speakers often try to adapt their speech to correct for the difference. In this case, adaptation would involve speakers trying to lower F1 and raise F2 to balance out the perturbation in their feedback.

Participants who heard the Condition 2 speaker (the “congruent” group) showed significantly more adaptation than those who heard the Condition 1 speaker (the “incongruent” group). That is, the speaker with lower F1 and higher F2 matched the direction of adaptation, so those participants adapted more to the perturbation in feedback. Conversely, when participants heard the speaker with higher F1 and lower F2, they showed a weaker adaptation effect.

*Illustration of experimental manipulations in the study.*

This difference in adaptation effects suggests that vocal convergence and speech motor adaptation both operate on acoustic targets—and those targets are flexible based on experience with other voices.

Bradshaw commented,

“Our findings underscore the importance of studying the phenomenon of speech motor learning ‘outside the vacuum’ of repetitive solo speech, in more naturalistic speaking contexts involving interactions with other voices (the central reason for which we speak).”

This research is an exciting step toward fine-tuning theories of speech production, and one step closer to understanding synchronous speech… something that Garth and Kat could work on.

Featured Psychonomic Society article

Bradshaw, A.R., Wheeler, E.D., McGettigan, C., & Lametti, D.R. (2024). Sensorimotor learning during synchronous speech is modulated by the acoustics of the other voice. Psychonomic Bulletin & Review. https://doi.org/10.3758/s13423-024-02536-x

Author

Brett Myers

Brett Myers, PhD, CCC-SLP is an Associate Professor and the Director of Clinical Education in the Department of Communication Sciences and Disorders at the University of Utah. He received his doctorate from Vanderbilt University, where he studied with Duane Watson and Reyna Gordon. His research investigates planning processes during speech production, including parameters related to prosody, and their role in neural models of motor speech control.
View all posts

How the voices around us shape our own speech

Author

You may also like

“It’s me”… When “it” could be anyone between Seinfeld and Philadelphia

Processing Gefühle in your second language

How do we decide what’s true or false? – A fight between dead philosophers