You heard that right: accent judgment but not accent perception is influenced by expectations

Everyone “has an accent”—even if you think you don’t. Most likely, your accent is influenced by both your cultural identity, socioeconomic status, and other social processes, as well as more cognitive processes like emulating another person’s style in a conversation.

Accents are such a strong indicator of social factors that they get associated to stereotypes, from the seemingly innocuous (this speaker comes from money) to judgments that actively lead to discrimination (this person is not very intelligent).

When we say that someone has an accent, we typically mean that we have a hard time understanding them, potentially because they do not come from the same background as we do. Our beliefs about how someone will sound can affect our judgments of their speech in what is called reverse linguistic stereotyping.

In an increasingly globalized world, we hear a greater variety of different accents today than ever before. If we think that someone is hard to understand, this might affect our ability to learn from them. In this post I will focus on one line of research that has been looking at how the judgment and perception of accents can affect learning in the laboratory and in the classroom.

It is important to distinguish between the judgment of an accent and the perception of an accent, or more broadly offline versus online processing. Colloquially, the perception of a person’s accent is actually an online judgment—we ultimately have to generate some label, potentially based on what we know about the speaker, but also potentially based on the actual speech signal. For the purposes of this article, perception occurs online and exists before a judgment is made.

One classic effect demonstrates that when participants are listening to recordings of American English, they will rate accents as more foreign if the recording was accompanied by a static picture of an Asian face.

This experimental paradigm is, however, rife with task demand effects—it is easy to guess what the experimenter is testing during accent judgment tasks with two different ethnicities. In addition to this, static pictures provide only weak visual cues, which are known to be a big component in language processing:

[youtube https://www.youtube.com/watch?v=G-lN8vWm3m0]

A recent study by Yi Zheng and Arthur Samuel published in the Psychonomic Society journal Attention, Perception, and Psychophysics therefore looked at whether live video can influence these accent judgments, and whether perception—or just judgments—is affected by a speaker’s ethnicity.

Zheng and Samuel extended the findings from the static-picture paradigm with six experiments. They first replicated the original effect, then added extensions to the paradigm, then a task that explicitly tested online perception. In doing this Zheng and Samuel sought to tease out whether accent perceptions actually changed, or just the judgments themselves.

In their first experiment, Zheng and Samuel sought a more rigorous test of how presentation affects these accent-judgment biases. First, Zheng and Samuel created a range of recorded single words that were blends of American English and heavily-accented English produced by a native Mandarin Chinese speaker.

They blended the proportion of American English to accented English in all of the recordings to create 8 levels of accentedness. This is similar to approaches in classic speech perception studies, where stimuli are often manipulated recordings of syllables that combine “ba” and “pa” to varying extents. You can see this continuum for yourself at this page run by the University of Calgary which demonstrates how a sound like “ba” can subtly morph into “pa.” This familiar technique allowed Zheng and Samuel to relate any results they found in perception of accents to prior work on how expectations and context affect speech perception.

Second, instead of static pictures, the recordings were dubbed onto videos of the two speakers. In the first two experiments in their article, Zheng and Samuel replicated the classic finding that a photograph of an Asian face leads to judgments of greater accentedness. Once video was introduced into the mix, however, the tendency of participants to rate the Asian speaker as more accented disappeared. In fact, it appears to be possible to completely eliminate the bias toward labeling Asian speakers as having an accent, suggesting that judgments of accented speech can be influenced by the context and the amount of extra information.

The final set of experiments of the paper brought the question back to perception, by drawing an analogy between accent perception and other speech perception tasks.

In experiments that test a continuum from “ba” to “pa”, there is an important but intuitive influence of repetition on behavior. When participants hear “ba” over and over, they are less likely to report hearing “ba.” This phenomenon is known as selective adaptation. Effectively, a participant’s definitions of “ba” and “pa” shift in response to repeated exposure to one of the sounds. Given that people can become accustomed to a speaker’s accent, Zheng and Samuel extended this paradigm to the accent-perception domain for recordings as well as for videos.

In the final pair of experiments, participants came into the lab on two days. During each of these sessions, participants first went through an identification task during which they rated the accentedness of eight forms of the same word that varied in how native the speaker sounded on a scale from 1 to 4.

Participants heard each of the eight forms 20 times in a random order. Then, the task switched to the same judgment interspersed with an adaptation “listening” phase consisting of 14 cycles. In each cycle, participants heard only one form of the word for 30 repetitions, and then rated all eight forms in a random order for how accented the word was.

On the first day, the adaptation phase contained only accented forms of a word or the native pronunciations. On the next day participants heard the opposite form.

The results were quite striking but straightforward. After repeated exposure to a native accent, foreign-accented speech was perceived as substantially more accented. Conversely, after repeated exposure to a foreign accent, foreign-accented speech was perceived as less accented—that is, with enough experience with foreign accents, we judge speakers as having less of an accent than we might have initially.

In order to get at the question of whether knowledge of a speaker’s ethnicity impacts the actual perception of speech, Zheng and Samuel then repeated the experiment but with dubbed videos of the speakers producing an ambiguous blend of foreign- and native-accented speech during the second phase. In this experiment, all the clear examples of native and non-native speech were taken out, so the speech signal provided no explicit clue as to whether the speaker was a native or non-native speaker of English. Instead, the videos were suggestive of the speaker’s linguistic background and were superimposed on the same recordings. Remarkably, but in line with earlier experiments, participants’ definition of “no accent” did not shift after exposure to either speaker, suggesting that listeners’ perception of the words both speakers produced did not differ across contexts.

Altogether, this paper shed light on the non-perceptual origins of accent bias and presents an interesting path forward in educational contexts as well as social policy. Since perception seems to be unaffected by the ethnicity of the speaker, reducing the discrimination that non-native speakers face requires directly targeting tightly-held beliefs about people of other ethnicities.

Building programs to combat these kinds of biases that arise independent of perception will be an important component in eliminating judgments on the base of race and ethnicity.

Featured Psychonomic Society article:                                      

Yi, Z., & Samuel, A. G. (2017). Does seeing an Asian face make speech sound more accented? Attention, Perception, and Psychophysics, DOI: 10.3758/s13414-017-1329-2.

Author

  • Cassandra Jacobs is a graduate student in Psychology at the University of Illinois. Before this, she was a student of linguistics, psychology, and French at the University of Texas, where she worked under Zenzi Griffin and Colin Bannard. Currently she is applying machine learning methods from computer science to understand human language processing under the direction of Gary Dell.

    View all posts

The Psychonomic Society (Society) is providing information in the Featured Content section of its website as a benefit and service in furtherance of the Society’s nonprofit and tax-exempt status. The Society does not exert editorial control over such materials, and any opinions expressed in the Featured Content articles are solely those of the individual authors and do not necessarily reflect the opinions or policies of the Society. The Society does not guarantee the accuracy of the content contained in the Featured Content portion of the website and specifically disclaims any and all liability for any claims or damages that result from reliance on such content by third parties.

You may also like