Face or mask? A Turing test for hyperrealistic masks

Will computers ever think like us? And if they do, how would we know? In 1950, Alan Turing proposed that computers could be considered intelligent if an observer can no longer distinguish which of two partners in a conversation is a “real” human and which is a computer. To date, no computer has passed this famous Turing test.

But that doesn’t mean that computers haven’t come a long way from their humble origins as glorified and over-sized arithmetic devices. In fact, computer have developed faster and further than most people imagined possible just a few decades ago. (Just ask Siri about the weather in Montreal right now.)

Although no computer has passed the original Turing test, there are several circumscribed task settings in which machines have reached or exceeded human levels of performance—computers now routinely outperform even highly expert chess players.

A recent article in the Psychonomic Society’s journal Cognitive Research: Principles and Implications addressed a perceptual arena in which modern technology has created a whole set of new challenges: hyper-realistic face masks, which are made from flexible materials such as silicone and are designed to imitate real human faces.

Researchers Jet Gabrielle Sanders, Yoshiyuki Ueda, Sakiko Yoshikawa, and Rob Jenkins examined whether hyper-realistic face masks can pass the Turing test. And if so, under what circumstances?

In earlier work, which we blogged about here, the same research team showed that people were unable to detect when a confederate of the experimenter wore a mask in a field experiment. The figure below shows the types of masks used in that study, with a low-realism mask (left), a hyper-realistic mask (middle), and a real face (right).

Although participants in that study (obviously!) picked the low-realism mask, less than half of participants detected the hyper-realistic mask even if the confederate was only 5 m away from the participant.

In the new study, Sanders and colleagues took this approach one step further and asked whether participants would be able to differentiate between photos of hyper-realistic masks and photos of actual faces. The rationale for this new study was to create a situation in which people knew that one of two photos presented to them always involved a mask, thus removing the problem of the low real-life incidence of masks in real life. That is, whereas in the above field study the unsuspecting participants had no idea that they might be viewing a person wearing a hyper-realistic mask, in the present study the question was not if a mask was present but which of two photos involved a mask.

The procedure is outlined in the figure below. On each trial, participants were shown a pair of pictures for 500 ms (1/2 second) and then decided which of the two faces was wearing a mask. Some masks were hyper-realistic and others were low-realism masks.

An additional experimental variable was the race of the stimulus face and of the participants: half the photos were of Asian people and the other half of Caucasians. Similarly, half the participants were members of the campus community at a Japanese university and the other half were volunteers from a British campus community. The race variable was included in the experiment because of the well-known “other race effect”, which refers to the fact that people find faces of other races more difficult to discriminate than faces of their own race. It is therefore of interest to discover whether similar skill differences also emerge with hyper-realistic masks.

The data are shown in the figure below, with response time in the left panel and accuracy in the right panel.

Considering response times first, people clearly took longer to make the decision for hyper-realistic masks than for their low-realism counterparts. People also took longer to make the decision for faces of a race other than their own (i.e., the Japanese participants judging Caucasian stimuli, and the British participants judging Asian faces).

A very similar pattern arose in the accuracy data: people had more difficulty with the hyper-realistic masks than the low-realism masks, and the judgments were more accurate for participants’ own race than the other race. (For the low-realism mask there was a ceiling effect because the decision was so easy, and hence no differences emerged between own- and other-race stimuli.)

The most important aspect of these results is the high error rate with hyper-realistic masks (33%). In a second experiment that was identical to the first one except that the stimuli now stayed on until participants responded, the hyper-realistic error rate was still substantial, but reduced to 20%.

Although performance was significantly above chance, it must be recalled that these data were observed under ideal conditions: participants knew about the masks, they only had two photos to compare, and in one of the experiments they had as much time as they needed.

When conditions are less than ideal, hyper-realistic masks may be more problematic. To illustrate, consider the case of an elderly white man who boarded an Air Canada flight from Hong Kong to Vancouver in 2010. During the flight, the “elderly white man” visited the bathroom and, having removed his mask, emerged an Asian man in his early 20s. The mask had gone unnoticed through multiple pre-boarding check-points.

Accordingly, Sanders and colleagues conclude:

“Our findings suggest that synthetic faces are at the point where they can fool viewers frequently. We see no reason to expect this imitation technology to stop improving. People are rightly wary of photorealistic images because they know that images can be manipulated. We may be entering a time where the same concerns apply to facial appearance in the real world.”

Psychonomics article highlighted in this post:

Sanders, J. G., Ueda, Y., Yoshikawa, S., & Jenkins, R. (2019). More human than human: a Turing test for photographed faces. Cognitive Research: Principles and Implications. DOI: 10.1186/s41235-019-0197-9.

Author

Stephan Lewandowsky

Stephan Lewandowsky's research examines memory, decision making, and knowledge structures, with a particular emphasis on how people update information in memory. He has also contributed nearly 50 opinion pieces to the global media on issues related to climate change "skepticism" and the coverage of science in the media.
View all posts

Face or mask? A Turing test for hyperrealistic masks

Author

You may also like

Forget the fish and spell that student’s name: B.O.B.

12,000 words and no plot but still useful: Introducing our new Resources for Research section

How about a few extra $ trillion? Discussing the value of open data

1 Comment