Gaming the 1-back reinforcement task: Birds vs humans

Sports have been part of my life since I was a kid, with team sports like basketball, volleyball, and softball being some of my favorites. When I no longer played competitively, I stayed with my favorite sports by refereeing – youth, high school, collegiate, and even a brief foray into professional leagues. And when my kids came along, they too gravitated to sports, both individual and team. Interestingly, neither kid has so far wished to pursue the difficulty of being a referee, but that’s a story for another time!

Rules Rule!

Rules for any type of game may be learned explicitly or implicitly. Explicit rules are explained and can be verbalized back – usually. Implicit rules are learned as you play, and sometimes they can be explained and sometimes they can’t.

When I refereed youth basketball, kids eventually learned that they couldn’t run with the ball, after having the ball given to the other team following a whistle. When kids are first learning basketball, in this case, most have a hard time remembering all the rules and can only tell you that they did something wrong, but maybe not why it was wrong. In this case, the kids were learning to play basketball implicitly rather than explicitly. As kids mature, they eventually learn the rules of the game and can explain them to others, thus demonstrating explicit learning.

Children playing basketball. — *Image of post author’s child learning to play basketball. Photo taken by. H. Manitzas Hill.*

Animal Games

Humans aren’t the only animals to learn about games and sports. B. F. Skinner taught pigeons to play ping pong with positive reinforcement, as shown in the cartoon image below. Each time the pigeon pecked at the ball and pushed it across the court, the bird was rewarded. This form of learning is operant conditioning or a type of associative learning and occurs very gradually with the use of reinforcement and is often considered a form of implicit learning.

Cartoon of pigeons playing ping pong — *Picture of ping pong playing pigeons and B.F. Skinner, guru of all things operant conditioning. Image from* *Skinner, 1962*.

Marine mammals, such as belugas and dolphins, spontaneously engage in games that are also acquired through implicit learning – most likely. I was an observer of such an experience. One beluga mother and her calf created a water spit game in which the mother spit water up into the air and her calf caught it. However, when the calf spat the water back, the mother did not catch it. The calf had to learn implicitly that while his mother was not going catch his water spit, he could continue the game as long as he caught his mother’s water spit!

Two dolphins spouting water to each other. — *Picture of beluga mother-calf water spit game. Picture taken by H. Manitzas Hill.*

Learning to Learn

Learning experts have long investigated the boundary between implicit learning and explicit learning, with the only certainty being that a clear boundary is hard to identify. Working with both human and non-human animals, comparative psychologists can begin to unravel shared abilities based on performance in a comparable task.

Tasks have been developed to tease apart the types of learning used by humans and animals: a discrimination along a continuum in which an uncertain response option is available and a rule-based category-learning task vs an information-integration task.

In a continuum discrimination task, an individual may have to decide if a stimulus is a circle or an oval, much like Pavlov did when he investigated experimental neuroses in his dogs when stimuli became too similar. Research with pigeons and primates has suggested that when an uncertain response is available, the task becomes an exercise in explicit learning or under conscious control. However, if reinforcement contingencies are provided, the task can become an implicit learning task (much like Skinner’s ping pong playing pigeons).

The second set of tasks used to determine if implicit or explicit learning is occurring is a rule-based category learning task in which participants learn a single rule about a feature of a stimulus (e.g., choose the red stimulus or choose the smaller stimulus). This task occurs quickly and implicitly for humans unlike the contrasting task in which participants must discriminate the correct choice based on two rules (e.g., color and size). Humans can learn this task but it takes more time and appears to involve explicit learning.

Pigeons Take an Unexpected Turn

Curiously, when pigeons were tested on these tasks, a conundrum emerged; the pigeons learned all the tasks slowly and presumably through associations or implicit learning. To attempt to unravel this new comparative puzzle, a new procedure that was thought to disrupt the implicit learning or associative processing of reinforcement contingencies was tested. This new task, the 1-back reinforcement task, was proposed to rely solely on explicit learning.

In the 1-back reinforcement task, a trial is conducted, but not reinforced immediately after the correct choice is made. Rather, another trial is conducted, and the individual is reinforced after the second trial has been completed, right or wrong. This iterative process continues throughout the session. Pigeons were able to learn this task steadily but at a somewhat low level of accuracy, unlike monkeys which were able to learn the task with greater accuracy.

The Human Litmus Test

In the study summarized here and published in the Psychonomic Society’s Learning & Behavior journal, Tom Zentall, Peyton Mueller, and Daniel Peng tested humans on the 1-back reinforcement task to compare human performance with pigeon performance while expanding the task to include different types of instructions. The instructions were meant to prime participants to determine if one would lead to learning the reinforcement strategy.

Over 100 undergraduate students participated in a symbolic match-to-sample task on a computer. In the basic task, participants who saw a yellow circle were supposed to select the red circle, while participants who saw a blue circle were supposed to select the green circle. Of course, the directions did not include this information. They had to determine it organically from their experiences.

However, the difficulty in the task was the 1-back reinforcement contingency. That is, the symbolic matching task feedback was delayed a trial, like in the pigeon study. The participants had to figure out the rule to be successful and gain points.

Interestingly, the students were fairly successful at the task (unlike the pigeons who were below 70% accuracy criteria) and could identify what color matched with what color, but they did not realize reinforced feedback was delayed by a trial. Moreover, the directions to solve the task intuitively or consciously did not provide any clues, and all groups performed similarly, as shown in the figure below. As the authors noted, the human participants learned the matching rule explicitly but completely missed the 1-back reinforcement trial rule, unlike the pigeons.

Plot from paper — *Performance on 1-back reinforcement symbolic matching task based on instructions given at the beginning of the task.*

The Final Score?

Zentall and co-authors argued that the levels of accuracy on this task (high or otherwise) cannot discriminate between implicit and explicit learning for either pigeons or humans. And, while the humans performed at a higher level of accuracy than the pigeons, the pigeons may have actually learned the 1-back reinforcement rule, whereas the humans did not.

It seems, then, that the pigeons and humans are tied in their score on the 1-back reinforcement task, suggesting that this procedure may not be the critical test in discriminating between implicit vs explicit learning. Thus, a new game must be invented to assess this question, and in the words of all great referees, it is time to “Play On”.

Featured Psychonomic Society article

Zentall, T. R., Mueller, P.M. & Peng, D. N. (2023). 1-Back reinforcement symbolic-matching by humans: How do they learn it? Learning & Behavior. https://doi.org/10.3758/s13420-022-00558-w