Bayes prevails in implicit learning categorization and beyond

Researchers have argued for centuries over two leading statistical approaches: Bayesian analysis and the Frequentist approach. Both holding their own complex (and convincing) reasoning, well-meaning researchers can all agree on the goal of their analyses: reaching conclusions with the least amount of bias and error. The war between Bayesians and Frequentists is likely far from over, but there’s some recent compelling evidence for Bayes as it applies to categorization techniques (i.e., grouping participants based on performance on a given task).

Cartoon Bayesians vs. Frequentists. Source credit: Borbála Tölgyesi

The evidence is laid out in an article published by Mateo Leganes-Fonteneau, Ryan Scott, Theodora Duka, and Zoltan Dienes (pictured below) in the Psychonomic Society journal Psychonomic Bulletin & Review. The authors provide a solution to the fundamental problem introduced by the Frequentist approach in implicit learning research: measurement errors that arise in the classification of participants as aware or unaware of their cognitive processes. The authors set out to demonstrate how these errors in standard categorization methodology can be reduced.

Authors of the featured article. From left to right: Mateo Leganes-Fonteneau, Ryan Scott, Theodora Duka, Zoltan Dienes.

When studying implicit learning, or learning that happens beneath one’s awareness, participants are first given a conditioning task. In this task, they learn to associate a particular stimulus (e.g., an image of a house) with either a reward (e.g., money) or an aversive outcome (e.g., an unpleasantly loud noise). On a subset of trials, participants respond using a 1-9 scale assessing their perceived likelihood of receiving the reward or aversive outcome (this is the ‘expectancy rating’) after being presented with a stimulus. An example of this task using a reward outcome is shown below.

The conditioning task. Stimuli are associated with either a high reward (80% probability of winning) or a low reward (20% probability of winning). After participants respond to the stimulus, they are then asked how likely they are to win.

 

Distinct from other implicit learning research, the authors included a second scale to assess the participants’ confidence in their awareness. Combining this metacognitive assessment with the expectancy ratings, the authors gather more sensitive evidence for awareness (or lack thereof).

Following the conditioning task, participants completed another task where attentional responses toward the conditioned stimuli are measured. The authors found that, even in people who are unaware of the relationship between a stimulus and its reward (as shown by the results from the conditioning task), conditioned stimuli generated more distraction, providing evidence of implicit learning.

Using the measurements gathered from both tasks, other researchers have used the frequentist approach (primarily by conducting t-tests) to classify participants into aware and unaware groups by analyzing expectancy ratings only. Participants classified as aware have mean expectancy ratings that are significantly above 5 for the high-reward stimulus and below 5 for the low-reward one (on the 1-9 scale). That is, they perform significantly above chance level in determining if they will receive the reward or not. Participants are classified as unaware if their scores either do not significantly differ from or are below chance level.

This sort of categorization may be problematic. Traditional frequentist approaches do not allow for extracting sensitive information from non-significant results. This means, that when someone is categorized as unaware in this way, they may actually be aware.

A problem that arises when categorizing participants using the frequentist approach is regression to the mean. Regression to the mean (RTM) arises from random errors in measurement (the categorization technique, in this case): extreme values obtained during categorization analysis are likely to be extreme due to some random error, and when analyzed again, may be less extreme or extreme in the opposing direction. After many iterations, the value will begin to reflect its true state (the mean) due to some cancellation of these random measurement errors.

To combat regression to the mean effects, the authors provide an alternative method of categorization by using Bayes factors to classify participants into three distinct groups: aware, unaware, and insensitive. Using signal detection theory, d’ scores were obtained from the expectancy and confidence ratings to create these Bayes factors. This is a rare approach—few, if any, implicit learning researchers have used Bayesian analysis to categorize their participants.

Using Bayes factors, as opposed to t-tests, allows one to discern data that provides inconclusive evidence for a particular category – the insensitive group. Therefore, avoiding significant error in this classification. The authors coined this process the Bayesian Awareness Categorization Technique (BACT).

Here’s the important point: in the standard t-test classification, participants classified as insensitive using Bayes factors would instead be incorrectly classified as unaware, compromising researchers’ conclusions. The image below shows participant awareness scores and their corresponding categorization obtained after completing an attentional task. The x points correspond to the insensitive group, who is not interpreted to be implicit learners in the Bayesian approach but would be in the Frequentist approach.

Performance scores on a flanker task and awareness score for each group according to the Bayesian categorization.

 

To provide evidence that Bayesian categorization reduced regression to the mean effects, the authors compared the two approaches using a resampling procedure. This involves randomly taking half of the data points for each participant (the X-half), running the categorization technique to obtain an awareness category, and then using the category obtained to label the remainder of the data (the Y-half). This was done 1000 times for each participant, and the resulting categorizations of the X-scores and Y-scores are compared to assess if regression to the mean occurred for the Y-halves.

The image below shows the result using the classic t-test categorization technique (on the right) versus the Bayesian one (on the left).

On the left: density plots of the resampling procedure on awareness scores for X and Y halves on Unaware (a) and Insensitive (b) groups using BACT. Y-halves for the unaware group remained below chance level, while they were above chance level for the insensitive group. On the right: density plots for X and Y halves on Unaware (a) and Aware (b) groups using t-tests. Y-halves for the unaware group show regression to the mean effects, where Y-halves were above chance level.

 

The Y-halves for the unaware group regress to above-chance levels (0.5) when using Frequentist categorization, but do not surpass chance levels for the Bayesian unaware group. This shows that using Bayes factors allows one to be confident that participants categorized as unaware are not labeled as such due to measurement error, but due to sensitive evidence for their implicit learning.

While I’m convinced using Bayes factors could reduce error in measurement, what significance level should be used? The authors repeated a resampling analysis to further evaluate regression to the mean effects that could result from altering stringency levels for categorization boundaries (see image below).

Density plots for X and Y halves in Unaware iterations using the BACT at varying levels of significance. This shows that with more stringent cut-offs, the scores on X-halves are more extreme and separated from Y-halves. More liberal cut-offs produce less regression to the mean.

 

Note that the Y-half did not regress to above-chance levels when using a 1/5 – 5 Bayes factor boundary. According to the authors, this is the most stringent significance level that can be used while still avoiding regression to the mean effects. Using this boundary allows researchers to be confident that their categorization of participants reflects the true value while still implementing strict analyses.

Although this article focuses on categorization techniques within implicit learning research, the methodologies advocated can be used in other domains. For example, errors in classifying clinical data could have detrimental effects on how we interpret laboratory findings and therefore diagnose, treat, and assess outcomes of patients. Adopting Bayesian analyses in categorization allows one to be more confident in their interpretation of data because, as researchers know, our scientific advancement is bounded to the validity of our statistical approaches.

Featured Psychonomic Society article

Leganes-Fonteneau, M., Scott, R., Duka, T., & Dienes, Z. (2021). Avoiding pitfalls: Bayes factors can be a reliable tool for post hoc data selection in implicit learning. Psychonomic Bulletin & Review, 28, 1848–1859. https://doi.org/10.3758/s13423-021-01901-4

The Psychonomic Society (Society) is providing information in the Featured Content section of its website as a benefit and service in furtherance of the Society’s nonprofit and tax-exempt status. The Society does not exert editorial control over such materials, and any opinions expressed in the Featured Content articles are solely those of the individual authors and do not necessarily reflect the opinions or policies of the Society. The Society does not guarantee the accuracy of the content contained in the Featured Content portion of the website and specifically disclaims any and all liability for any claims or damages that result from reliance on such content by third parties.

You may also like