You must tug that thread: why treating preregistration as a gold standard might incentivize poor behavior

It is too early to know whether the recent period of methodological introspection in psychological science, and the sciences in general, will lead to positive changes in practices. As with any revolution, there is the potential of moving backward. One problem that has been consistently acknowledged is the problem of incentives: what gets one attention and advancement is not necessarily good for science. These incentives are just as real for reformers as they are for everyone else, and so we must be careful. Under these circumstances, even a well-thought-out reform might backfire if it is oversold.

I have seen this with Bayesian statistics. As discomfort with existing methods spreads, some authors have included Bayesian analyses as a way to differentiate their manuscript from others. Inclusion of ill-considered Bayesian analyses when they don’t add to the manuscript actually makes a manuscript less transparent and adds more potential for statistical errors. That good statistical practices are not incentivised is part of the problem; any reform that does not address the core problem will become part of the problem rather than part of the solution.

I have fears about another reform: preregistration. I believe preregistration is an important tool for scientists to increase transparency, clarify thinking, and make science more systematic. These are important goals, and when used in the right context, preregistration—and its big brother Registered Report—has the potential to help improve science. The current incentive structure, however, might cause problems rather than solve them. I will focus here on some implications for preregistration of statistical analyses, which have been offered as a solution for p-hacking and other opportunistic practices. My post therefore connects well to yesterday’s contribution by Klaus Oberauer.

Opportunistic practices are real and they have multiple causes, including incentives to publish and human cognitive biases. Let’s consider what results preregistered analyses might produce in such an environment. First, we imagine we have a system where preregistered analyses are trusted more, and so people begin to preregister their analyses to help them publish. They may have a weak understanding of statistics, but they typically perform the same (parametric) statistical test for every data set. They might preregister this test, along with any assumption checks SPSS returns by default.

Preregistration prevents them from, for instance, transforming the data in multiple ways to obtain a significant result—at least, they cannot do this without suspicion (as long as someone reads the preregistration document). That is, it prevents them doing what bad scientists do. This can be a good thing. But what would a good scientist do? A bad scientist opportunistically looks for reasons to trust their own hypothesis. A good scientist does the opposite: she opportunistically looks for reasons to doubt themselves, including their analysis and hypothesis. They check for robustness, both in ways that they might have thought of before (and potentially preregistered) and ways they might not have.

A bad scientist illicitly mines “truth” from their data; a good scientist mines doubt.

An advocate of preregistration will note that preregistration does not prevent checking for robustness. So-called “exploratory” analyses are allowed, and so a good scientist is not prevented from due diligence. But the reason why preregistration is being suggested now is to prevent bad practices. If good statistical practices were properly incentivised, we would not have a methodological crisis in the first place. The scientist who was willing to hide the fact that they performed multiple analyses and picked the best, or who opportunistically removed “outliers”, will also be willing to ignore clues that their data is not as high-quality as they thought, or that their analysis is mis-specified. Why would they tug on that thread, and watch their entire paper unravel?

Alternative approaches to preregistration exist. One that I think emulates what good scientists do better than preregistration is multiverse analysis (Steegen, Tuerlinckx, Gelman, & Vanpaemel, 2015). Multiverse analysis formalizes what a good data analyst would do: approaching a data analysis problem in many ways and getting a sense of the robustness of the result. The multiverse analysis is one way of looking for reasons to doubt your conclusions, in spite of the fact that one test might seem to confirm your hypothesis. (Yesterday’s post also touched on this.)

But even this is not enough. We need to incentivise people to tug the thread on their own analyses. If preregistration might help people ignore potential problems, what might we do to counteract the problem? One possibility is the expectation of data sharing and more secondary analysis of data. If you knew that someone else might tug the thread later, you’d have an incentive to make sure your analysis is robust.

People also need to be asked to move away from simple default analyses like t-tests with bar plots. Alternative ways of approaching an analysis, such as nonparametric statistics, are not just something you do in special cases (such as when a default assumption check fails). Recently, in response to my suggestion that nonparametric methods be taught to undergraduate students alongside parametric ones, I was told that this was unnecessary because one could always just look up the nonparametric test in a book when you needed it. This attitude clearly shows the tyranny of the default: the “standard” methods are the ones you need, except in special cases. Looking at a statistical problem from multiple perspectives as a part of one’s scientific process is a foreign idea to many people. This attitude needs to change.

In a world where researchers only use a single test type and a single graphical method—and where their statistical software allows little else—the problem is not that preregistration is a straightjacket, but rather that it actually lulls one into thinking that methodological vice is virtue. The tyranny of the default is real, and any practice that encourages such thinking should be carefully considered.

There is much to lose if preregistered analyses become a trusted gold standard instead of useful tool in some contexts.

Author

Richard Morey

Richard Morey is a Senior Lecturer in the School of Psychology at the Cardiff University. In 2008, he earned a PhD in Cognition and Neuroscience and a Masters degree in Statistics from the University of Missouri. He is the author of over 50 articles and book chapters, and in 2011 he was awarded the Netherlands Research Organization Veni Research Talent grant Innovational Research Incentives Scheme grant for work in cognitive psychology. His work spans cognitive science, where he develops and critiques statistical models of cognitive phenomena; statistics, where he is interested in the philosophy of statistical inference and the development of new statistical tools for research use; and the practical side of science, where he is interested in increasing openness in scientific methodology. Dr. Morey is an in-demand speaker on topics related to statistical inference in science, having spoken and given workshops across Europe, Australia, and North America. He is the author of the BayesFactor software for Bayesian inference and writes regularly on methodological topics at his blog.
View all posts

Dear Richard,

With all due respect, I don’t think we need to “have fears” about preregistrations. As you rightly pointed out, preregistration does not prevent checking for robustness as additional exploratory measures (or even as preregistered outcome neutral checks, and as long as preregistration is coupled with the sharing of raw data, we have the best of both worlds, as you point out in the conclusion of your post.

If there is a case for fearing preregistrations, perhaps the issue is one of disclosure and trust. Two decades ago, we were trained to conduct research in cosy ways, in the safety of our private offices and labs, refining our thoughts, research questions, hypotheses, and data files as we went along, deciding a posteriori when we were ready to make our work public (often the polished, cleaned, supposedly error-free version of our work). I am not talking about faking data here. I am talking about praxis, that is practice as distinct from theory, research as it happens in real life. There was no training in data management back then. How many of us can claim to have never created multiple versions of a data file under different names, or different folders? How many can say they have never consolidated their data analysis plan(s) while they were analysing their data? If you did, that’s OK! This is how we learnt to become a researcher, through trial and error.

Now, we are told, and are telling our students, we must plan everything ahead. The problem is, those among us who are more experienced with research know: plans rarely if ever survive contact with reality.

So perhaps the fear that preregistration instills is the fear of “being wrong” a priori or “being found out”. Open disclosures are frightening because they make us vulnerable. So the biggest challenge is for us, as a community, to change our outlook on other people’s work. To recognise that those who open up their work to scrutiny via preregistration are making themselves more vulnerable to criticism for their design and analysis choices. If we happen to know or believe there is a better way, we owe it to Science to share it, but we also owe it to our colleagues to do so with respect and constructive feedback. We need to recognise that, contrary to what the old ways of conducting and reporting research may have led us to believe, in fact, no-one is omniscient or should be expected to be. We should remind ourselves that preregistration is not about “getting everything right” a priori, that doing research and making hypotheses is not about “being right” or “being wrong” as a person, or conducting the “right kind” or the “wrong kind” of data analysis. It is about trying to find out, to the best of our abilities, both individually and collectively, what is true and what is not. That should include pre-registration but also registered replications, and positive feedback suggesting or implementing alternative analyses were needed, but always with the utmost consideration for others and their work.

Sincerely,

Gaëlle.

1 Comment

Prof Gaëlle Vallée-Tourangeau says:

January 16, 2019 at 10:30 am

Dear Richard,

With all due respect, I don’t think we need to “have fears” about preregistrations. As you rightly pointed out, preregistration does not prevent checking for robustness as additional exploratory measures (or even as preregistered outcome neutral checks, and as long as preregistration is coupled with the sharing of raw data, we have the best of both worlds, as you point out in the conclusion of your post.

If there is a case for fearing preregistrations, perhaps the issue is one of disclosure and trust. Two decades ago, we were trained to conduct research in cosy ways, in the safety of our private offices and labs, refining our thoughts, research questions, hypotheses, and data files as we went along, deciding a posteriori when we were ready to make our work public (often the polished, cleaned, supposedly error-free version of our work). I am not talking about faking data here. I am talking about praxis, that is practice as distinct from theory, research as it happens in real life. There was no training in data management back then. How many of us can claim to have never created multiple versions of a data file under different names, or different folders? How many can say they have never consolidated their data analysis plan(s) while they were analysing their data? If you did, that’s OK! This is how we learnt to become a researcher, through trial and error.

Now, we are told, and are telling our students, we must plan everything ahead. The problem is, those among us who are more experienced with research know: plans rarely if ever survive contact with reality.

So perhaps the fear that preregistration instills is the fear of “being wrong” a priori or “being found out”. Open disclosures are frightening because they make us vulnerable. So the biggest challenge is for us, as a community, to change our outlook on other people’s work. To recognise that those who open up their work to scrutiny via preregistration are making themselves more vulnerable to criticism for their design and analysis choices. If we happen to know or believe there is a better way, we owe it to Science to share it, but we also owe it to our colleagues to do so with respect and constructive feedback. We need to recognise that, contrary to what the old ways of conducting and reporting research may have led us to believe, in fact, no-one is omniscient or should be expected to be. We should remind ourselves that preregistration is not about “getting everything right” a priori, that doing research and making hypotheses is not about “being right” or “being wrong” as a person, or conducting the “right kind” or the “wrong kind” of data analysis. It is about trying to find out, to the best of our abilities, both individually and collectively, what is true and what is not. That should include pre-registration but also registered replications, and positive feedback suggesting or implementing alternative analyses were needed, but always with the utmost consideration for others and their work.

Sincerely,

Gaëlle.