A puppy in a cup, for open science

Since the enlightenment, openness has been a core part of the ethos of science. Scientific openness takes many forms: from its inception the Royal Society, for instance, published reports from all over the world, not just Great Britain. Science is politically open, a collective search for truth and human well-being ideally not concerned with national identity.

Another form of scientific openness we all recognize is the obligation to share what we have learned—or indeed, how we learned—in the scientific process. This obligation stems from many sources, including the need to back up our claims, our reciprocal responsibility to those who have shared their knowledge with us, and our responsibility to the society that supports us. Interestingly, in his inaugural issue of the Royal Society Philosophical transactions in 1665, Henry Oldenburg writes that both scientific researchers and those curious about science are “entitle[d]…to the knowledge of what this Kingdom, or other parts of the World…as well as of the progress of the Studies, Labours, and attempts of the Curious and learned in things of this kind, as of their complete Discoveries and performances…” The obligation of openness is as old as scientific culture itself.

As Oldenburg was advocating using the printing press to promote openness, so we today use computers to promote openness. Today I will discuss a pair of papers published in Behavior Research Methods that help to make psychology more open: Jeff Rouder’s Born Open Data, which discusses how experimenters can ensure that their data are made public; and Benedek Kurdi, Shayn Lozano, and Mahzarin Banaji’s Open Affective Standardized Image Set, which offers an alternative to other restriction-laden stimulus sets.

Born open data

Traditionally, researchers think of themselves as owning the data they generate. Even if this is not the case (e.g., some universities specify that the university itself owns data) researchers still think of data as theirs, and act consistently. Anyone who has asked for data from another researcher, or been given data with strong conditions on its use, has experienced this.

Jeff Rouder would like us to rethink this idea. We should instead think of ourselves as stewards of our data. What makes a good data steward? A good data steward curates the data and helps preserve it for the purpose it was created: to add to our collective knowledge. Of course, we’re not the only ones capable of extracting knowledge from our data. Our data stewardship should be evaluated by how well we make our data available to others.

Rouder suggests that what stands in the way of good stewardship is not bad intentions, but rather the fact that it takes effort. “Born open data” is his attempt to remove that effort from the equation. “Born-open” data are automatically uploaded — without identifying information — to an open repository the night after it is collected. Because this is done by a script, no intervention is needed to make it happen. You can see some of these data on GitHub. The linked file was automatically uploaded to GitHub February 4, 2016.

If you wanted to, you could even read these data in R directly from GitHub, removing any need to download them:

data.url = 'https://raw.githubusercontent.com/PerceptionCognitionLab/data1/master/bayes
 Observer/dotsBayes/countDots/countDots1.dat.001'read.table(data.url)

The data are useless for now because only the Rouder lab knows what most of the columns mean (although you can probably guess what several of the columns represent). But coupled with a simple description of the columns and a tech report or the eventual manuscript, these data will be useful. GitHub keeps a record of all changes to files; even though the data are useless now, the clear audit trail from collection to use in a publication makes born open data as transparent as it is reasonable to be.

Born open data is not for everyone. For discussion of both the nuts and bolts of born open data, and some critiques and responses, read Rouder’s article. Interested readers might also appreciate this discussion on Andrew Gelman’s blog.

The Open Affective Standardized Image Set (OASIS)

Although open data gets a lot of attention, open materials are perhaps even more important. When we publish the results of an experiment, the record of that experiment should include the experimental materials. Without these materials, interpretation and replication of the experiment become difficult.

The International Affective Picture System (IAPS) is a popular set of images for studying the effects of emotional stimuli. The IAPS set has been used in thousands of studies on diverse topics, including fear, anxiety, motivation, and memory. The set, however, is copyrighted and its use comes with severe restrictions. Images in the set cannot be shared, used in teaching or research, nor can they be used in any study accessible from the internet. The last restriction is particularly onerous: Online studies, in which people are recruited and participate via the internet, are a powerful new tool for collecting data. The IAPS data set cannot be used for such studies.

This led Kurdi, Lozano, and Banaji to put together their own collection of 900 images, such as this one of a puppy in a cup:

The source for this image is here on Pixabay, and the image is listed as having a Creative Commons CC0 license, which means that even if the OASIS team wanted to restrict its use, they could not. The image is in the public domain. The OASIS team did more than put together hundreds of images; they also normed them on valence (negative-positive) and arousal. The puppy above was rated as highly positive (average 6.5 on 7 point Likert scale) and arousing (average 5 on a 7 point Likert scale) by 102 adults who participated via Amazon Mechanical Turk. Perhaps unsurprisingly, the puppy in a cup was the most positively-rated image in the set.

For more details, see the OASIS paper or download the the whole picture set, with norms, from the OASIS website.

Open science: Good for everyone

This pair of papers by Rouder and Kurdi, Lozano, and Banaji represent two separate advances in open science: Rouder outlines a way of lessening the effort required to publicly release data; the OASIS team offers a more open set of stimuli for studying human emotion. Hopefully in the future such openness becomes typical scientific practice rather than the exception. Openness and transparency helps maximize the impact of our science.

For more on open science (and how it can make you some money!) see one of our previous posts, “Corralling the Texas Sharp Shooter: $1,000,000 Reward”.

Articles focused on in this post:

Rouder, J. N. (2015). The what, why, and how of born-open data. Behavior Research Methods.doi:10.3758/s13428-015-0630-z
Kurdi, B., Lozano, S., and Banaji, M. R. (2016). Introducing the Open Affective Standardized Image Set (OASIS). Behavior Research Methods. doi:10.3758/s13428-016-0715-3

Author

Richard Morey

Richard Morey is a Senior Lecturer in the School of Psychology at the Cardiff University. In 2008, he earned a PhD in Cognition and Neuroscience and a Masters degree in Statistics from the University of Missouri. He is the author of over 50 articles and book chapters, and in 2011 he was awarded the Netherlands Research Organization Veni Research Talent grant Innovational Research Incentives Scheme grant for work in cognitive psychology. His work spans cognitive science, where he develops and critiques statistical models of cognitive phenomena; statistics, where he is interested in the philosophy of statistical inference and the development of new statistical tools for research use; and the practical side of science, where he is interested in increasing openness in scientific methodology. Dr. Morey is an in-demand speaker on topics related to statistical inference in science, having spoken and given workshops across Europe, Australia, and North America. He is the author of the BayesFactor software for Bayesian inference and writes regularly on methodological topics at his blog.
View all posts