A few days ago I reported on a new paper that tested the timing accuracy of experimental software packages. The paper suggested that PsychoPy had some issues with presenting very fast stimuli that only lasted one (or a few) screen refreshes.
However, the creator of PsychoPy (Jon Peirce, of Nottingham University) has already responded to the data in the paper. The authors of the paper made all the data and (importantly) the program scripts used to generate it available on the Open Science Framework site. As a result it was possible for Jon to examine the scripts and see what the issue was. It turns out the authors used an older version of PsychoPy, where the ability to set a stimulus duration in frames (or screen refreshes) wasn’t available. Up-to-date versions have this feature, and as a result are now much more reliable. Those who need the full story should read Jon’s comment (and the response by the authors) on the PLoS comments page for the article.
So, great news if, like me, you’re a fan of PsychoPy. However, there’s a wider picture that’s revealed by this little episode, and a very interesting one I think. As a result of the authors posting their code up to the OSF website, and the fact that PLoS One allows comments to be posted on its articles, the issue could be readily identified and clarified, in a completely open and public manner, and within a matter of days. This is one of the best examples I’ve seen of the power of an open approach to science, and the ability of post-publication review to have a real impact.
Interesting times, fo’ shiz.
The first is some recent publicity around the Many Labs replication project – a fantastic effort to try and perform replications of some key psychological effects, with large samples, and in labs spread around the world. Ed Yong has written a really great piece on it here for those who are interested. The Many Labs project is part of the Open Science Framework – a free service for archiving and sharing research materials (data, experimental designs, papers, whatever).
The second was a recent paper by Tom Stafford and Mike Dewar in Psychological Science. This is a really impressive piece of research from a very large sample of participants (854,064!) who played an online game. Data from the game was analysed to provide metrics of perception, attention and motor skills, and to see how these improved with training (i.e. more time spent playing the game). The original paper is here (paywalled, unfortunately), but Tom has also written about it on the Mind Hacks site and on his academic blog. The latter piece is interesting (for me anyway) as Tom says that he found his normal approach to analysis just wouldn’t work with this large a dataset and he was obliged to learn Python in order to analyse the data. Python FTW!
Anyway, the other really nice thing about this piece of work is that the authors have made all the data, and the code used to analyse it, publicly available in a GitHub repository here. This is a great thing to do, particularly for a large, probably very rich dataset like this – potentially there are a lot of other analyses that could be run on these data, and making it available enables other researchers to use it.
These two things crystallised an important realisation for me: It’s now possible, and even I would argue preferential, for the majority of not-particularly-technically-minded psychology researchers to perform their research in a completely open manner. Solid, free, user-friendly cross-platform software now exists to facilitate pretty much every stage of the research process, from conception to analysis.
Some examples: PsychoPy is (in my opinion) one of the best pieces of experiment-building software around at the moment, and it’s completely free, cross-platform, and open-source. The R language for statistical computing is getting to be extremely popular, and is likewise free, cross-platform, etc. For analysis of neuroimaging studies, there are several open-source options, including FSL and NiPype. It’s not hard to envision a scenario where researchers who use these kinds of tools could upload all their experimental files (experimental stimulus programs, resulting data files, and analysis code) to GitHub or a similar service. This would enable anyone else in the world who had suitable (now utterly ubiquitous) hardware to perform a near-as-dammit exact replication of the experiment, or (more likely) tweak the experiment in an interesting way (with minimal effort) in order to run their own version. This could potentially really help accelerate the pace of research, and the issue of poorly-described and ambiguous methods in papers would become a thing of the past, as anyone who was interested could simply download and demo the experiment themselves in order to understand what was done. There are some issues with uploading very large datasets (e.g. fMRI or MEG data) but initiatives are springing up, and the problem seems like it should be a very tractable one.
The benefit for researchers should hopefully be greater visibility and awareness of their work (indexed in whatever manner; citations, downloads, page-views etc.). Clearly some researchers (like the authors of the above-mentioned paper) have taken the initiative and are already doing this kind of thing. They should be applauded for taking the lead, but they’ll likely remain a minority unless researchers can be persuaded that this is a good idea. One obvious prod would be if journals started encouraging this kind of open sharing of data and code in order to accept papers for publication.
One of the general tenets of the open-source movement (that open software benefits everyone, including the developers) is doubly true of open science. I look forward to a time when the majority of research code, data, and results are made public in this way and the research community as a whole can benefit from it.