Towards open-source psychology research
The first is some recent publicity around the Many Labs replication project - a fantastic effort to try and perform replications of some key psychological effects, with large samples, and in labs spread around the world. Ed Yong has written a really great piece on it here for those who are interested. The Many Labs project is part of the Open Science Framework - a free service for archiving and sharing research materials (data, experimental designs, papers, whatever).
The second was a recent paper by Tom Stafford and Mike Dewar in Psychological Science. This is a really impressive piece of research from a very large sample of participants (854,064!) who played an online game. Data from the game was analysed to provide metrics of perception, attention and motor skills, and to see how these improved with training (i.e. more time spent playing the game). The original paper is here (paywalled, unfortunately), but Tom has also written about it on the Mind Hacks site and on his academic blog. The latter piece is interesting (for me anyway) as Tom says that he found his normal approach to analysis just wouldn’t work with this large a dataset and he was obliged to learn Python in order to analyse the data. Python FTW!
Anyway, the other really nice thing about this piece of work is that the authors have made all the data, and the code used to analyse it, publicly available in a GitHub repository here. This is a great thing to do, particularly for a large, probably very rich dataset like this – potentially there are a lot of other analyses that could be run on these data, and making it available enables other researchers to use it.
These two things crystallised an important realisation for me: It’s now possible, and even I would argue preferential, for the majority of not-particularly-technically-minded psychology researchers to perform their research in a completely open manner. Solid, free, user-friendly cross-platform software now exists to facilitate pretty much every stage of the research process, from conception to analysis.
Some examples: PsychoPy is (in my opinion) one of the best pieces of experiment-building software around at the moment, and it’s completely free, cross-platform, and open-source. The R language for statistical computing is getting to be extremely popular, and is likewise free, cross-platform, etc. For analysis of neuroimaging studies, there are several open-source options, including FSL and NiPype. It’s not hard to envision a scenario where researchers who use these kinds of tools could upload all their experimental files (experimental stimulus programs, resulting data files, and analysis code) to GitHub or a similar service. This would enable anyone else in the world who had suitable (now utterly ubiquitous) hardware to perform a near-as-dammit exact replication of the experiment, or (more likely) tweak the experiment in an interesting way (with minimal effort) in order to run their own version. This could potentially really help accelerate the pace of research, and the issue of poorly-described and ambiguous methods in papers would become a thing of the past, as anyone who was interested could simply download and demo the experiment themselves in order to understand what was done. There are some issues with uploading very large datasets (e.g. fMRI or MEG data) but initiatives are springing up, and the problem seems like it should be a very tractable one.
The benefit for researchers should hopefully be greater visibility and awareness of their work (indexed in whatever manner; citations, downloads, page-views etc.). Clearly some researchers (like the authors of the above-mentioned paper) have taken the initiative and are already doing this kind of thing. They should be applauded for taking the lead, but they’ll likely remain a minority unless researchers can be persuaded that this is a good idea. One obvious prod would be if journals started encouraging this kind of open sharing of data and code in order to accept papers for publication.
One of the general tenets of the open-source movement (that open software benefits everyone, including the developers) is doubly true of open science. I look forward to a time when the majority of research code, data, and results are made public in this way and the research community as a whole can benefit from it.
Posted on January 8, 2014, in Commentary, Programming, Software and tagged GitHub, ManyLabs, MindHacks, Open Science Framework, Open Sesame, open-science, Open-source, PsychoPy, Software. Bookmark the permalink. 10 Comments.