The first is some recent publicity around the Many Labs replication project – a fantastic effort to try and perform replications of some key psychological effects, with large samples, and in labs spread around the world. Ed Yong has written a really great piece on it here for those who are interested. The Many Labs project is part of the Open Science Framework – a free service for archiving and sharing research materials (data, experimental designs, papers, whatever).
The second was a recent paper by Tom Stafford and Mike Dewar in Psychological Science. This is a really impressive piece of research from a very large sample of participants (854,064!) who played an online game. Data from the game was analysed to provide metrics of perception, attention and motor skills, and to see how these improved with training (i.e. more time spent playing the game). The original paper is here (paywalled, unfortunately), but Tom has also written about it on the Mind Hacks site and on his academic blog. The latter piece is interesting (for me anyway) as Tom says that he found his normal approach to analysis just wouldn’t work with this large a dataset and he was obliged to learn Python in order to analyse the data. Python FTW!
Anyway, the other really nice thing about this piece of work is that the authors have made all the data, and the code used to analyse it, publicly available in a GitHub repository here. This is a great thing to do, particularly for a large, probably very rich dataset like this – potentially there are a lot of other analyses that could be run on these data, and making it available enables other researchers to use it.
These two things crystallised an important realisation for me: It’s now possible, and even I would argue preferential, for the majority of not-particularly-technically-minded psychology researchers to perform their research in a completely open manner. Solid, free, user-friendly cross-platform software now exists to facilitate pretty much every stage of the research process, from conception to analysis.
Some examples: PsychoPy is (in my opinion) one of the best pieces of experiment-building software around at the moment, and it’s completely free, cross-platform, and open-source. The R language for statistical computing is getting to be extremely popular, and is likewise free, cross-platform, etc. For analysis of neuroimaging studies, there are several open-source options, including FSL and NiPype. It’s not hard to envision a scenario where researchers who use these kinds of tools could upload all their experimental files (experimental stimulus programs, resulting data files, and analysis code) to GitHub or a similar service. This would enable anyone else in the world who had suitable (now utterly ubiquitous) hardware to perform a near-as-dammit exact replication of the experiment, or (more likely) tweak the experiment in an interesting way (with minimal effort) in order to run their own version. This could potentially really help accelerate the pace of research, and the issue of poorly-described and ambiguous methods in papers would become a thing of the past, as anyone who was interested could simply download and demo the experiment themselves in order to understand what was done. There are some issues with uploading very large datasets (e.g. fMRI or MEG data) but initiatives are springing up, and the problem seems like it should be a very tractable one.
The benefit for researchers should hopefully be greater visibility and awareness of their work (indexed in whatever manner; citations, downloads, page-views etc.). Clearly some researchers (like the authors of the above-mentioned paper) have taken the initiative and are already doing this kind of thing. They should be applauded for taking the lead, but they’ll likely remain a minority unless researchers can be persuaded that this is a good idea. One obvious prod would be if journals started encouraging this kind of open sharing of data and code in order to accept papers for publication.
One of the general tenets of the open-source movement (that open software benefits everyone, including the developers) is doubly true of open science. I look forward to a time when the majority of research code, data, and results are made public in this way and the research community as a whole can benefit from it.
Researchers typically use a lot of different pieces of software in the course of their work; it’s part of what makes the job so varied. Separate packages might be used for creating experimental stimuli, programming an experiment, logging data, statistical analysis, and preparing work for publication or conferences. Until fairly recently there was little option but to use commercial software in at least some of these roles. For example, SPSS is the de facto analysis tool in many departments for statistics, and the viable alternatives were also commercial – there was little choice but to fork over the money. Fortunately, there are now pretty viable alternatives for cash-strapped departments and individual researchers. There’s a lot of politics around the open-source movement, but for most people the important aspect is that the software is provided for free, and (generally) it’s cross-platform (or can be compiled to be so). All that’s required is to throw off the shackles of the evil capitalist oppressors, or something.
So, there’s a lot of software listed on my Links page but I thought I’d pick out my favourite bits of open-source software, that are most useful for researchers and students in psychology.
First up – general office-type software; there are a couple of good options here. The Open Office suite has been around for 20 years, and contains all the usual tools (word processor, presentation-maker, spreadsheet tool, and more). It’s a solid, well-designed system that can pretty seamlessly read and write the Microsoft Office XML-based (.docx, .pptx) file formats. The other option is Libre Office, which has the same roots as Open Office, and similar features. Plans are apparently underway to port Libre Office to iOS and Android – nice. The other free popular options for presentations is, of course, Prezi.
There are lots of options for graphics programs, however the two best in terms of features are without a doubt GIMP (designed to be a free alternative to Adobe Photoshop) and Inkscape (vector graphics editor – good replacement for Adobe Illustrator). There’s a bit of a steep learning curve for these, but that’s true of their commercial counterparts too.
Programming experiments – if you’re still using a paid system like E-Prime or Presentation, you should consider switching to PsychoPy – it’s user-friendly, genuinely cross-platform, and absolutely free. I briefly reviewed it before, here. Another excellent option is Open Sesame.
For statistical analysis there are a couple of options. Firstly, if you’re a SPSS-user and pretty comfortable with it (but fed up of the constant hassles of the licensing system), you should check out PSPP; a free stats program designed to look and feel like SPSS, and replicate many of the functions. You can even use your SPSS syntax – awesome. The only serious issue is that it doesn’t contain the SPSS options for complex GLM models (repeated measures ANOVA, etc.). Hopefully these will be added at some future point. The other popular option is the R language for statistical computing. R is really gaining traction at the moment. The command-line interface is a bit of a hurdle for beginners, but that can be mitigated somewhat by IDEs like R-Commander or RStudio.
For neuroscience there’s the NeuroDebian project – not just a software package, but an entire operating system, bundled with a comprehensive suite of neuroscience tools, including FSL, AFNI and PyMVPA, plus lots of others. There really are too many bits of open-source neuro-software to list here, but a good place to find some is NITRC.org.
So, there you are people; go open-source. You have nothing to lose but your over-priced software subscriptions.