How to pilot an experiment

I got a serious question for you: What the fuck are you doing? This is not shit for you to be messin’ with. Are you ready to hear something? I want you to see if this sounds familiar: any time you try a decent crime, you got fifty ways you’re gonna fuck up. If you think of twenty-five of them, then you’re a genius… and you ain’t no genius.
Body Heat (1981, Lawrence Kasdan)

To consult the statistician after an experiment is finished is often merely to ask him to conduct a post-mortem examination. He can perhaps say what the experiment died of.
R.A. Fisher (1938)

Don’t crash and burn your experiment.

Doing a pilot run of a new psychology experiment is vital. No matter how well you think you’ve designed and programmed your task, there are (almost) always things that you didn’t think of. Going ahead and spending a lot of time and effort collecting a set of data without running a proper pilot is (potentially) a recipe for disaster. Several times I’ve seen data-sets where there was some subtle issue with the data logging, or the counter-balancing, or something else, which meant that the results were, at best,  compromised, and at worst completely useless.

All of the resultant suffering, agony, and sobbing could have been avoided by running a pilot study in the right way. It’s not sufficient to run through the experimental program a couple of times; a comprehensive test of an experiment has to include a test of the analysis as well. This is particularly true of any experiment involving methods like fMRI/MEG/EEG where a poor design can lead to a data-set that’s essentially uninterpretable, or perhaps even un-analysable. You may think you’ve logged all the data variables you think you’ll need for the analysis, and your design is a work of art, but you can’t be absolutely sure unless you actually do a test of the analysis.

This might seem like over-kill, or a waste of effort, however, you’re going to have to design your analysis at some point anyway, so why not do it at the beginning? Analyse your pilot data in exactly the way you’re planning on analysing your main data, save the details (using SPSS syntax, R code, SPM batch jobs – or whatever you’re using) and when you have your ‘proper’ data set, all you’ll (in theory) have to do is plug it in to your existing analysis setup.

These are the steps I normally go through when getting a new experiment up and running. Not all will be appropriate for all experiments, your mileage may vary etc. etc.

1. Test the stimulus program. Run through it a couple of times yourself, and get a friend/colleague to do it once too, and ask for feedback. Make sure it looks like it’s doing what you think it should be doing.

2. Check the timing of the stimulus program. This is almost always essential for a fMRI experiment, but may not be desperately important for some kinds of behavioural studies. Run through it with a stopwatch (the stopwatch function on your ‘phone is probably accurate enough). If you’re doing any kind of experiment involving rapid presentation of stimuli (visual masking, RSVP paradigms) you’ll want to do some more extensive testing to make sure your stimuli are being presented in the way that you think – this might involve plugging a light-sensitive diode into an oscilloscope, sticking it to your monitor with a bit of blu-tack and measuring the waveforms produced by your stimuli. For fMRI experiments the timing is critical. Even though the Haemodynamic Response Function (HRF) is slow (and somewhat variable) you’re almost always fighting to pull enough signal out of the noise, so why introduce more? A cumulative error of only a few tens of milliseconds per trial can mean that your experiment is a few seconds out by the end of a 10 minute scan – this means that your model regressors will be way off – and your results will likely suck.*

3. Look at the behavioural data files. I don’t mean do the analysis (yet), I mean just look at the data. First make sure all the variables you want logged are actually there, then dump it into Excel and get busy with the sort function. For instance, if you have 40 trials and 20 stimuli (each presented twice) make sure that each one really is being presented twice, and not some of them once, and some of them three times; sorting by the stimulus ID should make it instantly clear what’s going on. Make sure the correct responses and any errors are being logged correctly. Make sure the counter-balancing is working correctly by sorting on appropriate variables.

4. Do the analysis. Really do it. You’re obviously not looking for any significant results from the data, you’re just trying to validate your analysis pipeline and make sure you have all the information you need to do the stats. For fMRI experiments – look at your design matrix to see that it makes sense and that you’re not getting warnings about non-orthogonality of the regressors from the software. For fMRI data using visual stimuli, you could look at some basic effects (i.e. all stimuli > baseline) to make sure you get activity in the visual cortex. Button-pushing responses should also be visible as activity in the motor cortex in a single subject too – these kinds of sanity checks can be a good indicator of data quality. If you really want to be punctilious, bang it through a quick ICA routine and see if you get a) component(s) that look stimulus-related, b) something that looks like the default-mode network, and c) any suspiciously nasty-looking noise components (a and b = good, c = bad, obviously).

5. After all that, the rest is easy. Collect your proper set of data, analyse it using the routines you developed in point 4. above, write it up, and then send it to Nature.

And that, ladeez and gennulmen, is how to do it. Doing a proper pilot can only save you time and stress in the long run, and you can go ahead with your experiment in the certain knowledge that you’ve done everything in your power to make sure your data is as good as it can possibly be. Of course, it still might be total and utter crap, but that’ll probably be your participants’ fault, not yours.

Happy piloting! TTFN.

*Making sure your responses are being logged with a reasonable level of accuracy is also pretty important for many experiments, although this is a little harder to objectively verify. Hopefully if you’re using some reasonably well-validated piece of software and decent response device you shouldn’t have too many problems.

Advertisements

About Matt Wall

I do brains. BRAINZZZZ.

Posted on November 4, 2012, in Commentary, Experimental techniques, Neuroimaging, Programming, Study Skills, Uncategorized and tagged , , , , , , , . Bookmark the permalink. 5 Comments.

  1. One more thing that I started doing recently to test my scripts was to generate fake data. So for example if I expect reaction time for condition 1 to be faster than that for condition 2, I would generate (automatic) responses in my scripts such that for condition one there would be a fake response at, say, 600 ms + noise while for condition 2 I would set it at 700 ms + noise.

    That way, I can test if the experiment is taking the input properly, is running the entire duration well and terminates nicely and I immediatelly get a fake pilot data that I can use to verify my analysis scripts too. Often I even speed up this automatic responding (and stimulus presentation) so that I can do such basic verification in 1 min rather that 1 hour.

  2. YES!!!! And if you’re doing an fMRI experiment, spend a few hours just looking at your raw data. Look at the data on the scanner (or a dicom viewer), look at it again after it has been ported into your analysis package of choice. (Changing data types can introduce errors.) Better yet, also have your scanner physicist (or a highly experienced fellow fMRIer) take a look-see at your raw data, too. You presumably spent a bit of time talking to your physicist about your acquisition protocol so why wouldn’t you solicit feedback on your data? Iterate your physicist’s opinion once or twice. (If you don’t show it to me it’s really difficult to have an opinion on it!)

    Remember, just running the pilot does not a robust experiment make. You also have to *evaluate* the pilot to make it truly worthwhile. Almost every fMRI pilot I’ve ever been privy to has triggered substantial changes to the acquisition. Which is entirely the point.

    • Great stuff – thanks! I shied away from mentioning the acquisition side of fMRI pilot experiments in the article as it’s a bit more specialised, and a whole topic in its own right, but of course it’s *really* important too – thanks for bringing it up.

  1. Pingback: research-funds/collabs | Pearltrees

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: