Running Experiments 1: Timing is Everything.
There is a tide in the affairs of men.
Which, taken at the flood, leads on to fortune;
Julius Caeser, Act 4, scene III
This will be the first in a (probably fairly lengthy) series of posts about how to design, program and run a successful psychology experiment. For this initial post I want to go over some basics about how computers work, and what that means for running successful experiments. Of course there are many different kinds of experiments that it’s possible to run, but for the purposes of the present discussion I’m going to use as an example a canonical kind of cognitive experiment, where the dependent variable is reaction time, measured using a button-press. This kind of experiment is still widely-used, in paradigms like the dot-probe attentional task, and the Implicit Association Test.
The first thing you need to understand is that modern operating systems are very, very complicated. When you boot up Windows (for example) all you see at first is a nice, clean, uncluttered desktop, but examining the Windows Task Manager reveals a whole host of ‘background’ processes which are buzzing away invisibly all the time. These might be search indexers, printer services, network connections, anti-virus software, firewalls, and a whole mess of other stuff. If you then open a few different applications (a web browser with a few tabs open, Microsoft Word, an email program, some instant messaging application) the computer has a few more processes to juggle, as well as all the background stuff. This is normally fine, as long as you have enough RAM to handle all the requirements of these different processes – modern OSs are multi-tasking, meaning they can handle having lots of things going on at once.
However, the multi-tasking nature of Windows is essentially an illusion. A computer has only one processing core, meaning, very crudely, that it can only perform one calculation at a time.* Since it can only perform one process at once, and there are multiple processes, the solution that is used is that the processor switches its attention between all the active processes, and does so very, very fast. If you’re writing an e-mail in outlook and then switch your attention to Microsoft Word, so does the computer (as well as keeping all those background processes going). Since computer processors work very, very fast in human terms the computer can switch its attention away from what you’re working on and then back again very quickly (on the order of a few microseconds) without you even noticing. The computer is like a one-armed juggler, who perpetually has 20 or 30 balls in the air – it can only catch and throw one ball at a time so it has to catch and throw each ball at very high speed. Of course, if the computer is old and slow and you try to throw too many balls at it at once, eventually it’ll drop one – that’s when your applications crash or your system grinds to a halt completely.
How does the computer know which process to switch its attention to at any given time? Each process is assigned what’s called a ‘priority interrupt’ value. Processes with a high value (like the application you’re currently working on) can demand more processor time, and interrupt the operation of processes with a low value (like most background processes). The pattern of priority interrupts determines how much attention the computer gives to each process at any given time.
What does all this mean for running psychology experiments? Well, psychology experiments often have features that are critically dependent on exact timing, otherwise your experiment (and the data derived therefrom) might be unreliable. Imagine an experiment investigating unconscious visual priming, where a picture is flashed very briefly on a screen – so briefly that the participant doesn’t even ‘consciously’ register what the content of the picture was. It’s possible to demonstrate that participants can react meaningfully to a picture that they did not consciously ‘see’. An example might be, faster reaction times to identifying the word ‘chair’ when a related picture (i.e. a table), compared to an unrelated picture (i.e. a clock) was flashed just beforehand. If you flash the picture too fast, the participant won’t even have time to register it unconsciously, whereas if you flash the picture too slowly (i.e. for too long) the participant will be able to consciously see the table. Depending on the exact circumstances, it’s been shown that you can get non-conscious priming effects like this with picture presentations which last about 15-30 milliseconds. Now imagine that the computer draws the image on the screen, but while the image is still displayed it switches its attention to another different process. You may have specified in your experimental program that the picture should only be displayed for, say, 17ms, however it ends up being on the screen for 50ms, or even longer. Your participant sees the picture clearly, and your experiment is blown. An even worse situation is when the computer behaves inconsistently and displays pictures correctly on some trials, but on others leaves them up on the screen too long, or not long enough. In this situation you have no way of knowing how many of your trials were compromised in terms of timing, and your data is much, much noisier than would otherwise be the case.
A related problem is with the timing of participant responses. Say you’re measuring reaction times with say, your participant pressing the space-bar on the keyboard as a response to something appearing on the screen. The computer has to ‘listen’ for the space-bar press and record the time at which it occurs, relative to your stimulus. However, this ‘listening’ to the keyboard is a process, run by the processor, so if the computer has its attention somewhere else at the time that the space-bar is pressed, it might be a few milliseconds before it actually registers the response. Since humans are pretty fast at switching their attention systems and performing other cognitive functions too, the size of effect you often get in cognitive reaction-time experiments is sometimes pretty small – i.e. you might have a difference of maybe 10 or 20 ms between two conditions. This can be a significant effect if your sample size and number of trials is large enough, however, reaction time data is inevitably quite noisy in the first place, and if there is some variability in the timing of the recording of the response too, this can easily be enough to completely destroy your effect, and again, blow your experiment.
When I first started programming experiment back in the late ’90s the solution I used was an old piece of software called Micro Experimental Laboratory (MEL); a package which ran under MS-DOS. Since DOS is not a multi-tasking operating system, the problem simply didn’t exist – the experiment demanded 100% of the computer’s attention while it was running, and the timing was therefore always accurate. Even at the time though, the hardware and software I used was out-of-date, plus programming in DOS is about as horrendous and soul-destroying an experience as I could wish on anybody. Nowadays you can’t get the right kind of hardware, and most modern Windows computers don’t even run DOS.**
What’s the solution to this timing problem then? The most simple answer is to close down as many background processes as possible before running an experiment. Unplug your experimental computer from the network, disable anything that runs in the background like firewalls, anti-virus software, instant-messaging applications, that kind of thing. Basically, anything that sticks an icon in the Windows system-tray (bottom right of the screen, where the clock is) is bad – get rid of it all. You can also kill individual processes using the task manager, however use caution here as you might disable something critical – only close down processes here if you know what they are. iTunes is one example of a piece of software which runs invisible processes that you can safely close from the task manager. This minimises the problem, but doesn’t solve it completely.
The solution unfortunately is that there is no solution. You cannot guarantee that (if you’re using a multi-tasking operating system, which practically everybody does) timing errors due to priority-interrupt switching do not occur. You can however, ensure that if they do occur, they are very, very rare indeed. There are many packages available which are optimised for programming and running experiments (I’ll cover some of the more popular ones in a subsequent post) and most of them are reasonably successful at minimising timing errors. They might do this by assigning their experiment-execution processes the highest-possible priority interrupt value that the system allows, which means that other processes won’t interrupt them; not too often, anyway… Some of them also have their own internal timing checks based on super-accurate, sub-microsecond timers which log data as the experiment executes and allows the experimenter to identify trials on which the timing might have been bad; these trials can then be removed from analysis, preserving the data quality for the rest of the trials.
So, there you have it – the reason why modern computers find it difficult to measure time accurately. In future posts I’ll go into much more detail about specific tools and methods you can use in order to ensure that your experiments have (for the vast majority of trials anyway) accurate timing, and your experiments run smoothly, your data will be wonderful, you’ll discover grand new vistas of scientific inquiry, receive Nobel prizes, be irresistible to the opposite sex,*** etc. etc.
Much more techy-stuff about Priority Interrupts here:
Webpage of the Experimental Timing Standards Laboratory:
* In actual fact these days pretty much any computer you might buy has a dual-core, or even quad-core processor, however the implementation of how those multiple cores are used is not straightforward (i.e. it’s not the case that half the active processes run on one core and the other half run on the other) and all the points I make here still apply – unless you’re running some kind of 64-core linux cluster or something crazy like that.
**Well, they do… but only as an emulation, and let’s not get into that.
*** Or I guess, the same sex, or either, if that’s your thing.