Image Morphing and Psychology Research – A Case Study
As an example of the ways in which technology and psychology have developed together recently, I thought it would be fun to do a little case-study of a particular area of research which has benefitted from advances in computer software over recent years. Rather than talk about the very technical disciplines like brain imaging (which have of course advanced enormously recently) I thought it would be more fun to concentrate on an area of relatively ‘pure’ psychology, and one of the most important and fundamental cognitive processes which is present pretty much from birth; face perception.
In November 1991 Michael Jackson released the single ‘Black or White’; the first to be released from his eighth album ‘Dangerous’. The single is of some significance as it marked the beginning of Jackson’s descent from the firmament of music stardom into the spiral of musical mediocrity and personal weirdness which only ended with his death in 2009, but for the purposes of the present discussion it was interesting because of part of its accompanying video. Towards the end of the video a series of people of both sexes and of various ethnic groups are shown singing along with the song and the images of their faces morph into each other in series:
These days, this kind of special effect is absolutely ubiquitous in movies, TV shows and commercials and barely warrants a second glance, but in 1991, it was amazing. I can remember thinking to myself as a teenager “how did they possibly do that?”. The answer of course was computers – very big ones, and lots of them. Probably some of the old Sun SPARC workstations people were also using at the time to do the post-production on Jurassic Park (where each frame of the effects shots of dinosaurs would take a full 24 hours to render!). Anyway, I’m certain it took a great deal of effort in programming, and a huge number of person-hours to create. Nowadays, with the right software you could probably knock-up something half-decent in a similar vein with a £300 laptop and a wet Sunday afternoon.
By the mid-90s it was possible to use a decent desktop computer to produce some quite nice effects by morphing together two still pictures, using software like Morpheus, or plugins for Adobe photoshop. The way that most morphing software works is actually pretty simple. You take two images and define a set of corresponding points on each – a point is placed on the first face, say on the left eye, and then a corresponding point is placed on the left eye in the second picture. Once a sufficient number of points is defined on both pictures the algorithm distorts the first picture to match the second, while at the same time cross-fading between the two. A series of intermediate images can then be produced which represent any point on the transformation between the two pictures, like the 50% point in the image below:
In 1996, a bunch of clever chaps from Cambridge, St Andrews and Harvard published one of the first psychology papers to use stimuli produced using morphed faces. The paper (pdf here) demonstrated that the perception of morphed emotional faces operates in a categorical manner. In other words, when presented with stimuli like this…
…and asked to identify the emotion in the pictures then people have little problem with identifying the emotion in the pictures at the end of the spectrum. However the pictures around the middle tend to get identified as either one emotion, or the other, despite being a mathematically precise blend of the two. This kind of categorical perception is found with all kinds of other stimuli and is extremely interesting for all kinds of reasons – in essence it may be one of the most fundamental foundations of many of our cognitive processes. One of the clearest examples is speech sounds – the English sounds ‘b’ and ‘p’ are similar but quite distinct to English speakers, however, the Thai language contains a third phoneme which is intermediate between the English ‘b’ and ‘p’. English speakers cannot distinguish this intermediate sound – it tends to be heard (with roughly chance accuracy, i.e. 50%) as either b or p. Thai speakers can of course distinguish the three sounds perfectly. The same is true for the English ‘l’ and ‘r’ sounds for Japanese speakers – they simply cannot distinguish the two. These examples demonstrate that what we are able to perceive is shaped by our early experience with language, and probably in many other domains as well.
Returning to psychology and faces, since the mid-90s this kind of computer-graphical manipulation of faces has become a highly developed and very useful technique for investigating face perception, generating a huge number of publications (Google Scholar currently lists 13,700). Scotland seems to be a particular focus of this kind of research for some reason, with major labs at the University of St. Andrews (run by Dave Perrett, who wins the award for Professor-with-the-craziest-hair at conferences every time) and also in Aberdeen. Image processing of faces has been expanded into a highly sophisticated set of software tools which can do all kinds of funky things to faces, like create averages across a large set, and precisely manipulate particular characteristics of faces like masculinity/femininity, health, perceived age, dominance, or particular emotional expressions. The understanding of how we perceive faces has come a long way in the last 10-15 years, and it’s been driven to a very significant extent by advances in computer software techniques like this.
For those who are interested, there’s a great deal of information on the web about this. The Aberdeen lab runs another website at www.faceresearch.org, which has lots of great information about the technology they use, as well as links to online experiments which you can actually participate in like this one. More examples here from the St. Andrews lab. For a bit of fun, you can upload a picture of yourself (or someone else) and transform it in various ways here, including into the style of famous artists, and there’s also even an associated iPhone app which does the same thing – get that here. There’s also another (non-academic) website called MorphThing which lets you upload pics and do all kinds of funky stuff – honestly, is there nothing you can’t do in a browser window anymore? For the academic reader, a good place to start would be a nice (and brief) review on face perception which has just been published in the Royal Society’s B Journal here.
So that, dear readers, is how software changed, and to a large extent, enabled the modern scientific approach to the study of face perception. The software went from being a highly niche product concentrated in only a couple of special-effects production houses to a ubiquitous system, freely available on the web in only about 15 years, and along the way some psychologists got hold of it and a) started doing some very serious science, and b) made themselves some very nice careers out of it too. Who knows what the next iteration of this software will reveal about human perception, or what new techniques are currently being developed? Virtual reality is still pretty clunky and frankly, unlike actual reality, but techniques are improving all the time, and are already being used in psychology – definitely one to watch.