Phase 3: Emotion and the Uncanny Valley

Exploring incongruent emotional expressions

In my last phase, I wanted to take my investigation of the uncanny valley effect in a different direction. I had been reading a lot of research into the effect of categorical perception and congruity on how people perceive near-human agents, and I had started to be interested in how we convey and understand emotion through facial expressions. One particularly fascinating study was Tinwell, Nabi and Charlton’s (2013) work on linking perceptions of psychopathy in humans with certain patterns of emotional expressiveness in different parts of the face. Their research presented participants with a video of a near-human agent reacting to a startling sound. Agents were perceived as eerier when they only showed an emotional response in the lower part of their face, while the upper part remained static. I had also observed that many of the eerier near-human agents in my own research were those with exaggerated or distorted eyes, and had noticed that many eerier computer game characters were those where the expressions in the eyes were blank or not convincing. I wondered if I would be able to demonstrate this experimentally if incongruent expressions were presented in the eye region and the rest of the face: would very happy, angry, disgusted, frightened or sad faces with ‘dead’ eyes really be rated as the most eerie?

To do this, I created a suite of images from photographs of volunteers who had been trained to pose emotional expressions. For each model, I swapped the eye region into the base face for different combinations of emotions. A table of all the different combinations is shown below.

I had four broad areas of research to explore:

Would particular combinations of mismatch turn out to be eerie after all?
If I measured the level of fear, anger, sadness or disgust people felt when looking at each of the combinations, which part of the face would drive the response?
Would people be able to identify the emotions in the faces?
Would people be able to accurately classify the mismatched faces as displaying positive, negative or neutral expressions?

The results of two of those explorations will be detailed here. Further analyses are being prepared for publication from 2015.

Eeriest mismatches

I had expected that the eeriest faces would be 4 and 6A-6S in the middle row, showing strong emotional faces with blank eyes. This would go with Tinwell et al’s hypothesis and also my own observations of eerie ‘dead-eyed’ near-human agents. However, this was not the case. The eeriest faces were actually 7A and 7F - the very happy faces with angry or frightened eyes.

Emotional responses

I found that the emotions reported by participants were strongest for the faces where the expresssions were not blended, and were clearly and unambiguously presented. A full visualisation of all the results is shown below:

What next?

The aim of my thesis was to explore the UVE and across the three phases, I have been able to contribute new findings in this area. However, there are certainly other areas for research within this discipline that are still to be explored. Firstly, there has been relatively little research carried out systematically exploring the UVE in embodied agents and while virtual agents and images are now relatively well understood, studies of artificial agents in the real world have tended to be small scale in nature or carried out as case studies with specific reference to the testing of a particular aspect of an individual android. One intriguing possibility for further research would be looking at the highly realistic ‘reborn’ dolls that are now available that mimic the appearance of human babies.

The second suggestion for further research would look for evidence of cultural differences in the perception of mismatched expressions to explore whether they are universally eerie, or whether there is a cultural influence on how easily a sense of uncanny can be elicited when happy faces and paired with fearful or angry eyes. Evidence for universal eeriness would support the idea that there may be an evolutionary component to the uncanny valley effect, while if culturally specific components can be found for the effect, it may be that exposure to particular types of face image may be responsible for triggering or flattening the effect.

Finally, this research is concluding at an exciting time in the world of virtual humans as rapid advances in computer technology mean that encounters with virtual agents are now commonplace and it is becoming increasingly difficult to distinguish digital actors from virtual ones. Photorealistic actors such as Digital Ira are rendered at high levels of detail and have the ability to present a growing repertoire of convincing emotional expressions, and may be used in future to act alongside or even replace conventional actors in games and films. At the moment the actors are limited to short set pieces and the real challenge will be in making them portray convincing and emotionally engaging narratives in film, and even more so when they are the subjects for interaction in games. This challenge may present opportunities for developing the work in this thesis looking at mismatched emotional expressions in static images to exploring this further in animated faces, virtual agents or even beyond the screen and into embodied agents. In addition, these initial findings about the perceptual mechanisms for near-human faces could well be extended to virtual agents as they become more sophisticated, and could make important contributions to the ability to present convincing and relatable characters in these advanced computer graphics arenas. Digital Ira, like many convincing virtual actors, has been produced through motion capture of a real actor, whose appearance and emotion were then digitally recreated in such a way as to allow reproduction of realistic face movements and expressions. This technique has also been used to create virtual versions of well-known figures, often with unsettling or amusing results when the digital recreations are less than completely realistic. This suggests a new study as an area for future research concerning whether our familiarity with the actor would make us more sensitive to the flaws in their digital double. The digital recreation of an actor who is already well known presents a double challenge for the designer in that they not only have to design for a realistically human appearance but they need to reproduce the minute gestures and mannerisms that the audience will expect from the original. To date, no research has been published on the relationship between digital doubles and eeriness but it certainly seems to be a promising area for future enquiry as it would allow an exploration of the factors that influence the acceptability of different types of near-human agent.

Reference:

Tinwell, A., Nabi, D. A., & Charlton, J. P. (2013). Perception of psychopathy and the Uncanny Valley in virtual characters. Computers in Human Behaviour, 29(4), 1617-1625.