Emotive Response to a Hybrid-Face Robot and Translation to Consumer Social Robots
(2012.04511v1)
Published 8 Dec 2020 in cs.RO
Abstract: We introduce the conceptual formulation, design, fabrication, control and commercial translation with IoT connection of a hybrid-face social robot and validation of human emotional response to its affective interactions. The hybrid-face robot integrates a 3D printed faceplate and a digital display to simplify conveyance of complex facial movements while providing the impression of three-dimensional depth for natural interaction. We map the space of potential emotions of the robot to specific facial feature parameters and characterise the recognisability of the humanoid hybrid-face robot's archetypal facial expressions. We introduce pupil dilation as an additional degree of freedom for conveyance of emotive states. Human interaction experiments demonstrate the ability to effectively convey emotion from the hybrid-robot face to human observers by mapping their neurophysiological electroencephalography (EEG) response to perceived emotional information and through interviews. Results show main hybrid-face robotic expressions can be discriminated with recognition rates above 80% and invoke human emotive response similar to that of actual human faces as measured by the face-specific N170 event-related potentials in EEG. The hybrid-face robot concept has been modified, implemented, and released in the commercial IoT robotic platform Miko (My Companion), an affective robot with facial and conversational features currently in use for human-robot interaction in children by Emotix Inc. We demonstrate that human EEG responses to Miko emotions are comparative to neurophysiological responses for actual human facial recognition. Finally, interviews show above 90% expression recognition rates in our commercial robot. We conclude that simplified hybrid-face abstraction conveys emotions effectively and enhances human-robot interaction.
The paper "Emotive Response to a Hybrid-Face Robot and Translation to Consumer Social Robots" (Wairagkar et al., 2020) presents a novel approach to robotic facial expression, focusing on a "hybrid-face" design that combines physical and digital elements to convey emotion effectively. It details the development, validation through behavioral and neurophysiological measures, and subsequent translation of these concepts into the commercial Miko social robot platform.
Hybrid-Face Robot: Design, Fabrication, and Control
The core innovation is the hybrid-face concept, designed to mitigate the complexities and potential "uncanny valley" issues associated with fully actuated anthropomorphic robotic faces. This approach merges a static, 3D-printed physical faceplate, providing a sense of three-dimensional presence and contextual framing, with a dynamic digital display (LCD screen) that renders expressive facial features. This combination aims to simplify the mechanical requirements while retaining the capacity for nuanced emotional expression.
The physical component is a 3D printed visage, serving as a substrate for the digitally rendered features: eyebrows, eyelids, eyeballs (including pupils), and a mouth. The digital rendering was implemented using OpenGL and Face3D software. Control over the facial expressions is achieved through manipulating thirteen defined digital Degrees of Freedom (DoF):
Eyebrow Angles (Bal, Bar)
Eyebrow Heights (Bhl, Bhr)
Eyelid Openness (Ll, Lr)
Eye Pitch (Ep)
Eye Yaw (Ey)
Pupil Size (P)
Mouth Corner Height (Mc)
Mouth Width (Mw)
Upper Lip Openness (Mt)
Lower Lip Openness (Mb)
These DoF allow for the synthesis of various facial configurations corresponding to different emotional states. To enhance perceived realism, dynamic elements such as periodic eye blinks, subtle facial twitching, and continuous saccade-like eye movements were incorporated into the control scheme.
Two distinct models were developed to map target emotions onto the 13 DoF parameter space:
Categorical Affect Space: This model defines eight archetypal emotions (happy, sad, angry, afraid, surprise, tired, stern, disgust). Each expression is generated as a weighted linear combination of variances relative to a neutral facial expression, utilizing a basis set derived from these archetypal states.
Three-Dimensional Affect Space: Drawing inspiration from Breazeal's work, this model positions emotions within a continuous 3D space defined by Valence (positive/negative), Arousal (high/low energy), and Stance (approach/avoid). Specific emotions are represented as points or vectors within this space, enabling the generation of intermediate or blended expressions through linear interpolation along these axes.
Emotion Conveyance and Behavioural Validation
The effectiveness of the hybrid-face in conveying intended emotions was assessed through forced-choice recognition experiments involving human participants. Participants were presented with static or animated facial expressions corresponding to the eight archetypal emotions and asked to identify the displayed emotion.
The results indicated strong performance for several key expressions, with recognition rates exceeding 80% for happy, sad, and surprise. However, confusion was observed between semantically related or visually similar expressions, such as stern being misidentified as angry or tired, and afraid sometimes being confused with sad. Static expressions generally yielded slightly higher recognition accuracy compared to dynamic transitions from a neutral state.
A notable aspect investigated was the potential role of pupil dilation (the 'P' DoF) as an additional, non-mechanical channel for conveying affective information. Experiments explored correlations between pupil size and perceived emotion, with preliminary findings suggesting potential associations (e.g., larger pupils for happy/surprise, smaller for angry). However, the paper concluded that the contribution of pupil size modulation to overall expression recognizability was inconclusive and warranted further investigation.
Neurophysiological Validation via EEG and N170 ERP
Beyond behavioral measures, the paper employed Electroencephalography (EEG) to probe the subconscious neural processing elicited by the hybrid-face expressions. Specifically, the focus was on the N170 Event-Related Potential (ERP), a well-established neurophysiological marker robustly associated with the perceptual encoding of faces. The amplitude and latency of the N170 component are known to be sensitive to facial stimuli compared to non-facial objects.
Experiments compared EEG responses when participants viewed emotional expressions displayed:
a) Solely on a flat monitor (digital face only).
b) On the integrated hybrid-face robot (digital face within the 3D printed structure).
Key findings from the EEG analysis include:
Face-Specific Processing: The hybrid-face robot successfully elicited the face-specific N170 ERP, providing strong evidence that the human brain processed the robot's stylized face in a manner analogous to processing human faces.
Emotional Modulation: The pattern of N170 responses to different robotic emotions (e.g., variations in amplitude for strongly valenced vs. neutral expressions) mirrored established findings from studies using human facial stimuli, although the robotic stimuli generally evoked slightly delayed (~7ms later) and sometimes attenuated responses compared to the monitor-only condition.
Contextual Influence: A significant difference was observed in the N170 amplitude (stronger when viewed on the monitor) and latency (earlier on the monitor) between the monitor-only and the integrated hybrid-face conditions. This suggests that the physical context provided by the 3D faceplate influences the neural processing of the digital facial features, highlighting the importance of embodiment and physical presence in HRI.
These neurophysiological results provide quantitative validation that the hybrid-face design effectively taps into human neural mechanisms dedicated to face perception and emotional interpretation.
Commercial Translation: The Miko Robot
Leveraging the insights gained from the hybrid-face research prototype, the core concepts were adapted and translated into a commercial social robot platform, Miko, developed by Emotix Inc. targeting children. The translation involved significant simplification to meet the constraints of mass production, cost-effectiveness, and user acceptance for a consumer product.
The Miko I robot design represents a more abstracted implementation of the hybrid principle:
Simplified Design: Emotions are primarily conveyed through digitally rendered, expressive eyes displayed on a curved screen, which provides an illusion of depth without a full 3D faceplate.
Contextual Cues: The screen is integrated into a distinct head-like structure featuring non-actuated "ears," providing minimal but sufficient physical context.
User-Centric Adaptation: Design choices were informed by iterative feedback and testing with the target user group (children).
Expression Set: Miko utilizes a simplified set of static emotional expressions, focusing on clarity for key emotions relevant to child interaction.
IoT Integration: Miko incorporates IoT connectivity, enabling cloud-based interaction features, learning capabilities, and remote engagement, significantly expanding its social functionalities.
Validation of the Miko Platform
The effectiveness of the simplified Miko design was also subjected to validation:
Behavioural Validation: User interviews and forced-choice tests were conducted. Interviews reported expression recognition rates exceeding 90%. Formal tests (Table III) showed high accuracy for happy, sad, and angry expressions, comparable or exceeding the original hybrid-face prototype for these core emotions, despite the increased abstraction. Consistent with the prototype, expressions like stern and disgust remained challenging, likely due to the absence of mouth cues in the Miko I design.
Neurophysiological Validation (EEG): EEG experiments were repeated using the Miko I robot displaying key emotions (happy, sad, angry, surprise) visually, without accompanying audio or light cues. Crucially, the Miko I platform also successfully elicited the face-specific N170 ERP. This confirmed that even the highly simplified and abstracted facial design of the commercial robot effectively engaged the neural correlates of face processing, demonstrating the robustness of the underlying hybrid concept when adapted for practical application.
Practical Implications for Affective HRI
This research provides several important contributions to the design and implementation of affective social robots:
Efficacy of Simplified Abstraction: It demonstrates empirically that simplified, abstract robotic faces, particularly those employing a hybrid physical-digital approach, can effectively convey core emotions. High behavioral recognition rates and, critically, the elicitation of face-specific neural responses (N170) support this conclusion.
Viable Commercial Pathway: The successful translation from the research prototype to the Miko consumer product illustrates a practical approach for incorporating affective capabilities into cost-sensitive commercial robots, bypassing the need for highly complex mechanical actuation.
Quantitative Design Assessment: The use of EEG and the N170 ERP provides a valuable quantitative methodology for objectively evaluating the perceptual and affective impact of different robotic face designs, complementing subjective user reports and behavioral metrics.
Importance of Embodiment Context: The EEG results highlight that the physical embodiment or context in which digital facial features are presented significantly influences neural processing, reinforcing the importance of integrated physical design in HRI.
In conclusion, the paper validates the hybrid-face concept as an effective means for emotional expression in robots, demonstrating its capability to elicit human-like behavioral and neurophysiological responses. The successful adaptation and commercialization in the Miko platform underscore the practical viability of this approach for developing engaging and emotionally expressive social robots for real-world applications.