- The paper demonstrates that static anthropomorphic eyes achieve the highest maze completion rate (70.59%) while animated cues significantly elevate user trust and satisfaction.
- It employs a real-world maze navigation task with 37 participants to objectively compare task performance and signal interpretability across animated and static display modalities.
- The study recommends context-specific design strategies that balance clarity and engagement by integrating static icons for precision with animated elements for improved subjective experience.
Expressiveness and Clarity in Robot Display Design: An Objective and Subjective Analysis
This paper provides a systematic investigation of non-verbal visual display design in collaborative human-robot interaction (HRI), specifically evaluating the relative impact of animated versus static displays and anthropomorphic (eyes) versus symbolic (icons) representations on both objective task outcomes and subjective user experience. Using a real-world maze navigation task with 37 passersby as participants and a consumer-grade robot, the paper establishes several results of direct relevance for robot interface designers and HRI researchers.
Experimental Design and Methodology
Participants collaborated with a robot in a shared environment where information was incomplete for both parties: obstacles were invisible to the human, while hazards were invisible to the robot. The robot’s only means of communication was an on-device screen, which alternated between:
- Animated anthropomorphic eyes
- Animated icons
- Static anthropomorphic eyes
- Static icons
Eight distinct communicative displays were crafted for each modality, representing states such as idle, command affirmation, navigational directions, and reactions to hazards. Objective performance (task completion, correct interpretation of signals) and subjective ratings (trust, satisfaction, understandability) were recorded. To further probe display interpretability, all participants completed a post-task "Interpretation Game," requiring them to identify the meaning of visual signals encountered during the paper.
Key Results
The paper yields several notable quantitative findings:
- Task Success Rates: Robots with static anthropomorphic eyes enabled the highest maze completion ratio (70.59%), outperforming animated anthropomorphic eyes (68.42%), static icons (56.25%), and animated icons (50.00%).
- Signal Interpretability: Static icons were best interpreted (mean accuracy 1.403/2), followed by animated icons (1.321), static eyes (0.839), and animated eyes (0.888). ANOVA confirmed the significance of these differences.
- Subjective Experience: Participants working with robots sporting animated displays rated trust (M=7.9) and satisfaction (M=8.3) significantly higher than those with static displays (M=6.1, $6.9$, both p<0.001).
- Subjective vs. Objective Understanding: There was consistently low and statistically insignificant correlation between self-reported understandability of signals and measured interpretation accuracy across all modalities.
Summary Table
Modality |
Task Success (%) |
Signal Interpretation (mean accuracy) |
Trust Rating |
Satisfaction Rating |
Static Eyes |
70.59 |
0.839 |
6.1 |
6.9 |
Animated Eyes |
68.42 |
0.888 |
7.9 |
8.3 |
Static Icons |
56.25 |
1.403 |
6.1 |
6.9 |
Animated Icons |
50.00 |
1.321 |
7.9 |
8.3 |
Theoretical and Practical Implications
The results challenge a simplistic view that maximizing expressiveness or anthropomorphism always enhances HRI outcomes. Notably:
- Animation: While animated displays do increase subjective perceptions of trust and satisfaction, they do not confer a performance advantage in objective task completion; in fact, static cues, specifically static eyes, enable the highest collaborative success. This suggests that dynamic cues may increase "engagement bandwidth" but can simultaneously introduce ambiguity or cognitive load that undermines actionability in time-constrained decision-making.
- Anthropomorphism vs. Iconicity: Anthropomorphic cues (eyes) generally support both trust and completion better than icons in this paper, yet the most interpretable signals are static icons modeled on familiar signage conventions.
- Subjective/Objective Divergence: The persistent misalignment between self-rated and actual interpretability indicates that relying solely on subjective usability or likeability metrics is inadequate for evaluating the real communicative effectiveness of robot displays.
Recommendations for Application
- Contextual Modality Selection: For mission-critical or high-efficiency tasks, neutral and static visual cues, especially those derived from familiar iconography (such as road signs), should be preferred, as these optimize objective performance and reduce the risk of misinterpretation.
- Engagement-Driven Scenarios: When affective engagement, user satisfaction, or trust-building is paramount (e.g., education or long-term companionship), animated and/or anthropomorphic cues have measurable benefits.
- Multimodality: A hybrid interface that combines static, high-clarity icons for core signaling with optional animated or anthropomorphic overlays may balance objective task clarity and subjective user engagement.
- Evaluation Practices: Designers and evaluators must incorporate both objective behavioral metrics and subjective feedback, and avoid over-reliance on self-assessment data for evaluating novel communication modalities.
Limitations and Directions for Future Work
The paper is limited by its focus on a single class of robot platform and a confined navigation task. Broader validation will require replication with other robots (e.g., humanoids with body-based communication), different task genres (e.g., manipulation, multi-step planning), and diverse participant cohorts. Additionally, further research could illuminate the cognitive mechanisms behind the increased subjective trust for animation despite neutral or negative impacts on task success.
Outlook
This work advances the discussion about trade-offs in robot communication design, clearly demonstrating that expressiveness, social acceptability, and clarity are distinct axes that may not always be aligned. For robotics researchers and practitioners, these results urge a measured, scenario-specific approach to display design, and highlight the necessity for task-relevant, multimodal evaluation criteria. In the long term, adaptive systems capable of dynamically tailoring communicative modality based on user profile, context, and real-time feedback may provide the greatest utility across diverse collaborative HRI settings.