Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors (2308.11020v2)
Abstract: This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors. In this study, our main focus is on assessing the human-likeness of the robot as the primary evaluation metric. While previous research often relied on subjective evaluations from users, our approach aims to evaluate the robot's human-likeness based on observable user behaviors indirectly, thus enhancing objectivity and reproducibility. To begin, we created an annotated dataset of human-likeness scores, utilizing user behaviors found in an attentive listening dialogue corpus. We then conducted an analysis to determine the correlation between multimodal user behaviors and human-likeness scores, demonstrating the feasibility of our proposed behavior-based evaluation method.
- Technical metrics used to evaluate health care chatbots: scoping review. Journal of medical Internet research 22, 6 (2020).
- Trends & methods in chatbot evaluation. In Companion Publication of ICMI. 280–286.
- Survey on evaluation methods for dialogue systems. Artificial Intelligence Review 54 (2021), 755–810.
- SimSensei Kiosk: A Virtual Human Interviewer for Healthcare Decision Support. In AAMAS. 1061–1068.
- Towards human-like spoken dialogue systems. Speech Communication 50, 8 (2008), 630–645.
- Counseling Dialog System with 5W1H Extraction. In SIGDIAL.
- Human-like guide robot that proactively explains exhibits. International Journal of Social Robotics 12 (2020), 549–566.
- Job interviewer android with elaborate follow-up question generation. In ICMI. 324–332.
- An attentive listening system with android ERICA: Comparison of autonomous and WOZ interactions. In SIGDIAL. 118–127.
- Talking with ERICA, an autonomous android. In SIGDIAL. 212–215.
- Spoken Dialog Systems for Automated Survey Interviewing. In SIGDIAL. 329–333.
- Tatsuya Kawahara. 2018. Spoken dialogue system for a human-like conversational robot ERICA. In IWSDS. 65–75.
- Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.
- Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems. In SIGKDD. 3299–3308.
- Automatic Detection of Miscommunication in Spoken Dialogue Systems. In SIGDIAL. 354–363.
- Recent advances in deep learning based dialogue systems: A systematic survey. Artificial intelligence review 56 (2023), 3055–3155.
- Towards an engagement-aware attentive artificial listener for multi-party interactions. Frontiers in Robotics and AI 8 (2021).
- Potential applications of social robots in robot-assisted interventions for social anxiety. International Journal of Social Robotics 14 (2022), 1–32.
- Building autonomous sensitive artificial listeners. In ACII. 456–462.
- Use of social robots in mental health and well-being research: Systematic review. Journal of medical Internet research 21, 7 (2019).
- Doreen Ying Ying Sim and Chu Kiong Loo. 2015. Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A review. Information Sciences 301 (2015), 305–344.
- Ada and Grace: Toward realistic and engaging virtual museum guides. In IVA. 286–300.
- Stefan Ultes and Wolfgang Maier. 2021. User Satisfaction Reward Estimation Across Domains: Domain-independent Dialogue Policy Learning. Dialogue & Discourse 12, 2 (2021), 81–114.
- An open-source dialog system with real-time engagement tracking for job interview training applications. In IWSDS.
- Automatic evaluation and moderation of open-domain dialogue systems. arXiv preprint, 2111.02110 (2021).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.