The paper introduces a novel approach to evaluate the mental health of chatbots, addressing a critical gap in current dialogue system assessments that primarily focus on technical metrics.
Here's a summary of the key aspects:
- The paper establishes mental health assessment dimensions for chatbots, including depression, anxiety, alcohol addiction, and empathy, using questionnaire-based methods adapted from human psychological assessments like PHQ-9, GAD-7, CAGE, and TEQ.
- Experiments on open-domain chatbots like Blender, DialoGPT, Plato, and DialoFlow revealed significant mental health issues, with moderate to severe scores in depression and anxiety, and below-average empathy scores, suggesting a potential negative impact on users.
- The authors suggest that the neglect of mental health risks during dataset building and model training contributes to these issues, advocating for increased attention to mental health in chatbot development to foster positive emotional interactions.