Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mental Health Assessment for the Chatbots (2201.05382v1)

Published 14 Jan 2022 in cs.CL and cs.HC

Abstract: Previous researches on dialogue system assessment usually focus on the quality evaluation (e.g. fluency, relevance, etc) of responses generated by the chatbots, which are local and technical metrics. For a chatbot which responds to millions of online users including minors, we argue that it should have a healthy mental tendency in order to avoid the negative psychological impact on them. In this paper, we establish several mental health assessment dimensions for chatbots (depression, anxiety, alcohol addiction, empathy) and introduce the questionnaire-based mental health assessment methods. We conduct assessments on some well-known open-domain chatbots and find that there are severe mental health issues for all these chatbots. We consider that it is due to the neglect of the mental health risks during the dataset building and the model training procedures. We expect to attract researchers' attention to the serious mental health problems of chatbots and improve the chatbots' ability in positive emotional interaction.

The paper introduces a novel approach to evaluate the mental health of chatbots, addressing a critical gap in current dialogue system assessments that primarily focus on technical metrics.

Here's a summary of the key aspects:

  • The paper establishes mental health assessment dimensions for chatbots, including depression, anxiety, alcohol addiction, and empathy, using questionnaire-based methods adapted from human psychological assessments like PHQ-9, GAD-7, CAGE, and TEQ.
  • Experiments on open-domain chatbots like Blender, DialoGPT, Plato, and DialoFlow revealed significant mental health issues, with moderate to severe scores in depression and anxiety, and below-average empathy scores, suggesting a potential negative impact on users.
  • The authors suggest that the neglect of mental health risks during dataset building and model training contributes to these issues, advocating for increased attention to mental health in chatbot development to foster positive emotional interactions.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yong Shan (7 papers)
  2. Jinchao Zhang (49 papers)
  3. Zekang Li (13 papers)
  4. Yang Feng (230 papers)
  5. Jie Zhou (687 papers)
Citations (2)