Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Is ChatGPT More Empathetic than Humans? (2403.05572v1)

Published 22 Feb 2024 in cs.HC, cs.AI, and cs.CL

Abstract: This paper investigates the empathetic responding capabilities of ChatGPT, particularly its latest iteration, GPT-4, in comparison to human-generated responses to a wide range of emotional scenarios, both positive and negative. We employ a rigorous evaluation methodology, involving a between-groups study with 600 participants, to evaluate the level of empathy in responses generated by humans and ChatGPT. ChatGPT is prompted in two distinct ways: a standard approach and one explicitly detailing empathy's cognitive, affective, and compassionate counterparts. Our findings indicate that the average empathy rating of responses generated by ChatGPT exceeds those crafted by humans by approximately 10%. Additionally, instructing ChatGPT to incorporate a clear understanding of empathy in its responses makes the responses align approximately 5 times more closely with the expectations of individuals possessing a high degree of empathy, compared to human responses. The proposed evaluation framework serves as a scalable and adaptable framework to assess the empathetic capabilities of newer and updated versions of LLMs, eliminating the need to replicate the current study's results in future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA internal medicine.
  2. Ahmed Belkhir and Fatiha Sadat. 2023. Beyond information: Is chatgpt empathetic enough? In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 159–169.
  3. Melissa Birkett. 2014. Self-compassion and empathy across cultures: Comparison of young adults in china and the united states. International Journal of Research Studies in Psychology, 3(3):25–34.
  4. Comparing chatgpt and gpt-4 performance in usmle soft skill assessments. Scientific Reports, 13(1):16492.
  5. The role of culture in affective empathy: Cultural and bicultural differences. Journal of Cognition and Culture, 10(3-4):309–326.
  6. Experimental methods: Between-subject and within-subject design. Journal of economic behavior & organization, 81(1):1–8.
  7. Llm-empowered chatbots for psychiatrist and patient simulation: Application and evaluation. arXiv preprint arXiv:2305.13614.
  8. Differences in empathic concern and perspective taking across 63 countries. Journal of Cross-Cultural Psychology, 48(1):23–38.
  9. Jacob Cohen. 1992. Quantitative methods in psychology: A power primer. Psychol. Bull., 112:1155–1159.
  10. Human empathy through the lens of social neuroscience. The scientific World journal, 6:1146–1163.
  11. Data quality in online human-subjects research: Comparisons between mturk, prolific, cloudresearch, qualtrics, and sona. Plos one, 18(3):e0279720.
  12. Paul Ekman. 1992. An argument for basic emotions. Cognition & emotion, 6(3-4):169–200.
  13. Paul Ekman. 2004. Emotions revealed. Bmj, 328(Suppl S5).
  14. Chatgpt outperforms humans in emotional awareness evaluations. Frontiers in Psychology, 14:1199058.
  15. Statistical power analyses using g* power 3.1: Tests for correlation and regression analyses. Behavior research methods, 41(4):1149–1160.
  16. Reasoning before responding: Integrating commonsense-based causality explanation for empathetic response generation. arXiv preprint arXiv:2308.00085.
  17. Matching robot appearance and behavior to tasks to improve human-robot cooperation. In The 12th IEEE International Workshop on Robot and Human Interactive Communication, 2003. Proceedings. ROMAN 2003., pages 55–60. Ieee.
  18. Flora Ioannidou and Vaya Konstantikaki. 2008. Empathy and emotional intelligence: What is it really about? International Journal of caring sciences, 1(3):118.
  19. Hyun Kang. 2021. Sample size determination and power analysis using the g* power software. Journal of educational evaluation for health professions, 18.
  20. Alexandra Koufouli and Marieke S Tollenaar. 2016. Empathy and emotional awareness: An interdisciplinary perspective. Offenders no more: An interdisciplinary restorative justice dialogue.
  21. The levels of emotional awareness scale: A cognitive-developmental measure of emotion. Journal of personality assessment, 55(1-2):124–134.
  22. A systematic study and comprehensive evaluation of ChatGPT on benchmark datasets. In Findings of the Association for Computational Linguistics: ACL 2023, pages 431–469, Toronto, Canada. Association for Computational Linguistics.
  23. Does gpt-3 generate empathetic dialogues? a novel in-context example selection method and automatic evaluation metric for empathetic dialogue generation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 669–683.
  24. Leveraging large language models for generating responses to patient messages. medRxiv, pages 2023–07.
  25. Artificial empathy in marketing interactions: Bridging the human-ai gap in affective and social customer experience. Journal of the Academy of Marketing Science, 50(6):1198–1218.
  26. Amanda K Montoya. 2023. Selecting a within-or between-subject design for mediation: Validity, causality, and statistical power. Multivariate Behavioral Research, 58(3):616–636.
  27. Motivational interviewing treatment integrity coding manual 4.1 (miti 4.1). Unpublished manual.
  28. Data quality of platforms and panels for online behavioral research. Behavior Research Methods, page 1.
  29. Robert Plutchik. 1984. Emotions: A general psychoevolutionary theory. Approaches to emotion, 1984(197-219):2–4.
  30. Philip A Powell and Jennifer Roberts. 2017. Situational determinants of cognitive, affective, and compassionate empathy in naturalistic digital interactions. Computers in Human Behavior, 68:137–148.
  31. Harnessing the power of large language models for empathetic response generation: Empirical investigations and improvements. arXiv preprint arXiv:2310.05140.
  32. Towards empathetic open-domain conversation models: A new benchmark and dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5370–5381, Florence, Italy. Association for Computational Linguistics.
  33. Carolien Rieffe and Marina Camodeca. 2016. Empathy in adolescence: Relations with emotion awareness and social roles. British journal of developmental psychology, 34(3):340–353.
  34. Exploring chatgpt’s empathic abilities. arXiv preprint arXiv:2308.03527.
  35. The neurological basis of empathy and mimicry. Emotional mimicry in social context, pages 192–221.
  36. Doug Semenick. 1990. Tests and measurements: The t-test. Strength & Conditioning Journal, 12(1):36–37.
  37. Research methods in psychology. McGraw-Hill.
  38. Amy E Skerry and Rebecca Saxe. 2015. Neural representations of emotion are organized around abstract event features. Current biology, 25(15):1945–1954.
  39. The toronto empathy questionnaire: Scale development and initial validation of a factor-analytic solution to multiple empathy measures. Journal of personality assessment, 91(1):62–71.
  40. Analysis of variance (anova). Chemometrics and intelligent laboratory systems, 6(4):259–272.
  41. Steven J Stroessner and Jonathan Benitez. 2019. The social perception of humanoid and non-humanoid robots: Effects of gendered and machinelike features. International Journal of Social Robotics, 11:305–315.
  42. Ekaterina Svikhnushina and Pearl Pu. 2022. Peace: A model of key social and emotional qualities of conversational chatbots. ACM Transactions on Interactive Intelligent Systems, 12(4):1–29.
  43. “it seemed like an annoying woman”: On the perception and ethical considerations of affective language in text-based conversational agents. In Proceedings of the 25th Conference on Computational Natural Language Learning, pages 44–57.
  44. Anuradha Welivita and Pearl Pu. 2020. A taxonomy of empathetic response intents in human social conversations. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4886–4899, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  45. Anuradha Welivita and Pearl Pu. 2023. Boosting distress support dialogue responses with motivational interviewing strategy. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5411–5432, Toronto, Canada. Association for Computational Linguistics.
  46. Is chatgpt equipped with emotional dialogue capabilities? arXiv preprint arXiv:2304.09582.
  47. Through the lens of core competency: Survey on evaluation of large language models. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum), pages 88–109, Harbin, China.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Anuradha Welivita (6 papers)
  2. Pearl Pu (16 papers)
Citations (5)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com