Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support (2405.09300v1)

Published 15 May 2024 in cs.CL, cs.AI, and cs.HC

Abstract: Background: Rapid advancements in natural language processing have led to the development of LLMs with the potential to revolutionize mental health care. These models have shown promise in assisting clinicians and providing support to individuals experiencing various psychological challenges. Objective: This study aims to compare the performance of two LLMs, GPT-4 and Chat-GPT, in responding to a set of 18 psychological prompts, to assess their potential applicability in mental health care settings. Methods: A blind methodology was employed, with a clinical psychologist evaluating the models' responses without knowledge of their origins. The prompts encompassed a diverse range of mental health topics, including depression, anxiety, and trauma, to ensure a comprehensive assessment. Results: The results demonstrated a significant difference in performance between the two models (p > 0.05). GPT-4 achieved an average rating of 8.29 out of 10, while Chat-GPT received an average rating of 6.52. The clinical psychologist's evaluation suggested that GPT-4 was more effective at generating clinically relevant and empathetic responses, thereby providing better support and guidance to potential users. Conclusions: This study contributes to the growing body of literature on the applicability of LLMs in mental health care settings. The findings underscore the importance of continued research and development in the field to optimize these models for clinical use. Further investigation is necessary to understand the specific factors underlying the performance differences between the two models and to explore their generalizability across various populations and mental health conditions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (9)
  1. Clinical efficacy and economic evaluation of online cognitive behavioral therapy for major depressive disorder: a systematic review and meta-analysis. Expert review of pharmacoeconomics & outcomes research, 18(1):25–41, 2018.
  2. Predicting moments of mood changes overtime from imbalanced social media data. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, pages 239–244, 2022.
  3. A manager and an ai walk into a bar: Does chatgpt make biased decisions like we do? Available at SSRN 4380365, 2023.
  4. Using language processing and speech analysis for the identification of psychosis and other disorders. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 5(8):770–779, 2020.
  5. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (woebot): a randomized controlled trial. JMIR mental health, 4(2):e7785, 2017.
  6. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Australian & New Zealand Journal of Psychiatry, 53(10):954–964, 2019.
  7. Internet-assisted delivery of cognitive behavioural therapy (cbt) for childhood anxiety: systematic review and meta-analysis. Journal of anxiety disorders, 29:83–92, 2015.
  8. Internet-delivered cognitive behavioral therapy to treat insomnia: a systematic review and meta-analysis. PloS one, 11(2):e0149139, 2016.
  9. Attention is all you need. In Advances in neural information processing systems, volume 30, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Birger Moell (10 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets