Therapy as an NLP Task: Psychologists' Comparison of LLMs and Human Peers in CBT
Abstract: LLMs are being used as ad-hoc therapists. Research suggests that LLMs outperform human counselors when generating a single, isolated empathetic response; however, their session-level behavior remains understudied. In this study, we compare the session-level behaviors of human counselors with those of an LLM prompted by a team of peer counselors to deliver single-session Cognitive Behavioral Therapy (CBT). Our three-stage, mixed-methods study involved: a) a year-long ethnography of a text-based support platform where seven counselors iteratively refined CBT prompts through self-counseling and weekly focus groups; b) the manual simulation of human counselor sessions with a CBT-prompted LLM, given the full patient dialogue and contextual notes; and c) session evaluations of both human and LLM sessions by three licensed clinical psychologists using CBT competence measures. Our results show a clear trade-off. Human counselors excel at relational strategies -- small talk, self-disclosure, and culturally situated language -- that lead to higher empathy, collaboration, and deeper user reflection. LLM counselors demonstrate higher procedural adherence to CBT techniques but struggle to sustain collaboration, misread cultural cues, and sometimes produce "deceptive empathy," i.e., formulaic warmth that can inflate users' expectations of genuine human care. Taken together, our findings imply that while LLMs might outperform counselors in generating single empathetic responses, their ability to lead sessions is more limited, highlighting that therapy cannot be reduced to a standalone NLP task. We call for carefully designed human-AI workflows in scalable support: LLMs can scaffold evidence-based techniques, while peers provide relational support. We conclude by mapping concrete design opportunities and ethical guardrails for such hybrid systems.
- An overview of the features of chatbots in mental health: A scoping review. International Journal of Medical Informatics 132 (2019), 103978.
- Attitudes and perspectives towards the preferences for artificial intelligence in psychotherapy. Computers in Human Behavior 133 (2022), 107273.
- Understanding social media disclosures of sexual abuse through the lenses of support seeking and anonymity. In Proceedings of the 2016 CHI conference on human factors in computing systems. 3906–3918.
- American Psychological Association. 2016. Ethical principles of psychologists and code of conduct. https://www.apa.org/ethics/code/
- Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA internal medicine 183, 6 (2023), 589–596.
- Aaron T Beck. 1979. Cognitive Therapy of Depression. Guilford Press, New York.
- Bethany Biron. 2023. Online Mental Health Company Uses ChatGPT to Help Respond to Users in Experiment, Raising Ethical Concerns. Business Insider Inc. https://www.businessinsider.com/company-using-chatgpt-mental-health-support-ethical-issues-2023-1 Accessed: 2023-04-14.
- Signal processing and machine learning for mental health research and clinical applications [perspectives]. IEEE Signal Processing Magazine 34, 5 (2017), 196–195.
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- APA Ethics Code Commentary and Case Illustrations. American Psychological Association, Washington, D.C.
- The promise of machine learning in predicting treatment outcomes in psychiatry. World Psychiatry 20, 2 (2021), 154–170.
- Mick Cooper and John McLeod. 2007. A pluralistic framework for counselling and psychotherapy: Implications for research. Counselling and Psychotherapy Research 7, 3 (2007), 135–143.
- Flávio Luis De Mello and Sebastião Alves de Souza. 2019. Psychotherapy and artificial intelligence: A proposal for alignment. Frontiers in Psychology 10, 263 (2019), 1–9.
- Digital Interventions for Mental Disorders: Key Features, Efficacy, and Potential for Artificial Intelligence Applications. Advances in Experimental Medicine and Biology 1192 (2019), 583–627.
- Methodological gaps in predicting mental health states from social media: Triangulating diagnostic signals. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–16.
- Quantifying the association between psychotherapy content and clinical outcomes using deep learning. JAMA psychiatry 77, 1 (2020), 35–43.
- Christopher G Fairburn and Vikram Patel. 2017. The impact of digital technology on psychological treatments and their dissemination. Behaviour research and therapy 88 (2017), 19–25.
- The Role of Humanization and Robustness of Large Language Models in Conversational Artificial Intelligence for Individuals With Depression: A Critical Analysis. JMIR Mental Health 11 (2024), e56569.
- Your Robot Therapist Will See You Now: Ethical Implications of Embodied Artificial Intelligence in Psychiatry, Psychology, and Psychotherapy. Journal of Medical Internet Research 21, 5 (2019), e13216.
- Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR mental health 4, 2 (2017), e7785.
- Conversational agents in the treatment of mental health problems: mixed-method systematic review. JMIR mental health 6, 10 (2019), e14166.
- National estimates for mental health mutual support groups, self-help organizations, and consumer-operated services. Administration and Policy in Mental Health and Mental Health Services Research 33 (2006), 92–103.
- When will AI exceed human performance? Evidence from AI experts. Journal of Artificial Intelligence Research 62 (2018), 729–754.
- Artificial intelligence for mental health and mental illnesses: an overview. Current psychiatry reports 21 (2019), 1–18.
- Toward fairness in AI for people with disabilities: a research roadmap. ACM SIGACCESS Accessibility and Computing 125 (2019), 8 pages.
- From chatgpt to threatgpt: Impact of generative ai in cybersecurity and privacy. IEEE Access 11 (2023), 80218–80245.
- Helping the Helper: Supporting Peer Counselors via AI-Empowered Practice and Feedback. (2023). arXiv preprint arXiv:2305.08982.
- “Together but not together”: Evaluating Typing Indicators for Interaction-Rich Communication. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–12.
- Understanding the benefits and challenges of deploying conversational AI leveraging large language models for public health intervention. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–16.
- MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients’ Journaling. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–20.
- Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.
- Do therapists’ subjective variables impact on psychodynamic psychotherapy outcomes? A systematic literature review. Clinical psychology & psychotherapy 25, 1 (2018), 85–101.
- A fully automated conversational agent for promoting mental well-being: A pilot RCT using mixed methods. Internet interventions 10 (2017), 39–46.
- Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–15.
- Speaker and time-aware joint contextual learning for dialogue-act classification in counselling conversations. In Proceedings of the fifteenth ACM international conference on web search and data mining. 735–745.
- Shery Mead and Cheryl MacNeil. 2006. Peer support: What makes it unique. International Journal of Psychosocial Rehabilitation 10, 2 (2006), 29–37.
- Stirling Moorey and Anna Lavender. 2017. The Therapeutic Relationship in Cognitive Behavioural Therapy. Sage Publications, Los Angeles, CA.
- Towards an artificially empathic conversational agent for mental health applications: system design and user perceptions. Journal of medical Internet research 20, 6 (2018), e10148.
- Efficacy of a web-based, crowdsourced peer-to-peer cognitive reappraisal platform for depression: randomized controlled trial. Journal of medical Internet research 17, 3 (2015), e72.
- Hongbin Na. 2024. CBT-LLM: A Chinese Large Language Model for Cognitive Behavioral Therapy-based Mental Health Question Answering. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2930–2940.
- The future of mental health care: peer-to-peer support and social media. Epidemiology and psychiatric sciences 25, 2 (2016), 113–122.
- Jacqueline Nesi. 2023. Can chatgpt do therapy? https://technosapiens.substack.com/p/can-chatgpt-do-therapy
- Jesse Noyes. 2023. Perceptions of AI in healthcare: What professionals and the public think. The Intake. Available at: https://www.tebra.com/theintake/medical-deep-dives/tips-and-trends/research-perceptions-of-ai-in-healthcare.
- World Health Organization. 2019. Psychologists Working in Mental Health Sector (per 100,000). Technical Report. https://www.who.int/data/gho/data/indicators/indicator-details/GHO/psychologists-working-in-mental-health-sector-(per-100-000)
- Turning to peers: integrating understanding of the self, the condition, and others’ experiences in making sense of complex chronic conditions. Computer Supported Cooperative Work (CSCW) 25 (2016), 477–501.
- What makes a good counselor? learning to distinguish between high-quality and low-quality counseling conversations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 926–935.
- A therapeutic relational agent for reducing problematic substance use (Woebot): development and usability study. Journal of medical Internet research 23, 3 (2021), e24850.
- rahul_9735. 2023. Considering how many people use ChatGPT as a therapy tool, here use this prompt to turn your GPT into a personal therapist! Reddit. Available at: https://www.reddit.com/r/ChatGPT/comments/14b2u1p/considering_how_many_people_use_chatgpt_as_a/. Accessed: 2023-03-25.
- Paolo Raile. 2024. The usefulness of ChatGPT for psychotherapists and patients. Humanities and Social Sciences Communications 11, 1 (2024), 1–8.
- Scoping review to evaluate the effects of peer support on the mental health of young adults. BMJ open 12, 8 (2022), e061336.
- Sabirat Rubya and Svetlana Yarosh. 2017. Video-mediated peer support in an online community for recovery from substance use disorders. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1454–1469.
- HCI and Affective Health: Taking stock of a decade of studies and charting future research directions. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–17.
- Perceived Empathy of Technology Scale (PETS): Measuring Empathy of Systems Toward the User. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–18.
- Jana Sedlakova and Manuel Trachsel. 2023. Conversational artificial intelligence in psychotherapy: A new therapeutic tool or agent? The American Journal of Bioethics 23, 5 (2023), 4–13.
- Therapist competence ratings in relation to clinical outcome in cognitive therapy of depression. Journal of consulting and clinical psychology 67, 6 (1999), 837.
- Patrick E Shrout and Joseph L Fleiss. 1979. Intraclass correlations: uses in assessing rater reliability. Psychological bulletin 86, 2 (1979), 420.
- The effectiveness of peer support for individuals with mental illness: systematic review and meta-analysis. Psychological Medicine 53, 11 (2023), 5332–5341.
- Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation. NPJ Mental Health Research 3, 1 (2024), 12.
- Substance Abuse and Mental Health Services Administration. 2021. Key substance use and mental health indicators in the United States: results from the 2020 National Survey on Drug Use and Health. HHS Publication No. PEP21-07-01-003. https://www.samhsa.gov/data/
- Machine and Human Understanding of Empathy in Online Peer Support: A Cognitive Behavioral Approach. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–13.
- Cheeseburger Therapy Team. 2018. Cheeseburger Therapy. https://cheeseburgertherapy.org/ Accessed: 2023-22-12.
- Machine learning in mental health: A systematic review of the HCI literature to support the development of effective and implementable ML systems. ACM Transactions on Computer-Human Interaction (TOCHI) 27, 5 (2020), 1–53.
- Chatbots and conversational agents in mental health: a review of the psychiatric landscape. The Canadian Journal of Psychiatry 64, 7 (2019), 456–464.
- Creation, analysis and evaluation of annomi, a dataset of expert-annotated counselling dialogues. Future Internet 15, 3 (2023), 110.
- Mental-llm: Leveraging large language models for mental health prediction via online text data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1–32.
- Svetlana Yarosh. 2013. Shifting dynamics or breaking sacred traditions? The role of technology in twelve-step fellowships. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3413–3422.
- JE Young and Aaron T Beck. 1980. Cognitive therapy scale. (1980). Unpublished manuscript, University of Pennsylvania.
- The Role of AI in Peer Support for Young People: A Study of Preferences for Human-and AI-Generated Responses. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–18.
- Jim Young and Christopher L Williams. 1987. An Evaluation of GROW, a mutual-help community mental health organization. Australian and New Zealand Journal of Public Health 11, 1 (1987), 38–42.
- Online support groups for depression in China: Culturally shaped interactions and motivations. Computer Supported Cooperative Work (CSCW) 27, 3 (2018), 327–354.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.