Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback (2505.03293v1)

Published 6 May 2025 in cs.CL

Abstract: LLMs have shown promise in providing scalable mental health support, while evaluating their counseling capability remains crucial to ensure both efficacy and safety. Existing evaluations are limited by the static assessment that focuses on knowledge tests, the single perspective that centers on user experience, and the open-loop framework that lacks actionable feedback. To address these issues, we propose {\Psi}-Arena, an interactive framework for comprehensive assessment and optimization of LLM-based counselors, featuring three key characteristics: (1) Realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients, (2) Tripartite evaluation that integrates assessments from the client, counselor, and supervisor perspectives, and (3) Closed-loop optimization that iteratively improves LLM counselors using diagnostic feedback. Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives. Moreover, reflection-based optimization results in up to a 141% improvement in counseling performance. We hope PsychoArena provides a foundational resource for advancing reliable and human-aligned LLM applications in mental healthcare.

PDF Abstract

Interactive Assessment and Optimization of LLM-based Psychological Counselors

The paper " $: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback" addresses crucial challenges in deploying LLMs for psychological counseling. Recognizing the global shortage of mental health professionals, it proposes a structured framework,$ , to enhance both the efficacy and safety of LLMs as psychological counselors.

The research identifies three limitations in existing evaluations of LLM-based counseling systems: a focus on static knowledge-based assessments, reliance chiefly on user satisfaction metrics, and a deficiency in feedback mechanisms that guide iterative improvement of the models. To address these, $introduces a novel tripartite framework consisting of three core components: <ol> <li>Realistic Arena Interactions: Simulations are performed with virtual Non-Player Character (NPC) clients whose psychological profiles are built from real-world counseling records. These clients engage in multi-stage dialogues mirroring real-life scenarios such as trust-building, diagnosis, and solution exploration.</li> <li>Tripartite Evaluation Metrics: The framework evaluates counselor performance from client, supervisor, and counselor perspectives, covering dimensions such as emotional experience, professional competence, and reflective awareness. This multi-angled assessment ensures a comprehensive understanding of model capabilities from different stakeholders within the counseling process.</li> <li>Closed-loop Optimization: The results from these evaluations are utilized to provide diagnostic feedback, fostering iterative self-reflection and enhancing the counseling capabilities of the LLMs.</li> </ol> Experiments conducted with eight state-of-the-art LLMs reveal substantial variations in counseling performance across different models. The deployment of tripartite feedback and closed-loop optimization leads to an improvement in counseling efficacy by up to 141%, indicating significant potential for practical applications in mental healthcare. The framework not only enhances the ability of LLMs to provide emotional support but also aligns their operations with clinical standards, addressing concerns about efficacy and ethical compliance. Furthermore, the observed consistency between human expert evaluations and automated assessments underscores the reliability of the tripartite evaluation system. The implications of this research extend beyond the immediate goal of improving counseling efficacy with LLMs. The comprehensive framework paves the way for more responsible development of AI in healthcare settings, ensuring that psychological support tools can better emulate human empathy and communication skills. Given the complex nature of human emotions, future developments may explore enhancing the realism of simulations, optimizing feedback loops for scalability, and addressing ethical considerations in counselor-client interactions. In conclusion,$ offers robust methodology for advancing LLMs in psychological contexts, with potential applications in various AI-driven mental healthcare solutions. It represents an important step in leveraging AI technologies to alleviate mental health professional shortages and improve access to psychological care worldwide.

PDF Markdown Bookmark Chat (Pro)

Authors (13)

Shijing Zhu (1 paper)
Zhuang Chen (13 papers)
Guanqun Bi (11 papers)
Binghang Li (1 paper)
Yaxi Deng (1 paper)
Dazhen Wan (6 papers)
Libiao Peng (6 papers)
Xiyao Xiao (6 papers)
Rongsheng Zhang (36 papers)
Tangjie Lv (35 papers)
Zhipeng Hu (38 papers)
Fangfang Li (16 papers)
Minlie Huang (225 papers)

Related Papers

Find Related Papers

YouTube

Show All Videos