Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling (2405.16433v3)

Published 26 May 2024 in cs.CL, cs.AI, and cs.CY

Abstract: Using LLMs to assist psychological counseling is a significant but challenging task at present. Attempts have been made on improving empathetic conversations or acting as effective assistants in the treatment with LLMs. However, the existing datasets lack consulting knowledge, resulting in LLMs lacking professional consulting competence. Moreover, how to automatically evaluate multi-turn dialogues within the counseling process remains an understudied area. To bridge the gap, we propose CPsyCoun, a report-based multi-turn dialogue reconstruction and evaluation framework for Chinese psychological counseling. To fully exploit psychological counseling reports, a two-phase approach is devised to construct high-quality dialogues while a comprehensive evaluation benchmark is developed for the effective automatic evaluation of multi-turn psychological consultations. Competitive experimental results demonstrate the effectiveness of our proposed framework in psychological counseling. We open-source the datasets and model for future research at https://github.com/CAS-SIAT-XinHai/CPsyCoun

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Gpt-4 technical report. ArXiv preprint, abs/2303.08774.
  2. Disc-medllm: Bridging general large language models and real-world medical consultation. arXiv preprint arXiv:2308.14346.
  3. Modeling empathy and distress in reaction to news stories. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4758–4765, Brussels, Belgium. Association for Computational Linguistics.
  4. Llm-empowered chatbots for psychiatrist and patient simulation: Application and evaluation. arXiv preprint arXiv:2305.13614.
  5. SoulChat: Improving LLMs’ empathy, listening, and comfort abilities through fine-tuning with multi-turn empathy conversations. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1170–1183, Singapore. Association for Computational Linguistics.
  6. PAL: Persona-augmented emotional support conversation generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 535–554, Toronto, Canada. Association for Computational Linguistics.
  7. M. Freeman. 2022. The World Mental Health Report: transforming mental health for all. World Psychiatry, 21(3):391–392.
  8. GPTscore: Evaluate as you desire. arXiv preprint arXiv:2302.04166.
  9. Psy-llm: Scaling up global mental health psychological services with ai-based large language models. arXiv preprint arXiv:2307.11991.
  10. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  11. Chatcounselor: A large language models for mental health support. ArXiv preprint, abs/2309.15461.
  12. Towards emotional support dialog systems. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3469–3483, Online. Association for Computational Linguistics.
  13. G-eval: NLG evaluation using gpt-4 with better human alignment. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2511–2522, Singapore. Association for Computational Linguistics.
  14. The ethical role of computational linguistics in digital psychological formulation and suicide prevention. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, pages 17–29, Seattle, USA. Association for Computational Linguistics.
  15. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
  16. No health without mental health. The Lancet, 370(9590):859–877.
  17. Harnessing the power of large language models for empathetic response generation: Empirical investigations and improvements. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6516–6528, Singapore. Association for Computational Linguistics.
  18. Smile: Single-turn to multi-turn inclusive language expansion via chatgpt for mental health support.
  19. Towards empathetic open-domain conversation models: A new benchmark and dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5370–5381, Florence, Italy. Association for Computational Linguistics.
  20. Clinical BERTScore: An improved measure of automatic speech recognition performance in clinical settings. In Proceedings of the 5th Clinical Natural Language Processing Workshop, pages 1–7, Toronto, Canada. Association for Computational Linguistics.
  21. PsyQA: A Chinese dataset for generating long counseling text for mental health support. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1489–1503, Online. Association for Computational Linguistics.
  22. InternLM Team. 2023. Internlm: A multilingual language model with progressively enhanced capabilities. https://github.com/InternLM/InternLM.
  23. Utterance classification with logical neural network: Explainable AI for mental disorder diagnosis. In Proceedings of the 5th Clinical Natural Language Processing Workshop, pages 439–446, Toronto, Canada. Association for Computational Linguistics.
  24. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  25. Large language models as source planner for personalized knowledge-grounded dialogues. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9556–9569, Singapore. Association for Computational Linguistics.
  26. Is ChatGPT a good NLG evaluator? a preliminary study. In Proceedings of the 4th New Frontiers in Summarization Workshop, pages 1–11, Singapore. Association for Computational Linguistics.
  27. Xin Yan and Dong Xue. 2023. Mindchat: Psychological large language model. https://github.com/X-D-Lab/MindChat.
  28. Towards interpretable mental health analysis with large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6056–6077, Singapore. Association for Computational Linguistics.
  29. Bartscore: Evaluating generated text as text generation. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 27263–27277.
  30. Glm-130b: An open bilingual pre-trained model. arXiv preprint arXiv:2210.02414.
  31. Bertscore: Evaluating text generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.
  32. CodeBERTScore: Evaluating code generation with pretrained models of code. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13921–13937, Singapore. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Chenhao Zhang (35 papers)
  2. Renhao Li (6 papers)
  3. Minghuan Tan (15 papers)
  4. Min Yang (239 papers)
  5. Jingwei Zhu (6 papers)
  6. Di Yang (88 papers)
  7. Jiahao Zhao (12 papers)
  8. Guancheng Ye (2 papers)
  9. Chengming Li (28 papers)
  10. Xiping Hu (46 papers)
Citations (9)
X Twitter Logo Streamline Icon: https://streamlinehq.com