DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization (2505.16995v1)

Published 22 May 2025 in cs.CL

Abstract: Recent advances in Emotional Support Conversation (ESC) have improved emotional support generation by fine-tuning LLMs via Supervised Fine-Tuning (SFT). However, common psychological errors still persist. While Direct Preference Optimization (DPO) shows promise in reducing such errors through pairwise preference learning, its effectiveness in ESC tasks is limited by two key challenges: (1) Entangled data structure: Existing ESC data inherently entangles psychological strategies and response content, making it difficult to construct high-quality preference pairs; and (2) Optimization ambiguity: Applying vanilla DPO to such entangled pairwise data leads to ambiguous training objectives. To address these issues, we introduce Inferential Preference Mining (IPM) to construct high-quality preference data, forming the IPM-PrefDial dataset. Building upon this data, we propose a Decoupled ESC framework inspired by Gross's Extended Process Model of Emotion Regulation, which decomposes the ESC task into two sequential subtasks: strategy planning and empathic response generation. Each was trained via SFT and subsequently enhanced by DPO to align with the psychological preference. Extensive experiments demonstrate that our Decoupled ESC framework outperforms joint optimization baselines, reducing preference bias and improving response quality.

PDF Abstract

Enhancing Emotional Support Generation through Strategy-Response Decoupling

The paper "DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization" addresses challenges in Emotional Support Conversations (ESC), a field that leverages LLMs to provide empathetic dialogue. It identifies persistent psychological errors in traditional models trained with Supervised Fine-Tuning (SFT) and explores an advanced method to enhance the generation of emotional support responses.

Key Challenges in Emotional Support Generation

Two primary challenges are pinpointed for ESC tasks. Firstly, the entangled nature of existing ESC data complicates the separation of psychological strategies from actual response content, hindering the quality of preference pair construction. Secondly, optimization ambiguity arises when traditional Direct Preference Optimization (DPO) methods are applied to entangled data, leading to potential negative optimization effects where incorrect penalization of strategy-response pairs can degrade model performance.

Proposed Solution: DecoupledESC Framework

This paper proposes the DecoupledESC framework, inspired by Gross's Extended Process Model of Emotion Regulation, which effectively separates psychological strategy planning and empathic response generation into distinct subtasks. This decoupling allows for focused optimization via Inferential Preference Mining (IPM), producing a new dataset, IPM-PrefDial, which aids in constructing high-quality preference data. The innovative framework is designed to reduce psychological errors and preference biases, thereby improving emotional support generation.

Methodology and Dataset

The authors developed a training approach that incorporates two stages: Strategy Planning (SP) and Response Generation (RG). The IPM method constructs preference samples by dynamically routing psychological error samples into appropriate training phases. As a result, the SP and RG components are trained separately to achieve alignment with human psychological preferences, leveraging DPO for optimization.

Evaluation and Results

Extensive experiments were conducted using Qwen and Llama models as backbones, comparing the proposed DecoupledESC framework against baseline models using vanilla SFT and DPO. The findings highlight significant improvements in performance metrics such as strategy prediction accuracy, preference bias reduction, and qualitative aspects like fluency, professionalism, empathy, and helpfulness rated on a 5-point Likert scale. The responses generated by the decoupled models showed enhanced empathy and reduced psychological errors, demonstrating superior alignment with human emotional support strategies compared to traditional methods.

Implications and Future Research

The paper presents substantial evidence that the decoupled approach yields more accurate and empathetic emotional support dialogues. This advancement in ESC holds potential practical implications for deploying scalable LLM-based mental health support systems, especially amidst a shortage of mental health professionals. The decoupling framework could be extended to other emotional regulation domains beyond ESC, offering new avenues for future research in AI-driven psychological support.

Future explorations could investigate the efficacy of decoupled optimization frameworks on larger-scale models, examine integration with other preference optimization methods like IPO, KTO, and SimPO, and explore real-world applications of AI-crafted emotional support in collaboration with human psychological experts.

In summary, this paper's contribution lies in its novel approach to disentangling and optimizing psychological strategies and response content in ESC tasks, thus enhancing the quality of empathetic dialogue generated by LLMs. This work signifies a thoughtful progression towards more refined and intelligent emotional support systems within the AI research community.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Chao Zhang (907 papers)
Xin Shi (48 papers)
Xueqiao Zhang (3 papers)
Yifan Zhu (84 papers)
Yi Yang (855 papers)
Yawei Luo (40 papers)

Related Papers

Find Related Papers

YouTube

Show All Videos