CSConv Dataset: Dual-Dialogue Applications

Updated 18 May 2026

CSConv Dataset is a dual-domain resource offering strategy-aware customer support and emotionally-sensitive cognitive stimulation dialogues.
It employs rigorous annotation schemes and LLM-driven normalization to structure multi-turn conversational interactions effectively.
Empirical benchmarks using RoleCS demonstrate significant gains in lexical, semantic, and strategy accuracy for state-of-the-art dialogue models.

CSConv Dataset refers to two independent dialogue corpora that share the same acronym but serve distinct research domains: customer support conversation modeling (Zhu et al., 6 Aug 2025), and cognitive stimulation dialogue for elders with cognitive impairment (Jiang et al., 2023). This entry distinguishes both lines of research, providing technical details on their structure, annotation schemes, and utility in evaluating conversational systems.

Overview and Task Framing

CSConv is a benchmark for Customer Support Conversation (CSC), addressing the challenge of training service agents—or LLM-based systems—to produce responses that are not only problem-solving but also empathetic and aligned with the COPC guidelines. The formal definition models dialogue as a sequence

$D = \left\{ (P_i, T_i, U_i) \right\}_{i=1}^N$

with $P_i \in \{S, C\}$ (supporter/customer), $T_i$ as strategy label from a set $G$ (if $P_i=S$ ), and $U_i$ as textual utterance. At each supporter turn $k$ , the system receives context $X_k = \left\{(P_i, T_i, U_i)\right\}_{i=1}^{k-1}$ and must: (1) predict the next conversational strategy $T_k \in G$ ; (2) generate the supporter response $U_k$ conditioned on $P_i \in \{S, C\}$ 0.

High-quality support is organized into five stages and twelve strategies, governed by formal COPC standards:

Stage	Description
Connecting	Greeting, rapport building
Identifying	Understanding issue/data/emotion
Exploring	Discussing/evaluating solutions
Resolving	Delivering, confirming resolution
Maintaining	Closing, relationship preservation

Strategy Name	Abbrev.	Core Action
Greeting	GT	Friendly welcome/rapport
Identity Verification	IV	Confirm identity for security
Emotional Management	EM	Express empathy/understanding
Restatement/Paraphrase	RP	Clarify via rephrasing
Problem Refinement	PR	Refine via targeted questioning
Providing Suggestions	PS	Advise next steps/options
Information Delivery	ID	Explain policy/process
Resolution Implementation	RI	Execute concrete resolution
Feedback Request	FR	Inquire about resolution satisfaction
Appreciation & Closure	AC	Thank and close the conversation
Relationship Continuation	RC	Encourage further engagement
Others	—	Out-of-scope actions

2. Construction and Curation Pipeline

Data Acquisition and LLM-Driven Normalization

CSConv originates from 690,000 anonymized, professional transcriptions of Chinese customer service conversations in both pre-sales and post-sales contexts. Pre-filtering eliminates dialogues outside 6–60 utterances, enforces utterance length ≤ 500 characters, controls for turn imbalances ( $P_i \in \{S, C\}$ 1), mandates effectiveness in customer contributions, and removes unprofessional content via Qwen2.5-72B.

To enforce explicit strategic structure, DeepSeek-R1 LLM is prompted to rewrite sampled dialogs (≤500 per topic) such that each supporter turn is clearly annotated with an inferred strategy $P_i \in \{S, C\}$ 2. Customer responses are optionally refined for coherence, producing consistent, strategy-aware corpora.

Post-Processing and Annotation

Dialogs are retained only if they satisfy structural constraints (e.g., ≥10 utterances, inclusion of GT, IV, and AC stages, strict speaker alternation). Further LLM checks filter instances for coherence and empathy. Certified experts manually annotate each supporter turn with one of the twelve strategies and attest to stage boundaries per the COPC-aligned guidelines. No numerical inter-annotator agreement is reported, but all annotators are domain-credentialed.

Corpus Statistics

	Original	Rewritten
Conversations	1,855	1,855
Total utterances	35,350	50,587
Supporter utterances	17,862	25,810
Customer utterances	17,488	24,777
Avg. supporter utt.	9.63	13.91
Avg. customer utt.	9.43	13.36
Avg. supporter len.	41.16	48.72
Avg. customer len.	21.60	17.17
Strategy-labeled (\%)	55.28	97.82

Topic coverage spans eight topical domains plus "Others," each representing roughly 11–16% of dialogues. The most frequent strategies are Information Delivery (14.9%), Emotional Management (11.9%), and Providing Suggestions (10.0%).

3. RoleCS: Synthetic Training Corpus

To overcome scarcity of strategy-rich customer support interactions for model training, a role-playing LLM framework generates RoleCS—a large-scale synthetic dataset. Five LLM roles are orchestrated by DeepSeek-R1:

Planner selects (topic, persona) pairs, crafting customer goals and context scenarios.
Supporter Assistant recommends next strategy $P_i \in \{S, C\}$ 3 based on supporter's dialogue history.
Supporter generates responses following the given strategy.
Customer Assistant directs customer dialogue progression.
Customer replies in alignment with persona and scenario.

Customer personas are extracted from 15,980 real dialogues and de-duplicated by cosine embedding similarity, yielding a profile pool of 1,948. After generation and filtering, RoleCS comprises 11,232 dialogues with high strategy diversity.

	Value
Dialogues	11,232
Utterances	263,580
Avg. per conv.	23.47
Supporter utts.	137,406
Avg. supp. len.	66.98
Customer utts.	126,174
Avg. cust. len.	46.43

4. Benchmarking and Empirical Insights

Fine-tuning SOTA LLMs (LLaMA 3.1, Qwen2.5, DeepSeek) on RoleCS yields significant performance gains on CSConv across lexical (BLEU-n, ROUGE-L), semantic (BERTScore, BLEURT), and strategy-accuracy metrics:

Fine-tuning on RoleCS increases BLEU-2/4 by up to 5/2 points, ROUGE-L by ≈2, and accuracy by 5–6 points.
Qwen2.5-72B, despite being substantially smaller than DeepSeek, achieves competitive metrics.
Maintained performance drops under generated-context (free-running) evaluation, indicating context drift challenges in multi-turn deployment.

Model (ft)	BLEU-2	BLEU-4	ROUGE-L	ACC
Qwen2.5-72B + RoleCS	12.15	5.32	7.97	43.29

Human annotators and LLM evaluators (GPT-4o, Qwen-Plus) rate Qwen2.5-72B + RoleCS as delivering the highest response quality (3.79/5 humans; ~91/100 LLM), with Fleiss' Kappa indicating strong inter-rater reliability (0.628 human, 0.658 human-vs-LLM).

Design and Annotation

The "CSConv" dataset from (Jiang et al., 2023) is tailored for research at the juncture of cognitive stimulation therapy and dialogue systems. This resource consists of 2,643 dialogues (16,845 utterances) for Chinese-speaking elders with cognitive impairment, combining video transcripts (BrainLive project, ~1,800) and hand-scripted sessions (~900), all translated or authored in Mandarin.

Labeling is triple-layered per utterance: (1) Cognitive Stimulation (CS) principle (7-way), (2) emotion (8-way), (3) emotional support strategy (7-way), administered via BERT-based classifiers and iterative manual review.

CS Label	Count	%
None	5,296	31.4
Inquiry	4,156	24.7
Respect	2,134	12.7
Reminisc.	464	2.8
Expression	2,651	15.7
Enjoyment	1,862	11.1
Comfort	281	1.7

Strategy	Count	%
None	7,060	41.9
Question	4,195	24.9
Reflection of Feelings	293	17.4
Self-disclosure	3,022	17.9
Providing Suggestions	262	1.6
Information	819	4.9
Others	1,190	7.1

Utterances average 9.5 tokens. Dialogue scenarios are open-ended, encompassing reminiscence, comfort, chit-chat, and games.

The dataset is not pre-split into training, development, or test sets. Users establish their own partitions for modeling.

6. Access, Licensing, and Usage

Both datasets are publicly accessible:

CSConv (customer support): https://github.com/aliyun/qwen-dianjin
CSConv (cognitive stimulation): https://github.com/jiangjyjy/CSD

No explicit license or usage restrictions are provided in the original reports; prospective users should consult each repository for up-to-date terms. There are no reported limitations on research or academic usage for either corpus.

7. Significance and Use Cases

The customer support CSConv (Zhu et al., 6 Aug 2025), paired with RoleCS, supports evaluation and fine-tuning of LLMs in strategy-aware dialogue generation, establishing empirical baselines for copc-guided, empathetically-fluent agent systems in Chinese. The cognitive stimulation CSConv (Jiang et al., 2023) is the only large-scale, annotated dataset of its kind for conversational cognitive support, useful for modeling emotionally-grounded, therapeutic interactions.

Frequent reuse of the acronym "CSConv" should be contextually clarified due to the existence of two unrelated corpora under this name. A plausible implication is that future works should reference the task/intent (customer support vs. cognitive stimulation) when referring to "CSConv" datasets to avoid ambiguity.

Markdown Report Issue Upgrade to Chat

References (2)

Evaluating, Synthesizing, and Enhancing for Customer Support Conversation (2025)

A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CSConv Dataset.

CSConv Dataset: Dual-Dialogue Applications

1. Customer Support Conversation: CSConv (Zhu et al., 6 Aug 2025)

Overview and Task Framing

2. Construction and Curation Pipeline

Data Acquisition and LLM-Driven Normalization

Post-Processing and Annotation

Corpus Statistics

3. RoleCS: Synthetic Training Corpus

4. Benchmarking and Empirical Insights

5. Cognitive Stimulation CSConv (Jiang et al., 2023)

Design and Annotation

6. Access, Licensing, and Usage

7. Significance and Use Cases

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

CSConv Dataset: Dual-Dialogue Applications

1. Customer Support Conversation: CSConv (Zhu et al., 6 Aug 2025)

Overview and Task Framing

2. Construction and Curation Pipeline

Data Acquisition and LLM-Driven Normalization

Post-Processing and Annotation

Corpus Statistics

3. RoleCS: Synthetic Training Corpus

4. Benchmarking and Empirical Insights

5. Cognitive Stimulation CSConv (Jiang et al., 2023)

Design and Annotation

6. Access, Licensing, and Usage

7. Significance and Use Cases

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research