PsyCLIENT-CP: Chinese Client Profile Dataset
- PsyCLIENT-CP is a comprehensive dataset featuring 120 virtual client profiles across 60 counseling topics with detailed demographic and behavioral data.
- It utilizes structured frameworks like SOAP notes and case conceptualization, with expert triple-review annotation achieving high agreement scores (Cohen’s κ ≈ 0.82).
- The JSON-formatted dataset supports up to 38,880 simulation instances, enabling robust LLM-based research, training, and evaluation in mental health counseling.
PsyCLIENT-CP is an open-source Chinese client profile dataset designed for mental health counseling simulation, introduced as part of the PsyCLIENT framework for client simulation via conversational trajectory modeling (Qiu et al., 12 Jan 2026). It provides a rigorous foundation for generating diverse, behaviorally grounded virtual client agents in LLM-based counselor training and model evaluation contexts, addressing deficits in existing simulation tools concerning authenticity, demographic scope, and behavioral annotation within Chinese-language settings.
1. Dataset Composition and Demographic Scope
PsyCLIENT-CP comprises 120 virtual client profiles distributed over 60 distinct counseling topics. These topics include family conflict, suicidal ideation, intimacy fears, career adjustment, grief, anxiety, depression, interpersonal conflict, life-stage transitions, adjustment disorders, and identity crises, with a mean of two profiles per topic. The demographic attributes of the client profiles span:
- Age range: 18–65 (mean ≈ 35)
- Gender: approximately 50% female, 50% male
- Marital status: single (≈ 40%), married (≈ 35%), divorced or separated (≈ 25%)
- Occupations: student, teacher, office staff, healthcare professionals, among others
- Educational attainment: high-school through doctoral levels
- Income levels: low, medium, high
Each profile presents a detailed characterization including basic demographic information, family and developmental history, presenting problems, symptom history, physical health, personality traits and communication style, interpersonal relationships, social circle, lifestyle, hobbies/aspirations, additional issues, and a timeline of problem development.
| Attribute | Range or Distribution | Example Values |
|---|---|---|
| Age | 18–65 (mean ≈ 35) | 42 |
| Gender | ≈50% female, 50% male | "Female" |
| Marital status | Single 40%, Married 35%, Div/Separated 25% | "Divorced" |
| Occupation | Diverse: student, teacher, etc. | "Teacher" |
| Education | High-school–Doctoral | "Masters" |
| Income | Low–Medium–High | "Medium" |
| Topic | 60 topics, ≈2 profiles/topic | "Parental Grief" |
2. Construction and Annotation Methodology
Profile authoring was conducted by two licensed counselors, employing SOAP notes (Subjective, Objective, Assessment, Plan), detailed intake forms, and case conceptualization frameworks including CBT-based templates. Each profile averages 3,930±2,002 Chinese characters.
Conversational trajectory extraction utilized Chinese counseling transcripts (Li et al., 2023), with dialogues ≥ 30 turns, fully anonymized and free of personally identifiable information (PII). Each of the 324 extracted trajectories is annotated at the utterance level by three expert annotators using an oracle function , which maps each client utterance to a non-empty label set , where represents the space of 12 atomic behavioral types.
The behavioral label set comprises:
- co (Confirming)
- gi (Giving Information)
- rr (Reasonable Request)
- ex (Extending)
- re (Reformulating)
- ec (Expressing Confusion)
- de (Defending)
- sa (Self-criticism/Hopelessness)
- sh (Shifting Topics)
- st (Focus Disconnection)
- fd (Sarcastic Answer)
- ot (Other)
Multi-label combinations are permitted for any client turn. Quality control entails independent annotation by three experts, retaining only unanimous label assignments. Pilot annotation agreement was Cohen’s for pairwise, and Fleiss’ across three annotators.
Profile–trajectory independence allows combinatorial augmentation, yielding up to distinct simulation instances.
3. Data Structure and Format
The dataset is formatted in JSON, supporting programmatic access and integration. Profile, trajectory, and simulation instance schemas are specified as follows.
- Profile Schema Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
{
"id": "P030",
"topic": "Parental Grief",
"demographics": {
"name": "Li Hua",
"age": 42,
...
},
"background": {
"family_history": "Mother died of illness…",
"developmental_history": "Eldest child in working‐class family…"
},
"presenting_problem": "Overwhelming grief since mother’s death…",
"symptoms": {
"sleep": "insomnia",
"headache": "2–3×/month",
...
},
"personality": "Verbose",
"social_circle": "2 close colleagues, 3 college friends…",
"lifestyle": "teaches weekdays, reads/writes spare time…",
"issue_timeline": [
{"age": 41, "event": "Mother diagnosed…"},
{"age": 42, "event": "Mother passed…"}
],
"communication_style": "Verbose, detailed on teaching…",
"additional_issues": ["Emotional regulation difficulty"]
} |
- Trajectory Schema Example:
1 2 3 4 5 6 7 |
{
"traj_id": 105,
"turns": [
{"t":1, "behavior_labels":["gi"], "client_utterance":""},
...
]
} |
- Instance Structure for Simulation (pseudocode):
1 2 3 4 5 6 7 8 |
ProfileInstance = {
id: string,
demographics: dict,
topic: string,
behavior_labels_seq: List[Set[string]],
content_constraints: List[string],
dialogue_history: List[string]
} |
4. Annotation Guidelines and Dataset Quality
Annotation procedures are governed by formal definitions, provided with illustrative examples for all 12 behavioral categories. Annotators are instructed to assign at least one behavior label for every client turn and not to leave turns unlabeled. The guidelines mandate strict adherence to defined categories. Each annotation undergoes expert triple review, with only unanimous decisions included in the public dataset. Exemplar inter-annotator agreement metrics include Cohen’s and Fleiss’ , as noted above.
Preprocessing steps involve discarding turns shorter than a set threshold and those lacking sufficient context, as well as sweeping for and removing any remnant personal identifiers.
5. Dataset Access, Integration, and Simulation Workflow
Profiles and trajectories are retrievable from GitHub (https://github.com/qiuhuachuan/PsyCLIENT) as profiles.json and trajectories.json. Use is facilitated through standard Python data handling modules, exemplified below for the HuggingFace datasets interface:
1 2 3 4 5 |
from datasets import load_dataset ds_profiles = load_dataset("psyclient-cp", split="profiles") ds_trajs = load_dataset("psyclient-cp", split="trajectories") profile = ds_profiles[0] traj = ds_trajs[12] |
For simulation, profiles and trajectories are composed in prompt templates for LLMs. The simulation.py script provides exemplars:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import openai def build_prompt(profile, traj, turn_idx): p_info = profile["demographics"] beh = ",".join(traj["turns"][turn_idx]["behavior_labels"]) history = "\n".join(traj["turns"][:turn_idx]) return f""" You are a client… Personal Profile: {p_info} Conversation Rules: … Next Behavior: {beh} History: {history}""" def simulate_turn(prompt): resp = openai.ChatCompletion.create( model="gpt-4", messages=[{"role":"system","content":prompt}], temperature=0.7 ) return resp.choices[0].message.content |
A plausible implication is that the modular design allows researchers to instantiate a wide variety of client agents by recombining independently authored profiles and annotated trajectories.
6. Evaluation Metrics and Utility
PsyCLIENT-CP evaluations encompass:
- Profile Realism & Diversity: Coverage across age, gender, marital status; uniformity in topic-profile distribution (≈2 profiles per topic); lexical diversity (vocabulary size and average sentence length).
- Behavioral Coverage: Exhaustive representation of all 12 behavioral labels in profiles and trajectories; multi-label correlation analysis via Jaccard index matrices over dialogue turns.
- Expert-Rated Authenticity: 7-point Likert ratings for fluency, emotional expressiveness, coherence, appropriateness, overall authenticity.
- Discrimination Tasks: Expert confusion rate approaches 95% under the PsyCLIENT methodology; LLM-based automatic detection accuracy near chance level.
The dataset enables the generation of up to 38,880 distinct simulation instances (profiles × trajectories), supporting both research and practical training settings.
7. Research Implications and Applications
PsyCLIENT-CP provides a resource for simulating complex and realistic client interactions in Chinese-language mental health counseling. It functions as both a research benchmark and a practical training tool for novice counselors and automated system evaluators. This suggests significant utility in mental health education and empirical study design requiring controlled and diverse client agent populations.
The framework’s principled approach to trajectory modeling, rigorous behavioral annotation, and extensive demographic normalization offer new avenues for cross-lingual and cross-cultural simulation studies. As an open-source resource with ready-to-use integration pipelines and expert-reviewed metadata, it is positioned to facilitate future advancements in conversational AI, clinical training, and automated counselor evaluation (Qiu et al., 12 Jan 2026).