SPASM: Stable Persona-driven Agent Simulation for Multi-turn Dialogue Generation

Published 10 Apr 2026 in cs.CL and cs.MA | (2604.09212v1)

Abstract: LLMs are increasingly deployed in multi-turn settings such as tutoring, support, and counseling, where reliability depends on preserving consistent roles, personas, and goals across long horizons. This requirement becomes critical when LLMs are used to generate synthetic dialogues for training and evaluation, since LLM--LLM conversations can accumulate identity-related failures such as persona drift, role confusion, and "echoing", where one agent gradually mirrors its partner. We introduce SPASM (Stable Persona-driven Agent Simulation for Multi-turn dialogue generation), a modular, stability-first framework that decomposes simulation into (i) persona creation via schema sampling, plausibility validation, and natural-language persona crafting, (ii) Client--Responder dialogue generation, and (iii) termination detection for coherent stopping. To improve long-horizon stability without changing model weights, we propose Egocentric Context Projection (ECP): dialogue history is stored in a perspective-agnostic representation and deterministically projected into each agent's egocentric view before generation. Across three LLM backbones (GPT-4o-mini, DeepSeek-V3.2, Qwen-Plus) and nine Client--Responder pairings, we construct a dataset of 4,500 personas and 45,000 conversations (500 personas X 10 conversations per pairing). Ablations show ECP substantially reduces persona drift and, under human validation, eliminates echoing; embedding analyses recover persona structure and reveal strong responder-driven interaction geometry. Our code is available at https://github.com/lhannnn/SPASM.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces a stability-first framework that mitigates persona drift and echoing in multi-turn dialogue generation by using Egocentric Context Projection.
The method decouples dialogue history from role conditioning through perspective-agnostic storage and agent-centered projection to ensure consistent persona behavior.
Empirical results on 45,000 synthetic dialogues demonstrate a significant reduction in identity drift and the complete elimination of echoing in controlled simulations.

SPASM: A Stability-First Persona-Driven Simulation Framework for Multi-Turn LLM Dialogue

Problem Motivation and Theoretical Context

Multi-turn dialogue generation with LLMs is critical for applications in tutoring, customer support, counseling, and emotional support, yet such interaction settings are vulnerable to persistent failures including persona drift, role confusion, and echoing—the unwanted convergence of dialogue agent behaviors toward mirroring and loss of identity specification. While LLM-LLM dialogue synthesis offers scalable, controllable data generation for benchmarking and alignment, prior architectures have exhibited significant instability when tasked with maintaining strict persona and role constraints across extended interaction horizons. Standard template-based context construction and history concatenation protocols do not adequately preserve role semantics, inducing negative feedback loops and context misalignments that lead to these identity failures.

Methodology: The SPASM Framework and Egocentric Context Projection

SPASM (Stable Persona-driven Agent Simulation for Multi-turn dialogue generation) introduces a modular architecture employing three principal stages: (1) high-quality persona generation, (2) Client-Responder dialogue simulation, and (3) a robust, coherence-oriented termination detector. Persona creation is decomposed into schema sampling over demographic, context, affective and behavioral axes; plausibility validation enforces logical consistency, and persona crafting generates detailed role specifications in natural language.

The central innovation is Egocentric Context Projection (ECP), which decouples dialogue history storage from agent perspective. Instead of conditioning agents on a shared, role-labeled transcript, the framework stores all dialogue content in a perspective-agnostic format and projects it at generation time into each agent's egocentric coordinate system. That is, every message is relabeled deterministically as SELF/PARTNER for the target agent, maintaining content and temporal order but explicitly resolving ambiguity in role interpretation. This history rendering protocol implements strict role-consistent input normalization, preventing misattributions of intent, instruction drift, and feedback-induced behavioral echoing.

Empirical Results: Drift, Echoing, and Semantic Geometry

SPASM's evaluation corpus comprises 45,000 synthetic dialogues generated as a matrix of nine backbone Client-Responder configurations (GPT-4o-mini, DeepSeek-V3.2, Qwen-Plus), spanning 4,500 validated persona instantiations. Behavioral analysis demonstrates that dialogues grouped by persona form distinct and well-separated clusters in high-dimensional embedding space, with pronounced separation in same-backbone pairings and degradation in cross-backbone scenarios reflected by increased intra-persona variance but preserved inter-persona distances.

ECP demonstrates consistently reduced persona drift across semantic dimensions—particularly on emotion and concerns probes—with large effect sizes (e.g., Cohen's d = -0.75 for emotion drift in GPT-4o-mini pairs). Notably, echoing is eliminated entirely under human validation in ECP-augmented simulations across all backbone pairings, while standard concatenation exhibits echoing in up to 24% of conversations depending on the pairing. Persona retrieval diagnostics confirm that ECP maintains robust persona cues, as evidenced by near-perfect Acc@1 rates in same-backbone settings.

Additionally, the geometry of interactions reveals that the Responder model backbone dominates the embedding space structure, confirming an asymmetric influence of the dialogue partner on emergent behavioral patterns, and suggesting responder control as a key axis for population-level simulation.

Mechanistic Analysis and Theoretical Implications

The authors articulate and validate several mechanistic hypotheses:

Role-label ambiguity: Perspective-invariant transcripts induce interpretational errors, which ECP resolves by agent-centric labeling.
Post-training alignment bias: RLHF- and SFT-aligned LLMs revert toward assistant-like behaviors, leading to persona drift; ECP reduces the assistant-priming context.
Closed-loop feedback amplification: In multi-agent settings, identity drift in one agent quickly induces reciprocal drift in the partner, stabilized by ECP's view separation.

This analysis situates the persona drift and echoing problem as fundamentally architectural and context-driven rather than solely attributable to model pretraining or alignment protocols; thus, ECP provides a lightweight, model-agnostic mitigation.

Implications, Limitations, and Future Directions

SPASM establishes a new paradigm in controllable dialogue data generation, where stability over long horizons is an explicit design target and operational constraint. In practice, ECP and modular persona generation set a new standard for building agent simulators and synthetic corpora for downstream training, evaluation, and psychosocial safety assessments. The distinction between responder and client backbone influence suggests further research into asymmetric role configuration and compositional simulation.

Notable limitations include: (1) the evaluation scope is limited to English, instruction-tuned, two-agent settings; (2) the architecture's efficacy in larger/more heterogeneous agent groups remains underexplored; (3) current schemas, while broad, do not yet match the richness of real-world personality structure.

Promising directions include extension to multi-party settings, adaptive persona enrichment, context-aware persona transitions, and rigorous evaluation in non-English and zero-shot transfer domains. Integration with alignment and debiasing pipelines could further improve robustness for deployment and evaluation tasks.

Conclusion

SPASM provides a stability-first, modular, and scalable solution for long-horizon, persona-driven LLM-LLM dialogue simulation. The key advance, Egocentric Context Projection, offers role-consistent context conditioning that decisively mitigates persona drift and echoing without requiring model retraining. The resulting empirical and analytic contributions establish robust groundwork for future research and deployment in synthetic dialogue generation for evaluation, alignment, and human-agent simulation studies (2604.09212).

Markdown Report Issue