CHORUS Framework: Synthetic Deliberation Simulation
- CHORUS Framework is an agent-driven simulation system that produces synthetic, temporally realistic deliberation data by leveraging LLM actors with distinct personas and memory modules.
- It employs calibrated Poisson processes to mimic bursty online interactions, ensuring that the timing of posts and actions closely reflects real-world discussion dynamics.
- The framework integrates a structured tool suite, including web search for expert actors, to support platform benchmarking, policy evaluation, and NLP pipeline development.
CHORUS Framework
CHORUS (CHaracter-driven Orchestrated Response User Simulation) is an agentic simulation framework for generating large-scale, temporally realistic synthetic deliberation data on interactive web platforms. By orchestrating multiple LLM-powered actors—each grounded in a distinct, behaviorally consistent persona, endowed with contextual memory, and controlled via a stochastic temporal engagement model—CHORUS produces discussions that closely approximate the diversity, burstiness, and conversational interaction patterns of real online deliberation. The system is designed to address pressing challenges in data accessibility, privacy, and methodological rigor that afflict empirical research on online discourse and to furnish high-quality synthetic data for platform analysis and NLP pipeline development (Koursaris et al., 22 Apr 2026).
1. Motivation and Problem Scope
CHORUS addresses three core bottlenecks in computational analysis of online discussion:
- Data scarcity: Empirical deliberation data are restricted by platform access, stringent privacy policies, and inconsistent annotation or quality, hampering both social science and NLP research.
- Lack of temporal realism: Prior simulation frameworks inadequately capture the heterogeneous, bursty participation patterns observed in real user populations, often imposing uniform or synchronous activity schedules.
- Persona drift and coherence: Without explicit persona grounding and memory, LLM-based multi-actor simulations exhibit behavioral collapse or incoherence over multi-turn and multi-party interactions, undermining realism.
The framework is built to produce synthetic debates that preserve argumentation diversity, fine-grained engagement timing, and long-range persona coherence.
2. Architecture and Actor Model
CHORUS instantiates a population of actor agents , each constructed as an independent LLM-driven agent with the following modules:
- Persona Module: Each actor is assigned a detailed persona specification encompassing biographical context, communication style, core beliefs, and topical engagement parameters. Persona prompts are provided to the LLM at every event, constraining generation to selected archetypes (e.g., Casual User, Expert, Advocate, Skeptic).
- Memory Module: Two actor-local histories are maintained: (all prior posts by ) and (all voting actions). In addition to their own histories, actors access the global discussion history , enabling context-sensitive generation and voting.
- Timing Controller: The scheduled timings of “post” and “action” events are governed by independent Poisson processes per actor.
- Tool Interface : Exposes structured actions—publishing (), voting (0), context retrieval (1), and, for experts, evidence lookup via 2.
Actor scheduling and basic event execution are coordinated by a global priority queue 3, where the next occurring event (post or action) across all actors is computed, executed, actor memory updates performed, and the next event time for that actor resampled and reinserted into 4 (Koursaris et al., 22 Apr 2026).
3. Temporal Engagement and Decision Modeling
Temporal realism in CHORUS is driven by independent Poisson process modeling:
- Posting events for actor 5: arrival rate 6
- Action (voting) events: rate 7
Empirically calibrated per-archetype rates from the deployment study:
| Archetype | 8 | 9 | 0 | 1 |
|---|---|---|---|---|
| Casual User 1 | 1.0 | 1.4 | 0.45 | 0.35 |
| Casual User 2 | 0.7 | 1.0 | 0.40 | 0.40 |
| Casual User 3 | 1.2 | 1.5 | 0.50 | 0.30 |
| Casual User 4 | 0.5 | 0.7 | 0.35 | 0.45 |
| Expert | 0.4 | 0.6 | 0.60 | 0.65 |
| Advocate 1 | 1.0 | 1.8 | 0.55 | 0.25 |
| Advocate 2 | 1.2 | 2.0 | 0.60 | 0.20 |
| Advocate 3 | 0.8 | 1.6 | 0.50 | 0.30 |
| Skeptic 1 | 0.55 | 1.3 | 0.70 | 0.55 |
| Skeptic 2 | 0.45 | 1.1 | 0.75 | 0.60 |
Event timings are updated by sampling inter-event times 2 from the exponential distribution corresponding to each rate parameter. Decision policies (reply vs new post, vote vs no action) are stochastic, with uniform random variables compared to actor parameters 3 and 4 at decision time. This approach reproduces the bursty, non-stationary engagement dynamics observed in real-world interactive platforms.
4. Tool Use and External Platform Integration
Actors interact with the simulation environment via a structured tool suite:
- All actors utilize generic tools: publishing, history retrieval, upvoting/downvoting.
- The Expert archetype uniquely invokes 5, enabling pulls of external data or citations for evidence-augmented argumentation.
- All generative actions (posts, replies) and voting behaviors are mapped to HTTP API calls that interface with external platforms such as Deliberate, rendering CHORUS compatible with live or simulated online discussion services.
The tool interface design supports instrumented integration for downstream logging, analysis, and platform benchmarking (Koursaris et al., 22 Apr 2026).
5. Implementation and Experimentation
In the prototypical deployment of CHORUS on the Deliberate platform (“Extreme Weather Events Due to Climate Change” topic), the following configuration was used:
- 6 actors covering four archetypes: 4 Casual, 1 Expert, 3 Advocates, 2 Skeptics.
- Claude Sonnet 4.5 powered LLM agents (selected for quality/cost tradeoff, 72 € for 20 minutes and 10 actors).
- Persona-conditioned, context-windowed LLM prompts: full discussion history for experts/skeptics, recent context for casuals/advocates.
- Orchestrator-driven event execution following Algorithms 1–3 (Simulation Cycle, Action, Post).
Empirical Evaluation
Thirty expert evaluators (QA, linguistics, AI research) rated synthetic discussions on three dimensions (5-point Likert scale):
| Dimension | Mean Score | Representative notes |
|---|---|---|
| Content realism | 4.6 | Tone, vocabulary, and diversity high |
| Discussion coherence | 4.1 | Multi-party coherence slight challenge |
| Analytical utility | 4.3 | Thematic trends reflected real data |
Results validate CHORUS’s effectiveness for realism, internal consistency, and NLP-compatibility. Multi-turn, multi-actor argumentation and voting patterns are retained, and downstream pipelines yield interpretable outputs similar to those over authentic data.
6. Limitations and Directions for Advancement
Identified limitations in the present CHORUS implementation:
- No ablation/isolation studies have been conducted to disentangle the effects of persona grounding, Poisson-based timing, or tool usage individually.
- All evaluation has been performed with a single LLM instance (Claude Sonnet 4.5); robustness under different LLMs (GPT-4, Gemini Ultra, etc.) is untested.
- The operational frequency and impact of tool invocation (especially 8) on content quality are not quantified.
- Potential enhancements under development include adversarial archetype introduction (misinformation modeling, polarization), adaptive persona evolution, and benchmarking against real-world observational data at scale.
7. Significance and Research Impact
CHORUS provides a principled, extensible substrate for generating high-fidelity synthetic deliberation traces—enabling systematic experimentation, evaluation of moderation or voting policies, platform development, and pretraining or benchmarking of deliberation-oriented NLP systems in data-constrained or privacy-sensitive settings. Its combination of personalized, memory-rich LLM actors, calibrated temporal engagement, and structured tool interaction sets a methodological standard for realism and analytical tractability in synthetic online discourse generation (Koursaris et al., 22 Apr 2026).