Context-Agent: Dynamic Discourse Trees for Non-Linear Dialogue

Published 7 Apr 2026 in cs.CL and cs.AI | (2604.05552v1)

Abstract: LLMs demonstrate outstanding performance in many language tasks but still face fundamental challenges in managing the non-linear flow of human conversation. The prevalent approach of treating dialogue history as a flat, linear sequence is misaligned with the intrinsically hierarchical and branching structure of natural discourse, leading to inefficient context utilization and a loss of coherence during extended interactions involving topic shifts or instruction refinements. To address this limitation, we introduce Context-Agent, a novel framework that models multi-turn dialogue history as a dynamic tree structure. This approach mirrors the inherent non-linearity of conversation, enabling the model to maintain and navigate multiple dialogue branches corresponding to different topics. Furthermore, to facilitate robust evaluation, we introduce the Non-linear Task Multi-turn Dialogue (NTM) benchmark, specifically designed to assess model performance in long-horizon, non-linear scenarios. Our experiments demonstrate that Context-Agent enhances task completion rates and improves token efficiency across various LLMs, underscoring the value of structured context management for complex, dynamic dialogues. The dataset and code is available at GitHub.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces a dynamic framework that models dialogue as a set of discourse trees, improving context utilization in complex conversations.
It employs semantic clustering and a two-stage branch management process to achieve up to +9.7% relative TCR improvement and an 88.9% absolute TCR on GPT-4.1.
Experimental results show a 45–52% reduction in context tokens, underscoring enhanced efficiency and reduced redundancy in dialogue processing.

Dynamic Discourse Trees for Non-Linear Dialogue: The Context-Agent Framework

Introduction

"Context-Agent: Dynamic Discourse Trees for Non-Linear Dialogue" (2604.05552) addresses the inefficacy of modeling dialogue history as flat, linear token sequences in LLM-based dialogue systems. The paper posits that the inherent branching and hierarchical structure of human dialogue, such as frequent topic shifts and instruction refinements, is mismatched with the prevailing linear approaches. This misalignment leads to diminished context utilization and a loss of coherence in long, dynamic multi-turn interactions—a challenge unmitigated by simple context window expansion or naive compression strategies.

The proposed solution, Context-Agent, formalizes and implements dialogue history as a dynamic forest of topic trees, explicitly modeling topic branching and discourse-level intent. Alongside, the authors introduce the Non-linear Task Multi-turn Dialogue (NTM) benchmark, designed to stress-test context management paradigms with dialogues featuring multiple topic shifts, complex instruction refinements, and long-horizon task dependencies.

Methodology: Discourse Trees and Dynamic Context Construction

The Context-Agent framework structures dialogue as a set of topic trees, where each node corresponds to a (user, agent) exchange, and branches model distinct dialogue threads under the same or diverging topics. The system continuously updates the state based on queries, utilizing lightweight models for topic/branch decision and semantic clustering for fork-point identification.

Each dialogue turn involves:

Topic identification (new, switch, or continue), using branch and topic summaries.
Fork-point detection, leveraging embeddings to discover the closest semantic ancestor.
Branch management, decided via a two-stage process (heuristic filtering + LM-driven action).
Selective construction of context: full active-path history plus hierarchical branch/topic summaries.

This structure directly enforces logical separation of threads, minimizing retrieval confounds common in flat history compression and flat RAG, and ensuring efficient path-aware context retrieval.

The NTM Benchmark: Measuring Non-Linear Dialogue Competence

The NTM benchmark comprises 405 dialogues (~6900 turns), with lengths of 10–25 turns each, across daily planning and coding domains. Dialogues are engineered for multiple topic shifts and instruction refinements rather than purely information-centric QA, with each session terminating in an objectively verifiable multi-checkpoint task.

Evaluation focuses on Task Completion Rate (TCR, per-turn checkpoint success) and Average Context Tokens (ACT, per-turn token usage), supporting joint assessment of coherence, task orientation, and computational efficiency.

Experimental Results

Extensive experiments across four distinct LLMs (GPT-4.1, DeepSeek-V3, GLM-4-Plus, Llama 3.1-70B) demonstrate several notable findings:

Context-Agent consistently improves TCR over both truncation and even Full-History linear baselines, with relative gains up to +9.7% (Llama 3.1-70B) and absolute TCR of 88.9% (GPT-4.1).
Token efficiency is substantially improved: ACT is reduced by 45–52% compared to Full-History concatenation, indicating more focused, less redundant context presentation.
For TopiOCQA, Context-Agent increases factual answer accuracy (EM, F1), using only ~57% of baseline context tokens, demonstrating generalization to open-domain QA with topic shifts.

The results decisively indicate that merely increasing the context window size is insufficient for non-linear dialogue management. Even massive-window models like GPT-4.1 and Llama 3.1-70B benefit significantly from the structured context.

Figure 2: (a) TCR comparison across context management methods and models. (b) Example trajectory of context token usage in a 20-turn dialogue, highlighting greater efficiency with Context-Agent.

Analysis shows that token consumption with Context-Agent grows sub-linearly with dialogue length, while task accuracy remains robust or even improves—a strong endorsement of the value in explicit discourse modeling.

Figure 4: The trade-off between TCR and ACT, with optimal methods in the top-left quadrant (high accuracy, low token usage); Context-Agent dominates the efficiency frontier.

Ablation studies further underscore the necessity of both tree structure and semantic RAG: removing either yields sharp declines in TCR (-29.5% to -35.5% absolute).

Discussion, Implications, and Future Directions

The Context-Agent methodology reifies the intuition from Attentional State theory: human dialogue management is inherently hierarchical and path-sensitive, not simply a function of linear context accumulation. By dynamically representing conversational state as a set of discourse trees and reconciling multiple dialogue threads, LLM-driven agents can sharply reduce token and reasoning redundancy endemic to concatenation/truncation schemes.

Practical implications are significant:

For LLM agents, Context-Agent allows precise focus amid extended, multi-party, or multi-topic sessions—a necessity for agentic workflows in enterprise and collaborative settings.
The reduction in ACT unlocks longer interactions before encountering prompt-token limits, or lowers inference cost for otherwise equivalent task accuracy.
The NTM benchmark fills a clear gap in evaluating discourse-structured, task-oriented dialogue modeling, providing a rigorous standard as LLM applications grow in complexity.

Theoretically, this work points toward integration of explicit, dynamically learned discourse graphs within LLM memory and context modeling architectures, potentially extending to graph-induced attention mechanisms within the model itself. Further, optimizing or end-to-end learning of the decision policies (rather than heuristic/LMM mediation) may yield even stronger gains.

Conclusion

Context-Agent brings explicit branching discourse modeling to the forefront of non-linear dialogue management for LLMs, yielding substantial improvements in both accuracy and context efficiency on challenging, long-horizon conversational tasks. The approach shows robustness across both open- and closed-source LLMs of varying context capacities. These findings strongly support advancing towards explicitly structured, path-aware context management as a foundational component in the development of next-generation conversational agents.

Markdown Report Issue