LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents

Published 6 May 2026 in cs.AI | (2605.05191v1)

Abstract: Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information. Naively accumulating all intermediate content can overwhelm the agent, increasing costs and the risk of errors. We propose that effective context management should be adaptive: parts of the agent's trajectory are maintained at different levels of detail depending on their current relevance to the task. To operationalize this principle, we introduce Context-ReAct, a general agentic paradigm for elastic context orchestration that integrates reasoning, context management, and tool use in a unified loop. Context-ReAct provides five atomic operations: Skip, Compress, Rollback, Snippet and Delete, which allow the agent to dynamically reshape its working context, preserving important evidence, summarizing resolved information, discarding unhelpful branches, and controlling context size. We prove that the Compress operator is expressively complete, while the other specialized operators provide efficiency and fidelity guarantees that reduce generation cost and hallucination risk. Building on this paradigm, we develop LongSeeker, a long-horizon search agent fine-tuned from Qwen3-30B-A3B on 10k synthesized trajectories. Across four representative search benchmarks, LongSeeker achieves 61.5% on BrowseComp and 62.5% on BrowseComp-ZH, substantially outperforming Tongyi DeepResearch (43.2% and 46.7%) and AgentFold (36.2% and 47.3%). These results highlight the potential of adaptive context management, showing that agents can achieve more reliable and efficient long-horizon reasoning by actively shaping their working memory.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper demonstrates that elastic context orchestration via the Context-ReAct paradigm effectively manages long-horizon search tasks by dynamically controlling context length.
It utilizes atomic operations like Compress, Snippet, Delete, and Rollback to maintain multi-resolution fidelity and prevent token budget exhaustion.
Empirical evaluations show that LongSeeker outperforms conventional methods on several benchmarks, ensuring efficient and scalable agent performance.

Elastic Context Management in Long-Horizon Search Agents: The LongSeeker Framework

Motivation and Problem Formulation

The increasing complexity of agentic search tasks has exposed limitations in conventional context management strategies in long-horizon agents. The ReAct paradigm, while facilitating iterative reasoning and tool use, leads to monotonically growing context windows—resulting in noise accumulation and, eventually, token budget exhaustion. Prior approaches such as sliding-window truncation, threshold-based resets, and periodic summarization lack selectivity and continuity, often sacrificing critical evidence or failing to purge outdated or misleading information. The paper "LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents" (2605.05191) formally identifies "elastic context orchestration" as a core requirement for scalable search agents and operationalizes this through the Context-ReAct paradigm.

Context-ReAct Paradigm: Formalization and Operations

Context-ReAct extends ReAct with proactive, content-aware context management, integrating reasoning, context transformation, and tool execution in a unified loop. At every step, the agent jointly generates its reasoning trace, a sequence of context meta-operations, motivation, and tool calls.

Figure 1: The Context-ReAct paradigm adds a meta-operation layer for fine-grained context control, enabling elastic orchestration that spans lossless extraction, compression, deletion, and structural rollback.

The paradigm defines five atomic operations:

Skip: Identity operation; retains full context.
Compress: Abstractive summarization over arbitrary segments; enables retroactive condensation.
Snippet: Exact substring extraction; ensures lossless preservation of precision-critical data.
Delete: Removes uninformative steps.
Rollback: Structural backtracking to earlier states; discards failed branches while preserving causal rationale.

Unlike fixed-size windows or periodic summaries, Context-ReAct allows for targeted intervention at any position in the trajectory, maintaining multi-resolution context fidelity. The operation set is formally proven to be expressively complete, with Compress able to simulate all others; specialized operations provide efficiency, fidelity, and inductive bias conducive to robust agent training.

Figure 2: Managed context after sequential application of meta-operations, producing an information-dense history; structured output includes reasoning, meta-operations, motivation, and tool call.

Implementation and Training Protocol

LongSeeker is instantiated by fine-tuning Qwen3-30B-A3B on 10k synthesized trajectories incorporating context management supervision. The data pipeline leverages DeepSeek-V3.2 as a teacher agent, explicitly annotating meta-operations and structured outputs for each step. Supervised fine-tuning enforces joint learning of meta-operation invocation, reasoning continuity, and tool usage.

Empirical Evaluation

LongSeeker demonstrates strong results on diverse benchmarks including BrowseComp, BrowseComp-ZH, xbench, and GAIA. On BrowseComp and BrowseComp-ZH, LongSeeker achieves 61.5% and 62.5%, markedly outperforming Tongyi DeepResearch (43.2%, 46.7%) and AgentFold (36.2%, 47.3%). The scores generalize to xbench (78.0%) and GAIA-text (77.7%), underscoring the paradigm's transferability beyond information retrieval.

Figure 3: LongSeeker-30B delivers strong results on challenging long-horizon benchmarks, matching or surpassing several foundation models and search agents.

Context growth analysis reveals stable token counts plateauing around 15k tokens, in contrast to explosive linear expansion in standard ReAct agents. This substantiates the claim that elastic orchestration enables compact, information-dense memory management, with ample headroom for further scaling.

Figure 4: Context Growth Dynamics of LongSeeker illustrate stable context sizes, which sharply diverge from typical linear accumulations observed in append-only baselines.

Ablation experiments comparing Context-ReAct to summary and discard-all strategies confirm superior performance under identical step budgets.

Figure 5: Context-ReAct paradigm on BrowseComp yields better performance than summary or discard-all management approaches under the same step budget.

Case Study and Structured Reasoning

A comprehensive case study reveals dynamic application of Compress, Rollback, Delete, and Snippet operations, producing minimal yet information-dense context trajectories and maintaining explicit reasoning, meta-operation rationale, and tool call structures.

Figure 6: Case study demonstrates the compositional use of Compress, Rollback, Delete, and Snippet in a live trajectory, optimizing context compaction and reasoning clarity.

Figure 7: LongSeeker's structured output encapsulates chain-of-thought, meta-tool calls, motivation, and tool invocation within each step.

Implications and Forward Directions

Elastic context orchestration, as realized in Context-ReAct and LongSeeker, shifts context management from ad-hoc engineering to a learnable, integral agentic component. Practically, this enables reliable operation of long-horizon agents across extended tasks, minimizes generation costs, reduces hallucination and error risk, and increases information throughput. Theoretically, it aligns with Minimum Description Length principles, optimizing memory representation for both efficiency and reasoning fidelity.

Future work includes RL optimization of meta-operation policies, domain transfer to modalities beyond web search (e.g., large-scale legal reasoning, autonomous code synthesis), and architectural generalization of the paradigm. Context-ReAct's explicit multi-resolution control is poised to become a blueprint for scalable agentic systems wherever long-horizon exploration and information synthesis are required.

Conclusion

LongSeeker demonstrates that agentic context can be precisely, adaptively managed through atomic operations within the Context-ReAct paradigm. The presented framework achieves superior numerical performance on demanding benchmarks, maintains compact, information-dense context trajectories, and proves the theoretical completeness of the operation set. The research establishes elastic context management as a central principle for scalable agentic reasoning, opening avenues for advanced applications and algorithmic developments in long-horizon AI systems.

Markdown Report Issue