Papers
Topics
Authors
Recent
Search
2000 character limit reached

Table-as-Search: Formulate Long-Horizon Agentic Information Seeking as Table Completion

Published 6 Feb 2026 in cs.CL | (2602.06724v1)

Abstract: Current Information Seeking (InfoSeeking) agents struggle to maintain focus and coherence during long-horizon exploration, as tracking search states, including planning procedure and massive search results, within one plain-text context is inherently fragile. To address this, we introduce \textbf{Table-as-Search (TaS)}, a structured planning framework that reformulates the InfoSeeking task as a Table Completion task. TaS maps each query into a structured table schema maintained in an external database, where rows represent search candidates and columns denote constraints or required information. This table precisely manages the search states: filled cells strictly record the history and search results, while empty cells serve as an explicit search plan. Crucially, TaS unifies three distinct InfoSeeking tasks: Deep Search, Wide Search, and the challenging DeepWide Search. Extensive experiments demonstrate that TaS significantly outperforms numerous state-of-the-art baselines across three kinds of benchmarks, including multi-agent framework and commercial systems. Furthermore, our analysis validates the TaS's superior robustness in long-horizon InfoSeeking, alongside its efficiency, scalability and flexibility. Code and datasets are publicly released at https://github.com/AIDC-AI/Marco-Search-Agent.

Summary

  • The paper presents TaS, a framework that reformulates long-horizon agentic information seeking as a table completion task to enhance state management.
  • It leverages a structured tabular schema to decouple planning from execution, enabling precise candidate filtering and efficient sub-agent orchestration.
  • Empirical results demonstrate TaS’s superiority over baseline models in Deep, Wide, and DeepWide search scenarios, showing significant gains in accuracy and efficiency.

Table-as-Search: Reformulating Long-Horizon Agentic Information Seeking as Table Completion

Motivation and Problem Statement

Long-horizon agentic information seeking tasks involve sequential, multi-step reasoning and large-scale retrieval across the web. Current agent frameworks, such as ReAct, maintain search state and planning information within plain-text contexts, leading to context fragility, state dilution, and frequent hallucinations in extended interactions. As task complexity and search horizon expand, agents display a pronounced "lost-in-the-middle" phenomenon, where critical information cannot be reliably tracked or synthesized.

TaS Framework: Design and Architecture

The Table-as-Search (TaS) framework addresses context fragility by introducing tabular, externally structured state management for InfoSeeking agents. TaS transforms queries into a structured schema, where rows correspond to candidate entities and columns represent constraints and required information. Table cells are used to record search history and results, while empty cells denote explicit plan items pending completion. Figure 1

Figure 1: TaS reformulates InfoSeeking as a Table Completion task, explicitly managing search state and supporting Deep, Wide, and DeepWide paradigms.

The tabular schema unifies three InfoSeeking paradigms:

  • Deep Search: Precise candidate filtering with multi-constraint verification.
  • Wide Search: Broad candidate aggregation with minimal constraints.
  • DeepWide Search: Simultaneous breadth-oriented exploration and depth-oriented verification and information collection.

TaS is implemented as a multi-agent system with a planner Main-Agent orchestrating specialized Sub-Agents. The planner initializes the schema, manages row expansion, and coordinates cell population in parallel, facilitating efficient search and deep attribute extraction. All state is offloaded to a persistent external database, overcoming unstructured context window limitations.

Task Formulation and Implementation Details

Formally, queries are mapped via ϕ(q)→S\phi(q) \to \mathcal{S}, creating a schema S=⟨K,C,I⟩\mathcal{S} = \langle \mathcal{K}, \mathcal{C}, \mathcal{I} \rangle, with K\mathcal{K} for candidates, C\mathcal{C} for constraints, and I\mathcal{I} for required information. Agent policy is then π(⋅∣q,τt,Tt)\pi(\cdot \mid q, \tau_t, T_t), conditioning on structured table TtT_t and trajectory τt\tau_t rather than unstructured text.

The execution pipeline follows three phases:

  1. Table Initialization: Parsing query, constructing schema, and initializing database table.
  2. Dynamic Orchestration: Planner selects between row expansion (candidate discovery) and cell population (attribute filling) based on table state, dispatching Sub-Agents in parallel.
  3. Answer Synthesis: Final response synthesis based on completed structured evidence.

TaS supports "plug-and-play" integration of advanced search models as Sub-Agents and persistent scalable storage, offering high flexibility and architectural modularity.

Empirical Evaluation and Results

Extensive experiments are conducted across Deep Search (GAIA, BrowseComp-ZH), Wide Search (WideSearch), and DeepWide Search (new BD benchmark) scenarios. TaS is compared against single- and multi-agent ReAct baselines, compute-scaled variants, commercial systems (e.g., Gemini DeepResearch), and specialized agentic RL-trained models.

Deep Search: TaS consistently achieves superior accuracy, particularly with cost-efficient models (e.g., Gemini-2.5-Flash), which, under TaS, outperformed larger Multi-Agent ReAct baselines by a margin of +14% on GAIA. When task does not require external search, overhead associated with table management may cause minor regression, confirming TaS's specialization for open-world InfoSeeking.

Wide Search: TaS demonstrates holistic superiority, achieving higher Success Rates with smaller models and outperforming computation-heavy baselines in maximum recall. Precision and recall are both improved, with structured table constraints effectively filtering noise during large-scale aggregation. Figure 2

Figure 2: Robustness analysis reveals TaS's superiority as task complexity increases in both BrowseComp-ZH and WideSearch.

DeepWide Search: On real-world BD cases, TaS outperforms both ReAct and Gemini DeepResearch, with gains of +4.7% in entity accuracy and +5.1% in information precision. By decoupling planning and execution, TaS allows efficient sub-agent replacement (e.g., fine-tuned 32B models), demonstrating architectural scalability.

Efficiency and Scaling: TaS consistently achieves higher performance at comparable or lower tool usage volumes. Test-time scaling experiments on BrowseComp-ZH and WideSearch illustrate that TaS benefits more from increased compute allocation than baseline methods. Figure 3

Figure 3: Gemini-2.5-Flash shows higher search efficiency under TaS on both Deep and Wide benchmarks.

Figure 4

Figure 4: Test-time scaling analysis confirms TaS's effectiveness in leveraging expanded compute, amplifying performance margins.

Analytical Insights and Ablation Studies

TaS's robustness is validated: performance gaps over baselines widen as complexity increases due to precise state tracking and explicit planning. Efficiency analyses demonstrate that performance stems from planning quality, not brute-force search scaling. Ablation studies further reveal that the main planner's reasoning capability is critical, while sub-agents may be efficiently replaced with specialized or smaller models without significant degradation.

Qualitative Case Analyses

TaS prevents failure modes endemic to unstructured agents:

  • Deep Search: Premature convergence is eliminated by enforcing global constraint verification across all candidates.
  • Wide Search: Lazy omission is replaced by systematic schema decomposition and targeted cell-filling.
  • DeepWide Search: Breadth-plus-depth complexity is addressed by strict candidate eligibility screening and parallel deep attribute extraction. Figure 5

    Figure 5: TaS’s stepwise process in a complex DeepWide Search scenario.

Practical and Theoretical Implications

TaS enables decoupling of control and execution in agentic InfoSeeking, improving scalability, robustness, and query fidelity for real-world applications. By externalizing state, TaS overcomes inherent limitations of context compression, paving the way for high-density retrieval and stable performance in industrial-scale tasks. The plug-and-play compatibility with state-of-the-art search models ensures forward-compatibility and operational flexibility. Architecturally, TaS can serve as a canonical framework for explicit state management and planning in AGI-scale research systems.

Limitations and Future Directions

TaS's structured approach introduces rigidity for tasks better served by free-form reasoning. Adaptive mechanisms for switching between tabular and text-centric modalities are warranted. Performance is bounded by planner model strength; future optimization via agentic RL may further unlock TaS's potential. While context compression strategies are orthogonal to TaS and can be integrated, TaS fundamentally distinguishes itself by persistent externalization of state.

Scalability of evaluation, particularly in DeepWide benchmarks, remains contingent on human-in-the-loop verification due to open-ended nature. Iterative ground truth maintenance offers partial mitigation but does not fully resolve reproducibility constraints.

Conclusion

Table-as-Search (TaS) redefines agentic InfoSeeking by shifting from brittle, unstructured context management to explicit, tabular state completion. Robust empirical results confirm TaS's effectiveness across depth, width, and hybrid paradigms, with demonstrated gains in robustness, efficiency, and scalability. TaS thus provides a viable path forward for large-scale, long-horizon agent architectures, offering a blueprint for structured planning in future AI research systems (2602.06724).

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.