Papers
Topics
Authors
Recent
2000 character limit reached

WebWeaver: Dual-Agent Research Framework

Updated 19 September 2025
  • WebWeaver is a dual-agent framework that combines iterative planning with hierarchical synthesis to generate reliable, well-structured research reports.
  • It features a planner that continuously refines research outlines and a writer that selectively retrieves and synthesizes target evidence, reducing context overflow.
  • Empirical validations on OEDR benchmarks demonstrate its effectiveness in enhancing citation accuracy, report quality, and mitigating long-context failures.

WebWeaver is a dual-agent framework designed for open-ended deep research (OEDR), in which AI agents must synthesize large volumes of web-scale evidence into reliable, well-structured reports. Addressing fundamental limitations in current research automation — specifically static pipelines and long-context failures — WebWeaver introduces an adaptive, human-centric methodology that interleaves planning, evidence acquisition, and hierarchical synthesis. The system is empirically validated on a range of open-ended deep research benchmarks, establishing new state-of-the-art results in report quality, reliability, and structure (Li et al., 16 Sep 2025).

1. Dual-Agent Architecture

WebWeaver’s architecture comprises two specialized agents: the planner and the writer.

  • Planner: Implements an iterative cycle of evidence acquisition and outline optimization. Rather than following a rigid plan fixed before evidence collection, the planner continuously searches for relevant sources, integrating each discovery back into an evolving outline. This results in a dynamic, citation-linked outline that reflects emerging evidence rather than static hypotheses.
  • Writer: Executes hierarchical retrieval and synthesis. The writer decomposes the report into manageable sections, retrieving only the necessary evidence for each part from a memory bank. Each section is written exclusively with the evidence that supports its specific content, greatly reducing context overflow and hallucination risk.

Formally, a complete agent trajectory is defined as: HT=(τ0,a0,o0,,τT,aT)\mathcal{H}_T = (\tau_0, a_0, o_0, \dots, \tau_T, a_T) where round ii includes a thought τi\tau_i, an action aia_i, and an observation oio_i.

2. Overcoming Deep Research Bottlenecks

Traditional OEDR systems suffer two primary limitations: decoupled planning and evidence acquisition, and the "one-shot" generation approach that presents all context at once to the model. The result is frequent "loss in the middle" — where critical evidence is dropped from attention — and increased hallucination risk.

WebWeaver addresses these via:

  • Interleaved Planning and Acquisition: Rather than separating search and writing, the planner’s loop adaptively acquires new evidence and integrates it into the outline in real time.
  • Hierarchical Section-Wise Writing: Only the section-relevant evidence is retrieved from the memory bank, so long-context attention failures are mitigated.

This design ensures the report remains both comprehensive and strictly source-grounded at all levels.

3. Methodological Principles

WebWeaver’s methodology is explicitly aligned with human-centric research conduct:

  • Adaptive Planning: The planner alternates between search actions and outline refinement, so new insights immediately influence report structure.
  • Focused Synthesis: The writer generates each section with only its supporting evidence, avoiding distraction from unrelated material.
  • Memory Bank Management: All retrieved evidence (summaries, quotations, key data) is stored in a dedicated memory bank. For each subsection, targeted retrieval operations supply only what is needed.
  • Attentional Pruning: Upon section completion, the system clears extraneous evidence from context, maintaining model attentional fidelity.

4. Empirical Performance Across OEDR Benchmarks

WebWeaver’s dual-agent, iterative design has demonstrated strong empirical performance:

  • DeepResearch Bench: Achieves state-of-the-art scores in comprehensiveness, insight, instruction-following, readability, and citation accuracy.
  • DeepConsult: Outperforms competitive baselines in both win rates and average scores for actionable consulting reports.
  • DeepResearchGym: Excels in metrics of depth, breadth, and support, attributed to systematic context pruning and targeted retrieval. Cross-sectional interference is suppressed due to the modular writing approach.

These findings confirm the necessity of dynamic planning and focused synthesis for reliable open-ended deep research.

5. Technical Implementation

The technical specifications central to WebWeaver include:

  • Planner Agent Actions: "Search", "outline optimization", and "terminate" performed in a sequence of (τ\tau, aa, oo) iterations.
  • Writer Mechanisms: For each section, a "retrieve" action joins context with relevant evidence before initiating "write" operations.
  • Memory Bank: Stores distilled representations of web-scale evidence. Lookups are citation-driven, maintaining grounding to underlying sources.
  • Context Management: By segmenting writing into subsections and exclusively introducing necessary evidence to context, the system avoids overlong context windows that plague LLM inference.
  • Optimization Operators: While explicit formulas like arg max\argmax and arg min\argmin are not foregrounded, they are implicit in agent reasoning for best-evidence selection.

6. Significance in AI-Assisted Deep Research

WebWeaver establishes the importance of dynamic, adaptive research workflows for high-quality output in contexts where static or unstructured generative agents fail. Its validated methodology — dual-agent design with interleaved planning and modular synthesis — sets a precedent for future open-ended research automation. Empirical evidence from major OEDR benchmarks demonstrates that mitigating long-context failures and hallucinations is paramount to robust, reliable, and granular content production.

This framework marks a key advancement toward human-centric, source-grounded research automation, with practical implications for scientific literature synthesis, consulting, and large-scale evidence-based reporting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to WebWeaver.