WebWeaver: Dual-Agent Research Framework

Updated 19 September 2025

WebWeaver is a dual-agent framework that combines iterative planning with hierarchical synthesis to generate reliable, well-structured research reports.
It features a planner that continuously refines research outlines and a writer that selectively retrieves and synthesizes target evidence, reducing context overflow.
Empirical validations on OEDR benchmarks demonstrate its effectiveness in enhancing citation accuracy, report quality, and mitigating long-context failures.

WebWeaver is a dual-agent framework designed for open-ended deep research (OEDR), in which AI agents must synthesize large volumes of web-scale evidence into reliable, well-structured reports. Addressing fundamental limitations in current research automation — specifically static pipelines and long-context failures — WebWeaver introduces an adaptive, human-centric methodology that interleaves planning, evidence acquisition, and hierarchical synthesis. The system is empirically validated on a range of open-ended deep research benchmarks, establishing new state-of-the-art results in report quality, reliability, and structure (Li et al., 16 Sep 2025).

1. Dual-Agent Architecture

WebWeaver’s architecture comprises two specialized agents: the planner and the writer.

Planner: Implements an iterative cycle of evidence acquisition and outline optimization. Rather than following a rigid plan fixed before evidence collection, the planner continuously searches for relevant sources, integrating each discovery back into an evolving outline. This results in a dynamic, citation-linked outline that reflects emerging evidence rather than static hypotheses.
Writer: Executes hierarchical retrieval and synthesis. The writer decomposes the report into manageable sections, retrieving only the necessary evidence for each part from a memory bank. Each section is written exclusively with the evidence that supports its specific content, greatly reducing context overflow and hallucination risk.

Formally, a complete agent trajectory is defined as: $\mathcal{H}_T = (\tau_0, a_0, o_0, \dots, \tau_T, a_T)$ where round $i$ includes a thought $\tau_i$ , an action $a_i$ , and an observation $o_i$ .

2. Overcoming Deep Research Bottlenecks

Traditional OEDR systems suffer two primary limitations: decoupled planning and evidence acquisition, and the "one-shot" generation approach that presents all context at once to the model. The result is frequent "loss in the middle" — where critical evidence is dropped from attention — and increased hallucination risk.

WebWeaver addresses these via:

Interleaved Planning and Acquisition: Rather than separating search and writing, the planner’s loop adaptively acquires new evidence and integrates it into the outline in real time.
Hierarchical Section-Wise Writing: Only the section-relevant evidence is retrieved from the memory bank, so long-context attention failures are mitigated.

This design ensures the report remains both comprehensive and strictly source-grounded at all levels.

3. Methodological Principles

WebWeaver’s methodology is explicitly aligned with human-centric research conduct:

Adaptive Planning: The planner alternates between search actions and outline refinement, so new insights immediately influence report structure.
Focused Synthesis: The writer generates each section with only its supporting evidence, avoiding distraction from unrelated material.
Memory Bank Management: All retrieved evidence (summaries, quotations, key data) is stored in a dedicated memory bank. For each subsection, targeted retrieval operations supply only what is needed.
Attentional Pruning: Upon section completion, the system clears extraneous evidence from context, maintaining model attentional fidelity.

4. Empirical Performance Across OEDR Benchmarks

WebWeaver’s dual-agent, iterative design has demonstrated strong empirical performance:

DeepResearch Bench: Achieves state-of-the-art scores in comprehensiveness, insight, instruction-following, readability, and citation accuracy.
DeepConsult: Outperforms competitive baselines in both win rates and average scores for actionable consulting reports.
DeepResearchGym: Excels in metrics of depth, breadth, and support, attributed to systematic context pruning and targeted retrieval. Cross-sectional interference is suppressed due to the modular writing approach.

These findings confirm the necessity of dynamic planning and focused synthesis for reliable open-ended deep research.

5. Technical Implementation

The technical specifications central to WebWeaver include:

Planner Agent Actions: "Search", "outline optimization", and "terminate" performed in a sequence of ( $\tau$ , $a$ , $o$ ) iterations.
Writer Mechanisms: For each section, a "retrieve" action joins context with relevant evidence before initiating "write" operations.
Memory Bank: Stores distilled representations of web-scale evidence. Lookups are citation-driven, maintaining grounding to underlying sources.
Context Management: By segmenting writing into subsections and exclusively introducing necessary evidence to context, the system avoids overlong context windows that plague LLM inference.
Optimization Operators: While explicit formulas like $\argmax$ and $\argmin$ are not foregrounded, they are implicit in agent reasoning for best-evidence selection.

6. Significance in AI-Assisted Deep Research

WebWeaver establishes the importance of dynamic, adaptive research workflows for high-quality output in contexts where static or unstructured generative agents fail. Its validated methodology — dual-agent design with interleaved planning and modular synthesis — sets a precedent for future open-ended research automation. Empirical evidence from major OEDR benchmarks demonstrates that mitigating long-context failures and hallucinations is paramount to robust, reliable, and granular content production.

This framework marks a key advancement toward human-centric, source-grounded research automation, with practical implications for scientific literature synthesis, consulting, and large-scale evidence-based reporting.

PDF Markdown Chat (Pro)

References (1)

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to WebWeaver.