Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 85 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 10 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 455 tok/s Pro
Claude Sonnet 4 31 tok/s Pro
2000 character limit reached

WebWeaver: Dual-Agent Research Framework

Updated 19 September 2025
  • WebWeaver is a dual-agent framework that combines iterative planning with hierarchical synthesis to generate reliable, well-structured research reports.
  • It features a planner that continuously refines research outlines and a writer that selectively retrieves and synthesizes target evidence, reducing context overflow.
  • Empirical validations on OEDR benchmarks demonstrate its effectiveness in enhancing citation accuracy, report quality, and mitigating long-context failures.

WebWeaver is a dual-agent framework designed for open-ended deep research (OEDR), in which AI agents must synthesize large volumes of web-scale evidence into reliable, well-structured reports. Addressing fundamental limitations in current research automation — specifically static pipelines and long-context failures — WebWeaver introduces an adaptive, human-centric methodology that interleaves planning, evidence acquisition, and hierarchical synthesis. The system is empirically validated on a range of open-ended deep research benchmarks, establishing new state-of-the-art results in report quality, reliability, and structure (Li et al., 16 Sep 2025).

1. Dual-Agent Architecture

WebWeaver’s architecture comprises two specialized agents: the planner and the writer.

  • Planner: Implements an iterative cycle of evidence acquisition and outline optimization. Rather than following a rigid plan fixed before evidence collection, the planner continuously searches for relevant sources, integrating each discovery back into an evolving outline. This results in a dynamic, citation-linked outline that reflects emerging evidence rather than static hypotheses.
  • Writer: Executes hierarchical retrieval and synthesis. The writer decomposes the report into manageable sections, retrieving only the necessary evidence for each part from a memory bank. Each section is written exclusively with the evidence that supports its specific content, greatly reducing context overflow and hallucination risk.

Formally, a complete agent trajectory is defined as: HT=(τ0,a0,o0,,τT,aT)\mathcal{H}_T = (\tau_0, a_0, o_0, \dots, \tau_T, a_T) where round ii includes a thought τi\tau_i, an action aia_i, and an observation oio_i.

2. Overcoming Deep Research Bottlenecks

Traditional OEDR systems suffer two primary limitations: decoupled planning and evidence acquisition, and the "one-shot" generation approach that presents all context at once to the model. The result is frequent "loss in the middle" — where critical evidence is dropped from attention — and increased hallucination risk.

WebWeaver addresses these via:

  • Interleaved Planning and Acquisition: Rather than separating search and writing, the planner’s loop adaptively acquires new evidence and integrates it into the outline in real time.
  • Hierarchical Section-Wise Writing: Only the section-relevant evidence is retrieved from the memory bank, so long-context attention failures are mitigated.

This design ensures the report remains both comprehensive and strictly source-grounded at all levels.

3. Methodological Principles

WebWeaver’s methodology is explicitly aligned with human-centric research conduct:

  • Adaptive Planning: The planner alternates between search actions and outline refinement, so new insights immediately influence report structure.
  • Focused Synthesis: The writer generates each section with only its supporting evidence, avoiding distraction from unrelated material.
  • Memory Bank Management: All retrieved evidence (summaries, quotations, key data) is stored in a dedicated memory bank. For each subsection, targeted retrieval operations supply only what is needed.
  • Attentional Pruning: Upon section completion, the system clears extraneous evidence from context, maintaining model attentional fidelity.

4. Empirical Performance Across OEDR Benchmarks

WebWeaver’s dual-agent, iterative design has demonstrated strong empirical performance:

  • DeepResearch Bench: Achieves state-of-the-art scores in comprehensiveness, insight, instruction-following, readability, and citation accuracy.
  • DeepConsult: Outperforms competitive baselines in both win rates and average scores for actionable consulting reports.
  • DeepResearchGym: Excels in metrics of depth, breadth, and support, attributed to systematic context pruning and targeted retrieval. Cross-sectional interference is suppressed due to the modular writing approach.

These findings confirm the necessity of dynamic planning and focused synthesis for reliable open-ended deep research.

5. Technical Implementation

The technical specifications central to WebWeaver include:

  • Planner Agent Actions: "Search", "outline optimization", and "terminate" performed in a sequence of (τ\tau, aa, oo) iterations.
  • Writer Mechanisms: For each section, a "retrieve" action joins context with relevant evidence before initiating "write" operations.
  • Memory Bank: Stores distilled representations of web-scale evidence. Lookups are citation-driven, maintaining grounding to underlying sources.
  • Context Management: By segmenting writing into subsections and exclusively introducing necessary evidence to context, the system avoids overlong context windows that plague LLM inference.
  • Optimization Operators: While explicit formulas like arg max\argmax and arg min\argmin are not foregrounded, they are implicit in agent reasoning for best-evidence selection.

6. Significance in AI-Assisted Deep Research

WebWeaver establishes the importance of dynamic, adaptive research workflows for high-quality output in contexts where static or unstructured generative agents fail. Its validated methodology — dual-agent design with interleaved planning and modular synthesis — sets a precedent for future open-ended research automation. Empirical evidence from major OEDR benchmarks demonstrates that mitigating long-context failures and hallucinations is paramount to robust, reliable, and granular content production.

This framework marks a key advancement toward human-centric, source-grounded research automation, with practical implications for scientific literature synthesis, consulting, and large-scale evidence-based reporting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to WebWeaver.