Can prompting alone overcome anchoring and synthesis deficits?

Determine whether prompting alone, without architectural changes or external verification modules, can overcome positional anchoring and evidence synthesis deficits exhibited by web-enabled language agents evaluated in the Synthetic Web Benchmark under adversarial ranking with a top-ranked misinformation honeypot.

Background

The paper evaluates web-enabled language agents in a synthetic, controllable web environment where a single high-plausibility misinformation article is injected at rank 0 to test robustness. Agents show catastrophic accuracy drops, minimal search escalation, synthesis failures, and severe miscalibration under this adversarial ranking.

While the study uses a uniform zero-shot prompt, the authors note that production systems might employ more sophisticated prompting or multi-stage protocols. They explicitly state uncertainty about whether adjustments to prompting alone can address the observed positional anchoring (overweighting top-ranked sources) and evidence synthesis failures.

References

Whether prompting alone can overcome positional anchoring and synthesis deficits remains an open question for future work.

The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents  (2603.00801 - Shah et al., 28 Feb 2026) in Section: Limitations — Model capabilities