Effect of grounding scheme on observation representation trends

Ascertain whether the same dependence of web-agent performance on observation representation (HTML versus accessibility tree) and thinking token budget, observed under id-based DOM element grounding, also holds under coordinate-based and other grounding schemes.

Background

All experiments in the paper employ id-based grounding, where actions reference DOM elements by identifiers for execution through Playwright.

Many web-agent systems use alternative grounding schemes (e.g., coordinate-based). It is explicitly stated that it is an open question whether the observed performance trends with HTML versus accessibility tree and thinking token budgets persist under different grounding schemes.

References

Our experiments are limited to id-based grounding; whether similar trends hold under coordinate-based or other grounding schemes is an open question.

Read More, Think More: Revisiting Observation Reduction for Web Agents  (2604.01535 - Enomoto et al., 2 Apr 2026) in Limitation, Section: Grounding scheme