Mechanism of the HTML benefit and the role of CSS layout cues

Establish, via controlled ablation, whether CSS layout cues present in HTML (e.g., z-index determining stacking order) causally contribute to the performance improvements observed for higher-capability models when using HTML instead of the accessibility tree, and identify any additional contributing factors.

Background

The paper’s error analysis suggests that richer HTML, which includes CSS layout information, helps stronger models reduce grounding errors (e.g., avoiding click interceptions due to overlapping elements through z-index cues).

However, this hypothesized mechanism has not been validated by ablation, and the authors note that additional factors may also influence the observed improvements.

References

Our error analysis suggests that CSS layout cues contribute to the HTML benefit, but this has not been directly verified through ablation and additional factors may be involved.

Read More, Think More: Revisiting Observation Reduction for Web Agents  (2604.01535 - Enomoto et al., 2 Apr 2026) in Limitation, Section: Mechanism of HTML benefit