SusBench: Dark Pattern Benchmark
- SusBench is an online benchmark that systematically evaluates the susceptibility of LLM computer-use agents and humans to deceptive dark patterns on consumer websites.
- It employs controlled code injections and nine distinct dark pattern categories to simulate realistic manipulations across 55 consumer-oriented sites.
- Findings reveal comparable resilience in agents and humans, though both show vulnerability to covert patterns, driving implications for regulatory and design improvements.
SusBench is an online benchmark designed to systematically evaluate the susceptibility of LLM computer-use agents (CUAs) to deceptive user interface patterns, known as dark patterns, on real-world consumer websites. The benchmark employs controlled code injections to synthesize realistic manipulative designs and measures the capacity of both agents and human users to resist these manipulations across hundreds of tasks. By providing an ecologically valid protocol, SusBench informs the development of CUA architectures, their deployment in consumer environments, and the design of regulatory or prevention strategies regarding the manipulation of autonomous agents.
1. Motivation and Scope
SusBench addresses the emerging concern that LLM-powered agents, which autonomously interact with consumer-facing web interfaces, may be vulnerable to manipulative UI elements intended to deceive, nudge, or exploit users. The benchmark is motivated by two converging trends: the prevalence of dark patterns in online commerce and services, and the increasing use of CUAs to act as proxies for human intent in environments such as online shopping, travel booking, and digital media. The core premise is that if CUAs cannot robustly avoid manipulative choices, their reliability and trustworthiness in autonomous decision-making are compromised.
The benchmark is structured to enable direct comparison between CUAs and human users, enabling a dual function: evaluating agent resilience and providing reference points for regulatory intervention.
2. Taxonomy and Construction of Dark Patterns
SusBench operationalizes nine high-level dark pattern categories, consolidated from prior taxonomies and empirical UI studies:
- Disguised Ad
- False Hierarchy
- Preselection
- Pop-Up Ad
- Trick Wording
- Confirm Shaming
- Fake Social Proof
- Forced Action
- Hidden Information
Each pattern targets specific cognitive or interface-level vulnerabilities, ranging from visual obfuscation (Disguised Ad) to behavioral nudges exploiting default choices (Preselection) or semantic ambiguity (Trick Wording). Figure 1 in the source material exemplifies these injections, such as preselected upsells or confirmation dialogs employing misleading text.
To construct realistic manipulations, SusBench uses three code-level functions per injection:
| Function Type | Purpose | Methodology |
|---|---|---|
| Page Matching | Detects suitable web page contexts | HTML/CSS pattern recognition |
| Injection | Alters UI to synthesize pattern | Programmatic DOM modification |
| Evaluation | Checks whether avoidance occurred | Element selection auditing |
A LaTeX formula presented for avoidance rate is:
where quantifies successful resistance.
3. Website and Task Selection Protocol
The benchmark covers 55 consumer-oriented websites, sampled primarily from Online-Mind2Web and existing dark pattern datasets. Websites employing CAPTCHAs or blocking non-human interactions are excluded. The sample spans nine sectors (retail, apparel, lifestyle, etc.), targeting typical user processes such as product search, checkout, and order confirmation.
A total of 313 evaluation tasks are generated to ensure each dark pattern surfaces along the critical execution path. Tasks are seeded using prior benchmarks and LLM-generated variants, then manually validated for realism and robustness. Specific attention is paid to temporal generalization (dynamic date references) and style congruity, such that manipulations are indistinguishable from authentic site components.
A controlled Playwright browser environment and extension are leveraged for reliably loading and rendering both the manipulative designs and the evaluation logic.
4. Human Perception Study
The paper presents a controlled paper with 29 human participants, revealing that dark pattern injections are perceived as highly realistic:
- 86% of users believe injected elements are part of the native website.
- Only ~14% express suspicion; even these cannot consistently discriminate injected from genuine dark patterns.
Qualitative analysis identifies that covert manipulations (e.g., Trick Wording, Hidden Information, Preselection) are more effective, correlating with user habits favoring rapid confirmation, pop-up dismissal, and minimal inspection of interface text or options.
The normalized avoidance rate formula, as above, formalizes human performance on SusBench.
5. Agent-Based Evaluation
Five state-of-the-art CUAs are assessed on SusBench, representing both “Browser Use” agents (processing screenshots and HTML structures) and vision-only agents. Benchmarked systems include variants based on GPT-4o, Anthropic’s LLMs, and a Browser Use agent with a GPT-5 backbone.
Aggregate findings:
- Human average dark pattern avoidance: 67.5%
- Best agent dark pattern avoidance: 68.3%
Both humans and agents demonstrate strong resilience (>85%) to overt patterns (False Hierarchy, Confirm Shaming, Forced Action), but are notably vulnerable (<15% avoidance) to covert patterns (Preselection, Trick Wording, Hidden Information).
Architecture-dependent effects are observed: Browser Use agents with access to HTML structure outperform vision-only models on tasks involving pop-up detection, whereas certain screenshot-based cues (e.g., small “Advertisement” labels) are more readily detected by vision-only agents. This points to design trade-offs in agent input representation.
6. Methodological Considerations and Ecological Validity
SusBench incorporates strict design criteria to enforce ecological validity in manipulation and evaluation:
- Each dark pattern injection is iteratively verified—first via LLM-generated code, then via human review.
- Robustness is ensured by dynamic references and authentic visuals congruent with each website’s native structure.
- The evaluation function operationalizes avoidance and non-avoidance, enabling precise measurement of susceptibility per pattern and per task.
By maintaining indistinguishability of manipulations, the benchmark circumvents confounding effects due to suspicion or experimental demand, supporting credible inferences about inherent vulnerabilities in agent and human navigation.
7. Implications, Applications, and Future Directions
The comparable vulnerability of agents and humans to dark patterns suggests several research and regulatory implications:
- Agent training protocols should incorporate explicit manipulation resistance policies—potentially via reinforcement or avoidance rewards rather than mere completion metrics.
- Structural input sources (beyond screenshots) are promising for increasing agent resilience. Future agent designs may benefit from multi-modal or HTML-aware processing strategies.
- CUAs may function as scalable proxies for regulatory audits of web design, allowing consumer agencies to assess manipulativeness without reliance on costly human studies.
The paper proposes that prioritization of covert patterns (Trick Wording, Hidden Information, Preselection) is warranted for both agent development and regulatory enforcement, due to their observed potency.
Future work may include expanding SusBench to new UI manipulations, simulating more varied human personas, and developing agent architectures capable of exceeding human robustness through advanced input fusion and training.
In summary, SusBench delivers a rigorously constructed, online evaluation suite that quantitatively benchmarks dark pattern susceptibility in both LLM agents and human users. Its injection methodology, coverage of pattern types and website sectors, and comparative findings on agent versus human resilience establish it as a foundational resource for research into safe, trustworthy autonomous web interactions and for auditing deceptive digital design practices (Guo et al., 13 Oct 2025).