AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents (2503.09780v2)

Published 12 Mar 2025 in cs.AI

Abstract: Autonomous AI agents that can follow instructions and perform complex multi-step tasks have tremendous potential to boost human productivity. However, to perform many of these tasks, the agents need access to personal information from their users, raising the question of whether they are capable of using it appropriately. In this work, we introduce a new benchmark AgentDAM that measures if AI web-navigation agents follow the privacy principle of data minimization''. For the purposes of our benchmark, data minimization means that the agent uses a piece of potentially sensitive information only if it isnecessary'' to complete a particular task. Our benchmark simulates realistic web interaction scenarios end-to-end and is adaptable to all existing web navigation agents. We use AgentDAM to evaluate how well AI agents built on top of GPT-4, Llama-3 and Claude can limit processing of potentially private information, and show that they are prone to inadvertent use of unnecessary sensitive information. We also propose a prompting-based defense that reduces information leakage, and demonstrate that our end-to-end benchmarking provides a more realistic measure than probing LLMs about privacy. Our results highlight that further research is needed to develop AI agents that can prioritize data minimization at inference time.

Summary

Privacy Leakage Evaluation for Autonomous Web Agents

The paper, "AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents," investigates privacy concerns in LLM-powered AI agents tasked with autonomous web navigation. LLM-powered AI agents hold promise for enhancing productivity by automating tasks and interactions on the web. However, these agents frequently require access to users’ sensitive information, increasing the risk of privacy breaches. This research proposes the AgentDAM benchmark to evaluate these risks and introduce methods for reducing privacy leakage, aiming to ensure adherence to the principle of data minimization.

Overview of Proposed Work

AgentDAM Benchmark: The paper introduces a benchmark that gauges an AI agent's ability to limit processing and exposure of sensitive information during web interactions. It utilizes simulated web scenarios that reflect realistic navigation tasks, allowing for comprehensive testing across diverse web applications.
Privacy Principle of Data Minimization: The benchmark centers on data minimization, defined as sharing private information only when necessary to fulfill a task-relevant purpose. This principle guides the agent to avoid unnecessary exposure of sensitive data.
Application to Web Agents: The benchmark is applied to AI agents developed using frameworks like GPT-4, Llama-3, and Claude, assessing their handling of potentially private information.

Key Contributions

The paper delivers significant contributions to AI privacy benchmarks:

Dataset of Web Interaction Scenarios: The authors compile a realistic dataset of web navigation tasks across platforms like GitLab, Shopping, and Reddit, integrating simulated private data for testing.
LLM-Based Privacy Evaluator: This evaluation mechanism analyzes the agents' trajectories, employing models like gpt-4o to detect and report privacy breaches.
Mitigation Strategies: Two prompting-based approaches are implemented. The more effective utilizes privacy-aware system prompts and CoT demonstrations which improve privacy performance, albeit at a slight reduction in task performance.

Experimental Results

Results indicate variable privacy performance across models, highlighting gaps:

Initial results without mitigation show privacy performance between 25–46% for GPT-based agents, indicating substantial leakage risks. However, Llama and Claude models demonstrate higher privacy awareness, reaching around 90% performance without mitigation.
Privacy mitigation strategies considerably bolster privacy performance across models, although they somewhat degrade task efficacy due to cautious behavior limiting data usage.

Implications and Future Directions

Practical Implications: As AI agents become more autonomous, integrating robust privacy measures is crucial. The AgentDAM benchmark offers a foundation for evaluating AI agents under realistic conditions, pushing for enhanced privacy safeguards in AI systems.
Theoretical Implications: The paper highlights the nuanced role of context in privacy breaches and showcases the potential for CoT reasoning in improving data protection within AI systems.
Future Research Directions: Further exploration could extend the benchmark's applicability to more diverse task types, web applications, and broader agent scenarios outside web interactions. Additionally, examining larger-scale mitigation techniques potentially aiding agent training at a foundational level could prove worthwhile.

This research emphasizes the necessity for safeguarding privacy in rapidly evolving AI systems, aiming to standardize benchmarks that ensure AI agents operate within an ethical framework akin to human privacy expectations.

Related Papers

Find Related Papers

Tweets

https://twitter.com/rsalakhu/status/1901024714803286369

https://twitter.com/kamalikac/status/1910641041578881066

https://twitter.com/kamalikac/status/1900693210256822483

https://twitter.com/ArmanZharmagam1/status/1902429526917050634