Poisoning Web-Scale Training Datasets is Practical (2302.10149v2)

Published 20 Feb 2023 in cs.CR and cs.LG

Abstract: Deep learning models are often trained on distributed, web-scale datasets crawled from the internet. In this paper, we introduce two new dataset poisoning attacks that intentionally introduce malicious examples to a model's performance. Our attacks are immediately practical and could, today, poison 10 popular datasets. Our first attack, split-view poisoning, exploits the mutable nature of internet content to ensure a dataset annotator's initial view of the dataset differs from the view downloaded by subsequent clients. By exploiting specific invalid trust assumptions, we show how we could have poisoned 0.01% of the LAION-400M or COYO-700M datasets for just $60 USD. Our second attack, frontrunning poisoning, targets web-scale datasets that periodically snapshot crowd-sourced content -- such as Wikipedia -- where an attacker only needs a time-limited window to inject malicious examples. In light of both attacks, we notify the maintainers of each affected dataset and recommended several low-overhead defenses.

Authors (9)

Nicholas Carlini (101 papers)
Matthew Jagielski (51 papers)
Christopher A. Choquette-Choo (49 papers)
Daniel Paleka (11 papers)
Will Pearce (4 papers)
Hyrum Anderson (7 papers)
Andreas Terzis (23 papers)
Kurt Thomas (15 papers)
Florian Tramèr (87 papers)

Citations (146)

View on Semantic Scholar

Summary

Poisoning Web-Scale Training Datasets is Practical

The paper "Poisoning Web-Scale Training Datasets is Practical," authored by Nicholas Carlini et al., introduces two novel dataset poisoning attacks designed to compromise deep learning models trained on web-scale datasets. The research investigates the vulnerabilities of distributed datasets, where massive amounts of data are automatically scraped from the internet. Due to the impracticality of manual curation at such a scale, these datasets are susceptible to adversarial actions, making the risk of poisoning quite tangible and highly relevant.

Attack Methodologies

The authors present two distinct poisoning strategies:

Split-View Poisoning: This attack exploits the mutable nature of content on the internet. Given that web-based datasets are indexed with URLs, changes to the content at these URLs can lead to discrepancies between the initial dataset curation and the content available at the time of dataset use. By acquiring expired domains from dataset indexes, an adversary can modify the content hosted at these URLs, causing poisoning when subsequent users access the content. Their paper reveals that for an expenditure of merely \$60, a perpetrator could poison 0.01% of datasets like LAION-400M.
Frontrunning Poisoning: This attack targets datasets built from snapshots of dynamic online content, such as Wikipedia. Here, the adversary times the introduction of malicious examples to immediately precede snapshot collection, ensuring these examples are incorporated before moderation systems can revert the changes. This technique exploits predictable snapshotting schedules, making it feasible for an attacker to achieve their objectives with precise timing.

Empirical Evaluation and Feasibility

The authors evaluate the attacks’ practical feasibility on ten popular web-scale datasets and highlight the lack of integrity mechanisms, such as cryptographic hashes, which allow attackers to introduce adversarial content effectively. They disclose how the bias towards large, automatic, and distributed datasets compromises their reliability. Furthermore, frequent re-downloads of these datasets ensure that even in the case of retroactive poisoning, the impact can spread widely across the academic and industrial landscape.

Defensive Strategies

In response to these vulnerabilities, the paper suggests two primary defenses:

Integrity Verification: This involves using cryptographic hashes to ensure that downloaded data matches the curator’s original snapshot, thus preventing split-view attacks. Despite its effectiveness, this approach may face practical challenges due to benign content changes over time.
Timing-Based Defenses: These defenses address frontrunning poisoning by randomizing snapshot orders or leveraging delays that allow for moderator intervention before finalizing a snapshot.

While these defenses can mitigate the poisoning risks, the authors comment on their limitations, especially regarding the management of dynamic and evolving datasets, noting that their application might require significant changes in the current data handling methodologies.

Implications and Future Directions

The research underscores the need for increased diligence in the collection and utilization of web-scale datasets. These datasets’ lack of curation presents tangible risks to model reliability and safety, indicating a critical area for the improvement of data collection protocols. In the broader context, the paper suggests that emerging AI systems must take an adversarial stance towards data trust and expectancy. Future exploration may focus on developing resilient systems capable of maintaining integrity amidst adversarial conditions, exploring consensus-based collection techniques or distributed validity checks that draw inspiration from blockchain technologies.

Ultimately, this work not only sheds light on the feasibility of dataset poisoning but also calls for a reassessment of the assumptions underpinning the current large-scale data practices. Researchers and practitioners alike must be aware of the potential pitfalls these systemic vulnerabilities introduce, as well as the steps needed to ensure robust, safe, and ethical AI development.

PDF Markdown

Related Papers

Tweets

https://twitter.com/evilsocket/status/1928450881592197438

https://twitter.com/framart1/status/1744372945302995080

https://twitter.com/LeonDerczynski/status/1744049807495176410

https://twitter.com/lu_sichu/status/1754234336822337772

https://twitter.com/ni_jovanovic/status/1744103108857201144

https://twitter.com/geeknik/status/1835418089267007570

YouTube

Show All Videos