Attack of the Tails: Yes, You Really Can Backdoor Federated Learning (2007.05084v1)

Published 9 Jul 2020 in cs.LG, cs.CR, cs.DC, and stat.ML

Abstract: Due to its decentralized nature, Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training. The goal of a backdoor is to corrupt the performance of the trained model on specific sub-tasks (e.g., by classifying green cars as frogs). A range of FL backdoor attacks have been introduced in the literature, but also methods to defend against them, and it is currently an open question whether FL systems can be tailored to be robust against backdoors. In this work, we provide evidence to the contrary. We first establish that, in the general case, robustness to backdoors implies model robustness to adversarial examples, a major open problem in itself. Furthermore, detecting the presence of a backdoor in a FL model is unlikely assuming first order oracles or polynomial time. We couple our theoretical results with a new family of backdoor attacks, which we refer to as edge-case backdoors. An edge-case backdoor forces a model to misclassify on seemingly easy inputs that are however unlikely to be part of the training, or test data, i.e., they live on the tail of the input distribution. We explain how these edge-case backdoors can lead to unsavory failures and may have serious repercussions on fairness, and exhibit that with careful tuning at the side of the adversary, one can insert them across a range of machine learning tasks (e.g., image classification, OCR, text prediction, sentiment analysis).

PDF Abstract

Analysis of "Attack of the Tails: Yes, You Really Can Backdoor Federated Learning"

This paper rigorously investigates the vulnerability of Federated Learning (FL) systems to backdoor attacks and challenges the notion that FL can be robustly protected from such threats. It highlights the ease with which adversaries can introduce specific misclassifications into FL models through backdoors, focusing especially on "edge-case backdoors."

Key Contributions

The core contributions of the paper can be summarized as follows:

Theoretical Insights:
- The paper establishes a theoretical connection between the vulnerability to adversarial examples and susceptibility to backdoor attacks. Specifically, it shows that if a model is susceptible to adversarial examples, backdoor attacks are essentially unavoidable.
- It asserts the computational difficulty of detecting backdoors by linking the problem to the NP-hardness of certain decision problems, reinforcing the challenge of achieving robust defense.
Edge-Case Backdoors:
- The paper introduces the concept of "edge-case backdoors," which target inputs that exist on the tails of the distribution but are naturally occurring rather than out-of-distribution. The authors argue that these are particularly difficult to detect and defend against due to their rarity in the training data.
- It is detailed how such backdoors can be manifested across a variety of tasks, such as image and sentiment classification, highlighting their broad applicability and potential threat.
Empirical Evidence:
- Through experiments across multiple datasets and tasks (including CIFAR-10, ImageNet, and sentiment analysis), the paper demonstrates the feasibility and persistence of these attacks even in the presence of state-of-the-art defense mechanisms like differential privacy, robust aggregations, and norm clipping.
- The paper also reveals how increasing model compactness can diminish the risk of backdoors, although often at the cost of model accuracy on the main task.

Implications and Future Directions

The implications of this work are significant for both theoretical and practical aspects of FL systems:

Theoretical Impact:
- By connecting backdoor resistance to adversarial robustness, the paper suggests that the longstanding challenge of defending against adversarial examples extends into the field of FL, and resolving it may be a prerequisite for safeguarding against sophisticated backdoor attacks.
Practical Considerations:
- FL systems in fields like finance, healthcare, and autonomous vehicles, where fairness and model integrity are critical, are shown to be at risk. This necessitates a reevaluation of defense mechanisms currently thought to be adequate.
- The paper raises issues of fairness in FL systems, as stringent defenses against backdoors can inadvertently filter out data diversity, thus impacting model fairness.
Research Directions:
- Future research may explore designing models with a smaller capacity that balances fairness, accuracy, and robustness. Exploring new aggregation strategies that can adaptively mitigate the effects of edge-case backdoors while preserving privacy will be crucial.
- Investing in theoretical advances in understanding the landscape of adversarial and backdoor vulnerabilities will continue to be vital.

Overall, this paper serves as a compelling reminder of the vulnerabilities that exist within federated models and emphasizes the need for continued innovation in designing robust FL frameworks.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Hongyi Wang (62 papers)
Kartik Sreenivasan (8 papers)
Shashank Rajput (17 papers)
Harit Vishwakarma (15 papers)
Saurabh Agarwal (19 papers)
Jy-yong Sohn (37 papers)
Kangwook Lee (70 papers)
Dimitris Papailiopoulos (59 papers)

Citations (523)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos