How To Backdoor Federated Learning (1807.00459v3)

Published 2 Jul 2018 in cs.CR and cs.LG

Abstract: Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards without revealing what individual users type. We demonstrate that any participant in federated learning can introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features, or that a word predictor completes certain sentences with an attacker-chosen word. We design and evaluate a new model-poisoning methodology based on model replacement. An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task. We evaluate the attack under different assumptions for the standard federated-learning tasks and show that it greatly outperforms data poisoning. Our generic constrain-and-scale technique also evades anomaly detection-based defenses by incorporating the evasion into the attacker's loss function during training.

PDF Abstract

How To Backdoor Federated Learning

Overview

The paper "How To Backdoor Federated Learning" by Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov introduces a novel vulnerability in Federated Learning (FL) systems, termed "model poisoning." This particular attack enables malicious participants to insert a backdoor into the joint model, reminiscent of but significantly more potent than traditional data poisoning attacks. The efficacy of the model replacement strategy is demonstrated through rigorous experimentation on standard FL tasks such as image classification and word prediction. The paper primarily challenges the robustness of FL under adversarial conditions, addressing the limitations of existing defenses and secure aggregation.

Key Contributions

Model Poisoning Attack: The authors propose a model poisoning attack that leverages the federated averaging process to implant backdoors into the joint model, enhancing the attacker's control over the model's performance on specific, attacker-chosen tasks.
Evaluation of Attack Efficacy: The attack's effectiveness is thoroughly evaluated using CIFAR-10 for image classification and a Reddit-based corpus for word prediction. The results show that a single adversarial participant in one round can cause the joint model to achieve near-perfect accuracy on the backdoor task, with minimal impact on the model's main task.
Comparison with Traditional Data Poisoning: The paper contrasts the proposed model poisoning attack with traditional data poisoning, demonstrating the former's superior performance. For instance, in a word-prediction task with 80,000 participants, compromising just eight participants suffices to achieve 50% backdoor accuracy. In contrast, data poisoning requires 400 malicious participants for similar performance.
Secure Aggregation and Defenses: The paper critically examines the implications of secure aggregation in FL, highlighting that it fails to prevent model-poisoning attacks since it nullifies any potential anomaly detection methods. Furthermore, the authors develop and evaluate a sophisticated "constrain-and-scale" technique to evade even future, more sophisticated defenses.

Implications and Future Directions

Practical Implications: The paper significantly impacts the trust and reliability of FL systems, particularly in scenarios involving highly sensitive data. By exposing the vulnerability of FL to model replacement attacks, it underscores that current secure aggregation methods are insufficient for ensuring model integrity. This calls for a reevaluation of FL's robustness against adversarial manipulation.

Theoretical Implications: The work opens new theoretical avenues for improving FL's resistance to adversarial attacks. It highlights the need for new aggregation techniques and anomaly detection methods that can both preserve privacy and ensure model integrity. Specifically, the discussion on the inefficacy of Byzantine-tolerant aggregation in non-i.i.d. settings paves the way for designing robust, privacy-preserving FL mechanisms.

Future Research: The results suggest several promising directions for future research:

Enhanced Aggregation Mechanisms: Developing aggregation techniques that can detect and mitigate model-poisoning without compromising the participants' privacy.
Robust Federated Learning Protocols: Designing FL protocols resilient to single-point compromises, perhaps through layered defense strategies that combine multiple anomaly detection and noise-addition techniques.
Adversarial-Resilient Models: Exploring model architectures and training procedures inherently more resistant to backdoor attacks.

Conclusion

The paper compellingly demonstrates that federated learning, despite its privacy-preserving advantages, is vulnerable to potent model-poisoning attacks. This research challenges the current paradigms in distributed machine learning, urging the community to rethink the design and deployment of FL systems in adversarial settings. The proposed model replacement methodology sets a critical benchmark for evaluating the security of FL and outlines a clear path for future advancements in secure machine learning.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Eugene Bagdasaryan (17 papers)
Andreas Veit (29 papers)
Yiqing Hua (10 papers)
Deborah Estrin (10 papers)
Vitaly Shmatikov (42 papers)

Citations (1,710)

View on Semantic Scholar

How To Backdoor Federated Learning (1807.00459v3)