Mitigating Sybils in Federated Learning Poisoning (1808.04866v5)

Published 14 Aug 2018 in cs.LG, cs.CR, cs.DC, and stat.ML

Abstract: Machine learning (ML) over distributed multi-party data is required for a variety of domains. Existing approaches, such as federated learning, collect the outputs computed by a group of devices at a central aggregator and run iterative algorithms to train a globally shared model. Unfortunately, such approaches are susceptible to a variety of attacks, including model poisoning, which is made substantially worse in the presence of sybils. In this paper we first evaluate the vulnerability of federated learning to sybil-based poisoning attacks. We then describe \emph{FoolsGold}, a novel defense to this problem that identifies poisoning sybils based on the diversity of client updates in the distributed learning process. Unlike prior work, our system does not bound the expected number of attackers, requires no auxiliary information outside of the learning process, and makes fewer assumptions about clients and their data. In our evaluation we show that FoolsGold exceeds the capabilities of existing state of the art approaches to countering sybil-based label-flipping and backdoor poisoning attacks. Our results hold for different distributions of client data, varying poisoning targets, and various sybil strategies. Code can be found at: https://github.com/DistributedML/FoolsGold

Citations (448)

View on Semantic Scholar

Summary

The paper presents FoolsGold, a defense framework that leverages cosine similarity of client updates to dynamically adjust learning rates and thwart sybil-based poisoning attacks.
FoolsGold requires no pre-defined attacker counts or strict client data assumptions, offering a flexible solution in diverse federated learning environments.
Empirical evaluations on datasets like MNIST and VGGFace2 demonstrate its robustness, maintaining low attack success rates even under extreme sybil conditions.

Mitigating Sybils in Federated Learning Poisoning

Federated learning is a paradigmatic advancement in ML, enabling distributed model training across multiple devices while maintaining data privacy. However, this paper highlights a significant vulnerability within federated learning systems: susceptibility to sybil-based poisoning attacks. The authors present FoolsGold, an innovative defense mechanism designed to mitigate this risk by leveraging client update diversity. This approach does not require pre-defining the number of attackers, utilizing external data, or imposing stringent assumptions on client data distributions, making it uniquely adaptable to federated learning environments.

Sybil Attacks in Federated Learning

Federated learning allows clients to train local models and share updates with a central aggregator, minimizing data transfer and enhancing privacy. However, this setup also permits malicious clients to perform model poisoning, especially when sybils—colluding adversarial nodes—participate. The sybils can execute attacks like label-flipping and backdoor poisoning, significantly altering model predictions by manipulating shared updates.

FoolsGold: A Novel Defense

FoolsGold is proposed to counteract the threat of sybil-based attacks by dynamically adapting learning rates based on the similarity of client contributions. This method hinges on the assumption that sybil nodes would generally provide more homogenous updates than honest clients due to their shared malicious objectives.

Key Features of FoolsGold:

Cosine Similarity Measurement: FoolsGold assesses the angular similarity between updates, focusing on indicative model features. This choice circumvents issues that might arise from simple magnitude comparisons, which adversarial nodes could easily manipulate.
Adaptive Learning Rates: By adjusting learning rates according to inter-client update similarity, FoolsGold systematically reduces the influence of similar updates indicative of sybil behavior.
Historical Update Analysis: Incorporating historical data allows for a robust assessment of contribution patterns, enabling the identification of consistent similarities among sybils over multiple iterations.
Pardoning Mechanism: This feature ensures genuine updates are not mistakenly penalized due to transient similarities with malicious contributions.
Logit Transformation: By transforming similarity scores via the logit function, FoolsGold effectively dilutes the influence of adversarial updates as their apparent similarity strengthens.

Evaluation and Results

The empirical assessment of FoolsGold across diverse datasets, including MNIST, VGGFace2, KDDCup99, and Amazon Reviews, demonstrates its efficacy. FoolsGold effectively mitigates a range of poisoning attacks while preserving model accuracy, even under conditions of significant sybil proliferation (up to 99%).

Additionally, the evaluation underscores FoolsGold’s robustness to variations in client data distributions and attack methodologies, including mixed-data strategies and adaptive update frequencies. Despite the increased noise and variance introduced by these tactics, FoolsGold maintains a low attack success rate.

Implications and Future Directions

This research signifies a substantial step toward enhancing federated learning security, addressing a critical gap in current defenses which often lack adaptability to dynamic attacker strategies. The insights on leveraging update diversity offer a promising direction for developing resilient machine learning frameworks.

Future work can explore integrating FoolsGold with other robust learning architectures and adapting it to emerging federated learning scenarios requiring secure, scalable, and privacy-preserving solutions.

In summary, FoolsGold provides a pragmatic approach to safeguarding federated learning systems against sybil-based poisoning, setting a foundation for ongoing innovation in the secure deployment of decentralized machine learning models.

PDF Markdown

Related Papers

GitHub

GitHub - DistributedML/FoolsGold: A sybil-resilient distributed learning protocol. (97 stars)