Survey on Federated Learning Threats: Concepts, Taxonomy on Attacks and Defences, Experimental Study, and Challenges
The paper offers a comprehensive examination of threats within Federated Learning (FL), a machine learning paradigm that enables collaborative model training without centralizing data, addressing privacy concerns increasingly relevant in fields like healthcare and finance. The paper meticulously categorizes adversarial attacks and corresponding defense strategies, underscoring the vulnerabilities specific to FL and offering insights into developing more robust systems.
Federated Learning and Its Challenges
FL is conceptualized to solve significant challenges in AI, predominantly focusing on preserving data privacy while maintaining the advantages of massive, distributed data sources. The paper outlines the core goal of FL: constructing a global model collaboratively while ensuring individual data sets on distributed clients are not shared outright. Despite these advantages, FL models are susceptible to adversarial attacks that threaten both data integrity and privacy.
Taxonomy of Threats and Defenses
The paper distinguishes between two primary categories of adversarial attacks:
- Attacks on the Federated Model: These include strategies aimed at corrupting the FL model's efficacy, further divided into:
- Training-time attacks: Performed during model training, these attacks inject malicious modifications.
- Inference-time attacks: Conducted post-training, these target misclassification during inference.
- Privacy Attacks: These focus on extracting private information from the aggregated data processing despite the intended secure environment of FL.
For each attack type, the paper provides an exhaustive taxonomy reflecting the attack's characteristics, objectives (targeted and untargeted), scope (data or model poisoning), and frequency (one-shot or continual).
Experimental Study
The paper incorporates a thorough empirical evaluation using datasets such as EMNIST, Fashion MNIST, and CIFAR-10, revealing insights into adversarial attack strategies and defense mechanisms. Standout findings include:
- Random weights and label-flipping attacks are notably effective in degrading model performance.
- Robust aggregation methods, such as trimmed and truncated means, demonstrate substantial efficacy in mitigating these attacks, albeit their success is frequently contingent upon attack configurations and parameters.
Defense Mechanisms
The paper classifies defenses into those implemented at the server, client, and communication levels. Server-based defenses— including robust aggregation strategies and the application of differential privacy—hold significant promise. However, defenses need to balance between ensuring security and maintaining model accuracy, a dichotomy that remains a critical challenge in FL security.
Lessons Learned and Future Directions
The evolving field of FL threats and defenses is earmarked by continuous development and challenges, particularly the need to balance federated model robustness with usability. The integration of differential privacy mechanisms, while effective to an extent in mitigating both model and privacy attacks, demands calibration to prevent substantial performance degradation.
The paper concludes with several future challenges:
- Developing comprehensive defenses effective across all attack dimensions, combining the strengths of various mitigation strategies.
- Addressing non-IID data distribution's implications for defense effectiveness.
- Extending the depth and breadth of adversarial research in less-explored FL settings, such as vertical and transfer FL, and tailoring solutions therein.
In summary, while the paper presents a meticulous taxonomy and evaluation framework for understanding federated learning threats, it underscores the ongoing journey toward achieving a robust, privacy-preserving federated learning ecosystem.