Privacy and Robustness in Federated Learning: Attacks and Defenses
The paper "Privacy and Robustness in Federated Learning: Attacks and Defenses," authored by Lingjuan Lyu, Han Yu, Xingjun Ma, Chen Chen, Lichao Sun, Jun Zhao, Qiang Yang, and Philip S. Yu, provides a comprehensive survey of the current landscape in federated learning (FL) concerning its privacy and robustness challenges. The survey encompasses a broad overview of the privacy and robustness threats in federated learning, categorizes numerous attack methodologies, and evaluates existing defensive mechanisms.
Overview of Federated Learning
Federated learning is an emergent paradigm aimed at decentralized model training, allowing multiple participants to collaborate without sharing raw data. This method addresses critical privacy issues associated with centralized data aggregation, particularly under stringent regulations such as GDPR. The survey delineates FL into horizontally federated learning (HFL), vertically federated learning (VFL), and federated transfer learning (FTL), categorized by the data distribution and feature space alignment among participating entities.
Threats to Federated Learning
The survey identifies two primary threats to FL: privacy attacks and robustness attacks. Privacy attacks aim to infer sensitive data from model updates, while robustness attacks, including both untargeted and targeted poisoning, seek to compromise the integrity of the FL models.
- Privacy Attacks: These include generative adversarial network (GAN) attacks for inferring class representatives, membership inference attacks (MIA), property inference attacks, and direct recovery techniques like Deep Leakage from Gradients (DLG). Each method exploits the partial observability of model updates to deduce private training data.
- Robustness Attacks: Such threats include Byzantine attacks that introduce arbitrary errors, compromising system stability, and targeted attacks like label-flipping and backdoor attacks that aim to implant malicious behaviors into models.
Defenses Against Threats
The authors describe various methodologies to mitigate these threats:
- Cryptographic Techniques: Methods such as homomorphic encryption and secure multiparty computation are utilized to protect the integrity of data without exposing gradients, although they come with significant computational overhead.
- Differential Privacy (DP): The application of DP, including centralized, local, and distributed variants, offers privacy guarantees by perturbing data, albeit often at the cost of model utility.
- Robust Aggregation: Techniques like Krum, Multi-Krum, and geometric median aggregation are designed to withstand Byzantine participants by filtering anomalous updates.
- Backdoor and Sybil Attack Containment: The survey also addresses detection and mitigation approaches for sophisticated targeted attacks.
Implications and Future Directions
The paper posits substantial challenges in harmonizing the diverse objectives of federated learning, prominently privacy, robustness, model utility, and communication efficiency. Existing defense methodologies, though they can address specific attack vectors, frequently result in trade-offs that degrade learning performance or increase computational demand. The inherent conflict between rigorous privacy assurance and robust defense against poisoning attacks presents a formidable barrier to realizing fully secure FL systems.
Future research initiatives could focus on refining one-shot federated learning models to circumvent extensive communication overheads, enhancing hybrid privacy models to cater to varying compute and trust scenarios, and developing more resilient, scalable strategies against the ever-evolving spectrum of adversarial threats.
The concerted efforts outlined in this survey mark a pivotal step towards securing federated learning deployments, underscoring the need for interdisciplinary collaboration to anticipate and counteract emerging vulnerabilities effectively.