Privacy and Robustness in Federated Learning: Attacks and Defenses (2012.06337v3)

Published 7 Dec 2020 in cs.CR and cs.AI

Abstract: As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of AI models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

PDF Abstract

Privacy and Robustness in Federated Learning: Attacks and Defenses

The paper "Privacy and Robustness in Federated Learning: Attacks and Defenses," authored by Lingjuan Lyu, Han Yu, Xingjun Ma, Chen Chen, Lichao Sun, Jun Zhao, Qiang Yang, and Philip S. Yu, provides a comprehensive survey of the current landscape in federated learning (FL) concerning its privacy and robustness challenges. The survey encompasses a broad overview of the privacy and robustness threats in federated learning, categorizes numerous attack methodologies, and evaluates existing defensive mechanisms.

Overview of Federated Learning

Federated learning is an emergent paradigm aimed at decentralized model training, allowing multiple participants to collaborate without sharing raw data. This method addresses critical privacy issues associated with centralized data aggregation, particularly under stringent regulations such as GDPR. The survey delineates FL into horizontally federated learning (HFL), vertically federated learning (VFL), and federated transfer learning (FTL), categorized by the data distribution and feature space alignment among participating entities.

Threats to Federated Learning

The survey identifies two primary threats to FL: privacy attacks and robustness attacks. Privacy attacks aim to infer sensitive data from model updates, while robustness attacks, including both untargeted and targeted poisoning, seek to compromise the integrity of the FL models.

Privacy Attacks: These include generative adversarial network (GAN) attacks for inferring class representatives, membership inference attacks (MIA), property inference attacks, and direct recovery techniques like Deep Leakage from Gradients (DLG). Each method exploits the partial observability of model updates to deduce private training data.
Robustness Attacks: Such threats include Byzantine attacks that introduce arbitrary errors, compromising system stability, and targeted attacks like label-flipping and backdoor attacks that aim to implant malicious behaviors into models.

Defenses Against Threats

The authors describe various methodologies to mitigate these threats:

Cryptographic Techniques: Methods such as homomorphic encryption and secure multiparty computation are utilized to protect the integrity of data without exposing gradients, although they come with significant computational overhead.
Differential Privacy (DP): The application of DP, including centralized, local, and distributed variants, offers privacy guarantees by perturbing data, albeit often at the cost of model utility.
Robust Aggregation: Techniques like Krum, Multi-Krum, and geometric median aggregation are designed to withstand Byzantine participants by filtering anomalous updates.
Backdoor and Sybil Attack Containment: The survey also addresses detection and mitigation approaches for sophisticated targeted attacks.

Implications and Future Directions

The paper posits substantial challenges in harmonizing the diverse objectives of federated learning, prominently privacy, robustness, model utility, and communication efficiency. Existing defense methodologies, though they can address specific attack vectors, frequently result in trade-offs that degrade learning performance or increase computational demand. The inherent conflict between rigorous privacy assurance and robust defense against poisoning attacks presents a formidable barrier to realizing fully secure FL systems.

Future research initiatives could focus on refining one-shot federated learning models to circumvent extensive communication overheads, enhancing hybrid privacy models to cater to varying compute and trust scenarios, and developing more resilient, scalable strategies against the ever-evolving spectrum of adversarial threats.

The concerted efforts outlined in this survey mark a pivotal step towards securing federated learning deployments, underscoring the need for interdisciplinary collaboration to anticipate and counteract emerging vulnerabilities effectively.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Lingjuan Lyu (131 papers)
Han Yu (218 papers)
Xingjun Ma (114 papers)
Chen Chen (752 papers)
Lichao Sun (186 papers)
Jun Zhao (469 papers)
Qiang Yang (202 papers)
Philip S. Yu (592 papers)

Citations (287)

View on Semantic Scholar

Privacy and Robustness in Federated Learning: Attacks and Defenses (2012.06337v3)