Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (1812.00535v3)

Published 3 Dec 2018 in cs.LG, cs.CR, and cs.CV

Abstract: Federated learning, i.e., a mobile edge computing framework for deep learning, is a recent advance in privacy-preserving machine learning, where the model is trained in a decentralized manner by the clients, i.e., data curators, preventing the server from directly accessing those private data from the clients. This learning mechanism significantly challenges the attack from the server side. Although the state-of-the-art attacking techniques that incorporated the advance of Generative adversarial networks (GANs) could construct class representatives of the global data distribution among all clients, it is still challenging to distinguishably attack a specific client (i.e., user-level privacy leakage), which is a stronger privacy threat to precisely recover the private data from a specific client. This paper gives the first attempt to explore user-level privacy leakage against the federated learning by the attack from a malicious server. We propose a framework incorporating GAN with a multi-task discriminator, which simultaneously discriminates category, reality, and client identity of input samples. The novel discrimination on client identity enables the generator to recover user specified private data. Unlike existing works that tend to interfere the training process of the federated learning, the proposed method works "invisibly" on the server side. The experimental results demonstrate the effectiveness of the proposed attacking approach and the superior to the state-of-the-art.

Citations (729)

View on Semantic Scholar

Summary

The paper introduces a novel client-level privacy attack using mGAN-AI, effectively reconstructing individual user data in federated learning systems.
It employs a multi-task GAN discriminator to simultaneously assess reality, category, and client identity, enhancing the specificity of data recovery.
Experimental results on MNIST and AT&T datasets validate the attack's superior performance over traditional models, underscoring critical privacy risks.

Overview of User-Level Privacy Leakage from Federated Learning

The paper "Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning" by Zhibo Wang et al. explores the privacy risks associated with Federated Learning (FL), particularly the potential for user-level privacy leakage. This critical examination employs a well-crafted attack methodology leveraging Generative Adversarial Networks (GANs) to reconstruct individual user's data, even within the federated learning paradigm, which is designed to safeguard user privacy.

The authors present a novel attack framework named mGAN-AI (multi-task GAN for Auxiliary Identification). This approach is specifically designed to explicitly target and compromise the client-level privacy in FL without interfering with the standard training procedures. The paper marks a significant advancement from existing models that focus mainly on class-level privacy attacks and underscores the vulnerability posed by a malicious server in an FL setup.

Key Contributions

The contributions of this research are multi-fold and significant:

Introduction of Client-Level Privacy Attacks: The paper pivots from the traditional class-level attacks to a more granular client-level attack, focusing on recovering the private data of a specific user in an FL system. This shift highlights a more severe privacy threat, indicating the potential for more targeted data breaches.
mGAN-AI Framework: A critical innovation in this paper is the multi-task GAN framework incorporating client identity as a discriminator task alongside the reality and category. This structure supports more precise and targeted recovery of user-specific data.
Experimental Validation: The effectiveness of the proposed mGAN-AI is rigorously validated through experimental results on benchmark datasets like MNIST and AT&T. The empirical findings demonstrate superior data recovery fidelity compared to state-of-the-art attacks.

Methodology

The mGAN-AI leverages a sophisticated GAN architecture featuring a multi-task discriminator designed to perform simultaneous discriminations on category, reality, and client identity of the input data. The authors have articulated a clear theoretical foundation underscoring how the discriminator and generator of GAN can be adapted to this multi-task scenario, thus improving the quality and specificity of the generated samples.

Attack Types

The paper details two primary attack vectors that a malicious server can employ within an FL framework:

Passive Attack: Here, the server remains an honest-but-curious adversary and subtly analyzes periodic updates from clients. This attack operates within the bounds of expected behavior, making it an 'invisible' attack that avoids compromising the FL process.
Active Attack: This is a more aggressive approach where the server isolates and manipulates updates from a specific client. Though it introduces some level of interference with the FL process, it significantly enhances the fidelity of the recovered private data.

Experimental Results

The paper vividly demonstrates the attacks' effectiveness through extensive experiments. Notably:

MNIST Dataset: The results reveal that mGAN-AI successfully reconstructs rotated digit images specific to a victim client, illustrating how the framework can capture user-specific data nuances.
AT&T Dataset: The reconstructed images demonstrate client-specific features like the presence of glasses, deviating visibly from images of other clients, emphasizing the attack framework’s efficacy.

Quantitative measures such as Inception Score further substantiate mGAN-AI's superiority, with scores distinctly higher than those from current leading attacks such as the GAN-based and Model Inversion attacks.

Implications and Future Directions

The implications of this research are profound, emphasizing the need for robust privacy-preserving measures in FL systems. Given the demonstrated ability to breach client-level privacy, the paper makes a compelling case for reevaluating current security frameworks in FL and enhancing defense mechanisms to mitigate such risks.

Future research could expand on the mGAN-AI framework's adaptability to more complex and varied FL environments. There is also a potential avenue to explore countermeasures and improved encryption protocols that could thwart such sophisticated attacks without significantly impacting the learning efficiency and effectiveness.

In conclusion, the paper by Zhibo Wang et al. offers a critical lens on the vulnerabilities of federated learning, has extended the discourse on privacy threats, and presents actionable insights for bolstering security in decentralized learning frameworks.

PDF Markdown