Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning (1812.00910v2)

Published 3 Dec 2018 in stat.ML, cs.CR, and cs.LG

Abstract: Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.

Authors (3)

Milad Nasr (48 papers)
Reza Shokri (46 papers)
Amir Houmansadr (63 papers)

Citations (214)

View on Semantic Scholar

Summary

The paper introduces novel white-box inference attacks that leverage gradient information to accurately detect training data membership.
It shows that both centralized and federated learning models, even those with strong generalization, remain vulnerable to privacy breaches.
Active attacks in federated learning are detailed where adversaries manipulate gradient updates, achieving up to 74.3% inference accuracy.

Summary of "Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning"

The paper "Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning" by Milad Nasr, Reza Shokri, and Amir Houmansadr provides an exhaustive examination of privacy vulnerabilities in deep learning frameworks. It focuses on inference attacks, particularly membership inference, which discern whether a specific data point was part of the training dataset for a given machine learning model. These attacks pose significant privacy risks, as they could potentially reveal sensitive user information from trained models.

Main Contributions

White-box Membership Inference Attacks: The authors introduce and detail new white-box membership inference attacks that surpass the traditional black-box approaches. They incorporate the exploitation of gradient information—a consequence of using the Stochastic Gradient Descent (SGD) algorithm—during training. These white-box attacks demonstrate superior performance in identifying training data membership across various model architectures and configurations.
Privacy Vulnerabilities in Centralized and Federated Learning: The research explores both centralized and federated learning setups. The authors illustrate that even well-generalized models are susceptible to privacy leaks. They employ white-box attacks to exploit gradients in federated learning scenarios, revealing how collaborative environments can inadvertently amplify privacy risks.
Active Attacks in Federated Learning: Beyond passive observation, the paper introduces active attacks where attackers, either participants or the server, can manipulate gradient updates to enhance their membership inference capabilities. This is particularly concerning in federated learning, where adversarial participants can influence the central model's parameters to extract confidential data insights.
Comprehensive Evaluation: The paper includes evaluations of the proposed attacks on real-world datasets and pre-trained models, demonstrating the tangible threat these attacks pose. Notably, evaluations on CIFAR100 models illustrate that white-box attack accuracy significantly exceeds that of black-box attacks, with up to 74.3% attack accuracy on dense neural networks like DenseNet.

Key Findings and Implications

The analysis reaffirms that fully knowing model parameters (the white-box scenario) substantially elevates the risk and potential efficacy of membership inference attacks compared to black-box scenarios.
Well-generalized models, typically characterized by high test accuracy, are not inherently safe from privacy risks. The accuracy of privacy attacks is not always aligned with the perceived overfitting of a model.
The federated learning environment, which is designed to maintain privacy across distributed data, is specifically vulnerable to these white-box attacks. Coordinated parameter updates, an inherent feature of federated networks, can be exploited actively by adversarial parties to orchestrate precise inference attacks.

Future Outlook

The paper's insights emphasize an urgent need for enhanced privacy-preservation strategies that counter the vulnerabilities identified. Future research must address the balance between model utility and privacy, focusing on robust privacy defenses at the algorithmic level. Techniques like differential privacy, secure multiparty computation, and robust aggregation mechanisms in federated learning will be crucial in mitigating these identified risks.

In conclusion, this work highlights the critical and timely concern of privacy in deep learning, particularly within white-box scenarios, and sets a directive for ongoing advancements in safeguarding user data without undermining the potential of collaborative machine learning efforts.

PDF Markdown