The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks (1911.07135v2)

Published 17 Nov 2019 in cs.LG and stat.ML

Abstract: This paper studies model-inversion attacks, in which the access to a model is abused to infer information about the training data. Since its first introduction, such attacks have raised serious concerns given that training data usually contain privacy-sensitive information. Thus far, successful model-inversion attacks have only been demonstrated on simple models, such as linear regression and logistic regression. Previous attempts to invert neural networks, even the ones with simple architectures, have failed to produce convincing results. We present a novel attack method, termed the generative model-inversion attack, which can invert deep neural networks with high success rates. Rather than reconstructing private training data from scratch, we leverage partial public information, which can be very generic, to learn a distributional prior via generative adversarial networks (GANs) and use it to guide the inversion process. Moreover, we theoretically prove that a model's predictive power and its vulnerability to inversion attacks are indeed two sides of the same coin---highly predictive models are able to establish a strong correlation between features and labels, which coincides exactly with what an adversary exploits to mount the attacks. Our extensive experiments demonstrate that the proposed attack improves identification accuracy over the existing work by about 75\% for reconstructing face images from a state-of-the-art face recognition classifier. We also show that differential privacy, in its canonical form, is of little avail to defend against our attacks.

Citations (355)

View on Semantic Scholar

Summary

The paper introduces a novel GAN-based inversion attack that uses public data as a regularizer to reconstruct private training data.
Experimental results on datasets like CelebA and MNIST demonstrate up to 75% improvement in reconstructing sensitive details over prior methods.
The study reveals a critical trade-off between model predictive power and privacy, questioning the effectiveness of standard defenses like differential privacy.

Insightful Overview of "The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks"

The paper "The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks" addresses a prevalent privacy concern in the deployment of deep neural networks (DNNs), specifically the vulnerability of such models to model-inversion (MI) attacks. These attacks exploit the model's output to infer sensitive characteristics of the training data, thus posing significant privacy risks when the data is sensitive or proprietary.

Key Contributions and Methodology

The authors propose a novel generative model-inversion (GMI) attack leveraging generative adversarial networks (GANs) to effectively invert DNNs. The core idea involves utilizing partial public information to establish a distributional prior using GANs, which guides the inversion process towards reproducing private training data. The innovative aspect of this approach lies in employing a pre-learned data distribution from public datasets—this acts as a regularizer to the otherwise ill-posed inversion problem.

The authors conduct a theoretical analysis that draws a connection between the model's predictive power and its susceptibility to MI attacks. Their argument is that highly predictive models inherently establish stronger correlations between input features and labels, providing the adversary with a more robust signal for reconstruction. This connection is substantiated by theoretical proofs demonstrating that a model’s predictive efficacy makes it more vulnerable to inversion tactics.

Experimental Validations

The empirical studies underscore the effectiveness of the GMI method across several datasets, notably CelebA, MNIST, and Chest X-ray data, achieving significant improvements in attack success rates over previous methods. For instance, the GMI attack's identification accuracy on reconstructing face images from a state-of-the-art face recognition classifier outperforms existing attacks by approximately 75%.

Furthermore, the research explores the impact of public data characteristics on the GMI attack’s success, demonstrating robustness even with disparate public data distributions relative to the private data.

Implications and Future Directions

Practically, this paper highlights the persistent threat to data privacy posed by model-inversion attacks and challenges the efficacy of current defenses like differential privacy (DP). The authors show that traditional DP mechanisms, although effective against membership inference, offer little protection against attribute-specific privacy leaks that MI attacks target.

Theoretically, the paper bridges a crucial gap in understanding model vulnerability, emphasizing the need for predictive power-aware defenses. This work paves the way for a research trajectory that not only scrutinizes defensive approaches under this attack paradigm but also reimagines privacy models to holistically address attribute-specific leakage.

In moving forward, a promising line of inquiry is extending GMI attack methodologies to black-box settings, where adversary access to model internals is restricted. Moreover, developing new privacy-preserving training paradigms that maintain model utility while minimizing attribute leaks would be of immense value.

In conclusion, this paper provides critical insights into the dual-edged nature of model predictive power and privacy, framing a comprehensive approach to understanding and mitigating MI attacks in deep learning.

PDF Markdown