ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models (2102.02551v2)

Published 4 Feb 2021 in cs.CR, cs.AI, cs.LG, and stat.ML

Abstract: Inference attacks against Machine Learning (ML) models allow adversaries to learn sensitive information about training data, model parameters, etc. While researchers have studied, in depth, several kinds of attacks, they have done so in isolation. As a result, we lack a comprehensive picture of the risks caused by the attacks, e.g., the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of possible defenses. In this paper, we fill this gap by presenting a first-of-its-kind holistic risk assessment of different inference attacks against machine learning models. We concentrate on four attacks -- namely, membership inference, model inversion, attribute inference, and model stealing -- and establish a threat model taxonomy. Our extensive experimental evaluation, run on five model architectures and four image datasets, shows that the complexity of the training dataset plays an important role with respect to the attack's performance, while the effectiveness of model stealing and membership inference attacks are negatively correlated. We also show that defenses like DP-SGD and Knowledge Distillation can only mitigate some of the inference attacks. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models, and equally serves as a benchmark tool for researchers and practitioners.

PDF Abstract

Holistic Risk Assessment of Inference Attacks on Machine Learning Models

The paper "ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models" provides a comprehensive evaluation of potential privacy and security vulnerabilities associated with ML models. The focus of this research is on four prominent inference attacks: membership inference, model inversion, attribute inference, and model stealing. These attacks, when successful, can lead to significant breaches of privacy by extracting sensitive information from ML models.

Threat Models and Analysis

The authors propose a taxonomy for assessing the risk of inference attacks along two primary dimensions: access to the target model (white-box or black-box) and the availability of auxiliary data sets (partial training data, shadow data, or no data). This framework allows for a thorough investigation of how various threat models can impact the effectiveness of different inference attacks.

Experimental Evaluation

To substantiate their analysis, the paper evaluates these attacks across five different neural network architectures: AlexNet, ResNet18, VGG19, Xception, and SimpleCNN. These models were trained on four diverse datasets: CelebA, Fashion-MNIST, STL10, and UTKFace. The results indicate intriguing patterns about how the complexity of the dataset influences attack efficacy, highlighting a negative correlation between the effectiveness of model stealing and membership inference attacks. Specifically, complex datasets tend to increase the efficacy of membership inference attacks due to overfitting but pose challenges for model stealing due to the difficulty in replicating the data complexity essential for training a stolen model.

Defense Mechanisms

The paper also explores two mitigation strategies: Differentially Private Stochastic Gradient Descent (DP-SGD) and Knowledge Distillation (KD). The findings reveal that while DP-SGD can significantly mitigate membership inference risks, its effectiveness against model inversion and model stealing remains limited. KD, conversely, offers minor mitigation benefits across various attacks but poses a lesser impact on model utility.

ML-Doctor: A Modular Framework

To facilitate ongoing research and practical application of these findings, the authors introduce ML-Doctor, a modular software framework designed to evaluate a wide array of inference attacks and countermeasures. ML-Doctor aims to serve not only ML model developers in assessing potential risks before model deployment but also to provide a comprehensive benchmarking tool for academic research into attack methodologies and defensive strategies.

Implications and Future Directions

This paper underscores the importance of understanding and mitigating the risks of inference attacks on ML models—a crucial step toward safeguarding sensitive data and intellectual property. The results indicate that comprehensive protection mechanisms remain underdeveloped. Future work could explore the integration of ML-Doctor into different domains beyond image classification, such as text and audio, and the development of more robust, general-purpose defenses against a wider array of inference attacks. The work raises pertinent questions about how AI models can be better secured throughout their deployment lifecycle without significant impact on performance or utility.

PDF Markdown Bookmark Chat (Pro)

Authors (9)

Yugeng Liu (8 papers)
Rui Wen (48 papers)
Xinlei He (58 papers)
Ahmed Salem (35 papers)
Zhikun Zhang (39 papers)
Michael Backes (157 papers)
Emiliano De Cristofaro (117 papers)
Mario Fritz (160 papers)
Yang Zhang (1129 papers)

Citations (113)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - liuyugeng/ML-Doctor: Code for ML Doctor (91 stars)