Systematic Evaluation of Privacy Risks of Machine Learning Models (2003.10595v2)

Published 24 Mar 2020 in cs.CR, cs.LG, and stat.ML

Abstract: Machine learning models are prone to memorizing sensitive data, making them vulnerable to membership inference attacks in which an adversary aims to guess if an input sample was used to train the model. In this paper, we show that prior work on membership inference attacks may severely underestimate the privacy risks by relying solely on training custom neural network classifiers to perform attacks and focusing only on the aggregate results over data samples, such as the attack accuracy. To overcome these limitations, we first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks and proposing a new inference attack method based on a modification of prediction entropy. We also propose benchmarks for defense mechanisms by accounting for adaptive adversaries with knowledge of the defense and also accounting for the trade-off between model accuracy and privacy risks. Using our benchmark attacks, we demonstrate that existing defense approaches are not as effective as previously reported. Next, we introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score. Our privacy risk score metric measures an individual sample's likelihood of being a training member, which allows an adversary to identify samples with high privacy risks and perform attacks with high confidence. We experimentally validate the effectiveness of the privacy risk score and demonstrate that the distribution of privacy risk score across individual samples is heterogeneous. Finally, we perform an in-depth investigation for understanding why certain samples have high privacy risks, including correlations with model sensitivity, generalization error, and feature embeddings. Our work emphasizes the importance of a systematic and rigorous evaluation of privacy risks of machine learning models.

PDF Abstract

Systematic Evaluation of Privacy Risks of Machine Learning Models

In this paper, the researchers present a methodical examination of the privacy risks inherent to machine learning models, particularly focusing on membership inference attacks. These attacks represent a significant privacy threat, as they are designed to determine whether a specific data point was part of the training set of a model—essentially breaching data privacy.

The authors critique existing work on this topic for relying heavily on neural network (NN)-based attacks to gauge these privacy risks, which can lead to an underestimation of the actual threat. They note that such methods often focus on aggregate attack accuracy metrics, which do not provide a complete picture of the risk landscape. Instead, they propose a comprehensive set of benchmark attacks that do not depend on NN predictors, many of which are based on refined calculations of prediction metrics such as prediction confidence and entropy.

Interestingly, this paper introduces a novel attack method that improves prediction entropy by considering the ground truth label, providing a more accurate measure of model susceptibility to membership inference. These benchmark attacks show that existing defenses, like adversarial regularization and MemGuard, are less effective than previously reported.

Further, the paper introduces a new metric, the privacy risk score, which assesses the risk of individual samples based on their probability of being training data. This allows for a more nuanced analysis, recognizing that privacy vulnerabilities can vary substantially between individual data points. The paper's experimental results highlight the heterogeneity in privacy risk scores, suggesting that average assessment methods may overlook high-risk individual samples.

The implications of this research extend to both the development of more robust privacy defenses and the theoretical understanding of model vulnerabilities. By employing a rigorous evaluation approach that considers fine-grained risks, better insight into machine learning privacy can be achieved. This work sets a precedence for future research, stressing the importance of systematic privacy risk evaluation.

Concluding, the research emphasizes the need for developing stronger defenses that account for adaptive adversaries and balance between privacy risk and model accuracy. The detailed experiments illustrate the intricacies of privacy risks, urging the research community to recognize and address these differences to advance secure and privacy-preserving machine learning architectures.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Liwei Song (13 papers)
Prateek Mittal (129 papers)

Citations (312)

View on Semantic Scholar

Systematic Evaluation of Privacy Risks of Machine Learning Models (2003.10595v2)

Systematic Evaluation of Privacy Risks of Machine Learning Models

Related Papers