Are Attribute Inference Attacks Just Imputation? (2209.01292v1)

Published 2 Sep 2022 in cs.CR and cs.LG

Abstract: Models can expose sensitive information about their training data. In an attribute inference attack, an adversary has partial knowledge of some training records and access to a model trained on those records, and infers the unknown values of a sensitive feature of those records. We study a fine-grained variant of attribute inference we call \emph{sensitive value inference}, where the adversary's goal is to identify with high confidence some records from a candidate set where the unknown attribute has a particular sensitive value. We explicitly compare attribute inference with data imputation that captures the training distribution statistics, under various assumptions about the training data available to the adversary. Our main conclusions are: (1) previous attribute inference methods do not reveal more about the training data from the model than can be inferred by an adversary without access to the trained model, but with the same knowledge of the underlying distribution as needed to train the attribute inference attack; (2) black-box attribute inference attacks rarely learn anything that cannot be learned without the model; but (3) white-box attacks, which we introduce and evaluate in the paper, can reliably identify some records with the sensitive value attribute that would not be predicted without having access to the model. Furthermore, we show that proposed defenses such as differentially private training and removing vulnerable records from training do not mitigate this privacy risk. The code for our experiments is available at \url{https://github.com/bargavj/EvaluatingDPML}.

PDF Abstract

Sensitive Value Inference in Machine Learning Models

The paper "Are Attribute Inference Attacks Just Imputation?" authored by Bargav Jayaraman and David Evans explores the capacity of machine learning models to inadvertently leak sensitive information concerning their training data through attribute inference attacks. The research explores whether such attacks are indistinguishable from statistical data imputation under various conditions. It broadens the scope of traditional attribute inference by introducing and analyzing what they term as "sensitive value inference."

Key Findings

Comparison with Imputation: The paper reveals that standard black-box attribute inference attacks do not discernibly expose more information than what could be detected through imputation informed by an understanding of the data distribution necessary to formulate the attack. This challenges the notion that models leak extensive training data insights.
White-Box Attack Viability: Through the introduction of novel white-box attacks that exploit the internal structure of neural networks, specifically neuron activation levels, the paper illustrates that under conditions of limited prior knowledge about the training distribution, these attacks can indeed surpass imputation in identifying sensitive attributes.
Implications for Distribution vs. Dataset Inference: The research underscores that the privacy threat stems not from training dataset inference but from distribution inference. The distinction lies in the fact that while models can reveal hidden statistical correlations within the training distribution, they do not necessarily leak specific training records, suggesting a broader privacy implication than previously acknowledged.

Methodology and Contributions

Sensitive Value Inference: The authors propose a new metric—sensitive value inference—focusing on identifying records holding particular sensitive attribute values with high confidence. This approach resonates more with realistic scenarios where asymmetric risks are high.
Stimulating Threat Models: By varying the adversary’s access to data and model, the paper examines an array of realistic threat models demonstrating that the adversary's ability to build an effective attack heavily relies on the extent of prior knowledge about the data distribution.
Defensive Measures Evaluation: Attempts to mitigate these inference risks through differential privacy and selective training record removal were evaluated. Both strategies proved inadequate in meaningfully reducing the risk, suggesting that more robust solutions need to be explored.

Practical and Theoretical Implications

Model Distribution Privacy Concerns: Highlighting the difference between dataset and distribution inference raises considerations about disclosing models trained on sensitive data. This is particularly salient when distribution data is not public, posing risks akin to those witnessed with dataset-privacy breaches.
Evaluation Metrics in Privacy Research: The introduction of sensitive value inference could prompt a re-evaluation of performance metrics used in privacy research, framing them in the context of asymmetric risks.
Future Directions: The findings warrant further exploration into technical measures that control model distribution leakage without significantly impacting utility. Future research might investigate whether models could be designed to obscure sensitive correlations in training data without compromising their primary functionality.

In essence, this research dismantles several assumptions held about the threats posed by attribute inference attacks, urging a reconsideration of both privacy loss definitions and mitigation strategies. While distributional knowledge of model data may sometimes be underestimated in threat evaluations, the need for refined protective measures becomes particularly pertinent.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Bargav Jayaraman (10 papers)
David Evans (63 papers)

Citations (36)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - bargavj/EvaluatingDPML: This project's goal is to evaluate the privacy leakage of differentially private machine learning models. (130 stars)