Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning (2101.04535v1)

Published 11 Jan 2021 in cs.LG and cs.CR

Abstract: Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage. DP formalizes this data leakage through a cryptographic game, where an adversary must predict if a model was trained on a dataset D, or a dataset D' that differs in just one example.If observing the training algorithm does not meaningfully increase the adversary's odds of successfully guessing which dataset the model was trained on, then the algorithm is said to be differentially private. Hence, the purpose of privacy analysis is to upper bound the probability that any adversary could successfully guess which dataset the model was trained on.In our paper, we instantiate this hypothetical adversary in order to establish lower bounds on the probability that this distinguishing game can be won. We use this adversary to evaluate the importance of the adversary capabilities allowed in the privacy analysis of DP training algorithms.For DP-SGD, the most common method for training neural networks with differential privacy, our lower bounds are tight and match the theoretical upper bound. This implies that in order to prove better upper bounds, it will be necessary to make use of additional assumptions. Fortunately, we find that our attacks are significantly weaker when additional (realistic)restrictions are put in place on the adversary's capabilities.Thus, in the practical setting common to many real-world deployments, there is a gap between our lower bounds and the upper bounds provided by the analysis: differential privacy is conservative and adversaries may not be able to leak as much information as suggested by the theoretical bound.

Citations (194)

View on Semantic Scholar

Summary

The paper develops a framework using adversary instantiation to derive empirical lower bounds on differential privacy in machine learning.
It introduces a range of adversarial models—from API access to gradient-based attacks—to quantify varying levels of privacy risk.
Findings reveal that practical privacy leakage is often lower than theoretical estimates, informing more accurate DP-SGD parameter selection.

Analyzing Adversarial Constraints: Lower Bounds for Differentially Private Machine Learning

This paper explores the intricacies of acquiring lower bounds on the differential privacy (DP) guarantees within machine learning frameworks, particularly focusing on differentially private stochastic gradient descent (DP-SGD). The authors propose an innovative methodology that instantiates a hypothetical adversary to examine the extent to which privacy can be compromised under various conditions. By systematically expanding the capabilities of this adversary, the paper seeks to bridge the gap between theoretical upper bounds and the actual empirical risk of privacy leakage.

Core Methodology

Differential privacy in machine learning attempts to quantify and limit the data leakage via probabilistic bounds. The objective is to ensure that the presence or absence of a single data point in a dataset does not significantly affect the outcome of any analysis. To achieve strong DP-SGD guarantees, the authors create several adversarial scenarios where the adversary is allowed access to different degrees of information throughout the training process. This includes access modifications such as API-based interactions, static and adaptive poisoning inputs, and direct manipulation of training gradients.

Experimental Setup and Results

The experimental section of the paper unfolds through a series of well-structured adversary models, each incrementally increasing in complexity and capability:

API Access Adversary: This represents the baseline adversary with standard black-box API access, similar to many real-world MLaaS applications. The paper reports empirical privacy breaches that are substantially lower than the theoretical bounds, suggesting that typical DP analysis may be overly conservative.
Static Poison Input Adversary: Here, the adversary focuses on crafting a worst-case input that maximally influences the model if included in training. Empirical results underscore significant improvements in the adversary’s ability to infer private information, although still remaining below the theoretical maximum.
Intermediate Poison and Adaptive Attacks: These models extend the adversarial power by allowing adaptive changes to training inputs over iterations. Results indicate a tighter privacy leakage bound as the adversary adapts in real-time to model outputs.
Gradient-Based Attacks: Targeting federated learning paradigms, where gradient sharing is intrinsic, the adversary exploits direct gradient manipulation, demonstrating potential near-tight privacy breaches reflective of theoretical maxima when full adversary capabilities are permitted.
Pathological Dataset Creation: The final and most advanced adversarial model constructs an entire dataset designed to minimize unintended training influences. This exhibits empirical privacy metrics that closely match theoretical bounds, affirming the rigor of DP-SGD upper-bound estimates.

Implications and Future Directions

The research offers significant implications for both theoretical and applied domains in differential privacy. Theoretically, it substantiates the tightness of current DP-SGD analyses, especially when adversaries exhibit maximal capacity. However, under more practical constraints, the gap between theory and empirical observations suggests room for potential methodological improvements, possibly through incorporating assumptions about dataset 'naturalness' or adversary limitations.

For applied machine learning, the results provide clarity on the practical efficacy of DP-SGD, advising caution in selecting differential privacy parameters. Specifically, the findings emphasize that while theoretical guarantees are conservatively bound, actual deployments—especially those involving weaker adversaries—may engender significantly lesser privacy risks than those posited by theoretical analysis.

In conclusion, this paper paves the way for nuanced assessments of privacy in machine learning, leveraging adversarial instantiations to critically evaluate and enhance differentially private algorithms. Future work could focus on further unconventional advantages or constraints, aiming to develop a more balanced framework that bridges the theoretical-experimental dichotomy inherent in the privacy-utility spectrum of machine learning algorithms.

PDF Markdown

Related Papers

YouTube

Show All Videos