Comparing privacy notions for protection against reconstruction attacks in machine learning

Published 6 Feb 2025 in cs.LG, cs.CR, cs.IT, and math.IT | (2502.04045v1)

Abstract: Within the machine learning community, reconstruction attacks are a principal concern and have been identified even in federated learning (FL), which was designed with privacy preservation in mind. In response to these threats, the privacy community recommends the use of differential privacy (DP) in the stochastic gradient descent algorithm, termed DP-SGD. However, the proliferation of variants of DP in recent years\textemdash such as metric privacy\textemdash has made it challenging to conduct a fair comparison between different mechanisms due to the different meanings of the privacy parameters $\epsilon$ and $\delta$ across different variants. Thus, interpreting the practical implications of $\epsilon$ and $\delta$ in the FL context and amongst variants of DP remains ambiguous. In this paper, we lay a foundational framework for comparing mechanisms with differing notions of privacy guarantees, namely $(\epsilon,\delta)$-DP and metric privacy. We provide two foundational means of comparison: firstly, via the well-established $(\epsilon,\delta)$-DP guarantees, made possible through the R\'enyi differential privacy framework; and secondly, via Bayes' capacity, which we identify as an appropriate measure for reconstruction threats.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a unified framework leveraging Rènyi Differential Privacy to compare and evaluate diverse privacy mechanisms against reconstruction attacks.
It demonstrates that while the Gaussian mechanism maintains higher accuracy, the VMF mechanism offers superior protection within the same privacy budget.
The study advocates using Bayes’ capacity as a robust metric to quantify privacy risks and guide safer deployment of machine learning systems.

Comparing Privacy Notions for Protection Against Reconstruction Attacks in Machine Learning

The paper, "Comparing Privacy Notions for Protection Against Reconstruction Attacks in Machine Learning," addresses a significant challenge in the field of privacy-preserving machine learning: how different privacy mechanisms protect against reconstruction attacks. The study focuses on the landscape of differential privacy (DP), particularly the interpretation and comparison of privacy parameters $(\epsilon, \delta)$ inherent in these frameworks, and introduces alternative measures like Bayes' capacity for evaluating privacy risks in machine learning contexts.

Major Contributions and Frameworks

The paper introduces a comprehensive framework to compare and evaluate different privacy mechanisms, especially targeting reconstruction attacks—a primary concern in federated learning (FL) scenarios. To this end, it integrates several key approaches:

Rènyi Differential Privacy (RDP): Used as a common ground to unify the diverse landscape of differential privacy. It facilitates the comparison of various DP definitions, allowing a conversion into the more traditional $(\epsilon, \delta)$ -DP, thus enabling a fair comparison between different privacy mechanisms like the Gaussian mechanism used in traditional DP and the von Mises Fisher (VMF) mechanism, which pertains to metric privacy.
Bayes' Capacity: Proposed as a novel measure to assess the privacy risk against reconstruction attacks. Bayes' capacity provides an upper bound on the information leakage, pertinent in characterizing the success of reconstruction attacks, and is particularly relevant to model the actual privacy threats in machine learning deployments.

Practical Experiments and Results

Empirical evaluations conducted in the study reveal distinct outcomes: when parameters for the VMF mechanism are tuned equivalently to the Gaussian mechanism using the RDP framework, utility in terms of accuracy is higher for the Gaussian mechanism. However, VMF provides superior protection against reconstruction attacks within the same privacy budget range—even when empirical results align the utility of both mechanisms.

These results underscore the inadequacy of simply comparing $\epsilon$ -values as a sufficient measure for understanding privacy risks in reconstruction attacks, supporting the paper’s assertion that Bayes' capacity offers a more robust metric for evaluation in such contexts.

Implications and Future Directions

From a theoretical standpoint, this work contributes to a deeper understanding of the nuanced capabilities and limitations of various privacy-preserving mechanisms in machine learning, especially against reconstruction attacks that threaten model integrity and data privacy. The practical implications lie in the possibility of leveraging these findings to inform safer, more privacy-sensitive deployment of machine learning systems, particularly when balancing the need for model accuracy against the risks of personal data exposure.

Going forward, the paper suggests the importance of considering adversarial goals when selecting privacy measures under different computational settings. As machine learning systems increasingly integrate into privacy-sensitive domains, these insights hold substantial potential to guide future research and development in robust privacy-preserving technologies.

Markdown