Evaluations of Machine Learning Privacy Defenses are Misleading (2404.17399v2)

Published 26 Apr 2024 in cs.CR and cs.LG

Abstract: Empirical defenses for machine learning privacy forgo the provable guarantees of differential privacy in the hope of achieving higher utility while resisting realistic adversaries. We identify severe pitfalls in existing empirical privacy evaluations (based on membership inference attacks) that result in misleading conclusions. In particular, we show that prior evaluations fail to characterize the privacy leakage of the most vulnerable samples, use weak attacks, and avoid comparisons with practical differential privacy baselines. In 5 case studies of empirical privacy defenses, we find that prior evaluations underestimate privacy leakage by an order of magnitude. Under our stronger evaluation, none of the empirical defenses we study are competitive with a properly tuned, high-utility DP-SGD baseline (with vacuous provable guarantees).

Citations (9)

View on Semantic Scholar

Summary

The paper shows that aggregate evaluation metrics mask the true privacy risks of individual, vulnerable data points.
It demonstrates that non-adaptive attack models yield overly optimistic results and advocates for adaptive, context-specific attacks.
The study recommends fair comparisons using stronger DP baselines tuned for high utility to better assess privacy-utility trade-offs.

Evaluations of Machine Learning Privacy Defenses are Misleading

Introduction: Challenges in Current Privacy Evaluations

The paper critically examines the current empirical evaluation methods used for privacy defenses in machine learning, highlighting significant methodological flaws. It emphasizes that existing practices founded on membership inference attacks do not capture the true privacy risks, particularly for the most vulnerable samples in a dataset. The authors argue that these evaluations tend to use suboptimal attack models, overlook individual data point privacy, and unfairly compare with weak differential privacy (DP) baselines.

Flaws in Existing Evaluation Techniques

Aggregate Metrics Fail to Reflect Individual Privacy

Current evaluation methods aggregate the privacy metrics over the entire dataset, which dilutes the statistical significance of the most vulnerable data points' privacy. This practice often presents an overly optimistic view of a method's privacy-preserving capabilities. The analysis in the paper demonstrates that even methods that significantly expose individual data points could pass these evaluations by not affecting the aggregated metrics much.

Weak and Non-adaptive Attack Models

Empirical evaluations frequently deploy non-adaptive and outdated attack models that fail to challenge the defense mechanisms robustly. Such approaches are inadequate as they do not mimic a realistic adversary capable of exploiting specific weaknesses in privacy defenses. This paper compares this issue to similar evaluation flaws in the adversarial machine learning domain, where non-adaptive attacks overestimate the robustness of models.

Comparisons with Inappropriately Weak DP Baselines

The common practice in empirical evaluations involves comparing new methods against differential privacy setups that use excessively strict privacy constraints, leading to reduced utility. This often results in comparisons where the new method appears superior in balancing privacy and utility. The paper criticizes this approach by arguing for a fair comparison against state-of-the-art DP methods tuned similarly for high utility.

Proposed Solutions for More Accurate Evaluations

Targeted Evaluation of the Most Vulnerable Samples

A significant improvement proposed by the authors is to evaluate privacy defenses based on their ability to protect the most vulnerable samples rather than averaging across the entire dataset. They suggest using 'canary' records that serve as proxies for the most at-risk samples, which would provide a more realistic audit of the privacy risk.

Adaptive and Context-Specific Attacks

To simulate a more realistic adversarial environment, the paper recommends using adaptive attacks specifically designed to exploit the unique weaknesses of each evaluated defense method. This approach would yield a more accurate depiction of how methods might perform against real-world adversaries.

Stronger DP Baselines for Fairer Comparisons

The authors advise comparing privacy-preserving methods against appropriately tuned DP-SGD baselines that focus on maximizing utility without rigid privacy constraints. This allows a fair comparison where both methods are optimized for high utility, making differential privacy a competitive baseline even when practical guarantees are relaxed.

Conclusions and Implications for Future Research

The research presents a critical viewpoint on current empirical evaluation practices for privacy defenses in machine learning, proposing methodologically rigorous alternatives that address overlooked vulnerabilities. This paper sets a higher benchmark for future defenses, particularly emphasizing the need for protecting the most vulnerable data points effectively. The suggested practices pivot towards a more realistic and demanding test of privacy-preserving methods, pushing the field towards developing more robust defenses against sophisticated attacks.

The authors encourage the adoption of their evaluation framework to foster more accurate, reproducible, and challenging research in machine learning privacy, which could profoundly influence how future privacy-preserving models are developed and assessed.

Related Papers

Tweets

https://twitter.com/AerniMichael/status/1784946627695784361

https://twitter.com/FSFG/status/1832055659556163795

https://twitter.com/FSFG/status/1784955760885026997