Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Machine Unlearning Jeopardizes Privacy (2005.02205v2)

Published 5 May 2020 in cs.CR, cs.LG, and stat.ML

Abstract: The right to be forgotten states that a data owner has the right to erase their data from an entity storing it. In the context of ML, the right to be forgotten requires an ML model owner to remove the data owner's data from the training set used to build the ML model, a process known as machine unlearning. While originally designed to protect the privacy of the data owner, we argue that machine unlearning may leave some imprint of the data in the ML model and thus create unintended privacy risks. In this paper, we perform the first study on investigating the unintended information leakage caused by machine unlearning. We propose a novel membership inference attack that leverages the different outputs of an ML model's two versions to infer whether a target sample is part of the training set of the original model but out of the training set of the corresponding unlearned model. Our experiments demonstrate that the proposed membership inference attack achieves strong performance. More importantly, we show that our attack in multiple cases outperforms the classical membership inference attack on the original ML model, which indicates that machine unlearning can have counterproductive effects on privacy. We notice that the privacy degradation is especially significant for well-generalized ML models where classical membership inference does not perform well. We further investigate four mechanisms to mitigate the newly discovered privacy risks and show that releasing the predicted label only, temperature scaling, and differential privacy are effective. We believe that our results can help improve privacy protection in practical implementations of machine unlearning. Our code is available at https://github.com/MinChen00/UnlearningLeaks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Min Chen (200 papers)
  2. Zhikun Zhang (39 papers)
  3. Tianhao Wang (98 papers)
  4. Michael Backes (157 papers)
  5. Mathias Humbert (19 papers)
  6. Yang Zhang (1129 papers)
Citations (190)

Summary

Machine Unlearning and Privacy Risks: A New Perspective

The paper "When Machine Unlearning Jeopardizes Privacy" offers a critical assessment of the impact of machine unlearning on data privacy, particularly focusing on unintended information leakage. Machine unlearning is a method employed to comply with the legal requirement of data deletion, as encapsulated in regulations like the General Data Protection Regulation (GDPR). It involves removing any traces of specified data from a trained ML model. While the primary purpose of unlearning is to protect individual privacy, the paper argues that this process might inadvertently compromise privacy instead.

Key Contributions and Findings

  1. Novel Membership Inference Attack: The authors introduce an innovative membership inference attack that differentiates between the outputs of two versions of an ML model — the original and the unlearned model. This technique enables an adversary to determine whether a data sample was initially part of the model's training set. Experimental results indicate that this attack often surpasses classical membership inference attacks, particularly in well-generalized ML models.
  2. Privacy Degradation Metrics: To quantify privacy risks associated with machine unlearning, two novel metrics, Degradation Count and Degradation Rate, are proposed. These metrics assess the differential privacy loss attributable to unlearning, providing a quantifiable basis to measure how privacy is compromised.
  3. Defense Mechanisms: The paper evaluates four possible defenses to mitigate the novel attack's impact: publishing only the predicted label, temperature scaling, differential privacy, and limiting the output to top-k confidence scores. Among these, differential privacy and publishing only the predicted label are found to be effective deterrents in preserving privacy.
  4. Broader Implications: The research highlights the paradoxical situation where an effort to enhance privacy via unlearning can inadvertently expose sensitive information. This underlines the complexity and the need for rigorous privacy assessments in deploying machine unlearning practices.

Experimental Evaluation

The paper conducts extensive experiments using categorical datasets and image datasets across various ML models, including logistic regression, decision trees, random forests, and convolutional neural networks such as DenseNet and ResNet50. The findings demonstrate that while unlearning might clear explicit traces, the residual impact on the model's parameters could allow inferences about the original training data, thus jeopardizing privacy.

Theoretical and Practical Implications

From a theoretical perspective, the findings necessitate a reevaluation of machine unlearning techniques to better understand their limitations and vulnerabilities. Practically, the insights call for developing more robust mechanisms that can ensure the intended privacy objectives without backfiring.

Future Directions

The paper opens several avenues for future exploration, including:

  • Refinement of Defense Mechanisms: Further refinement and validation of defense strategies can provide additional robustness against inference attacks in unlearning setups.
  • Evaluation Across Domains: Extending the empirical evaluations to broader contexts and datasets can validate the findings' generalizability.
  • Impact on Other AI Developments: Understanding how these privacy risks can affect other AI applications and devising ways to address them without compromising model performance or utility.

Conclusion

This research presents a critical reevaluation of machine unlearning, highlighting a significant and previously underexplored privacy risk. By providing a novel adversarial perspective on unlearning and proposing actionable solutions, this paper makes a substantial contribution to the discourse on data privacy in machine learning, prompting both researchers and practitioners to reevaluate existing strategies and implementations.