Dataset Condensation Driven Machine Unlearning (2402.00195v2)
Abstract: The current trend in data regulation requirements and privacy-preserving machine learning has emphasized the importance of machine unlearning. The naive approach to unlearning training data by retraining over the complement of the forget samples is susceptible to computational challenges. These challenges have been effectively addressed through a collection of techniques falling under the umbrella of machine unlearning. However, there still exists a lack of sufficiency in handling persistent computational challenges in harmony with the utility and privacy of unlearned model. We attribute this to the lack of work on improving the computational complexity of approximate unlearning from the perspective of the training dataset. In this paper, we aim to fill this gap by introducing dataset condensation as an essential component of machine unlearning in the context of image classification. To achieve this goal, we propose new dataset condensation techniques and an innovative unlearning scheme that strikes a balance between machine unlearning privacy, utility, and efficiency. Furthermore, we present a novel and effective approach to instrumenting machine unlearning and propose its application in defending against membership inference and model inversion attacks. Additionally, we explore a new application of our approach, which involves removing data from `condensed model', which can be employed to quickly train any arbitrary model without being influenced by unlearning samples. The corresponding code is available at \href{https://github.com/algebraicdianuj/DC_U}{URL}.
- J. Rosen, “The right to be forgotten,” Stan. L. Rev. Online, vol. 64, p. 88, 2011.
- C. J. Hoofnagle, B. Van Der Sloot, and F. Z. Borgesius, “The european union general data protection regulation: what it is and what it means,” Information & Communications Technology Law, vol. 28, no. 1, pp. 65–98, 2019.
- M. Kurmanji, P. Triantafillou, and E. Triantafillou, “Towards unbounded machine unlearning,” arXiv preprint arXiv:2302.09880, 2023.
- Y. Liu, M. Fan, C. Chen, X. Liu, Z. Ma, L. Wang, and J. Ma, “Backdoor defense with machine unlearning,” in IEEE INFOCOM 2022-IEEE Conference on Computer Communications, pp. 280–289, IEEE, 2022.
- N. G. Marchant, B. I. Rubinstein, and S. Alfeld, “Hard to forget: Poisoning attacks on certified machine unlearning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 7691–7700, 2022.
- J. Z. Di, J. Douglas, J. Acharya, G. Kamath, and A. Sekhari, “Hidden poison: Machine unlearning enables camouflaged poisoning attacks,” in NeurIPS ML Safety Workshop, 2022.
- T. Feng, M. Wang, and H. Yuan, “Overcoming catastrophic forgetting in incremental object detection via elastic response distillation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9427–9436, 2022.
- P. W. Koh and P. Liang, “Understanding black-box predictions via influence functions,” in International conference on machine learning, pp. 1885–1894, PMLR, 2017.
- V. S. Chundawat, A. K. Tarun, M. Mandal, and M. Kankanhalli, “Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 7210–7217, 2023.
- J. Jia, J. Liu, P. Ram, Y. Yao, G. Liu, Y. Liu, P. Sharma, and S. Liu, “Model sparsification can simplify machine unlearning,” arXiv preprint arXiv:2304.04934, 2023.
- Z. Bu, Y.-X. Wang, S. Zha, and G. Karypis, “Differentially private optimization on large model at small cost,” arXiv preprint arXiv:2210.00038, 2022.
- Z. Bu, Y.-X. Wang, S. Zha, and G. Karypis, “Differentially private bias-term only fine-tuning of foundation models,” arXiv preprint arXiv:2210.00036, 2022.
- A. Thudi, G. Deza, V. Chandrasekaran, and N. Papernot, “Unrolling sgd: Understanding factors influencing machine unlearning,” in 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P), pp. 303–319, IEEE, 2022.
- L. Bourtoule, V. Chandrasekaran, C. A. Choquette-Choo, H. Jia, A. Travers, B. Zhang, D. Lie, and N. Papernot, “Machine unlearning,” in 2021 IEEE Symposium on Security and Privacy (SP), pp. 141–159, IEEE, 2021.
- Y. Cao and J. Yang, “Towards making systems forget with machine unlearning,” in 2015 IEEE symposium on security and privacy, pp. 463–480, IEEE, 2015.
- L. Graves, V. Nagisetty, and V. Ganesh, “Amnesiac machine learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11516–11524, 2021.
- Y. Wu, E. Dobriban, and S. Davidson, “Deltagrad: Rapid retraining of machine learning models,” in International Conference on Machine Learning, pp. 10355–10366, PMLR, 2020.
- A. Warnecke, L. Pirch, C. Wressnegger, and K. Rieck, “Machine unlearning of features and labels,” arXiv preprint arXiv:2108.11577, 2021.
- A. Golatkar, A. Achille, and S. Soatto, “Eternal sunshine of the spotless net: Selective forgetting in deep networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9304–9312, 2020.
- A. Mahadevan and M. Mathioudakis, “Certifiable machine unlearning for linear models,” arXiv preprint arXiv:2106.15093, 2021.
- A. K. Tarun, V. S. Chundawat, M. Mandal, and M. Kankanhalli, “Fast yet effective machine unlearning,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
- M. Chen, W. Gao, G. Liu, K. Peng, and C. Wang, “Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7766–7775, 2023.
- R. Shokri, M. Stronati, C. Song, and V. Shmatikov, “Membership inference attacks against machine learning models,” in 2017 IEEE symposium on security and privacy (SP), pp. 3–18, IEEE, 2017.
- M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp. 1322–1333, 2015.
- V. S. Chundawat, A. K. Tarun, M. Mandal, and M. Kankanhalli, “Zero-shot machine unlearning,” IEEE Transactions on Information Forensics and Security, 2023.
- B. Zhao, K. R. Mopuri, and H. Bilen, “Dataset condensation with gradient matching,” arXiv preprint arXiv:2006.05929, 2020.
- B. Zhao and H. Bilen, “Dataset condensation with distribution matching,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 6514–6523, 2023.
- G. Cazenavette, T. Wang, A. Torralba, A. A. Efros, and J.-Y. Zhu, “Dataset distillation by matching training trajectories,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4750–4759, 2022.
- B. Zhao and H. Bilen, “Dataset condensation with differentiable siamese augmentation,” in International Conference on Machine Learning, pp. 12674–12685, PMLR, 2021.
- G. Zhao, G. Li, Y. Qin, and Y. Yu, “Improved distribution matching for dataset condensation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7856–7865, 2023.
- J.-H. Kim, J. Kim, S. J. Oh, S. Yun, H. Song, J. Jeong, J.-W. Ha, and H. O. Song, “Dataset condensation via efficient synthetic-data parameterization,” in International Conference on Machine Learning, pp. 11102–11118, PMLR, 2022.
- T. Wiatowski and H. Bölcskei, “A mathematical theory of deep convolutional neural networks for feature extraction,” IEEE Transactions on Information Theory, vol. 64, no. 3, pp. 1845–1866, 2017.
- A. Dosovitskiy and T. Brox, “Inverting visual representations with convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4829–4837, 2016.
- F. Mo, A. S. Shamsabadi, K. Katevas, S. Demetriou, I. Leontiadis, A. Cavallaro, and H. Haddadi, “Darknetz: towards model privacy at the edge using trusted execution environments,” in Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services, pp. 161–174, 2020.
- M. Nasr, R. Shokri, and A. Houmansadr, “Comprehensive privacy analysis of deep learning,” in Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), pp. 1–15, 2018.
- M. Raghu, C. Zhang, J. Kleinberg, and S. Bengio, “Transfusion: Understanding transfer learning for medical imaging,” Advances in neural information processing systems, vol. 32, 2019.
- S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha, “Privacy risk in machine learning: Analyzing the connection to overfitting,” in 2018 IEEE 31st computer security foundations symposium (CSF), pp. 268–282, IEEE, 2018.
- H. Tanaka, D. Kunin, D. L. Yamins, and S. Ganguli, “Pruning neural networks without any data by iteratively conserving synaptic flow,” Advances in neural information processing systems, vol. 33, pp. 6377–6389, 2020.
- S. Liao, Beyond perturbation: introduction to the homotopy analysis method. CRC press, 2003.
- L. F. Richardson, Advanced calculus: an introduction to linear analysis. John Wiley & Sons, 2011.
- G. Strang, Introduction to linear algebra. SIAM, 2022.
- R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2012.
- B. Ghorbani, S. Krishnan, and Y. Xiao, “An investigation into neural net optimization via hessian eigenvalue density,” in International Conference on Machine Learning, pp. 2232–2241, PMLR, 2019.
- C. Guo, T. Goldstein, A. Hannun, and L. Van Der Maaten, “Certified data removal from machine learning models,” arXiv preprint arXiv:1911.03030, 2019.
- B. Hanin and D. Rolnick, “How to start training: The effect of initialization and architecture,” Advances in Neural Information Processing Systems, vol. 31, 2018.
- H. Petzka, M. Kamp, L. Adilova, C. Sminchisescu, and M. Boley, “Relative flatness and generalization,” Advances in neural information processing systems, vol. 34, pp. 18420–18432, 2021.
- Springer Science & Business Media, 1993.
- N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, “Membership inference attacks from first principles,” in 2022 IEEE Symposium on Security and Privacy (SP), pp. 1897–1914, IEEE, 2022.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.