Data Selection for Transfer Unlearning (2405.10425v1)
Abstract: As deep learning models are becoming larger and data-hungrier, there are growing ethical, legal and technical concerns over use of data: in practice, agreements on data use may change over time, rendering previously-used training data impermissible for training purposes. These issues have driven increased attention to machine unlearning: removing "the influence of" a subset of training data from a trained model. In this work, we advocate for a relaxed definition of unlearning that does not address privacy applications but targets a scenario where a data owner withdraws permission of use of their data for training purposes. In this context, we consider the important problem of \emph{transfer unlearning} where a pretrained model is transferred to a target dataset that contains some "non-static" data that may need to be unlearned in the future. We propose a new method that uses a mechanism for selecting relevant examples from an auxiliary "static" dataset, and finetunes on the selected data instead of "non-static" target data; addressing all unlearning requests ahead of time. We also adapt a recent relaxed definition of unlearning to our problem setting and demonstrate that our approach is an exact transfer unlearner according to it, while being highly efficient (amortized). We find that our method outperforms the gold standard "exact unlearning" (finetuning on only the "static" portion of the target dataset) on several datasets, especially for small "static" sets, sometimes approaching an upper bound for test accuracy. We also analyze factors influencing the accuracy boost obtained by data selection.
- Deep learning methods for land cover and land use classification in remote sensing: A review. In 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO), pp. 903–908. IEEE, 2020.
- Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pp. 141–159. IEEE, 2021.
- Towards making systems forget with machine unlearning. In 2015 IEEE Symposium on Security and Privacy, pp. 463–480. IEEE, 2015.
- When machine unlearning jeopardizes privacy. In Proceedings of the 2021 ACM SIGSAC conference on computer and communications security, pp. 896–911, 2021a.
- Weighted training for cross-task learning. arXiv preprint arXiv:2105.14095, 2021b.
- Remote sensing image scene classification: Benchmark and state of the art. Proceedings of the IEEE, 105(10):1865–1883, 2017.
- Describing textures in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3606–3613, 2014.
- Imagenet large scale visual recognition challenge. In International Conference on Computer Vision (ICCV), 2009.
- Diabetic retinopathy detection, 2015. URL https://kaggle.com/competitions/diabetic-retinopathy-detection.
- Cynthia Dwork. Differential privacy. In International colloquium on automata, languages, and programming, pp. 1–12. Springer, 2006.
- From privacy-theoretic to relative notions of unlearning, 2024.
- Head2toe: Utilizing intermediate representations for better transfer learning. In International Conference on Machine Learning, pp. 6009–6033. PMLR, 2022.
- One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence, 28(4):594–611, 2006.
- Learn-to-share: A hardware-friendly transfer learning framework exploiting computation and parameter sharing. In Proceedings of the International Conference on Machine Learning, 2021.
- Making ai forget you: Data deletion in machine learning. Advances in neural information processing systems, 32, 2019.
- Towards adversarial evaluations for inexact machine unlearning. arXiv preprint arXiv:2201.06640, 2022.
- Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9304–9312, 2020a.
- Forgetting outside the box: Scrubbing deep networks of information accessible from input-output observations. In European Conference on Computer Vision, pp. 383–398. Springer, 2020b.
- Mixed-privacy forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 792–801, 2021.
- Amnesiac machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
- Spottune: transfer learning through adaptive fine-tuning. In Proceedings of the Conference on Computer Vision and Pattern Recognition, 2019.
- Inexact unlearning needs more careful evaluations to avoid a false sense of privacy. arXiv preprint arXiv:2403.01218, 2024.
- Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016.
- Parameter-efficient transfer learning for nlp. In Proceedings of the International Conference on Machine Learning, 2019.
- Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
- Towards unbounded machine unlearning. arXiv preprint arXiv:2302.09880, 2023.
- Improved fine-tuning by better leveraging pre-training data. In Advances in Neural Information Processing Systems, 2021.
- Alessandro Mantelero. The eu proposal for a general data protection regulation and the roots of the ‘right to be forgotten’. Computer Law & Security Review, 29(3):229–235, 2013.
- Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
- A survey of machine unlearning. arXiv preprint arXiv:2209.02299, 2022.
- Automated flower classification over a large number of classes. In 2008 Sixth Indian conference on computer vision, graphics & image processing, pp. 722–729. IEEE, 2008.
- Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition, pp. 3498–3505. IEEE, 2012.
- Pytorch: An imperative style, high-performance deep learning library. https://pytorch.org/, 2019.
- In-context unlearning: Language models as few shot unlearners. arXiv preprint arXiv:2310.07579, 2023.
- Scalable transfer learning with expert models. arXiv preprint arXiv:2009.13239, 2020.
- Learning multiple visual domains with residual adapters. Advances in Neural Information Processing Systems, 2017.
- Remember what you want to forget: Algorithms for machine unlearning. Advances in Neural Information Processing Systems, 34:18075–18086, 2021.
- Unrolling sgd: Understanding factors influencing machine unlearning. In 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P), pp. 303–319. IEEE, 2022.
- Neurips 2023 - machine unlearning, 2023. URL https://kaggle.com/competitions/neurips-2023-machine-unlearning.
- Dataset projection: Finding target-aligned subsets of auxiliary data, 2022.
- Arcane: An efficient architecture for exact machine unlearning. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 4006–4013, 2022.
- Identifying spurious biases early in training through the lens of simplicity bias. arXiv preprint arXiv:2305.18761, 2023.
- Large language model unlearning. arXiv preprint arXiv:2310.10683, 2023.
- A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419, 2023.
- Nazanin Mohammadi Sepahvand (3 papers)
- Vincent Dumoulin (34 papers)
- Eleni Triantafillou (20 papers)
- Gintare Karolina Dziugaite (54 papers)