Machine Unlearning for Image-to-Image Generative Models (2402.00351v2)
Abstract: Machine unlearning has emerged as a new paradigm to deliberately forget data samples from a given model in order to adhere to stringent regulations. However, existing machine unlearning methods have been primarily focused on classification models, leaving the landscape of unlearning for generative models relatively unexplored. This paper serves as a bridge, addressing the gap by providing a unifying framework of machine unlearning for image-to-image generative models. Within this framework, we propose a computationally-efficient algorithm, underpinned by rigorous theoretical analysis, that demonstrates negligible performance degradation on the retain samples, while effectively removing the information from the forget samples. Empirical studies on two large-scale datasets, ImageNet-1K and Places-365, further show that our algorithm does not rely on the availability of the retain samples, which further complies with data retention policy. To our best knowledge, this work is the first that represents systemic, theoretical, empirical explorations of machine unlearning specifically tailored for image-to-image generative models. Our code is available at https://github.com/jpmorganchase/l2l-generator-unlearning.
- Wasserstein generative adversarial networks. In International Conference on Machine Learning, pp. 214–223. PMLR, 2017.
- Gradient surgery for one-shot unlearning on generative model. CoRR, abs/2307.04550, 2023.
- Mutual information neural estimation. In International Conference on Machine Learning, pp. 531–540. PMLR, 2018.
- On the opportunities and risks of foundation models. CoRR, abs/2108.07258, 2021.
- Machine unlearning. In 42nd IEEE Symposium on Security and Privacy, SP 2021, San Francisco, CA, USA, 24-27 May 2021, pp. 141–159. IEEE, 2021.
- Large scale GAN training for high fidelity natural image synthesis. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
- Sparks of artificial general intelligence: Early experiments with GPT-4. CoRR, abs/2303.12712, 2023.
- To learn image super-resolution, use a gan to learn how to do image degradation first. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 185–200, 2018.
- Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), pp. 5253–5270, 2023.
- Maskgit: Masked generative image transformer. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pp. 11305–11315. IEEE, 2022.
- Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pp. 7766–7775. IEEE, 2023.
- Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems, pp. 2172–2180, 2016.
- Forget unlearning: Towards true data-deletion in machine learning. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (eds.), International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp. 6028–6073. PMLR, 2023.
- Zero-shot machine unlearning. IEEE Trans. Inf. Forensics Secur., 18:2345–2354, 2023.
- Japan Congress. Act on the protection of personal information, 2022a. URL https://www.ppc.go.jp/files/pdf/280222_amendedlaw.pdf.
- United States Congress. American data privacy and protection act, 2022b. URL https://www.congress.gov/bill/117th-congress/house-bill/8152.
- Elements of Information Theory, chapter 12, pp. 409–413. Wiley, 2012. ISBN 9781118585771.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pp. 248–255. IEEE Computer Society, 2009.
- Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems, pp. 8780–8794, 2021.
- An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
- Masked autoencoders as spatiotemporal learners. In Advances in Neural Information Processing Systems, volume 35, pp. 35946–35958, 2022.
- Erasing concepts from diffusion models. CoRR, abs/2303.07345, 2023.
- Eternal sunshine of the spotless net: Selective forgetting in deep networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 9301–9309. Computer Vision Foundation / IEEE, 2020a.
- Forgetting outside the box: Scrubbing deep networks of information accessible from input-output observations. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 383–398. Springer, 2020b.
- Generative adversarial networks. CoRR, abs/1406.2661, 2014.
- Amnesiac machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 11516–11524, 2021.
- Improved training of wasserstein gans. In Advances in Neural Information Processing Systems, pp. 5767–5777, 2017.
- Federated unlearning: How to efficiently erase a client in fl? CoRR, abs/2207.05521, 2022.
- Masked autoencoders are scalable vision learners. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pp. 15979–15988. IEEE, 2022.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems, pp. 6626–6637, 2017.
- Denoising diffusion probabilistic models. In Advances in neural information processing systems, pp. 6840–6851, 2020.
- Autoregressive diffusion models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
- Model sparsification can simplify machine unlearning. CoRR, abs/2304.04934, 2023.
- A style-based generator architecture for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 4401–4410. Computer Vision Foundation / IEEE, 2019.
- Analyzing and improving the image quality of stylegan. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 8107–8116. Computer Vision Foundation / IEEE, 2020.
- An introduction to variational autoencoders. Found. Trends Mach. Learn., 12(4):307–392, 2019.
- A mutual information maximization perspective of language representation learning. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
- Data redaction from conditional generative models. CoRR, abs/2305.11351, 2023.
- Boundless: Generative adversarial networks for image extension. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 10520–10529. IEEE, 2019.
- Towards improving privacy of synthetic datasets. In Privacy Technologies and Policy - 9th Annual Privacy Forum, APF 2021, Oslo, Norway, June 17-18, 2021, Proceedings, volume 12703 of Lecture Notes in Computer Science, pp. 106–119. Springer, 2021.
- Towards unbounded machine unlearning. CoRR, abs/2302.09880, 2023.
- MAGE: masked generative encoder to unify representation learning and image synthesis. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pp. 2142–2152. IEEE, 2023.
- Analyzing leakage of personally identifiable information in language models. In 44th IEEE Symposium on Security and Privacy, SP 2023, San Francisco, CA, USA, May 21-25, 2023, pp. 346–363. IEEE, 2023.
- Feature unlearning for generative models via implicit feedback. CoRR, abs/2303.05699, 2023.
- Descent-to-delete: Gradient-based methods for machine unlearning. In Algorithmic Learning Theory, 16-19 March 2021, Virtual Conference, Worldwide, volume 132 of Proceedings of Machine Learning Research, pp. 931–962. PMLR, 2021.
- A survey of machine unlearning. CoRR, abs/2209.02299, 2022.
- Canada Parliament. The personal information protection and electronic documents act (pipeda), 2019. URL https://laws-lois.justice.gc.ca/PDF/P-8.6.pdf.
- Regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation) (text with eea relevance), 2016. URL https://eur-lex.europa.eu/eli/reg/2016/679/oj.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, pp. 8024–8035, 2019.
- On variational bounds of mutual information. In International Conference on Machine Learning, pp. 5171–5180. PMLR, 2019.
- Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pp. 8748–8763. PMLR, 2021.
- High-resolution image synthesis with latent diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pp. 10674–10685. IEEE, 2022.
- Palette: Image-to-image diffusion models. In SIGGRAPH ’22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7 - 11, 2022, pp. 15:1–15:10. ACM, 2022a.
- Photorealistic text-to-image diffusion models with deep language understanding. In Advances in Neural Information Processing Systems, pp. 36479–36494, 2022b.
- Progressive distillation for fast sampling of diffusion models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
- Improved techniques for training gans. In Advances in Neural Information Processing Systems, pp. 2226–2234, 2016.
- LAION-5B: an open large-scale dataset for training next generation image-text models. In Advances in Neural Information Processing Systems, 2022.
- Diffusion art or digital forgery? investigating data replication in diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pp. 6048–6058. IEEE, 2023.
- Solving inverse problems with latent diffusion models via hard data consistency. CoRR, abs/2307.08123, 2023.
- Improved techniques for training score-based generative models. In Advances in Neural Information Processing Systems, pp. 12438–12448, 2020.
- Generative adversarial networks unlearning. CoRR, abs/2308.09881, 2023.
- Fast yet effective machine unlearning. IEEE Transactions on Neural Networks and Learning Systems, 2023a.
- Deep regression unlearning. In International Conference on Machine Learning, pp. 33921–33939. PMLR, 2023b.
- Memorization without overfitting: Analyzing the training dynamics of large language models. In Advances in Neural Information Processing Systems, pp. 38274–38290, 2022.
- Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. In Advances in Neural Information Processing Systems, volume 35, pp. 10078–10093, 2022.
- EDICT: exact diffusion inversion via coupled transformations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pp. 22532–22541. IEEE, 2023.
- Machine unlearning of features and labels. In 30th Annual Network and Distributed System Security Symposium, NDSS 2023, San Diego, California, USA, February 27 - March 3, 2023. The Internet Society, 2023.
- On mutual information in contrastive learning for visual representations. arXiv preprint arXiv:2005.13149, 2020.
- On the quantitative analysis of decoder-based generative models. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
- GAN inversion: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3121–3138, 2023.
- Machine unlearning: A survey. ACM Comput. Surv., 56(1), aug 2023.
- Diffusion models: A comprehensive survey of methods and applications. CoRR, abs/2209.00796, 2022.
- Adding conditional control to text-to-image diffusion models. CoRR, abs/2302.05543, 2023.
- Machine unlearning methodology based on stochastic teacher network. In Advanced Data Mining and Applications - 19th International Conference, ADMA 2023, Shenyang, China, August 21-23, 2023, Proceedings, Part V, volume 14180 of Lecture Notes in Computer Science, pp. 250–261. Springer, 2023a.
- Machine unlearning by reversing the continual learning. Applied Sciences, 13(16):9341, 2023b.
- Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6):1452–1464, 2017. URL https://github.com/CSAILVision/places365.
- Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 2242–2251. IEEE Computer Society, 2017.