Continual Learning of Diffusion Models with Generative Distillation (2311.14028v2)
Abstract: Diffusion models are powerful generative models that achieve state-of-the-art performance in image synthesis. However, training them demands substantial amounts of data and computational resources. Continual learning would allow for incrementally learning new tasks and accumulating knowledge, thus enabling the reuse of trained models for further learning. One potentially suitable continual learning approach is generative replay, where a copy of a generative model trained on previous tasks produces synthetic data that are interleaved with data from the current task. However, standard generative replay applied to diffusion models results in a catastrophic loss in denoising capabilities. In this paper, we propose generative distillation, an approach that distils the entire reverse process of a diffusion model. We demonstrate that our approach substantially improves the continual learning performance of generative replay with only a modest increase in the computational costs.
- Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32, 2019.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020a.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, volume 24, pages 109–165. Elsevier, 1989.
- Roger Ratcliff. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological Review, 97:285–308, 1990.
- Three types of incremental learning. Nature Machine Intelligence, 4(12):1185–1197, 2022.
- Exploring continual learning of diffusion models. arXiv preprint arXiv:2303.15342, 2023.
- Continual learning with deep generative replay. In Advances in Neural Information Processing Systems, volume 30, 2017.
- DDGR: Continual learning with deep diffusion-based generative replay. In Proceedings of the 40th International Conference on Machine Learning, 2023.
- Denoising diffusion implicit models. In International Conference on Learning Representations, 2020b.
- Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
- Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:2101.02388, 2021.
- Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2021.
- TRACT: Denoising diffusion models with transitive closure time-distillation. arXiv preprint arXiv:2303.04248, 2023.
- Consistency models. In Proceedings of the 40th International Conference on Machine Learning, 2023.
- BOOT: Data-free distillation of denoising diffusion models with bootstrapping. In ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling, 2023.
- Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
- Continual diffusion: Continual customization of text-to-image diffusion with C-LoRA. arXiv preprint arXiv:2304.06027, 2023.
- Brain-inspired replay for continual learning with artificial neural networks. Nature Communications, 11:4069, 2020.
- GANs trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems, 30, 2017.
- DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022a.
- DPM-solver++: Fast solver for guided sampling of diffusion probabilistic models. arXiv preprint arXiv:2211.01095, 2022b.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Avalanche: an end-to-end library for continual learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2nd Continual Learning in Computer Vision Workshop, 2021.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
- Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017.