AdjointDEIS: Efficient Gradients for Diffusion Models (2405.15020v2)
Abstract: The optimization of the latents and parameters of diffusion models with respect to some differentiable metric defined on the output of the model is a challenging and complex problem. The sampling for diffusion models is done by solving either the probability flow ODE or diffusion SDE wherein a neural network approximates the score function allowing a numerical ODE/SDE solver to be used. However, naive backpropagation techniques are memory intensive, requiring the storage of all intermediate states, and face additional complexity in handling the injected noise from the diffusion term of the diffusion SDE. We propose a novel family of bespoke ODE solvers to the continuous adjoint equations for diffusion models, which we call AdjointDEIS. We exploit the unique construction of diffusion SDEs to further simplify the formulation of the continuous adjoint equations using exponential integrators. Moreover, we provide convergence order guarantees for our bespoke solvers. Significantly, we show that continuous adjoint equations for diffusion SDEs actually simplify to a simple ODE. Lastly, we demonstrate the effectiveness of AdjointDEIS for guided generation with an adversarial attack in the form of the face morphing problem. Our code will be released on our project page https://zblasingame.github.io/AdjointDEIS/
- Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf.
- Denoising diffusion implicit models. In International Conference on Learning Representations, 2021a. URL https://openreview.net/forum?id=St1giarCHLP.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, June 2022.
- Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv e-prints, art. arXiv:2204.06125, April 2022. doi: 10.48550/arXiv.2204.06125.
- Photorealistic text-to-image diffusion models with deep language understanding. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 36479–36494. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ec795aeadae0b7d230fa35cbaf04c041-Paper-Conference.pdf.
- AudioLDM: Text-to-Audio Generation with Latent Diffusion Models. arXiv e-prints, art. arXiv:2301.12503, January 2023. doi: 10.48550/arXiv.2301.12503.
- Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models. arXiv e-prints, art. arXiv:2304.08818, April 2023. doi: 10.48550/arXiv.2304.08818.
- Elucidating the design space of diffusion-based generative models. In Proc. NeurIPS, 2022.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. arXiv preprint arXiv:2208.12242, 2022.
- An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. arXiv e-prints, art. arXiv:2208.01618, August 2022. doi: 10.48550/arXiv.2208.01618.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Classifier-free diffusion guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021. URL https://openreview.net/forum?id=qw8AKxfYbI.
- Universal Guidance for Diffusion Models. arXiv e-prints, art. arXiv:2302.07121, February 2023. doi: 10.48550/arXiv.2302.07121.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021b. URL https://openreview.net/forum?id=PxTIG12RRHS.
- Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 5775–5787. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/260a14acce2a89dad36adc8eefe7c59e-Paper-Conference.pdf.
- Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models, 2023.
- Fast sampling of diffusion models with exponential integrator. In International Conference on Learning Representations, 2023.
- Neural ordinary differential equations. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper_files/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf.
- Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=TIdIXIpzhoI.
- Consistency models. arXiv preprint arXiv:2303.01469, 2023.
- Generative modeling by estimating gradients of the data distribution. Curran Associates Inc., Red Hook, NY, USA, 2019.
- The mathematical theory of optimal processes. ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik, 43(10-11):514–515, 1963. doi: https://doi.org/10.1002/zamm.19630431023. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/zamm.19630431023.
- Exponential integrators. Acta Numerica, 19:209–286, 2010. doi: 10.1017/S0962492910000048.
- Iyabo Ann Adamu. Numerical approximation of SDEs & the stochastic Swift-Hohenberg equation. PhD thesis, Heriot-Watt University, 2011.
- Seeds: Exponential sde solvers for fast high-quality sampling from diffusion models. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 68061–68120. Curran Associates, Inc., 2023. URL https://proceedings.neurips.cc/paper_files/paper/2023/file/d6f764aae383d9ff28a0f89f71defbd9-Paper-Conference.pdf.
- Numerical Solution of Ordinary Differential Equations. Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts. Wiley, 2011. ISBN 9781118164525. URL https://books.google.com/books?id=QzjGgLlKCYQC.
- SDEdit: Guided image synthesis and editing with stochastic differential equations. In International Conference on Learning Representations, 2022.
- The blessing of randomness: SDE beats ODE in general diffusion-based image editing. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=DesYwmUG00.
- Scalable gradients for stochastic differential equations. In Silvia Chiappa and Roberto Calandra, editors, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, pages 3870–3882. PMLR, 26–28 Aug 2020. URL https://proceedings.mlr.press/v108/li20i.html.
- Efficient and accurate gradients for neural sdes. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 18747–18761. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/9ba196c7a6e89eafd0954de80fc1b224-Paper.pdf.
- H. Kunita. Stochastic Flows and Stochastic Differential Equations. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1990. ISBN 9780521599252. URL https://books.google.com/books?id=_S1RiCosqbMC.
- A latent space of stochastic diffusion models for zero-shot image editing and guidance. In ICCV, 2023.
- Leveraging diffusion for strong and high quality face morphing attacks. IEEE Transactions on Biometrics, Behavior, and Identity Science, 6(1):118–131, 2024. doi: 10.1109/TBIOM.2024.3349857.
- Detecting morphed face images. In IEEE 8th Int’l Conf. on Biometrics Theory, Applications and Systems (BTAS), pages 1–7, 2016. doi: 10.1109/BTAS.2016.7791169.
- Are gan-based morphs threatening face recognition? In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2959–2963, 2022. doi: 10.1109/ICASSP43922.2022.9746477.
- Fast-dim: Towards fast diffusion morphs. arXiv e-prints, art. arXiv:2310.09484, October 2023. doi: 10.48550/arXiv.2310.09484.
- Morph-pipe: Plugging in identity prior to enhance face morphing attack based on diffusion model. In Norwegian Information Security Conference (NISK), 2023.
- Diffusion autoencoders: Toward a meaningful and decodable representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10619–10629, June 2022.
- A style-based generator architecture for generative adversarial networks. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4396–4405, 2019. doi: 10.1109/CVPR.2019.00453.
- Syn-mad 2022: Competition on face morphing attack detection based on privacy-aware synthetic training data. In 2022 IEEE International Joint Conference on Biometrics (IJCB), pages 1–10, 2022. doi: 10.1109/IJCB54206.2022.10007950.
- Face Research Lab London Set. 5 2017. doi: 10.6084/m9.figshare.5047666.v5. URL https://figshare.com/articles/dataset/Face_Research_Lab_London_Set/5047666.
- Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4690–4699, 2019.
- Elasticface: Elastic margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 1578–1587, June 2022.
- Adaface: Quality adaptive margin for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Biometric systems under morphing attacks: Assessment of morphing techniques and vulnerability reporting. In 2017 International Conference of the Biometrics Special Interest Group (BIOSIG), pages 1–7, 2017. doi: 10.23919/BIOSIG.2017.8053499.
- Freedom: Training-free energy-guided conditional diffusion model. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- End-to-end diffusion latent optimization improves classifier guidance, 2023.
- AdjointDPM: Adjoint sensitivity method for gradient backpropagation of diffusion probabilistic models. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=y33lDRBgWI.
- Improved residual networks for image and video recognition. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 9415–9422, 2021. doi: 10.1109/ICPR48806.2021.9412193.
- Mipgan—generating strong and high quality morphing attacks using identity prior driven gan. IEEE Transactions on Biometrics, Behavior, and Identity Science, 3(3):365–383, 2021. doi: 10.1109/TBIOM.2021.3072349.
- Partial fc: Training 10 million identities on a single machine. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 1445–1449, 2021. doi: 10.1109/ICCVW54120.2021.00166.
- Zander W. Blasingame (7 papers)
- Chen Liu (206 papers)