Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Diffusion Models (2310.08337v3)

Published 12 Oct 2023 in cs.LG and stat.ML

Abstract: Diffusion models have shown remarkable performance on many generative tasks. Despite recent success, most diffusion models are restricted in that they only allow linear transformation of the data distribution. In contrast, broader family of transformations can potentially help train generative distributions more efficiently, simplifying the reverse process and closing the gap between the true negative log-likelihood and the variational approximation. In this paper, we present Neural Diffusion Models (NDMs), a generalization of conventional diffusion models that enables defining and learning time-dependent non-linear transformations of data. We show how to optimise NDMs using a variational bound in a simulation-free setting. Moreover, we derive a time-continuous formulation of NDMs, which allows fast and reliable inference using off-the-shelf numerical ODE and SDE solvers. Finally, we demonstrate the utility of NDMs with learnable transformations through experiments on standard image generation benchmarks, including CIFAR-10, downsampled versions of ImageNet and CelebA-HQ. NDMs outperform conventional diffusion models in terms of likelihood and produce high-quality samples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Building normalizing flows with stochastic interpolants. arXiv preprint arXiv:2209.15571, 2022.
  2. Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018.
  3. Likelihood training of schr\\\backslash\” odinger bridge using forward-backward sdes theory. arXiv preprint arXiv:2110.11291, 2021.
  4. Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1):53–65, 2018.
  5. Good semi-supervised learning that requires a bad gan. Advances in neural information processing systems, 30, 2017.
  6. Soft diffusion: Score matching for general corruptions. arXiv preprint arXiv:2209.05442, 2022.
  7. Diffusion schrödinger bridge with applications to score-based generative modeling. Advances in Neural Information Processing Systems, 34:17695–17709, 2021.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  9. Deng, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE signal processing magazine, 29(6):141–142, 2012.
  10. Diffusion models beat GANs on image synthesis. arXiv preprint arXiv:2105.05233, 2021.
  11. A family of embedded runge-kutta formulae. Journal of computational and applied mathematics, 6(1):19–26, 1980.
  12. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  13. Ffjord: Free-form continuous dynamics for scalable reversible generative models. arXiv preprint arXiv:1810.01367, 2018.
  14. f-dm: A multi-stage diffusion model via progressive signal transformation. arXiv preprint arXiv:2210.04955, 2022.
  15. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  16. Denoising diffusion probabilistic models. arXiv preprint arXiv:2006.11239, 2020.
  17. Blurring diffusion models. arXiv preprint arXiv:2209.05557, 2022.
  18. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.
  19. Maximum likelihood training of implicit nonlinear diffusion models. arXiv preprint arXiv:2205.13699, 2022.
  20. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114, 2013.
  21. Variational diffusion models. arXiv preprint arXiv:2107.00630, 2, 2021.
  22. Learning multiple layers of features from tiny images. 2009.
  23. Minimizing trajectory curvature of ode-based generative models. arXiv preprint arXiv:2301.12003, 2023.
  24. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747, 2022.
  25. Pseudo numerical methods for diffusion models on manifolds. arXiv preprint arXiv:2202.09778, 2022a.
  26. Liu, Q. Rectified flow: A marginal preserving approach to optimal transport. arXiv preprint arXiv:2209.14577, 2022.
  27. Flow straight and fast: Learning to generate and transfer data with rectified flow. arXiv preprint arXiv:2209.03003, 2022b.
  28. MacKay, D. J. Information theory, inference and learning algorithms. Cambridge university press, 2003.
  29. Action matching: A variational method for learning stochastic dynamics from samples. arXiv preprint arXiv:2210.06662, 2022.
  30. Improved denoising diffusion probabilistic models. arXiv preprint arXiv:2102.09672, 2021.
  31. Diffenc: Variational diffusion with a learned encoder. arXiv preprint arXiv:2310.19789, 2023.
  32. Stochastic differential equations. Springer, 2003.
  33. Normalizing flows for probabilistic modeling and inference. The Journal of Machine Learning Research, 22(1):2617–2680, 2021.
  34. Peluchetti, S. Non-denoising forward-time diffusions.
  35. Grad-tts: A diffusion probabilistic model for text-to-speech. In International Conference on Machine Learning, pp.  8599–8608. PMLR, 2021.
  36. Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning, pp.  1278–1286. PMLR, 2014.
  37. Generative modelling with inverse heat dissipation. arXiv preprint arXiv:2206.13397, 2022.
  38. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10684–10695, 2022.
  39. Image super-resolution via iterative refinement. arXiv preprint arXiv:2104.07636, 2021.
  40. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
  41. Where to diffuse, how to diffuse, and how to get back: Automated learning for multivariate diffusions. arXiv preprint arXiv:2302.07261, 2023.
  42. Differential equations, dynamical systems, and linear algebra, volume 60. Elsevier, 1974.
  43. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp.  2256–2265. PMLR, 2015.
  44. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  45. Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJUYGxbCW.
  46. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  47. Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems, 34:1415–1428, 2021.
  48. It\\\backslash\^{{\{{o}}\}}-taylor sampling scheme for denoising diffusion probabilistic models using ideal derivatives. arXiv preprint arXiv:2112.13339, 2021.
  49. Tomczak, J. M. Deep generative modeling. Springer, 2022.
  50. Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem. In The Eleventh International Conference on Learning Representations, 2023.
  51. Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
  52. Deep generative learning via schrödinger bridge. In International Conference on Machine Learning, pp.  10794–10804. PMLR, 2021.
  53. Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models. bioRxiv, pp.  2022–12, 2022.
  54. Practical and asymptotically exact conditional sampling in diffusion models. arXiv preprint arXiv:2306.17775, 2023.
  55. Tackling the generative learning trilemma with denoising diffusion GANs. arXiv preprint arXiv:2112.07804, 2021.
  56. Diffusion models: A comprehensive survey of methods and applications. arXiv preprint arXiv:2209.00796, 2022.
Citations (4)

Summary

  • The paper introduces Neural Diffusion Models with learnable non-linear transformations that narrow the gap between true and approximated data distributions.
  • It employs simulation-free variational optimization and a continuous time formulation using ODE/SDE solvers for efficient training and inference.
  • Experiments on MNIST, CIFAR-10, ImageNet, and CelebA-HQ confirm enhanced likelihood estimation and superior sample generation over traditional methods.

An Overview of Neural Diffusion Models

The paper "Neural Diffusion Models" introduces a novel approach to generative modeling by extending the conventional diffusion models, known for their iterative noise reduction processes, into a more flexible framework called Neural Diffusion Models (NDMs). The primary innovation in this framework is the introduction of time-dependent and learnable non-linear transformations of data during the diffusion process, significantly enhancing the model's ability to close the gap between true and approximated distributions in terms of negative log-likelihood (NLL).

Technical Contributions

  1. Generalization of Diffusion Processes: Traditional diffusion models typically apply linear transformations, whereas NDMs introduce a non-linear, time-dependent transformation process. This is realized through a neural network parameterization that adapts the transformation to the specific characteristics of the data at each diffusion step.
  2. Simulation-Free Variational Optimization: The authors extend the variational objective used in traditional diffusion models to accommodate the learnable transformations. This objective is optimized in a simulation-free setting, similar to the training of conventional diffusion models, which enables efficient computation and scalability.
  3. Continuous Time Formulation: The paper derives a continuous-time analogue for the model, which allows for the use of ordinary differential equations (ODE) and stochastic differential equations (SDE) solvers in inference. This continuous formulation supports faster and more reliable inference processes, providing a significant computational advantage.
  4. Experimental Validation: Through experiments on established image generation tasks such as MNIST, CIFAR-10, and downsampled versions of ImageNet and CelebA-HQ, the paper demonstrates that NDMs not only improve likelihood estimation, attaining state-of-the-art results on ImageNet and CelebA-HQ, but also provide high-quality sample generation. The results underline the model's capability to outperform existing diffusion frameworks significantly.

Implications and Future Directions

NDMs represent a substantial advancement in the capability of generative models, particularly in how they can be adapted and fine-tuned to different data distributions through learnable transformations. This flexibility in modeling can lead to better understanding and reconstruction of complex data distributions, which is crucial for applications in data augmentation and semi-supervised learning. Moreover, by enabling improved estimation of data likelihood, NDMs have potential applications in fields such as data compression and adversarial purification.

The findings suggest avenues for future exploration, including the explicit paper of the dynamics and properties of learned transformations and potential optimizations in neural architectures tailored for specific types of data or tasks. Furthermore, exploring the integration of NDMs with existing generative frameworks, such as variational autoencoders (VAEs) or generative adversarial networks (GANs), could produce even more powerful models. Additionally, investigating how NDMs can incorporate conditional information to perform conditional generation tasks is a promising area for future research.

In conclusion, the advent of Neural Diffusion Models opens new possibilities and challenges in generative modeling. The flexibility of learnable transformations holds promise for modeling complex data distributions more effectively, setting a precedent for refinements and innovations in the architecture and applications of generative models.