Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Disentangling Total-Variance and Signal-to-Noise-Ratio Improves Diffusion Models (2502.08598v2)

Published 12 Feb 2025 in cs.LG and stat.ML

Abstract: The long sampling time of diffusion models remains a significant bottleneck, which can be mitigated by reducing the number of diffusion time steps. However, the quality of samples with fewer steps is highly dependent on the noise schedule, i.e., the specific manner in which noise is introduced and the signal is reduced at each step. Although prior work has improved upon the original variance-preserving and variance-exploding schedules, these approaches $\textit{passively}$ adjust the total variance, without direct control over it. In this work, we propose a novel total-variance/signal-to-noise-ratio disentangled (TV/SNR) framework, where TV and SNR can be controlled independently. Our approach reveals that schedules where the TV explodes exponentially can often be improved by adopting a constant TV schedule while preserving the same SNR schedule. Furthermore, generalizing the SNR schedule of the optimal transport flow matching significantly improves the generation performance. Our findings hold across various reverse diffusion solvers and diverse applications, including molecular structure and image generation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Khaled Kahouli (2 papers)
  2. Winfried Ripken (6 papers)
  3. Stefan Gugler (8 papers)
  4. Oliver T. Unke (24 papers)
  5. Klaus-Robert Müller (167 papers)
  6. Shinichi Nakajima (44 papers)

Summary

We haven't generated a summary for this paper yet.