Applicability of spectrally guided per‑instance noise schedules to multi‑stage models
Investigate whether per‑instance diffusion noise schedules derived from an image’s radially averaged power spectral density (RAPSD)—as used to guide training and sampling in single‑stage pixel diffusion—can be effectively applied within multi‑stage generative pipelines, specifically latent diffusion models and distilled diffusion models.
References
Our results showed improved quality over strictly single-stage pixel diffusion models, while needing fewer denoising steps, though they generally lag behind state-of-the-art latent diffusion and distilled models. We leave for future work to investigate whether similar techniques apply to these multi-stage models, noting that \citet{skorokhodov2025improving} investigated the differences between latent and RGB spectra.