Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 83 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 109 tok/s Pro
Kimi K2 181 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain (2403.17870v1)

Published 26 Mar 2024 in cs.CV and cs.MM

Abstract: Diffusion models have recently brought a powerful revolution in image generation. Despite showing impressive generative capabilities, most of these models rely on the current sample to denoise the next one, possibly resulting in denoising instability. In this paper, we reinterpret the iterative denoising process as model optimization and leverage a moving average mechanism to ensemble all the prior samples. Instead of simply applying moving average to the denoised samples at different timesteps, we first map the denoised samples to data space and then perform moving average to avoid distribution shift across timesteps. In view that diffusion models evolve the recovery from low-frequency components to high-frequency details, we further decompose the samples into different frequency components and execute moving average separately on each component. We name the complete approach "Moving Average Sampling in Frequency domain (MASF)". MASF could be seamlessly integrated into mainstream pre-trained diffusion models and sampling schedules. Extensive experiments on both unconditional and conditional diffusion models demonstrate that our MASF leads to superior performances compared to the baselines, with almost negligible additional complexity cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Renderdiffusion: Image Diffusion for 3D Reconstruction, inpainting and generation. In CVPR, 2023.
  2. Analytic-DPM: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In ICLR, 2022.
  3. All are Worth Words: A ViT Backbone for Diffusion Models. In CVPR, 2023.
  4. John Charles Butcher. A history of Runge-Kutta methods. Applied numerical mathematics, 20(3):247–260, 1996.
  5. Controlstyle: Text-driven stylized image generation using diffusion priors. In ACM Multimedia, 2023a.
  6. Control3d: Towards controllable text-to-3d generation. In ACM Multimedia, 2023b.
  7. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
  8. Diffusion Models Beat GANs on Image Synthesis. In NeurIPS, 2021.
  9. Structure and Content-Guided Video Synthesis with Diffusion Models. In ICCV, 2023.
  10. Generative Diffusion Prior for Unified Image Restoration and Enhancement. In CVPR, 2023.
  11. SWAGAN: A style-based wavelet-driven generative model. ACM Transactions on Graphics (TOG), 40(4):1–11, 2021.
  12. SEEDS: Exponential SDE Solvers for Fast High-Quality Sampling from Diffusion Models. In NeurIPS, 2023.
  13. Generative Adversarial Nets. In NeurIPS, 2014.
  14. Amara Graps. An Introduction to Wavelets. IEEE computational science and engineering, 1995.
  15. Wavelet Score-Based Generative Modeling. In NeurIPS, 2022.
  16. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS, 2017.
  17. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021.
  18. Denoising Diffusion Probabilistic Models. In NeurIPS, 2020.
  19. Imagen Video: High Definition Video Generation with Diffusion Models. arXiv preprint arXiv:2210.02303, 2022.
  20. Gotta Go Fast When Generating Data with Score-Based Models. arXiv preprint arXiv:2105.14080, 2021.
  21. A Style-Based Generator Architecture for Generative Adversarial Networks. In CVPR, 2019.
  22. Diffwave: A Versatile Diffusion Model for Audio Synthesis. In ICLR, 2021.
  23. Localization of Diffusion-Based Inpainting in Digital Images. IEEE transactions on information forensics and security, 12(12):3050–3064, 2017.
  24. Wavelet Transform-Assisted Adaptive Generative Modeling for Colorization. IEEE Transactions on Multimedia, 25:4547–4562, 2023a.
  25. ERA-Solver: Error-Robust Adams Solver for Fast Sampling of Diffusion Probabilistic Models. arXiv preprint arXiv:2301.12935, 2023b.
  26. Microsoft COCO: Common objects in context. In ECCV, 2014.
  27. Pseudo Numerical Methods for Diffusion Models on Manifolds. In ICLR, 2022.
  28. DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models. arXiv preprint arXiv:2211.01095, 2022a.
  29. DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps. In NeurIPS, 2022b.
  30. Repaint: Inpainting using denoising diffusion probabilistic models. In CVPR, 2022.
  31. Semantic-conditional diffusion networks for image captioning. In CVPR, 2023a.
  32. Refusion: Enabling large-size realistic image restoration with latent-space diffusion models. In CVPR, 2023b.
  33. S.G. Mallat. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):674–693, 1989.
  34. To create what you tell: Generating videos from captions. In ACM Multimedia, 2017.
  35. Wavelet Diffusion Models Are Fast and Scalable Image Generators. In CVPR, 2023.
  36. High-resolution image synthesis with latent diffusion models. In CVPR, 2022a.
  37. High-Resolution Image Synthesis With Latent Diffusion Models. In CVPR, 2022b.
  38. DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation. In CVPR, 2023.
  39. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In NeurIPS, 2022.
  40. Flavio Schneider. Archisound: Audio Generation with Diffusion. arXiv preprint arXiv:2301.13267, 2023.
  41. Denoising Diffusion Implicit Models. In ICLR, 2021a.
  42. Score-Based Generative Modeling through Stochastic Differential Equations. In ICLR, 2021b.
  43. The Haar wavelet transform: its status and achievements. Computers & Electrical Engineering, 29(1):25–44, 2003.
  44. Boosting diffusion models with an adaptive momentum sampler. arXiv preprint arXiv:2308.11941, 2023.
  45. Daniel Raymond Wells. Multirate linear multistep methods for the solution of systems of ordinary differential equations. University of Illinois at Urbana-Champaign, 1982.
  46. Diffusion Sampling with Momentum for Mitigating Divergence Artifacts. In ICLR, 2023.
  47. Fast Diffusion Model. arXiv preprint arXiv:2306.06991, 2023.
  48. Diffir: Efficient diffusion model for image restoration. In ICLR, 2023.
  49. 3dstyle-diffusion: Pursuing fine-grained text-driven 3d stylization with 2d diffusion models. In ACM Multimedia, 2023a.
  50. WaveGAN: An Frequency-aware GAN for High-Fidelity Few-shot Image Generation. In ECCV, 2022a.
  51. FreGAN: Exploiting Frequency Components for Training GANs under Limited Data. In NeurIPS, 2022b.
  52. Diffusion Probabilistic Modeling for Video Generation. Entropy, 25(10):1469, 2023b.
  53. Diffusion Probabilistic Model Made Slim. In CVPR, 2023c.
  54. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop. arXiv preprint arXiv:1506.03365, 2015.
  55. Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models. arXiv preprint arXiv:2307.14648, 2023.
  56. StyleSwin: Transformer-Based GAN for High-Resolution Image Generation. In CVPR, 2022.
  57. Adding conditional control to text-to-image diffusion models. In ICCV, 2023a.
  58. Fast Sampling of Diffusion Models with Exponential Integrator. arXiv preprint arXiv:2204.13902, 2022.
  59. gDDIM: Generalized denoising diffusion implicit models. In ICLR, 2023b.
  60. UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models. In NeurIPS, 2023.
Citations (8)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.