Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
55 tokens/sec
2000 character limit reached

Easing Color Shifts in Score-Based Diffusion Models (2306.15832v2)

Published 27 Jun 2023 in cs.LG, cs.AI, and cs.CV

Abstract: Generated images of score-based models can suffer from errors in their spatial means, an effect, referred to as a color shift, which grows for larger images. This paper investigates a previously-introduced approach to mitigate color shifts in score-based diffusion models. We quantify the performance of a nonlinear bypass connection in the score network, designed to process the spatial mean of the input and to predict the mean of the score function. We show that this network architecture substantially improves the resulting quality of the generated images, and that this improvement is approximately independent of the size of the generated images. As a result, this modified architecture offers a simple solution for the color shift problem across image sizes. We additionally discuss the origin of color shifts in an idealized setting in order to motivate the approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. Brian D.O. Anderson. Reverse-time diffusion equation models. Stoch. Process. their Appl., 12(3):313–326, 1982.
  2. Unpaired Downscaling of Fluid Flows with Diffusion Bridges. 2023.
  3. Perception Prioritized Training of Diffusion Models. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11462–11471. IEEE Computer Society, 2022.
  4. Diffusion Models Beat GANs on Image Synthesis. arXiv, 2021.
  5. Denoising Diffusion Probabilistic Models. In Adv. Neur. In., pages 6840–6851. Curran Associates, Inc., 2020.
  6. Imagen Video: High Definition Video Generation with Diffusion Models, 2022a.
  7. Video Diffusion Models. arXiv, 2022b.
  8. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  9. DiffWave: A Versatile Diffusion Model for Audio Synthesis, 2021.
  10. Bi-Noising Diffusion: Towards Conditional Diffusion Models with Generative Restoration Priors, 2022.
  11. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer International Publishing, 2015.
  12. Palette: Image-to-Image Diffusion Models. arXiv, 2021.
  13. Progressive Distillation for Fast Sampling of Diffusion Models. arXiv, 2022.
  14. Improved Techniques for Training Score-Based Generative Models. arXiv, 2020.
  15. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
  16. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. In Adv. Neur. In., pages 7537–7547. Curran Associates, Inc., 2020.
  17. Pascal Vincent. A Connection Between Score Matching and Denoising Autoencoders. Neural Computation, 23(7):1661–1674, 2011.
  18. Exploiting diffusion prior for real-world image super-resolution, 2023.
  19. Non-local neural networks. In Proc. IEEE Int. Conf. Comput. Vis., pages 7794–7803, 2018.
  20. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. 2017.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.