Papers
Topics
Authors
Recent
Search
2000 character limit reached

Misalignment-Robust Frequency Distribution Loss for Image Transformation

Published 28 Feb 2024 in cs.CV and eess.IV | (2402.18192v1)

Abstract: This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution, which heavily rely on precisely aligned paired datasets with pixel-level alignments. However, creating precisely aligned paired images presents significant challenges and hinders the advancement of methods trained on such data. To overcome this challenge, this paper introduces a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain. Specifically, we transform image features into the frequency domain using Discrete Fourier Transformation (DFT). Subsequently, frequency components (amplitude and phase) are processed separately to form the FDL loss function. Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain. Extensive experimental evaluations, focusing on image enhancement and super-resolution tasks, demonstrate that FDL outperforms existing misalignment-robust loss functions. Furthermore, we explore the potential of our FDL for image style transfer that relies solely on completely misaligned data. Our code is available at: https://github.com/eezkni/FDL

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2017.
  2. The perception-distortion tradeoff. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6228–6237, 2018.
  3. Toward real-world single image super-resolution: A new benchmark and a new model. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3086–3095, 2019.
  4. Frequency domain image translation: More photo-realistic, better identity-preserving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13930–13940, 2021.
  5. Camera lens super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1652–1660, 2019.
  6. Simple baselines for image restoration. In European Conference on Computer Vision, pages 17–33. Springer, 2022.
  7. Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1):53–65, 2018.
  8. Projected distribution loss for image enhancement. In 2021 IEEE International Conference on Computational Photography, pages 1–12, 2021.
  9. Image quality assessment: Unifying structure and texture similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5):2567–2581, 2020.
  10. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision, pages 184–199. Springer, 2014.
  11. Generating natural images with direct patch distributions matching. In European Conference on Computer Vision, pages 544–560. Springer, 2022.
  12. Effnet: An efficient structure for convolutional neural networks. In IEEE International Conference on Image Processing, pages 6–10. IEEE, 2018.
  13. How do amplitude spectra influence rapid animal detection? Vision Research, 49(24):3001–3012, 2009.
  14. Image style transfer using convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2414–2423, 2016.
  15. Deep bilateral learning for real-time image enhancement. In ACM Transactions on Graphics, 36(4):1–12, 2017.
  16. Shift-tolerant perceptual similarity metric. In European Conference on Computer Vision, pages 91–107. Springer, 2022.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  18. A sliced Wasserstein loss for neural texture synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9412–9420, 2021.
  19. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems, 30, 2017.
  20. Deep fourier-based exposure correction network with spatial-frequency interaction. In European Conference on Computer Vision, pages 163–180. Springer, 2022.
  21. DSLR-quality photos on mobile devices with deep convolutional networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3277–3285, 2017.
  22. Focal frequency loss for image reconstruction and synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13919–13929, 2021.
  23. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016.
  24. Optimal mass transport: Signal processing and machine-learning applications. IEEE Signal Processing Magazine, 34(4):43–59, 2017.
  25. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 2012.
  26. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1833–1844, 2021.
  27. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
  28. The contextual loss for image transformation with non-aligned data. In European Conference on Computer Vision, pages 768–783. Springer, 2018.
  29. Image super-resolution with non-local sparse attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3517–3526, 2021.
  30. Distributional sliced-wasserstein and applications to generative modeling. In International Conference on Learning Representations, 2020.
  31. Towards unsupervised deep image enhancement with generative adversarial network. IEEE Transactions on Image Processing, 29:9140–9151, 2020.
  32. Cycle-interactive generative adversarial network for robust unsupervised low-light enhancement. In Proceedings of the ACM International Conference on Multimedia, pages 1484–1492, 2022.
  33. The importance of phase in signals. Proceedings of the IEEE, 69(5):529–541, 1981.
  34. Human-aware motion deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5572–5581, 2019.
  35. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations. Computational and Biological Learning Society, 2015.
  36. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004.
  37. Deep retinex decomposition for low-light enhancement. In British Machine Vision Conference, 2018.
  38. FDA: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4085–4095, 2020.
  39. Richard Zhang. Making convolutional networks shift-invariant again. In International Conference on Machine Learning, pages 7324–7334, 2019.
  40. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018a.
  41. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018b.
  42. Zoom to learn, learn to zoom. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3762–3770, 2019.
  43. Image super-resolution using very deep residual channel attention networks. In European Conference on Computer Vision, pages 286–301, 2018c.
  44. Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging, 3(1):47–57, 2016.
  45. Spatial-frequency domain information integration for pan-sharpening. In European Conference on Computer Vision, pages 274–291. Springer, 2022.
Citations (4)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.