Learning A Physical-aware Diffusion Model Based on Transformer for Underwater Image Enhancement (2403.01497v2)
Abstract: Underwater visuals undergo various complex degradations, inevitably influencing the efficiency of underwater vision tasks. Recently, diffusion models were employed to underwater image enhancement (UIE) tasks, and gained SOTA performance. However, these methods fail to consider the physical properties and underwater imaging mechanisms in the diffusion process, limiting information completion capacity of diffusion models. In this paper, we introduce a novel UIE framework, named PA-Diff, designed to exploiting the knowledge of physics to guide the diffusion process. PA-Diff consists of Physics Prior Generation (PPG) Branch, Implicit Neural Reconstruction (INR) Branch, and Physics-aware Diffusion Transformer (PDT) Branch. Our designed PPG branch aims to produce the prior knowledge of physics. With utilizing the physics prior knowledge to guide the diffusion process, PDT branch can obtain underwater-aware ability and model the complex distribution in real-world underwater scenes. INR Branch can learn robust feature representations from diverse underwater image via implicit neural representation, which reduces the difficulty of restoration for PDT branch. Extensive experiments prove that our method achieves best performance on UIE tasks.
- “Autonomous data collection with timed communication constraints for unmanned underwater vehicles,” IEEE Robotics Autom. Lett., vol. 6, no. 2, pp. 1832–1839, 2021.
- “Realtime multi-diver tracking and re-identification for underwater human-robot collaboration,” in 2020 IEEE International Conference on Robotics and Automation, ICRA 2020, Paris, France, May 31 - August 31, 2020, 2020, pp. 11140–11146.
- “What is the space of attenuation coefficients in underwater computer vision?,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 2017, pp. 568–577.
- “Underwater image restoration based on image blurriness and light absorption,” IEEE Trans. Image Process., vol. 26, no. 4, pp. 1579–1594, 2017.
- “Transmission estimation in underwater single images,” in 2013 IEEE International Conference on Computer Vision Workshops, ICCV Workshops 2013, Sydney, Australia, December 1-8, 2013, 2013, pp. 825–830.
- “Generalization of the dark channel prior for single image restoration,” IEEE Trans. Image Process., vol. 27, no. 6, pp. 2856–2868, 2018.
- “Underwater image enhancement by transformer-based diffusion model with non-uniform sampling for skip strategy,” in Proceedings of the 31st ACM International Conference on Multimedia, MM 2023, Ottawa, ON, Canada, 29 October 2023- 3 November 2023. 2023, pp. 5419–5427, ACM.
- “U-shape transformer for underwater image enhancement,” IEEE Trans. Image Process., vol. 32, pp. 3066–3079, 2023.
- “An underwater image enhancement benchmark dataset and beyond,” IEEE Trans. Image Process., vol. 29, pp. 4376–4389, 2020.
- “Enhancing underwater imagery using generative adversarial networks,” in 2018 IEEE International Conference on Robotics and Automation, ICRA 2018, Brisbane, Australia, May 21-25, 2018. 2018, pp. 7159–7165, IEEE.
- “High-resolution image synthesis with latent diffusion models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. 2022, pp. 10674–10685, IEEE.
- “Photorealistic text-to-image diffusion models with deep language understanding,” in NeurIPS, 2022.
- “ILVR: conditioning method for denoising diffusion probabilistic models,” in 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. 2021, pp. 14347–14356, IEEE.
- “Zero-shot image restoration using denoising diffusion null-space model,” in The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023, 2023.
- “Diff-retinex: Rethinking low-light image enhancement with A generative diffusion model,” CoRR, vol. abs/2308.13164, 2023.
- “Denoising diffusion probabilistic models,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- “Denoising diffusion implicit models,” in 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021.
- “Wavelet-based fourier information interaction with frequency diffusion adjustment for underwater image restoration,” arXiv preprint arXiv:2311.16845, 2023.
- “Single underwater image restoration using adaptive attenuation-curve prior,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 65-I, no. 3, pp. 992–1002, 2018.
- “Underwater image enhancement by wavelength compensation and dehazing,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 1756–1769, 2012.
- “Sea-thru: A method for removing water from underwater images,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. 2019, pp. 1682–1691, Computer Vision Foundation / IEEE.
- “Toward sufficient spatial-frequency interaction for gradient-aware underwater image enhancement,” arXiv preprint arXiv:2309.04089, 2023.
- “Underwater image enhancement via medium transmission-guided multi-color space embedding,” IEEE Trans. Image Process., vol. 30, pp. 4985–5000, 2021.
- “Transmission and color-guided network for underwater image enhancement,” in 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023, pp. 1337–1342.
- “Tf-icon: Diffusion-based training-free cross-domain image composition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 2294–2305.
- “Pyramid diffusion models for low-light image enhancement,” arXiv preprint arXiv:2305.10028, 2023.
- “Migc: Multi-instance generation controller for text-to-image synthesis,” arXiv preprint arXiv:2402.05408, 2024.
- “Palette: Image-to-image diffusion models,” in SIGGRAPH ’22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7 - 11, 2022. 2022, pp. 15:1–15:10, ACM.
- “Restoring vision in adverse weather conditions with patch-based denoising diffusion models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- “Difflle: Diffusion-guided domain calibration for unsupervised low-light image enhancement,” arXiv preprint arXiv:2308.09279, 2023.
- “Dynamic convolution: Attention over convolution kernels,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11030–11039.
- “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004.
- “The unreasonable effectiveness of deep features as a perceptual metric,” in 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, 2018, pp. 586–595.
- “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 6626–6637.
- “Uiec^2-net: Cnn-based underwater image enhancement using two color space,” Signal Process. Image Commun., vol. 96, pp. 116250, 2021.
- “A wavelet-based dual-stream network for underwater image enhancement,” in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Virtual and Singapore, 23-27 May 2022. 2022, pp. 2769–2773, IEEE.
- “Deep underwater image enhancement,” CoRR, vol. abs/1807.03528, 2018.