IDF-CR: Iterative Diffusion Process for Divide-and-Conquer Cloud Removal in Remote-sensing Images (2403.11870v1)
Abstract: Deep learning technologies have demonstrated their effectiveness in removing cloud cover from optical remote-sensing images. Convolutional Neural Networks (CNNs) exert dominance in the cloud removal tasks. However, constrained by the inherent limitations of convolutional operations, CNNs can address only a modest fraction of cloud occlusion. In recent years, diffusion models have achieved state-of-the-art (SOTA) proficiency in image generation and reconstruction due to their formidable generative capabilities. Inspired by the rapid development of diffusion models, we first present an iterative diffusion process for cloud removal (IDF-CR), which exhibits a strong generative capabilities to achieve component divide-and-conquer cloud removal. IDF-CR consists of a pixel space cloud removal module (Pixel-CR) and a latent space iterative noise diffusion network (IND). Specifically, IDF-CR is divided into two-stage models that address pixel space and latent space. The two-stage model facilitates a strategic transition from preliminary cloud reduction to meticulous detail refinement. In the pixel space stage, Pixel-CR initiates the processing of cloudy images, yielding a suboptimal cloud removal prior to providing the diffusion model with prior cloud removal knowledge. In the latent space stage, the diffusion model transforms low-quality cloud removal into high-quality clean output. We refine the Stable Diffusion by implementing ControlNet. In addition, an unsupervised iterative noise refinement (INR) module is introduced for diffusion model to optimize the distribution of the predicted noise, thereby enhancing advanced detail recovery. Our model performs best with other SOTA methods, including image reconstruction and optical remote-sensing cloud removal on the optical remote-sensing datasets.
- S. Han, J. Wang, and S. Zhang, “Former-cr: A transformer-based thick cloud removal method with optical and sar imagery,” Remote Sensing, vol. 15, no. 5, p. 1196, 2023.
- J. Nie, W. Wei, L. Zhang, J. Yuan, Z. Wang, and H. Li, “Contrastive haze-aware learning for dynamic remote sensing image dehazing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–11, 2022.
- Y. Shi, H. Zhong, Z. Yang, X. Yang, and L. Lin, “Ddet: Dual-path dynamic enhancement network for real-world image super-resolution,” IEEE Signal Processing Letters, vol. 27, pp. 481–485, 2020.
- H. Li, J. Qin, Z. Yang, P. Wei, J. Pan, L. Lin, and Y. Shi, “Real-world image super-resolution by exclusionary dual-learning,” IEEE Transactions on Multimedia, 2022.
- Y. Shi, H. Li, S. Zhang, Z. Yang, and X. Wang, “Criteria comparative learning for real-scene image super-resolution,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 12, pp. 8476–8485, 2022.
- Y. Song, M. Wang, X. Xian, Z. Yang, Y. Fan, and Y. Shi, “Negvsr: Augmenting negatives for generalized noise modeling in real-world video super-resolution,” arXiv preprint arXiv:2305.14669, 2023.
- S. Ji, P. Dai, M. Lu, and Y. Zhang, “Simultaneous cloud detection and removal from bitemporal remote sensing images using cascade convolutional neural networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 1, pp. 732–748, 2020.
- K. Enomoto, K. Sakurada, W. Wang, H. Fukui, M. Matsuoka, R. Nakamura, and N. Kawaguchi, “Filmy cloud removal on satellite imagery with multispectral conditional generative adversarial nets,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 48–56.
- P. Singh and N. Komodakis, “Cloud-gan: Cloud removal for sentinel-2 imagery using a cyclic consistent generative adversarial networks,” in IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2018, pp. 1772–1775.
- A. Meraner, P. Ebel, X. X. Zhu, and M. Schmitt, “Cloud removal in sentinel-2 imagery using a deep residual neural network and sar-optical data fusion,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 166, pp. 333–346, 2020.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
- H. Pan, “Cloud removal for remote sensing imagery via spatial attention generative adversarial network,” arXiv preprint arXiv:2009.13015, 2020.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- A. Van Den Oord, O. Vinyals et al., “Neural discrete representation learning,” Advances in neural information processing systems, vol. 30, 2017.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
- X. Lin, J. He, Z. Chen, Z. Lyu, B. Fei, B. Dai, W. Ouyang, Y. Qiao, and C. Dong, “Diffbir: Towards blind image restoration with generative diffusion prior,” arXiv preprint arXiv:2308.15070, 2023.
- D. Baranchuk, I. Rubachev, A. Voynov, V. Khrulkov, and A. Babenko, “Label-efficient semantic segmentation with diffusion models,” arXiv preprint arXiv:2112.03126, 2021.
- R. S. Zimmermann, L. Schott, Y. Song, B. A. Dunn, and D. A. Klindt, “Score-based generative classifiers,” arXiv preprint arXiv:2110.00473, 2021.
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.
- P. Wei, Z. Xie, H. Lu, Z. Zhan, Q. Ye, W. Zuo, and L. Lin, “Component divide-and-conquer for real-world image super-resolution,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VIII 16. Springer, 2020, pp. 101–117.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10 012–10 022.
- L. Metz, B. Poole, D. Pfau, and J. Sohl-Dickstein, “Unrolled generative adversarial networks,” arXiv preprint arXiv:1611.02163, 2016.
- B. Xia, Y. Zhang, S. Wang, Y. Wang, X. Wu, Y. Tian, W. Yang, and L. Van Gool, “Diffir: Efficient diffusion model for image restoration,” arXiv preprint arXiv:2303.09472, 2023.
- L. Zhang, A. Rao, and M. Agrawala, “Adding conditional control to text-to-image diffusion models,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3836–3847.
- D. Lin, G. Xu, X. Wang, Y. Wang, X. Sun, and K. Fu, “A remote sensing image dataset for cloud removal,” arXiv preprint arXiv:1901.00600, 2019.
- J. Li, Y. Zhang, Q. Sheng, Z. Wu, B. Wang, Z. Hu, G. Shen, M. Schmitt, and M. Molinier, “Thin cloud removal fusing full spectral and spatial features for sentinel-2 imagery,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 8759–8775, 2022.
- R. E. Rossi, J. L. Dungan, and L. R. Beck, “Kriging in the shadows: geostatistical interpolation for remote sensing,” Remote Sensing of Environment, vol. 49, no. 1, pp. 32–40, 1994.
- A. Maalouf, P. Carré, B. Augereau, and C. Fernandez-Maloigne, “A bandelet-based inpainting technique for clouds removal from remotely sensed images,” IEEE transactions on geoscience and remote sensing, vol. 47, no. 7, pp. 2363–2371, 2009.
- C.-H. Lin, P.-H. Tsai, K.-H. Lai, and J.-Y. Chen, “Cloud removal from multitemporal satellite images using information cloning,” IEEE transactions on geoscience and remote sensing, vol. 51, no. 1, pp. 232–241, 2012.
- M. Xu, X. Jia, M. Pickering, and A. J. Plaza, “Cloud removal based on sparse representation via multitemporal dictionary learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 5, pp. 2998–3006, 2016.
- Z. Liu and B. R. Hunt, “A new approach to removing cloud cover from satellite imagery,” Computer vision, graphics, and image processing, vol. 25, no. 2, pp. 252–256, 1984.
- G. Hu, X. Li, and D. Liang, “Thin cloud removal from remote sensing images using multidirectional dual tree complex wavelet transform and transfer least square support vector regression,” Journal of Applied Remote Sensing, vol. 9, no. 1, pp. 095 053–095 053, 2015.
- L. Lorenzi, F. Melgani, and G. Mercier, “Missing-area reconstruction in multispectral images under a compressive sensing perspective,” IEEE transactions on geoscience and remote sensing, vol. 51, no. 7, pp. 3998–4008, 2013.
- M. Xu, M. Pickering, A. J. Plaza, and X. Jia, “Thin cloud removal based on signal transmission principles and spectral mixture analysis,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 3, pp. 1659–1669, 2016.
- X. Li, H. Shen, L. Zhang, H. Zhang, Q. Yuan, and G. Yang, “Recovering quantitative remote sensing products contaminated by thick clouds and shadows using multitemporal dictionary learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 11, pp. 7086–7098, 2014.
- Q. Zhang, Q. Yuan, C. Zeng, X. Li, and Y. Wei, “Missing data reconstruction in remote sensing image with a unified spatial–temporal–spectral deep convolutional neural network,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 8, pp. 4274–4288, 2018.
- M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
- J. Zheng, X.-Y. Liu, and X. Wang, “Single image cloud removal using u-net and generative adversarial networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 8, pp. 6371–6385, 2020.
- J. Bermudez, P. Happ, D. Oliveira, and R. Feitosa, “Sar to optical image synthesis for cloud removal with generative adversarial networks,” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 4, pp. 5–11, 2018.
- J. Gao, Q. Yuan, J. Li, H. Zhang, and X. Su, “Cloud removal with fusion of high resolution optical and sar images using generative adversarial networks,” Remote Sensing, vol. 12, no. 1, p. 191, 2020.
- W. Li, Y. Li, and J. C.-W. Chan, “Thick cloud removal with optical and sar imagery via convolutional-mapping-deconvolutional network,” IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 4, pp. 2865–2879, 2019.
- F. Xu, Y. Shi, P. Ebel, L. Yu, G.-S. Xia, W. Yang, and X. X. Zhu, “Glf-cr: Sar-enhanced cloud removal with global–local fusion,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 192, pp. 268–278, 2022.
- P. Ebel, Y. Xu, M. Schmitt, and X. X. Zhu, “Sen12ms-cr-ts: A remote-sensing data set for multimodal multitemporal cloud removal,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2022.
- P. Ebel, V. S. F. Garnot, M. Schmitt, J. D. Wegner, and X. X. Zhu, “Uncrtaints: Uncertainty quantification for cloud removal in optical satellite time series,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 2085–2095.
- J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
- N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, “Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 500–22 510.
- C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans et al., “Photorealistic text-to-image diffusion models with deep language understanding,” Advances in Neural Information Processing Systems, vol. 35, pp. 36 479–36 494, 2022.
- J. Ho, T. Salimans, A. Gritsenko, W. Chan, M. Norouzi, and D. J. Fleet, “Video diffusion models,” 2022.
- B. Kawar, S. Zada, O. Lang, O. Tov, H. Chang, T. Dekel, I. Mosseri, and M. Irani, “Imagic: Text-based real image editing with diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6007–6017.
- J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, and R. Timofte, “Swinir: Image restoration using swin transformer,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 1833–1844.
- Y. Zhang, D. Li, K. L. Law, X. Wang, H. Qin, and H. Li, “Idr: Self-supervised image denoising via iterative data refinement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2098–2107.
- Y. Zheng, J. Zhan, S. He, J. Dong, and Y. Du, “Curricular contrastive regularization for physics-aware single image dehazing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5785–5794.
- R.-Q. Wu, Z.-P. Duan, C.-L. Guo, Z. Chai, and C. Li, “Ridcp: Revitalizing real image dehazing via high-quality codebook priors,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 282–22 291.
- H. Bai, J. Pan, X. Xiang, and J. Tang, “Self-guided image dehazing using progressive feature fusion,” IEEE Transactions on Image Processing, vol. 31, pp. 1217–1229, 2022.
- R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
- A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a “completely blind” image quality analyzer,” IEEE Signal processing letters, vol. 20, no. 3, pp. 209–212, 2012.
- S. Yang, T. Wu, S. Shi, S. Lao, Y. Gong, M. Cao, J. Wang, and Y. Yang, “Maniqa: Multi-dimension attention network for no-reference image quality assessment,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1191–1200.
- A. Mittal, A. K. Moorthy, and A. C. Bovik, “Blind/referenceless image spatial quality evaluator,” in 2011 conference record of the forty fifth asilomar conference on signals, systems and computers (ASILOMAR). IEEE, 2011, pp. 723–727.
- Y. Blau, R. Mechrez, R. Timofte, T. Michaeli, and L. Zelnik-Manor, “The 2018 pirm challenge on perceptual image super-resolution,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
- J. Gu, H. Cai, C. Dong, J. S. Ren, R. Timofte, Y. Gong, S. Lao, S. Shi, J. Wang, S. Yang et al., “Ntire 2022 challenge on perceptual image quality assessment,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 951–967.
- W. Li, Y. Li, D. Chen, and J. C.-W. Chan, “Thin cloud removal with residual symmetrical concatenation network,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 153, pp. 137–150, 2019.
- Y. Li and X. Chen, “A coarse-to-fine two-stage attentive network for haze removal of remote sensing images,” IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 10, pp. 1751–1755, 2020.
- J. Guo, J. Yang, H. Yue, H. Tan, C. Hou, and K. Li, “Rsdehazenet: Dehazing network with channel refinement for multispectral remote sensing images,” IEEE Transactions on geoscience and remote sensing, vol. 59, no. 3, pp. 2535–2549, 2020.
- Meilin Wang (4 papers)
- Yexing Song (2 papers)
- Pengxu Wei (26 papers)
- Xiaoyu Xian (10 papers)
- Yukai Shi (44 papers)
- Liang Lin (319 papers)