Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution (2403.11078v1)

Published 17 Mar 2024 in eess.IV and cs.CV

Abstract: Remote sensing image super-resolution (SR) is a crucial task to restore high-resolution (HR) images from low-resolution (LR) observations. Recently, the Denoising Diffusion Probabilistic Model (DDPM) has shown promising performance in image reconstructions by overcoming problems inherent in generative models, such as over-smoothing and mode collapse. However, the high-frequency details generated by DDPM often suffer from misalignment with HR images due to the model's tendency to overlook long-range semantic contexts. This is attributed to the widely used U-Net decoder in the conditional noise predictor, which tends to overemphasize local information, leading to the generation of noises with significant variances during the prediction process. To address these issues, an adaptive semantic-enhanced DDPM (ASDDPM) is proposed to enhance the detail-preserving capability of the DDPM by incorporating low-frequency semantic information provided by the Transformer. Specifically, a novel adaptive diffusion Transformer decoder (ADTD) is developed to bridge the semantic gap between the encoder and decoder through regulating the noise prediction with the global contextual relationships and long-range dependencies in the diffusion process. Additionally, a residual feature fusion strategy establishes information exchange between the two decoders at multiple levels. As a result, the predicted noise generated by our approach closely approximates that of the real noise distribution.Extensive experiments on two SR and two semantic segmentation datasets confirm the superior performance of the proposed ASDDPM in both SR and the subsequent downstream applications. The source code will be available at https://github.com/littlebeen/ASDDPM-Adaptive-Semantic-Enhanced-DDPM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. G. Cheng, X. Xie, J. Han, L. Guo, and G.-S. Xia, “Remote sensing image scene classification meets deep learning: Challenges, methods, benchmarks, and opportunities,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 3735–3756, 2020.
  2. X. Zhang, W. Yu, M.-O. Pun, and W. Shi, “Cross-domain landslide mapping from large-scale remote sensing images using prototype-guided domain-aware progressive representation learning,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 197, pp. 1–17, 2023.
  3. P. Wang, B. Bayram, and E. Sertel, “A comprehensive review on deep learning based remote sensing image super-resolution methods,” Earth-Science Reviews, p. 104110, 2022.
  4. Q. Wu, X. Ma, J. Sui, and M.-O. Pun, “Df4lcz: A sam-empowered data fusion framework for scene-level local climate zone classification,” arXiv preprint arXiv:2403.09367, 2024.
  5. S. M. A. Bashir, Y. Wang, M. Khan, and Y. Niu, “A comprehensive review of deep learning-based single image super-resolution,” PeerJ Computer Science, vol. 7, p. e621, 2021.
  6. H. Chen, X. He, L. Qing, Y. Wu, C. Ren, R. E. Sheriff, and C. Zhu, “Real-world single image super-resolution: A brief review,” Information Fusion, vol. 79, pp. 124–145, 2022.
  7. Q. Zhang, G. Yang, and G. Zhang, “Collaborative network for super-resolution and semantic segmentation of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–12, 2021.
  8. M. Liu, Q. Shi, A. Marinoni, D. He, X. Liu, and L. Zhang, “Super-resolution-based change detection network with stacked attention module for images with different resolutions,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–18, 2021.
  9. J. Sui, X. Ma, X. Zhang, and M.-O. Pun, “GCRDN: Global context-driven residual dense network for remote sensing image super-resolution,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023.
  10. J. Sui, X. Ma, X. Zhang, and M.-O. Pun, “DTRN: Dual transformer residual network for remote sensing super-resolution,” in Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2023, pp. 6041–6044.
  11. Q. Li, M. Gong, Y. Yuan, and Q. Wang, “Symmetrical feature propagation network for hyperspectral image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–12, 2022.
  12. Q. Li, M. Gong, Y. Yuan, and Q. Wang, “RGB-induced feature modulation network for hyperspectral image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, 2023.
  13. Y. Wei, Y. Li, Z. Ding, Y. Wang, T. Zeng, and T. Long, “SAR parametric super-resolution image reconstruction methods based on admm and deep neural network,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 12, pp. 10 197–10 212, 2021.
  14. Z. Liu, R. Feng, L. Wang, W. Han, and T. Zeng, “Dual learning-based graph neural network for remote sensing image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2022.
  15. S. Wang, T. Zhou, Y. Lu, and H. Di, “Contextual transformation network for lightweight remote-sensing image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2021.
  16. G. Cheng, A. Matsune, Q. Li, L. Zhu, H. Zang, and S. Zhan, “Encoder-decoder residual network for real super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 0–0.
  17. C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295–307, 2015.
  18. L. Chen, H. Liu, M. Yang, Y. Qian, Z. Xiao, and X. Zhong, “Remote sensing image super-resolution via residual aggregation and split attentional fusion network,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 9546–9556, 2021.
  19. W. Ma, Z. Pan, J. Guo, and B. Lei, “Achieving super-resolution remote sensing images via the wavelet transform combined with the recursive res-net,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 6, pp. 3512–3527, 2019.
  20. X. Zhang, W. Yu, and M.-O. Pun, “Multilevel deformable attention-aggregated networks for change detection in bitemporal remote sensing imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–18, 2022.
  21. Y. Mei, Y. Fan, and Y. Zhou, “Image super-resolution with non-local sparse attention,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3517–3526.
  22. S. Jia, Z. Wang, Q. Li, X. Jia, and M. Xu, “Multiattention generative adversarial network for remote sensing image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2022.
  23. H. Wu, L. Zhang, and J. Ma, “Remote sensing image super-resolution via saliency-guided feedback gans,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–16, 2020.
  24. K. Jiang, Z. Wang, P. Yi, G. Wang, T. Lu, and J. Jiang, “Edge-enhanced GAN for remote sensing image superresolution,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 8, pp. 5799–5812, 2019.
  25. J. Tu, G. Mei, Z. Ma, and F. Piccialli, “SWCGAN: Generative adversarial network combining swin transformer and cnn for remote sensing image super-resolution,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 5662–5673, 2022.
  26. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33.   Curran Associates, Inc., 2020, pp. 6840–6851.
  27. A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in International Conference on Machine Learning.   PMLR, 2021, pp. 8162–8171.
  28. H. Wang, X. Wu, Z. Huang, and E. P. Xing, “High-frequency component helps explain the generalization of convolutional neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8684–8694.
  29. N. Park and S. Kim, “How do vision transformers work?” arXiv preprint arXiv:2202.06709, 2022.
  30. J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 1646–1654.
  31. B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 136–144.
  32. Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2472–2481.
  33. Z. Zhang, K. Gao, J. Wang, L. Min, S. Ji, C. Ni, and D. Chen, “Gradient enhanced dual regression network: Perception-preserving super-resolution for multi-sensor remote sensing imagery,” IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1–5, 2021.
  34. J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, and R. Timofte, “Swinir: Image restoration using swin transformer,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 1833–1844.
  35. Y. Liu, J. Hu, X. Kang, J. Luo, and S. Fan, “Interactformer: Interactive transformer and cnn for hyperspectral image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2022.
  36. A. Güngör, B. Askin, D. A. Soydan, E. U. Saritas, C. B. Top, and T. Çukur, “TranSMS: transformers for super-resolution calibration in magnetic particle imaging,” IEEE Transactions on Medical Imaging, pp. 1–13, 2022.
  37. C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4681–4690.
  38. Y. Meng, W. Li, S. Lei, Z. Zou, and Z. Shi, “Large-factor super-resolution of remote sensing images with spectra-guided generative adversarial networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–11, 2022.
  39. R. Dong, L. Zhang, and H. Fu, “RRSGAN: Reference-based super-resolution for remote sensing image,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–17, 2021.
  40. C. Saharia, J. Ho, W. Chan, T. Salimans, D. J. Fleet, and M. Norouzi, “Image super-resolution via iterative refinement,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  41. H. Li, Y. Yang, M. Chang, S. Chen, H. Feng, Z. Xu, Q. Li, and Y. Chen, “SRDiff: Single image super-resolution with diffusion probabilistic models,” Neurocomputing, vol. 479, pp. 47–59, 2022.
  42. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops, 2018, pp. 0–0.
  43. Z. Yang, B. Liu, Y. Xxiong, L. Yi, G. Wu, X. Tang, Z. Liu, J. Zhou, and X. Zhang, “DocDiff: Document enhancement via residual diffusion models,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 2795–2806.
  44. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.
  45. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, pp. 1–11, 2017.
  46. D. Seichter, M. Köhler, B. Lewandowski, T. Wengefeld, and H.-M. Gross, “Efficient RGB-D semantic segmentation for indoor scene analysis,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 13 525–13 531.
  47. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International conference on machine learning.   pmlr, 2015, pp. 448–456.
  48. J. Wang, K. Gao, Z. Zhang, C. Ni, Z. Hu, D. Chen, and Q. Wu, “Multisensor remote sensing imagery super-resolution with conditional gan,” Journal of Remote Sensing, vol. 2021, pp. 1–11, 2021.
  49. A. Djerida, K. Djerriri, M. S. Karoui et al., “A new public alsat-2b dataset for single-image super-resolution,” in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, 2021, pp. 8095–8098.
  50. M. Gerke, “Use of the stair vision library within the isprs 2d semantic labeling benchmark (vaihingen),” Dec. 2014.
  51. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
  52. A. Lugmayr, M. Danelljan, L. Van Gool, and R. Timofte, “Srflow: Learning the super-resolution space with normalizing flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16.   Springer, 2020, pp. 715–732.
  53. W. Li, K. Zhou, L. Qi, L. Lu, and J. Lu, “Best-buddy gans for highly detailed image super-resolution,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 2, 2022, pp. 1412–1420.
  54. W. Peebles and S. Xie, “Scalable diffusion models with transformers,” arXiv preprint arXiv:2212.09748, 2022.
  55. Y. Xiao, Q. Yuan, K. Jiang, J. He, X. Jin, and L. Zhang, “EDiffSR: An efficient diffusion probabilistic model for remote sensing image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, 2023.
  56. L. Wang, R. Li, C. Zhang, S. Fang, C. Duan, X. Meng, and P. M. Atkinson, “UNetFormer: A unet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 190, pp. 196–214, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jialu Sui (6 papers)
  2. Xianping Ma (10 papers)
  3. Xiaokang Zhang (42 papers)
  4. Man-On Pun (28 papers)