Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 100 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 29 tok/s
GPT-5 High 29 tok/s Pro
GPT-4o 103 tok/s
GPT OSS 120B 480 tok/s Pro
Kimi K2 215 tok/s Pro
2000 character limit reached

Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images (2410.22830v1)

Published 30 Oct 2024 in eess.IV and cs.CV

Abstract: Recent advancements in diffusion models have significantly improved performance in super-resolution (SR) tasks. However, previous research often overlooks the fundamental differences between SR and general image generation. General image generation involves creating images from scratch, while SR focuses specifically on enhancing existing low-resolution (LR) images by adding typically missing high-frequency details. This oversight not only increases the training difficulty but also limits their inference efficiency. Furthermore, previous diffusion-based SR methods are typically trained and inferred at fixed integer scale factors, lacking flexibility to meet the needs of up-sampling with non-integer scale factors. To address these issues, this paper proposes an efficient and elastic diffusion-based SR model (E$2$DiffSR), specially designed for continuous-scale SR in remote sensing imagery. E$2$DiffSR employs a two-stage latent diffusion paradigm. During the first stage, an autoencoder is trained to capture the differential priors between high-resolution (HR) and LR images. The encoder intentionally ignores the existing LR content to alleviate the encoding burden, while the decoder introduces an SR branch equipped with a continuous scale upsampling module to accomplish the reconstruction under the guidance of the differential prior. In the second stage, a conditional diffusion model is learned within the latent space to predict the true differential prior encoding. Experimental results demonstrate that E$2$DiffSR achieves superior objective metrics and visual quality compared to the state-of-the-art SR methods. Additionally, it reduces the inference time of diffusion-based SR methods to a level comparable to that of non-diffusion methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. R. Xu, C. Wang, J. Zhang, S. Xu, W. Meng, and X. Zhang, “Rssformer: Foreground saliency enhancement for remote sensing land-cover segmentation,” IEEE Trans. Image Process., vol. 32, pp. 1052–1064, 2023.
  2. D. Zhao, J. Lu, and B. Yuan, “See, perceive and answer: A unified benchmark for high-resolution post-disaster evaluation in remote sensing images,” IEEE Trans. Geosci. Remote Sens., 2024.
  3. Y. Mao, K. Chen, L. Zhao, W. Chen, D. Tang, W. Liu, Z. Wang, W. Diao, X. Sun, and K. Fu, “Elevation estimation-driven building 3-d reconstruction from single-view remote sensing imagery,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–18, 2023.
  4. Z. Tu, X. Yang, X. He, J. Yan, and T. Xu, “Rgtgan: Reference-based gradient-assisted texture-enhancement gan for remote sensing super-resolution,” IEEE Trans. Geosci. Remote Sens., 2024.
  5. S. Wang, B. Han, L. Yang, C. Zhao, A. Liang, C. Hu, F. Yang, and F. Xu, “Robust remote sensing super-resolution with frequency domain decoupling for multi-scenarios,” IEEE Trans. Geosci. Remote Sens., 2024.
  6. J. Zhang, J. Lei, W. Xie, Z. Fang, Y. Li, and Q. Du, “Superyolo: Super resolution assisted object detection in multimodal remote sensing imagery,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–15, 2023.
  7. Y. Liu, Z. Xiong, Y. Yuan, and Q. Wang, “Distilling knowledge from super-resolution for efficient remote sensing salient object detection,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–16, 2023.
  8. M. Hou, Z. Huang, Z. Yu, Y. Yan, Y. Zhao, and X. Han, “CSwT-SR: Conv-swin transformer for blind remote sensing image super-resolution with amplitude-phase learning and structural detail alternating learning,” IEEE Trans. Geosci. Remote Sens., 2024.
  9. Y. Xiao, Q. Yuan, K. Jiang, J. He, C.-W. Lin, and L. Zhang, “Ttst: A top-k token selective transformer for remote sensing image super-resolution,” IEEE Trans. Image Process., 2024.
  10. S. Gao, X. Liu, B. Zeng, S. Xu, Y. Li, X. Luo, J. Liu, X. Zhen, and B. Zhang, “Implicit diffusion models for continuous super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 10 021–10 030.
  11. J. Kim and T.-K. Kim, “Arbitrary-scale image generation and upsampling using latent diffusion model and implicit neural decoder,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2024, pp. 9202–9211.
  12. X. Hu, H. Mu, X. Zhang, Z. Wang, T. Tan, and J. Sun, “Meta-SR: A magnification-arbitrary network for super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 1575–1584.
  13. Y. Chen, S. Liu, and X. Wang, “Learning continuous image representation with local implicit image function,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 8628–8638.
  14. X. Xu, Z. Wang, and H. Shi, “UltraSR: Spatial encoding is a missing key for implicit image function-based arbitrary-scale super-resolution,” arXiv preprint arXiv:2103.12716, 2021.
  15. H. Wu, N. Ni, and L. Zhang, “Learning dynamic scale awareness and global implicit functions for continuous-scale super-resolution of remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–15, 2023.
  16. K. Chen, W. Li, S. Lei, J. Chen, X. Jiang, Z. Zou, and Z. Shi, “Continuous remote sensing image super-resolution based on context interaction in implicit function space,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–16, 2023.
  17. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Proc. Adv. Neural Inf. Process. Syst., vol. 27, 2014.
  18. F. Meng, S. Wu, Y. Li, Z. Zhang, T. Feng, R. Liu, and Z. Du, “Single remote sensing image super-resolution via a generative adversarial network with stratified dense sampling and chain training,” IEEE Trans. Geosci. Remote Sens., 2023.
  19. C. Wang, X. Zhang, W. Yang, X. Li, B. Lu, and J. Wang, “Msagan: a new super-resolution algorithm for multispectral remote sensing image based on a multiscale attention gan network,” IEEE Geosci. Remote. Sens. Lett., vol. 20, pp. 1–5, 2023.
  20. C. Saharia, J. Ho, W. Chan, T. Salimans, D. J. Fleet, and M. Norouzi, “Image super-resolution via iterative refinement,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 4, pp. 4713–4726, 2022.
  21. Y. Wang, W. Yang, X. Chen, Y. Wang, L. Guo, L.-P. Chau, Z. Liu, Y. Qiao, A. C. Kot, and B. Wen, “SinSR: diffusion-based image super-resolution in a single step,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2024, pp. 25 796–25 805.
  22. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 10 684–10 695.
  23. C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295–307, 2015.
  24. J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2016, pp. 1646–1654.
  25. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3142–3155, 2017.
  26. B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. Workshops, 2017, pp. 136–144.
  27. Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 2472–2481.
  28. T. Dai, J. Cai, Y. Zhang, S.-T. Xia, and L. Zhang, “Second-order attention network for single image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2019 year=2013, pp. 11 065–11 074.
  29. X. Zhu, K. Guo, S. Ren, B. Hu, M. Hu, and H. Fang, “Lightweight image super-resolution with expectation-maximization attention mechanism,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 3, pp. 1273–1284, 2021.
  30. F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo, “Learning texture transformer network for image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 5791–5800.
  31. X. Chen, X. Wang, J. Zhou, Y. Qiao, and C. Dong, “Activating more pixels in image super-resolution transformer,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 22 367–22 377.
  32. C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 4681–4690.
  33. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “ESRGAN: Enhanced super-resolution generative adversarial networks,” in Proc. Eur. Conf. Comput. Vis. Workshops, 2018.
  34. C. Ma, Y. Rao, J. Lu, and J. Zhou, “Structure-preserving image super-resolution,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 7898–7911, 2021.
  35. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Proc. Adv. Neural Inf. Process. Syst., vol. 33, pp. 6840–6851, 2020.
  36. J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” in Proc. Int. Conf. Mach. Learn.   PMLR, 2015, pp. 2256–2265.
  37. H. Li, Y. Yang, M. Chang, S. Chen, H. Feng, Z. Xu, Q. Li, and Y. Chen, “SRDiff: Single image super-resolution with diffusion probabilistic models,” Neurocomputing, vol. 479, pp. 47–59, 2022.
  38. Z. Luo, F. K. Gustafsson, Z. Zhao, J. Sjölund, and T. B. Schön, “Image restoration with mean-reverting stochastic differential equations,” Proc. Int. Conf. Mach. Learn., 2023.
  39. P. Behjati, P. Rodriguez, A. Mehri, I. Hupont, C. F. Tena, and J. Gonzalez, “Overnet: Lightweight multi-scale super-resolution with overscaling network,” in Proc. IEEE Winter Conf. Appl. Comput. Vis., 2021, pp. 2694–2703.
  40. C. Lanaras, J. Bioucas-Dias, S. Galliani, E. Baltsavias, and K. Schindler, “Super-resolution of sentinel-2 images: Learning a globally applicable deep neural network,” ISPRS J. Photogramm. Remote Sens., vol. 146, pp. 305–319, 2018.
  41. Z. Pan, W. Ma, J. Guo, and B. Lei, “Super-resolution of single remote sensing image based on residual dense backprojection networks,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 10, pp. 7918–7933, 2019.
  42. S. Lei, Z. Shi, and Z. Zou, “Coupled adversarial training for remote sensing image super-resolution,” IEEE Trans. Geosci. Remote Sens., vol. 58, no. 5, pp. 3633–3643, 2019.
  43. X. Dong, X. Sun, X. Jia, Z. Xi, L. Gao, and B. Zhang, “Remote sensing image super-resolution using novel dense-sampling networks,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 2, pp. 1618–1633, 2020.
  44. H. Wu, N. Ni, S. Wang, and L. Zhang, “Conditional stochastic normalizing flows for blind super-resolution of remote sensing images,” IEEE Trans. Geosci. Remote Sens., 2023.
  45. Y. Xiao, Q. Yuan, K. Jiang, J. He, X. Jin, and L. Zhang, “EDiffSR: An efficient diffusion probabilistic model for remote sensing image super-resolution,” IEEE Trans. Geosci. Remote Sens., 2023.
  46. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, vol. 1, no. 2, p. 3, 2022.
  47. M. Kwon, J. Jeong, and Y. Uh, “Diffusion models already have a semantic latent space,” in Proc. Int. Conf. Learn. Representations, Kigali, Rwanda, 2023.
  48. A. Blattmann, R. Rombach, H. Ling, T. Dockhorn, S. W. Kim, S. Fidler, and K. Kreis, “Align your latents: High-resolution video synthesis with latent diffusion models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 22 563–22 575.
  49. Y. Takagi and S. Nishimoto, “High-resolution image reconstruction with latent diffusion models from human brain activity,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 14 453–14 463.
  50. Z. Hui, X. Gao, Y. Yang, and X. Wang, “Lightweight image super-resolution with information multi-distillation network,” in Proceedings of the 27th acm international conference on multimedia, 2019, pp. 2024–2032.
  51. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017.
  52. P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 12 873–12 883.
  53. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  54. G.-S. Xia, J. Hu, F. Hu, B. Shi, X. Bai, Y. Zhong, L. Zhang, and X. Lu, “Aid: A benchmark data set for performance evaluation of aerial scene classification,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 7, pp. 3965–3981, 2017.
  55. G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “DOTA: A large-scale dataset for object detection in aerial images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 3974–3983.
  56. K. Li, G. Wan, G. Cheng, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS journal of photogrammetry and remote sensing, vol. 159, pp. 296–307, 2020.
  57. G. Cheng, J. Han, and X. Lu, “Remote sensing image scene classification: Benchmark and state of the art,” Proceedings of the IEEE, vol. 105, no. 10, pp. 1865–1883, 2017.
  58. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 586–595.
  59. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017.
  60. J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube