Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image (2401.05093v1)

Published 10 Jan 2024 in cs.CV

Abstract: With recent advancements in aerospace technology, the volume of unlabeled remote sensing image (RSI) data has increased dramatically. Effectively leveraging this data through self-supervised learning (SSL) is vital in the field of remote sensing. However, current methodologies, particularly contrastive learning (CL), a leading SSL method, encounter specific challenges in this domain. Firstly, CL often mistakenly identifies geographically adjacent samples with similar semantic content as negative pairs, leading to confusion during model training. Secondly, as an instance-level discriminative task, it tends to neglect the essential fine-grained features and complex details inherent in unstructured RSIs. To overcome these obstacles, we introduce SwiMDiff, a novel self-supervised pre-training framework designed for RSIs. SwiMDiff employs a scene-wide matching approach that effectively recalibrates labels to recognize data from the same scene as false negatives. This adjustment makes CL more applicable to the nuances of remote sensing. Additionally, SwiMDiff seamlessly integrates CL with a diffusion model. Through the implementation of pixel-level diffusion constraints, we enhance the encoder's ability to capture both the global semantic information and the fine-grained features of the images more comprehensively. Our proposed framework significantly enriches the information available for downstream tasks in remote sensing. Demonstrating exceptional performance in change detection and land-cover classification tasks, SwiMDiff proves its substantial utility and value in the field of remote sensing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Z. Chen, D. Hong, and H. Gao, “Grid network: Feature extraction in anisotropic perspective for hyperspectral image classification,” IEEE Geosci. Remote Sens. Lett., vol. 20, pp. 1–5, 2023.
  2. Z. Chen, G. Wu, H. Gao, Y. Ding, D. Hong, and B. Zhang, “Local aggregation and global attention network for hyperspectral image classification with spectral-induced aligned superpixel segmentation,” Expert Syst. Appl., vol. 232, p. 120828, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0957417423013301
  3. Z. Chen, Y. Wang, H. Gao, Y. Ding, Q. Zhong, D. Hong, and B. Zhang, “Temporal difference-guided network for hyperspectral image change detection,” Int. J. Remote Sens., vol. 44, no. 19, pp. 6033–6059, 2023. [Online]. Available: https://doi.org/10.1080/01431161.2023.2258563
  4. Q. Zhu, X. Guo, Z. Li, and D. Li, “A review of multi-class change detection for satellite remote sensing imagery,” Geo-Spat. Inf. Sci., vol. 0, no. 0, pp. 1–15, 2022. [Online]. Available: https://doi.org/10.1080/10095020.2022.2128902
  5. Q. Zhu, X. Guo, W. Deng, S. Shi, Q. Guan, Y. Zhong, L. Zhang, and D. Li, “Land-use/land-cover change detection based on a siamese global learning framework for high spatial resolution remote sensing imagery,” ISPRS-J. Photogramm. Remote Sens., vol. 184, pp. 63–78, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0924271621003270
  6. Z. Chen, Z. Lu, H. Gao, Y. Zhang, J. Zhao, D. Hong, and B. Zhang, “Global to local: A hierarchical detection algorithm for hyperspectral image target detection,” IEEE Trans. on Geosci. and Remote Sens., vol. 60, pp. 1–15, 2022.
  7. J. Zhang, J. Lei, W. Xie, Y. Li, G. Yang, and X. Jia, “Guided hybrid quantization for object detection in remote sensing imagery via one-to-one self-teaching,” IEEE Trans. on Geosci. and Remote Sens., 2023.
  8. J. Zhang, J. Lei, W. Xie, Z. Fang, Y. Li, and Q. Du, “Superyolo: Super resolution assisted object detection in multimodal remote sensing imagery,” IEEE Trans. on Geosci. and Remote Sens., vol. 61, pp. 1–15, 2023.
  9. S. Sader, T. Stone, and A. Joyce, “Remote sensing of tropical forests- an overview of research and applications using non-photographic sensors,” Photogramm. Eng. Remote Sens., vol. 56, no. 10, pp. 1343–1351, 1990.
  10. G. J. Schumann, G. R. Brakenridge, A. J. Kettner, R. Kashif, and E. Niebuhr, “Assisting flood disaster response with earth observation data and products: A critical assessment,” Remote Sens., vol. 10, no. 8, p. 1230, 2018.
  11. D. J. Mulla, “Twenty five years of remote sensing in precision agriculture: Key advances and remaining knowledge gaps,” Biosyst. Eng., vol. 114, no. 4, pp. 358–371, 2013.
  12. F. Filipponi, “Exploitation of sentinel-2 time series to map burned areas at the national level: A case study on the 2017 italy wildfires,” Remote Sens., vol. 11, no. 6, p. 622, 2019.
  13. P. Berg, M.-T. Pham, and N. Courty, “Self-supervised learning for scene classification in remote sensing: Current state of the art and perspectives,” Remote Sens., vol. 14, no. 16, p. 3995, 2022.
  14. Y. Wang, C. M. Albrecht, N. A. A. Braham, L. Mou, and X. X. Zhu, “Self-supervised learning in remote sensing: A review,” arXiv, 2022. [Online]. Available: https://arxiv.org/abs/2206.13188
  15. Q. Liu, J. Peng, N. Chen, W. Sun, Y. Ning, and Q. Du, “Category-specific prototype self-refinement contrastive learning for few-shot hyperspectral image classification,” IEEE Trans. on Geosci. and Remote Sens., vol. 61, pp. 1–16, 2023.
  16. Q. Liu, J. Peng, Y. Ning, N. Chen, W. Sun, Q. Du, and Y. Zhou, “Refined prototypical contrastive learning for few-shot hyperspectral image classification,” IEEE Trans. on Geosci. and Remote Sens., vol. 61, pp. 1–14, 2023.
  17. Y. Ning, J. Peng, Q. Liu, Y. Huang, W. Sun, and Q. Du, “Contrastive learning based on category matching for domain adaptation in hyperspectral image classification,” IEEE Trans. on Geosci. and Remote Sens., vol. 61, pp. 1–14, 2023.
  18. R. Balestriero, M. Ibrahim, V. Sobal, A. Morcos, S. Shekhar, T. Goldstein, F. Bordes, A. Bardes, G. Mialon, Y. Tian et al., “A cookbook of self-supervised learning,” arXiv, 2023. [Online]. Available: https://arxiv.org/abs/2304.12210
  19. M. Noroozi and P. Favaro, “Unsupervised learning of visual representations by solving jigsaw puzzles,” in Proc. Eur. Conf. Comput. Vis. (ECCV).   Springer, 2016, pp. 69–84.
  20. C. Doersch, A. Gupta, and A. A. Efros, “Unsupervised visual representation learning by context prediction,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1422–1430.
  21. R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in Proc. Eur. Conf. Comput. Vis. (ECCV).   Springer, 2016, pp. 649–666.
  22. M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, and A. Joulin, “Unsupervised learning of visual features by contrasting cluster assignments,” Proc. Adv. Neural Inf. Process. Syst., vol. 33, pp. 9912–9924, 2020.
  23. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2020, pp. 9729–9738.
  24. X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,” arXiv, 2020. [Online]. Available: https://arxiv.org/abs/2003.04297
  25. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 1597–1607.
  26. J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar et al., “Bootstrap your own latent-a new approach to self-supervised learning,” Proc. Adv. Neural Inf. Process. Syst., vol. 33, pp. 21 271–21 284, 2020.
  27. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. Int. Conf. Mach. Learn.   PMLR, 2020, pp. 1597–1607.
  28. T. Chen, S. Kornblith, K. Swersky, M. Norouzi, and G. E. Hinton, “Big self-supervised models are strong semi-supervised learners,” Proc. Adv. Neural Inf. Process. Syst., vol. 33, pp. 22 243–22 255, 2020.
  29. W. R. Tobler, “A computer movie simulating urban growth in the detroit region,” Econ. Geogr., p. 234, Jun 1970.
  30. Z. Zhang, X. Wang, X. Mei, C. Tao, and H. Li, “False: False negative samples aware contrastive learning for semantic segmentation of high-resolution remote sensing image,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, 2022.
  31. R. C. Daudt, B. Le Saux, A. Boulch, and Y. Gousseau, “Urban change detection for multispectral earth observation using convolutional neural networks,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS).   Ieee, 2018, pp. 2115–2118.
  32. H. Chen and Z. Shi, “A spatial-temporal attention-based method and a new dataset for remote sensing image change detection,” Remote Sens., vol. 12, no. 10, 2020.
  33. G. Sumbul, M. Charfuelan, B. Demir, and V. Markl, “Bigearthnet: A large-scale benchmark archive for remote sensing image understanding,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS).   IEEE, 2019, pp. 5901–5904.
  34. P. Helber, B. Bischke, A. Dengel, and D. Borth, “Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification,” IEEE J. Sel. Top. Appl. Earth. Obs. Remote Sens., 2017.
  35. L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, Y. Shao, W. Zhang, B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey of methods and applications,” arXiv, 2022. [Online]. Available: https://arxiv.org/abs/2209.00796
  36. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Proc. Adv. Neural Inf. Process. Syst., vol. 33, pp. 6840–6851, 2020.
  37. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv, 2021. [Online]. Available: https://arxiv.org/abs/2112.10741
  38. P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” Proc. Adv. Neural Inf. Process. Syst., vol. 34, pp. 8780–8794, 2021.
  39. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv, vol. 1, no. 2, p. 3, 2022. [Online]. Available: https://arxiv.org/abs/2204.06125
  40. W. Xiang, H. Yang, D. Huang, and Y. Wang, “Denoising diffusion autoencoders are unified self-supervised learners,” arXiv, 2023. [Online]. Available: https://arxiv.org/abs/2303.09769
  41. W. G. C. Bandara, N. G. Nair, and V. M. Patel, “Ddpm-cd: Remote sensing change detection using denoising diffusion probabilistic models,” arXiv, 2022. [Online]. Available: https://arxiv.org/abs/2206.11892
  42. J. Ma, W. Xie, Y. Li, and L. Fang, “Bsdm: Background suppression diffusion model for hyperspectral anomaly detection,” arXiv, 2023. [Online]. Available: https://arxiv.org/abs/2307.09861
  43. R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), vol. 2.   IEEE, 2006, pp. 1735–1742.
  44. Y. Tian, D. Krishnan, and P. Isola, “Contrastive multiview coding,” in Proc. Eur. Conf. Comput. Vis. (ECCV).   Springer, 2020, pp. 776–794.
  45. X. Chen and K. He, “Exploring simple siamese representation learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2021, pp. 15 750–15 758.
  46. J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow twins: Self-supervised learning via redundancy reduction,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 12 310–12 320.
  47. C. Feng and I. Patras, “Adaptive soft contrastive learning,” in Proc. Int. Conf. on Pattern Recog.   IEEE, 2022, pp. 2721–2727.
  48. M. Schmitt, L. H. Hughes, C. Qiu, and X. X. Zhu, “Sen12ms–a curated dataset of georeferenced multi-spectral sentinel-1/2 imagery for deep learning and data fusion,” arXiv, 2019. [Online]. Available: https://arxiv.org/abs/1906.07789
  49. G. Christie, N. Fendley, J. Wilson, and R. Mukherjee, “Functional map of the world,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2018, pp. 6172–6180.
  50. N. Gorelick, M. Hancher, M. Dixon, S. Ilyushchenko, D. Thau, and R. Moore, “Google earth engine: Planetary-scale geospatial analysis for everyone,” Remote sens. Environ., vol. 202, pp. 18–27, 2017.
  51. K. Ayush, B. Uzkent, C. Meng, K. Tanmay, M. Burke, D. Lobell, and S. Ermon, “Geography-aware self-supervised learning,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 10 181–10 190.
  52. J. Kang, R. Fernandez-Beltran, P. Duan, S. Liu, and A. J. Plaza, “Deep unsupervised embedding for remotely sensed images based on spatially augmented momentum contrast,” IEEE Trans. on Geosci. and Remote Sens., vol. 59, no. 3, pp. 2598–2610, 2021.
  53. O. Manas, A. Lacoste, X. Giró-i Nieto, D. Vazquez, and P. Rodriguez, “Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 9414–9423.
  54. H. Huang, Z. Mou, Y. Li, Q. Li, J. Chen, and H. Li, “Spatial-temporal invariant contrastive learning for remote sensing scene classification,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, 2022.
  55. M. Drusch, U. Del Bello, S. Carlier, O. Colin, V. Fernandez, F. Gascon, B. Hoersch, C. Isola, P. Laberinti, P. Martimort et al., “Sentinel-2: Esa’s optical high-resolution mission for gmes operational services,” Remote sens. Environ., vol. 120, pp. 25–36, 2012.
  56. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2016, pp. 770–778.
  57. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv, 2014. [Online]. Available: https://arxiv.org/abs/1412.6980
  58. F. Haghighi, M. R. H. Taher, M. B. Gotway, and J. Liang, “Dira: Discriminative, restorative, and adversarial learning for self-supervised medical image analysis,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2022, pp. 20 824–20 834.
  59. Q. Zhang, Y. Wang, and Y. Wang, “Identifiable contrastive learning with automatic feature importance discovery,” in Proc. Adv. Neural Inf. Process. Syst., 2023.
  60. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention, Jan 2015, p. 234–241.
  61. M. Neumann, A. S. Pinto, X. Zhai, and N. Houlsby, “In-domain representation learning for remote sensing,” arXiv, 2019. [Online]. Available: https://arxiv.org/abs/1911.06721
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jiayuan Tian (5 papers)
  2. Jie Lei (52 papers)
  3. Jiaqing Zhang (30 papers)
  4. Weiying Xie (31 papers)
  5. Yunsong Li (41 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.