Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SARATR-X: Towards Building A Foundation Model for SAR Target Recognition (2405.09365v3)

Published 15 May 2024 in cs.CV

Abstract: Despite the remarkable progress in synthetic aperture radar automatic target recognition (SAR ATR), recent efforts have concentrated on detecting and classifying a specific category, e.g., vehicles, ships, airplanes, or buildings. One of the fundamental limitations of the top-performing SAR ATR methods is that the learning paradigm is supervised, task-specific, limited-category, closed-world learning, which depends on massive amounts of accurately annotated samples that are expensively labeled by expert SAR analysts and have limited generalization capability and scalability. In this work, we make the first attempt towards building a foundation model for SAR ATR, termed SARATR-X. SARATR-X learns generalizable representations via self-supervised learning (SSL) and provides a cornerstone for label-efficient model adaptation to generic SAR target detection and classification tasks. Specifically, SARATR-X is trained on 0.18 M unlabelled SAR target samples, which are curated by combining contemporary benchmarks and constitute the largest publicly available dataset till now. Considering the characteristics of SAR images, a backbone tailored for SAR ATR is carefully designed, and a two-step SSL method endowed with multi-scale gradient features was applied to ensure the feature diversity and model scalability of SARATR-X. The capabilities of SARATR-X are evaluated on classification under few-shot and robustness settings and detection across various categories and scenes, and impressive performance is achieved, often competitive with or even superior to prior fully supervised, semi-supervised, or self-supervised algorithms. Our SARATR-X and the curated dataset are released at https://github.com/waterdisappear/SARATR-X to foster research into foundation models for SAR image interpretation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (104)
  1. Air Force Research Laboratory. The air force moving and stationary target recognition database. https://www.sdms.afrl.af.mil/index.php?collection=mstar.
  2. Self-supervised learning from images with a joint-embedding predictive architecture. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 15619–15629, 2023.
  3. Feature enhancement pyramid and shallow feature reconstruction network for SAR ship detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16:1042–1056, 2023.
  4. A cookbook of self-supervised learning. arXiv preprint, 2023.
  5. On the opportunities and risks of foundation models. arXiv preprint, 2021.
  6. Alan C Bovik. On detecting edges in speckle imagery. IEEE Trans. Acoust. Speech Signal Process., 36(10):1618–1627, 1988.
  7. Cascade R-CNN: Delving into high quality object detection. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 6154–6162, 2018.
  8. John Canny. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell., (6):679–698, 1986.
  9. CSnsuperscriptCSn\rm{CS}^{n}roman_CS start_POSTSUPERSCRIPT roman_n end_POSTSUPERSCRIPTnet: A remote sensing detection network breaking the second-order limitation of transformers with recursive convolutions. IEEE Trans. Geosci. Remote Sens., 61:1–15, 2023.
  10. Large-scale multi-class SAR image target detection dataset-1.0. https://radars.ac.cn/web/data/getData?dataType=MSAR, 2022.
  11. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint, 2019.
  12. Generative pretraining from pixels. In Int. Conf. Machin. Learn. (ICML), pages 1691–1703. PMLR, 2020.
  13. SatMAE: Pre-training transformers for temporal and multi-spectral satellite imagery. In Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), volume 35, pages 197–211, 2022.
  14. Histograms of oriented gradients for human detection. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), volume 1, pages 886–893. Ieee, 2005.
  15. Explainable, physics-aware, trustworthy artificial intelligence: A paradigm shift for synthetic aperture radar. IEEE Geosci. Remote Sens. Mag., 11(1):8–25, 2023.
  16. SAR-SIFT: a SIFT-like algorithm for SAR images. EEE Trans. Geosci. Remote Sens., 53(1):453–466, 2014.
  17. Generative convnet foundation model with sparse modeling and low-frequency reconstruction for remote sensing image interpretation. IEEE Trans. Geosci. Remote Sens., 2024.
  18. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint, 2020.
  19. Li Fei-Fei and Ranjay Krishna. Searching for computer vision north stars. Daedalus, 151(2):85–99, 2022.
  20. An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images. IEEE Trans. Geosci. Remote Sens., 59(2):1331–1344, 2021.
  21. Scattering-keypoint-guided network for oriented ship detection in high-resolution and large-scale sar images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 14:11162–11178, 2021.
  22. Satellite remote sensing and non-destructive testing methods for transport infrastructure monitoring: Advances, challenges and perspectives. Remote Sens., 15(2):418, 2023.
  23. Accurate, large minibatch SGD: Training imagenet in 1 hour. arXiv preprint, 2017.
  24. A centernet++ model for ship detection in SAR images. Pattern Recognit., 112:107787, 2021.
  25. Visual attention network. Comput. Visual Media, 9(4):733–752, 2023.
  26. SkySense: A multi-modal remote sensing foundation model towards universal interpretation for earth observation imagery. arXiv preprint, 2023.
  27. Masked autoencoders are scalable vision learners. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 16000–16009, 2022.
  28. Bounding box regression with uncertainty for accurate object detection. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 2888–2897, 2019.
  29. Physically explainable CNN for SAR image classification. ISPRS J. Photogramm. Remote Sens., 190:25–37, 2022.
  30. Brain-inspired remote sensing foundation models and open problems: A comprehensive survey. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16:10084–10120, 2023.
  31. Automatic target recognition on synthetic aperture radar imagery: A survey. IEEE Aerosp. Electron. Syst. Mag., 36(3):56–81, 2021.
  32. DiffusionSat: A generative foundation model for satellite imagery. In Proc. Int. Conf. Learn. Represent. (ICLR), 2024.
  33. Synthetic SAR image generation using sensor, terrain and target models. In Proc. Eur. Conf. Synth. Aperture Radar, EUSAR 2016, pages 1–5. VDE, 2016.
  34. A SAR dataset for ATR development: the synthetic and measured paired labeled experiment (SAMPLE). In Proc. SPIE Conf. Algorithms SAR Imagery, volume 10987, pages 39–54, 2019.
  35. OpenSARShip 2.0: A large-volume dataset for deeper interpretation of ship targets in Sentinel-1 imagery. In Proc. SAR Big Data Era: Models Methods Appl. (BIGSARDATA), pages 1–5, 2017.
  36. Deep learning for SAR ship detection: Past, present and future. Remote Sens., 14(11):2712, 2022.
  37. A comprehensive survey on SAR ATR in deep-learning era. Remote Sens., 15(5):1454, 2023.
  38. Predicting gradient is better: Exploring self-supervised learning for SAR ATR with a joint-embedding predictive architecture. arXiv preprint, 2024.
  39. Discovering and explaining the noncausality of deep learning in SAR ATR. IEEE Geosci. Remote Sens. Lett., 20:1–5, 2023.
  40. Hierarchical disentanglement-alignment network for robust SAR vehicle recognition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16:9661–9679, 2023.
  41. Vision-language models in remote sensing: Current progress and future trends. IEEE Geosci. Remote Sens. Mag., pages 2–36, 2024.
  42. Large selective kernel network for remote sensing object detection. In Proc. IEEE Int. Conf. Comput. Vis. (ICCV), pages 16794–16805, October 2023.
  43. SARDet-100K: Towards open-source benchmark and toolkit for large-scale SAR object detection. arXiv preprint, 2024.
  44. SIVED: A SAR image dataset for vehicle detection based on rotatable bounding box. Remote Sens., 15(11):2825, 2023.
  45. Self-supervised learning: Generative or contrastive. IEEE Trans. Knowl. Data Eng., 35(1):857–876, 2021.
  46. PixMIM: Rethinking pixel reconstruction in masked image modeling. arXiv preprint, 2023.
  47. Swin transformer: Hierarchical vision transformer using shifted windows. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 10012–10022, 2021.
  48. A ConvNet for the 2020s. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 11976–11986, 2022.
  49. SGDR: Stochastic gradient descent with warm restarts. 2017.
  50. Decoupled weight decay regularization. 2019.
  51. Improving SAR automatic target recognition models with transfer learning from simulated data. IEEE Geosci. Remote Sens. Lett., 14(9):1484–1488, 2017.
  52. Towards geospatial foundation models via continual pretraining. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 16806–16816, 2023.
  53. A tutorial on synthetic aperture radar. IEEE Geosci. Remote Sens. Mag., 1(1):6–43, 2013.
  54. LHRS-Bot: Empowering remote sensing with vgi-enhanced large multimodal language model. arXiv preprint, 2024.
  55. Cmid: A unified self-supervised learning framework for remote sensing image understanding. IEEE Trans. Geosci. Remote Sens., 2023.
  56. Self-supervised feature representation for SAR image target classification using contrastive learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16:9246–9258, 2023.
  57. Scattering model guided adversarial examples for SAR target recognition: Attack and defense. IEEE Trans. Geosci. Remote Sens., 60:1–17, 2022.
  58. Scale-MAE: A scale-aware masked autoencoder for multiscale geospatial representation learning. In Proc. IEEE Int. Conf. Comput. Vis. (ICCV), pages 4065–4076, 2023.
  59. YOLO-Lite: An efficient lightweight network for SAR ship detection. Remote Sens., 15(15):3771, 2023.
  60. Sandia National Laboratories. Complex SAR data. https://www.sandia.gov/radar/complex-data/index.html.
  61. SAR target recognition via supervised discriminative dictionary learning and sparse representation of the SAR-HOG feature. Remote Sens., 8(8):683, 2016.
  62. Spaceborne synthetic aperture radar imaging algorithms: An overview. IEEE Geosci. Remote Sens. Mag., 10(1):161–184, 2021.
  63. Sparse R-CNN: End-to-end object detection with learnable proposals. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 14454–14463, 2021.
  64. SCAN: Scattering characteristics analysis network for few-shot aircraft classification in high-resolution SAR images. IEEE Trans. Geosci. Remote Sens., 60:1–17, 2022.
  65. RingMo: A remote sensing foundation model with masked image modeling. IEEE Trans. Geosci. Remote Sens., 61:1–22, 2023.
  66. AIR-SARShip-1.0: High-resolution SAR ship detection dataset. J. Radars, 8(6):852–862, 2019.
  67. TOV: The original vision model for optical remote sensing image understanding via self-supervised learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16:4916–4930, 2023.
  68. A statistical and geometrical edge detector for SAR images. IEEE Trans. Geosci. Remote Sens., 26(6):764–773, 1988.
  69. SAR data applications in earth observation: An overview. Expert Syst. Appl., 205:117342, 2022.
  70. Rapid object detection using a boosted cascade of simple features. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), volume 1, pages I–I. Ieee, 2001.
  71. Category-oriented localization distillation for SAR object detection and a unified benchmark. IEEE Trans. Geosci. Remote Sens., 61:1–14, 2023.
  72. Global in local: A convolutional transformer for SAR ATR FSL. IEEE Geosci. Remote Sens. Lett., 19:1–5, 2022.
  73. Crucial feature capture and discrimination for limited training data SAR ATR. ISPRS J. Photogramm. Remote Sens., 204:291–305, 2023.
  74. Recognition in label and discrimination in feature: A hierarchically designed lightweight method for limited data in SAR ATR. IEEE Trans. Geosci. Remote Sens., 60:1–13, 2022.
  75. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 7464–7475, 2023.
  76. SAR target classification based on multiscale attention super-class network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 15:9004–9019, 2022.
  77. Advancing plain vision transformer toward remote sensing foundation model. IEEE Trans. Geosci. Remote Sens., 61:1–15, 2022.
  78. SAR target recognition based on cross-domain and cross-task transfer learning. IEEE Access, 7:153391–153399, 2019.
  79. Feature guided masked autoencoder for self-supervised learning in remote sensing. arXiv preprint, 2023.
  80. SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset (in chinese). J. Radars, 12(4):906–922, 2023.
  81. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access, 8:120234–120254, 2020.
  82. Rotation awareness based self-supervised learning for SAR target recognition with limited training samples. IEEE Trans. Image Process., 30:7266–7279, 2021.
  83. ConvNeXt V2: Co-designing and scaling ConvNets with masked autoencoders. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 16133–16142, 2023.
  84. CRTransSar: A visual transformer based on contextual joint representation learning for SAR ship detection. Remote Sens., 14(6):1488, 2022.
  85. Revealing the dark secrets of masked image modeling. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 14475–14485, 2023.
  86. On data scaling in masked image modeling. In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 10365–10374, 2023.
  87. One for All: Toward unified foundation models for earth vision. arXiv preprint, 2024.
  88. Adversarial self-supervised learning for robust SAR target recognition. Remote Sens., 13(20):4158, 2021.
  89. RepPoints: Point set representation for object detection. In Proc. IEEE Int. Conf. Comput. Vis. (ICCV), pages 9657–9666, 2019.
  90. RingMo-Sense: Remote sensing foundation model for spatiotemporal prediction via spatiotemporal evolution disentangling. IEEE Trans. Geosci. Remote Sens., 2023.
  91. ObjectBox: From centers to boxes for anchor-free object detection. In Proc. Europ. Conf. Comp. Visi. (ECCV), pages 390–406. Springer, 2022.
  92. Weakly contrastive learning via batch instance discrimination and feature clustering for small sample SAR ATR. IEEE Trans. Geosci. Remote Sens., 60:1–17, 2022.
  93. SkyEyeGPT: Unifying remote sensing vision-language tasks via instruction tuning with large language model. arXiv preprint, 2024.
  94. Domain knowledge powered two-stream deep network for few-shot SAR vehicle recognition. IEEE Trans. Geosci. Remote Sens., 60:1–15, 2021.
  95. Optimal azimuth angle selection for limited SAR vehicle target recognition. ISPRS J. Photogramm. Remote Sens., 128:103707, 2024.
  96. SEFEPNet: Scale expansion and feature enhancement pyramid network for SAR aircraft detection with small sample dataset. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 15:3365–3375, 2022.
  97. SAR ship detection dataset (SSDD): Official release and comprehensive data analysis. Remote Sens., 13(18):3690, 2021.
  98. EarthGPT: A universal multi-modal large language model for multi-sensor image comprehension in remote sensing domain. arXiv preprint, 2024.
  99. HiVit: A simpler and more efficient design of hierarchical vision transformer. In Proc. Int. Conf. Learn. Represent. (ICLR), 2023.
  100. Semi-supervised SAR ATR via epoch- and uncertainty-aware pseudo-label exploitation. IEEE Trans. Geosci. Remote Sens., 61:1–15, 2023.
  101. Domain generalization: A survey. IEEE Trans. Pattern Anal. Mach. Intell., 45(4):4396–4415, 2023.
  102. Domain adaptive ensemble learning. IEEE Trans. Image Process., 30:8008–8018, 2021.
  103. A foundation model for generalizable disease detection from retinal images. Nature, 622(7981):156–163, 2023.
  104. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint, 2020.
Citations (5)

Summary

We haven't generated a summary for this paper yet.