Papers
Topics
Authors
Recent
Search
2000 character limit reached

Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting

Published 11 Oct 2024 in cs.CV, cs.AI, and cs.LG | (2410.08612v1)

Abstract: Sonar image synthesis is crucial for advancing applications in underwater exploration, marine biology, and defence. Traditional methods often rely on extensive and costly data collection using sonar sensors, jeopardizing data quality and diversity. To overcome these limitations, this study proposes a new sonar image synthesis framework, Synth-SONAR leveraging diffusion models and GPT prompting. The key novelties of Synth-SONAR are threefold: First, by integrating Generative AI-based style injection techniques along with publicly available real/simulated data, thereby producing one of the largest sonar data corpus for sonar research. Second, a dual text-conditioning sonar diffusion model hierarchy synthesizes coarse and fine-grained sonar images with enhanced quality and diversity. Third, high-level (coarse) and low-level (detailed) text-based sonar generation methods leverage advanced semantic information available in visual LLMs (VLMs) and GPT-prompting. During inference, the method generates diverse and realistic sonar images from textual prompts, bridging the gap between textual descriptions and sonar image generation. This marks the application of GPT-prompting in sonar imagery for the first time, to the best of our knowledge. Synth-SONAR achieves state-of-the-art results in producing high-quality synthetic sonar datasets, significantly enhancing their diversity and realism.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Underwater images contrast enhancement and its challenges: a survey. Multimedia Tools and Applications, pages 1–26, 2024.
  2. Unveiling underwater structures: Mobilenet vs. efficientnet in sonar image detection. Procedia Computer Science, 233:518–527, 2024.
  3. Side-scan sonar image classification with zero-shot and style transfer. IEEE Transactions on Instrumentation and Measurement, 73:1–15, 2024.
  4. Tom B Brown. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.
  5. A survey on generative diffusion models. IEEE Transactions on Knowledge and Data Engineering, 2024.
  6. A novel method for sidescan sonar image segmentation. IEEE Journal of Oceanic Engineering, 36(2):186–194, 2011.
  7. Style injection in diffusion: A training-free approach for adapting large-scale diffusion models for style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8795–8805, 2024.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  9. Leon A Gatys. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015.
  10. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  11. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  12. Efficient sonarnet: Lightweight cnn grafted vision transformer embedding network for forward-looking sonar image segmentation. IEEE Transactions on Geoscience and Remote Sensing, pages 1–1, 2024.
  13. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  14. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  15. Seg2sonar: A full-class sample synthesis method applied to underwater sonar image target detection, recognition, and segmentation tasks. IEEE Transactions on Geoscience and Remote Sensing, 62:1–19, 2024.
  16. Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data. IEEE access, 8:47407–47418, 2020.
  17. Full-scale continuous synthetic sonar data generation with markov conditional generative adversarial networks. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 3168–3174. IEEE, 2020.
  18. Side-scan sonar image synthesis based on generative adversarial network for images in multiple frequencies. IEEE Geoscience and Remote Sensing Letters, 18(9):1505–1509, 2020.
  19. Diederik P Kingma. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  20. Cycle-gan-based synthetic sonar image generation for improved underwater classification. In Ocean Sensing and Monitoring XVI, volume 13061, pages 69–83. SPIE, 2024.
  21. Deep learning from shallow dives: Sonar image generation and training for underwater object detection. arXiv preprint arXiv:1810.07990, 2018.
  22. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International conference on machine learning, pages 12888–12900. PMLR, 2022.
  23. Dual spatial attention network for underwater object detection with sonar imagery. IEEE Sensors Journal, 24(5):6998–7008, 2024.
  24. Visual instruction tuning. Advances in neural information processing systems, 36, 2024.
  25. Three-class markovian segmentation of high-resolution sonar images. Computer Vision and Image Understanding, 76(3):191–204, 1999.
  26. Sonar image segmentation using an unsupervised hierarchical mrf model. IEEE Transactions on Image Processing, 9(7):1216–1231, 2000.
  27. Underwater sonar image classification and analysis using lime-based explainable artificial intelligence. arXiv preprint arXiv:2408.12837, 2024.
  28. Vale: A multimodal visual and language explanation framework for image classifiers using explainable ai and language models, 2024.
  29. Medical image synthesis with context-aware generative adversarial networks. In Medical Image Computing and Computer Assisted Intervention- MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III 20, pages 417–425. Springer, 2017.
  30. Image transformer. In International conference on machine learning, pages 4055–4064. PMLR, 2018.
  31. Sigan: A multi-scale generative adversarial network for underwater sonar image super-resolution. Journal of Marine Science and Engineering, 12(7):1057, 2024.
  32. Sample augmentation method for side-scan sonar underwater target images based on cbl-singan. Journal of Marine Science and Engineering, 12(3):467, 2024.
  33. Self-supervised learning for sonar image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 1499–1508, June 2022.
  34. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  35. Zero-shot text-to-image generation. ArXiv, abs/2102.12092, 2021.
  36. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  37. S3simulator: A benchmarking side scan sonar simulator dataset for underwater image analysis, 2024.
  38. Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 35:25278–25294, 2022.
  39. Machine learning for shipwreck segmentation from side scan sonar imagery: Dataset and benchmark. The International Journal of Robotics Research, page 02783649241266853, 2024.
  40. Synthetic sonar image simulation with various seabed conditions for automatic target recognition. In OCEANS 2022, Hampton Roads, pages 1–8. IEEE, 2022.
  41. A comprehensive review of generative ai in healthcare. arXiv preprint arXiv:2310.00795, 2023.
  42. Flava: A foundational language and vision alignment model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15638–15650, 2022.
  43. Multimodal learning with deep boltzmann machines. Advances in neural information processing systems, 25, 2012.
  44. Multi-level feature representation framework with adaptive margin loss for few-shot sonar images classification of auvs. IEEE Transactions on Intelligent Vehicles, pages 1–13, 2024.
  45. Side-scan sonar underwater target detection: Combining the diffusion model with an improved yolov7 model. IEEE Journal of Oceanic Engineering, 49(3):976–991, 2024.
  46. Underwater acoustic research trends with machine learning: Active sonar applications. Journal of Ocean Engineering and Technology, 34(4):277–284, 2020.
  47. A sample augmentation method for side-scan sonar full-class images that can be used for detection and segmentation. IEEE Transactions on Geoscience and Remote Sensing, 2024.
  48. A sample augmentation method for side-scan sonar full-class images that can be used for detection and segmentation. IEEE Transactions on Geoscience and Remote Sensing, 62:1–11, 2024.
  49. A side-scan sonar image synthesis method based on a diffusion model. Journal of Marine Science and Engineering, 11(6):1103, 2023.
  50. A side-scan sonar image synthesis method based on a diffusion model. Journal of Marine Science and Engineering, 11(6), 2023.
  51. Intelligent corner synthesis via cycle-consistent generative adversarial networks for efficient validation of autonomous driving systems. In 2018 23rd Asia and South Pacific design automation conference (ASP-DAC), pages 9–15. IEEE, 2018.
  52. Vision-language models for vision tasks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
  53. Self-trained target detection of radar and sonar images using automatic deep learning. IEEE Transactions on Geoscience and Remote Sensing, 60:1–14, 2021.
  54. Sonar image generation by mfa-cyclegan for boosting underwater object detection of auvs. IEEE Journal of Oceanic Engineering, 49(3):905–919, 2024.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.