Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MARS: Multi-Scale Adaptive Robotics Vision for Underwater Object Detection and Domain Generalization (2312.15275v1)

Published 23 Dec 2023 in cs.RO

Abstract: Underwater robotic vision encounters significant challenges, necessitating advanced solutions to enhance performance and adaptability. This paper presents MARS (Multi-Scale Adaptive Robotics Vision), a novel approach to underwater object detection tailored for diverse underwater scenarios. MARS integrates Residual Attention YOLOv3 with Domain-Adaptive Multi-Scale Attention (DAMSA) to enhance detection accuracy and adapt to different domains. During training, DAMSA introduces domain class-based attention, enabling the model to emphasize domain-specific features. Our comprehensive evaluation across various underwater datasets demonstrates MARS's performance. On the original dataset, MARS achieves a mean Average Precision (mAP) of 58.57\%, showcasing its proficiency in detecting critical underwater objects like echinus, starfish, holothurian, scallop, and waterweeds. This capability holds promise for applications in marine robotics, marine biology research, and environmental monitoring. Furthermore, MARS excels at mitigating domain shifts. On the augmented dataset, which incorporates all enhancements (+Domain +Residual+Channel Attention+Multi-Scale Attention), MARS achieves an mAP of 36.16\%. This result underscores its robustness and adaptability in recognizing objects and performing well across a range of underwater conditions. The source code for MARS is publicly available on GitHub at https://github.com/LyesSaadSaoud/MARS-Object-Detection/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. B. Recht, R. Roelofs, L. Schmidt, and V. Shankar, “Do imagenet classifiers generalize to imagenet?” in International conference on machine learning.   PMLR, 2019, pp. 5389–5400.
  2. D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” arXiv preprint arXiv:1903.12261, 2019.
  3. Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. Van Gool, “Domain adaptive faster r-cnn for object detection in the wild,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3339–3348.
  4. C.-D. Xu, X.-R. Zhao, X. Jin, and X.-S. Wei, “Exploring categorical regularization for domain adaptive object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 724–11 733.
  5. H.-K. Hsu, C.-H. Yao, Y.-H. Tsai, W.-C. Hung, H.-Y. Tseng, M. Singh, and M.-H. Yang, “Progressive domain adaptation for object detection,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2020, pp. 749–757.
  6. D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain generalization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5542–5550.
  7. S. Motiian, M. Piccirilli, D. Adjeroh, and G. Doretto, “Unified deep supervised domain adaptation and generalization,” in Proc. IEEE Int. Conf. Comput. Vision, 2017, pp. 5715–5725.
  8. H. Li, S. Jialin Pan, S. Wang, and A. Kot, “Domain generalization with adversarial feature learning,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2018, pp. 5400–5409.
  9. S. Shankar, V. Piratla, S. Chakrabarti, S. Chaudhuri, P. Jyothi, and S. Sarawagi, “Generalizing across domains via cross-gradient training,” in Proc. Int. Conf. Learn. Repre, 2018.
  10. Q. Dou, D. Castro, K. Kamnitsas, and B. Glocker, “Domain generalization via model-agnostic learning of semantic features,” in Proc. Conf. Neur. Info. Proc. Systems, 2019, pp. 6447–6458.
  11. A. D’Innocente and B. Caputo, “Domain generalization with domain-specific aggregation modules,” in Proc. Germ. Conf. Pattern Recognit.   Springer, 2018, pp. 187–198.
  12. Y. Balaji, S. Sankaranarayanan, and R. Chellappa, “Metareg: Towards domain generalization using meta-regularization,” in Proc. Conf. Neur. Info. Proc. Systems, 2018, pp. 998–1008.
  13. D. Li, J. Zhang, Y. Yang, C. Liu, Y. Song, and T. Hospedales, “Episodic training for domain generalization,” arXiv:1902.00113, 2019.
  14. H. Huang, H. Zhou, X. Yang, L. Zhang, L. Qi, and A. Zang, “Faster r-cnn for marine organisms detection and recognition using data augmentation,” Neurocomputing, vol. 337, pp. 372–384, 2019.
  15. W. Lin, J. Zhong, S. Liu, T. Li, and G. Li, “Roimix: Proposal-fusion among multiple images for underwater object detection,” in IEEE International Conference on Acoustics, Speech, and Signal Processing, 2020.
  16. H. Liu, P. Song, and R. Ding, “Towards domain generalization in underwater object detection,” in Proc. Int. Conf. Image Proc., 2020, pp. 1971–1975.
  17. Y. Chen, P. Song, H. Liu, L. Dai, X. Zhang, R. Ding, and S. Li, “Achieving domain generalization for underwater object detection by domain mixup and contrastive learning,” Neurocomputing, vol. 528, pp. 20–34, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231223000644
  18. B. Fan, W. Chen, Y. Cong, and J. Tian, “Dual refinement underwater object detection network,” in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds.   Cham: Springer International Publishing, 2020, pp. 275–291.
  19. X. Chen, Y. Lu, Z. Wu, J. Yu, and L. Wen, “Reveal of domain effect: How visual restoration contributes to object detection in aquatic scenes,” arXiv preprint arXiv:2003.01913, 2020.
  20. X. Liang and P. Song, “Excavating roi attention for underwater object detection.”   IEEE, 2022.
  21. Z. Zhao, Y. Liu, X. Sun, J. Liu, X. Yang, and C. Zhou, “Composited fishnet: Fish detection and species recognition from low-quality underwater videos,” IEEE Transaction on Image Processing, vol. 30, pp. 4719–4734, 2021.
  22. Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in Proc. Int. Conf. Mach. Learn., 2015, pp. 1180–1189.
  23. M. Xu, J. Zhang, B. Ni, T. Li, C. Wang, Q. Tian, and W. Zhang, “Adversarial domain adaptation with domain mixup,” in Proc. AAAI Conf. Arti. Intell., 2019, pp. 6502–6509.
  24. R. Gong, W. Li, Y. Chen, and L. Van Gool, “Dlow: Domain flow for adaptation and generalization,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2019, pp. 2477–2486.
  25. Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. Van Gool, “Domain adaptive faster r-cnn for object detection in the wild,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2018, pp. 3339–3348.
  26. H. Liu, M. Roznere, and A. Q. Li, “Deep underwater monocular depth estimation with single-beam echosounder,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 1090–1097.
  27. S. Chen, H. Xu, X. Xiong, and B. Lu, “An underwater jet-propulsion soft robot with high flexibility driven by water hydraulics,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 2613–2619.
  28. K. Yao, N. Bauschmann, T. L. Alff, W. Cheah, D. A. Duecker, K. Groves, O. Marjanovic, and S. Watson, “Image-based visual servoing switchable leader-follower control of heterogeneous multi-agent underwater robot system,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 5200–5206.
  29. Y. Girdhar, N. McGuire, L. Cai, S. Jamieson, S. McCammon, B. Claus, J. E. S. Soucie, J. E. Todd, and T. A. Mooney, “Curee: A curious underwater robot for ecosystem exploration,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 11 411–11 417.
  30. J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com