Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 83 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 462 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Underwater object detection in sonar imagery with detection transformer and Zero-shot neural architecture search (2505.06694v1)

Published 10 May 2025 in cs.CV and cs.AI

Abstract: Underwater object detection using sonar imagery has become a critical and rapidly evolving research domain within marine technology. However, sonar images are characterized by lower resolution and sparser features compared to optical images, which seriously degrades the performance of object detection.To address these challenges, we specifically propose a Detection Transformer (DETR) architecture optimized with a Neural Architecture Search (NAS) approach called NAS-DETR for object detection in sonar images. First, an improved Zero-shot Neural Architecture Search (NAS) method based on the maximum entropy principle is proposed to identify a real-time, high-representational-capacity CNN-Transformer backbone for sonar image detection. This method enables the efficient discovery of high-performance network architectures with low computational and time overhead. Subsequently, the backbone is combined with a Feature Pyramid Network (FPN) and a deformable attention-based Transformer decoder to construct a complete network architecture. This architecture integrates various advanced components and training schemes to enhance overall performance. Extensive experiments demonstrate that this architecture achieves state-of-the-art performance on two Representative datasets, while maintaining minimal overhead in real-time efficiency and computational complexity. Furthermore, correlation analysis between the key parameters and differential entropy-based fitness function is performed to enhance the interpretability of the proposed framework. To the best of our knowledge, this is the first work in the field of sonar object detection to integrate the DETR architecture with a NAS search mechanism.

Summary

The paper presents a state-of-the-art approach to underwater object detection in sonar imagery, a critical domain in marine technology. The authors tackle the inherent challenges posed by sonar images, which typically possess lower resolution and sparser features compared to their optical counterparts. This paper introduces a Detection Transformer (DETR) architecture optimized through a novel Neural Architecture Search (NAS) strategy, dubbed NAS-DETR, aimed at enhancing object detection in sonar images.

Problem Addressed

Sonar imagery, characterized by severe noise interference, low resolution, small target sizes, and high inter-target similarity, presents significant obstacles to accurate underwater object detection. Conventional optical imaging techniques fail under high turbidity and low light conditions, necessitating advancements in sonar-based detection methods.

Methodology

The primary contribution of this paper is the development of NAS-DETR, an underwater object detection framework integrating DETR architecture with a NAS search mechanism. The authors employ a Zero-shot NAS method based on the maximum entropy principle to identify a CNN-Transformer hybrid backbone suitable for sonar image detection. This approach strategically maximizes the differential entropy of the neural network, facilitating the discovery of robust network architectures with minimal computational overhead.

The proposed CNN-Transformer hybrid backbone incorporates CNN blocks and Transformer modules to leverage both local feature extraction and global contextual modeling capabilities. The backbone is integrated with a Feature Pyramid Network (FPN) and a deformable attention-based Transformer decoder. This complete network architecture, featuring a content-position decoupled query initialization strategy and a multi-task optimized hybrid loss function, significantly enhances detection performance, particularly under low-resolution and high-noise conditions.

Experimental Results

Extensive experimentation on two representative datasets demonstrates the NAS-DETR's superiority in achieving state-of-the-art performance, with minimal overhead in real-time efficiency and computational complexity. Importantly, the proposed framework outperforms other contemporary methods such as RT-DETR and DDOD across multiple object detection metrics (mmAP, mAP50, mAP75) while maintaining high computational efficiency.

Analysis and Implications

The paper presents a correlation analysis between key architectural parameters and differential entropy, shedding light on the interpretability of the proposed NAS framework. The analysis reveals significant positive correlations between backbone network depth, channel count, convolution kernel size, and differential entropy, providing theoretical insights into the architecture's feature modeling capabilities.

By introducing a Zero-shot NAS for efficient object detection in sonar imagery, this research not only advances the technological capabilities of underwater object detection but also lays the groundwork for further exploration in AI-driven sonar analysis. The adaptive decoder mechanism, coupled with dynamic architectural search strategies, suggests promising future directions, including lightweight search strategies and multi-modal data fusion, critical for deployment in complex underwater environments.

Through comprehensive experimentation and theoretical validation, this paper establishes NAS-DETR as a pivotal development in sonar image-based object detection, bridging the gap between traditional CNN-based methods and emerging transformer-based architectures. The proposed model's scalability and robustness render it suitable for deployment across diverse maritime applications, from underwater archaeology to marine environmental monitoring.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube