Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing (2004.13316v2)

Published 28 Apr 2020 in cs.CV, cs.LG, and eess.IV

Abstract: Small and cluttered objects are common in real-world which are challenging for detection. The difficulty is further pronounced when the objects are rotated, as traditional detectors often routinely locate the objects in horizontal bounding box such that the region of interest is contaminated with background or nearby interleaved objects. In this paper, we first innovatively introduce the idea of denoising to object detection. Instance-level denoising on the feature map is performed to enhance the detection to small and cluttered objects. To handle the rotation variation, we also add a novel IoU constant factor to the smooth L1 loss to address the long standing boundary problem, which to our analysis, is mainly caused by the periodicity of angular (PoA) and exchangeability of edges (EoE). By combing these two features, our proposed detector is termed as SCRDet++. Extensive experiments are performed on large aerial images public datasets DOTA, DIOR, UCAS-AOD as well as natural image dataset COCO, scene text dataset ICDAR2015, small traffic light dataset BSTLD and our released S$2$TLD by this paper. The results show the effectiveness of our approach. The released dataset S2TLD is made public available, which contains 5,786 images with 14,130 traffic light instances across five categories.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xue Yang (141 papers)
  2. Junchi Yan (241 papers)
  3. Wenlong Liao (18 papers)
  4. Xiaokang Yang (207 papers)
  5. Jin Tang (139 papers)
  6. Tao He (62 papers)
Citations (239)

Summary

Overview of "SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing"

The paper "SCRDet++" tackles the challenges of detecting small, cluttered, and rotated objects in complex images, such as aerial imagery, through innovative strategies in object detection. The authors introduce SCRDet++, an enhanced version of their original SCRDet framework, which integrates instance-level feature denoising (InLD) and an IoU-Smooth L1 loss function for rotation detection.

Core Innovations

The paper presents two primary contributions that address significant challenges in object detection:

  1. Instance-Level Feature Denoising (InLD): This module is designed to tackle the difficulties posed by non-object noise, inter-class feature coupling, and intra-class boundary blurring. By enriching feature maps through denoising, InLD decouples features into respective channels while amplifying object features and reducing background interference. The implementation of InLD serves as a robust mechanism to enhance detection abilities, especially beneficial for small and densely packed objects seen in datasets like DOTA and DIOR.
  2. IoU-Smooth L1 Loss: To handle the boundary problem inherent to the rotation detection of objects, the paper introduces a novel loss function. By incorporating an IoU-based coefficient, this loss function addresses the discontinuities caused by the angular periodicity and edge exchangeability, enabling more accurate rotation predictions. This approach utilizes both direct and indirect regression strategies to mitigate boundary-related regression challenges.

Performance and Experiments

The paper provides extensive experimental results across several datasets, including DOTA, DIOR, COCO, and newly introduced S2TLD, confirming the efficacy of SCRDet++. Quantitative metrics indicate substantial improvements over established baselines. For example, on the DOTA dataset's Oriented Bounding Box (OBB) task, SCRDet++ achieved significant gains, demonstrating its capabilities in handling complex scenes with diverse object orientations. The experiments also reveal that InLD notably boosts the detection performance even with the removal of image-level denoising (ImLD), signifying the module's effectiveness in complex scenarios.

Implications and Future Directions

The research presented in this paper extends the landscape of object detection by addressing fundamental issues of scale, clutter, and orientation. Practically, the improvements shown by SCRDet++ may translate into more accurate and reliable applications in remote sensing, autonomous vehicles, and urban monitoring systems where small and rotated object detection is critical.

Theoretically, the concepts of feature map denoising and customized loss functions for orientation pose new avenues for improving neural network robustness and accuracy in object detection tasks. Future research might explore the integration of these methods into broader applications, potentially focusing on deploying these techniques in real-time systems and exploring further optimizations for computational efficiency.

In conclusion, SCRDet++ exemplifies a significant advancement in the domain of object detection, particularly for complex imagery, by utilizing innovative denoising techniques and refined loss functions. The enhancements proposed by the authors pave the way for future research aimed at overcoming the multifaceted challenges inherent in visual recognition tasks.