QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection (2103.09136v2)

Published 16 Mar 2021 in cs.CV

Abstract: While general object detection with deep learning has achieved great success in the past few years, the performance and efficiency of detecting small objects are far from satisfactory. The most common and effective way to promote small object detection is to use high-resolution images or feature maps. However, both approaches induce costly computation since the computational cost grows squarely as the size of images and features increases. To get the best of two worlds, we propose QueryDet that uses a novel query mechanism to accelerate the inference speed of feature-pyramid based object detectors. The pipeline composes two steps: it first predicts the coarse locations of small objects on low-resolution features and then computes the accurate detection results using high-resolution features sparsely guided by those coarse positions. In this way, we can not only harvest the benefit of high-resolution feature maps but also avoid useless computation for the background area. On the popular COCO dataset, the proposed method improves the detection mAP by 1.0 and mAP-small by 2.0, and the high-resolution inference speed is improved to 3.0x on average. On VisDrone dataset, which contains more small objects, we create a new state-of-the-art while gaining a 2.3x high-resolution acceleration on average. Code is available at https://github.com/ChenhongyiYang/QueryDet-PyTorch.

Authors (3)

Chenhongyi Yang (14 papers)
Zehao Huang (20 papers)
Naiyan Wang (65 papers)

Citations (187)

View on Semantic Scholar

Summary

The paper introduces QueryDet, which employs a coarse-to-fine strategy and sparse convolution to efficiently detect small objects in high-resolution images.
It demonstrates a 3.0× speed boost on COCO and a 2.3× improvement on VisDrone, along with a 2.0-point increase in mAP-small.
This method provides a practical solution for resource-efficient detection in fields like autonomous driving and UAV surveillance.

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection

The paper "QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection" presents a novel approach aimed at enhancing the efficiency and performance of detecting small objects in high-resolution images. This is achieved through a mechanism dubbed Cascaded Sparse Query (CSQ), which optimizes the computational process involved in feature-pyramid based object detectors.

Methodology and Contributions

The primary contribution of the paper is the introduction of the QueryDet framework, which is designed to address a critical challenge in the domain of visual object detection: the accurate and efficient recognition of small objects. Traditional approaches necessitate high-resolution imagery, inherently leading to computationally expensive operations due to the quadratic increase in data processing requirements. QueryDet overcomes this limitation by innovatively combining a coarse-to-fine detection strategy with sparse computational methods.

The proposed methodology involves predicting coarse locations of small objects using low-resolution features. These predicted locations guide subsequent computations on higher-resolution features, effectively non-redundantly focusing on areas where small objects are likely to be present. By deploying sparse convolutional operations, QueryDet significantly reduces computational expenditures associated with dense prediction masks.

The CSQ mechanism operates in a cascade manner. At each level of the feature pyramid, only those areas predicted to contain small objects through a query mechanism are processed in full resolution. This sharply contrasts traditional full-mask operations that indiscriminately convolve across all pixels. The CSQ not only maintains high detection accuracy but enhances inference speeds substantially—a differential of 3.0× on the COCO dataset and 2.3× on VisDrone.

Experimental Validation

Empirical results corroborate the efficacy of the QueryDet system. On standard benchmarks like COCO and specialized datasets such as VisDrone—which predominantly feature small objects—the QueryDet framework notably outperforms existing models in terms of mean Average Precision (mAP) and inference speed. Specifically, on COCO, QueryDet increased mAP-small by 2.0 points while tripling the detection speed for high-resolution inputs. On VisDrone, it achieved state-of-the-art results while enhancing speed.

Implications and Future Directions

The implications of this work span both theoretical and practical dimensions within AI-driven image analysis. The proposed QueryDet framework not only provides a mechanism to significantly reduce computational demands but also sets a precedent for resource-efficient small object detection in real-world applications, such as autonomous driving and UAV-based surveillance.

The theoretical underpinning, rooted in efficiently leveraging feature pyramids via sparse queries, could catalyze further research into optimizing neural network architectures for efficiency without compromising detection capabilities. Future explorations might involve extending the CSQ paradigm to three-dimensional object detection in point cloud data, where computational costs are even more pronounced.

This paper represents an essential step forward in the quest for balancing computational efficiency and detection performance in complex visual environments, potentially influencing a wide range of applications in AI-driven vision tasks.

PDF Markdown

Related Papers

GitHub

GitHub - ChenhongyiYang/QueryDet-PyTorch: [CVPR 2022 Oral] QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection (423 stars)

Tweets

https://twitter.com/ai_fast_track/status/1574745681750888448