Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation Exploitation (2208.00219v1)

Published 30 Jul 2022 in cs.CV, cs.AI, cs.LG, and cs.MM

Abstract: Few-shot object detection has been extensively investigated by incorporating meta-learning into region-based detection frameworks. Despite its success, the said paradigm is still constrained by several factors, such as (i) low-quality region proposals for novel classes and (ii) negligence of the inter-class correlation among different classes. Such limitations hinder the generalization of base-class knowledge for the detection of novel-class objects. In this work, we design Meta-DETR, which (i) is the first image-level few-shot detector, and (ii) introduces a novel inter-class correlational meta-learning strategy to capture and leverage the correlation among different classes for robust and accurate few-shot object detection. Meta-DETR works entirely at image level without any region proposals, which circumvents the constraint of inaccurate proposals in prevalent few-shot detection frameworks. In addition, the introduced correlational meta-learning enables Meta-DETR to simultaneously attend to multiple support classes within a single feedforward, which allows to capture the inter-class correlation among different classes, thus significantly reducing the misclassification over similar classes and enhancing knowledge generalization to novel classes. Experiments over multiple few-shot object detection benchmarks show that the proposed Meta-DETR outperforms state-of-the-art methods by large margins. The implementation codes are available at https://github.com/ZhangGongjie/Meta-DETR.

PDF Abstract

Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation Exploitation

The paper presents Meta-DETR, a novel approach in the field of few-shot object detection, which distinguishes itself from existing methodologies primarily by operating at the image level without relying on region proposals. Traditional approaches to few-shot detection leverage region-based frameworks like Faster R-CNN, which suffer from deficiencies in region proposal quality for novel classes. Meta-DETR addresses this limitation by utilizing a DETR-based framework, facilitating pure image-level prediction. This key deviation enables Meta-DETR to sidestep inaccuracies inherent in region-based predictions, thereby offering more robust detection capabilities for novel objects.

A critical aspect of Meta-DETR is the incorporation of an inter-class correlational meta-learning strategy. This element allows the model to effectively discern and leverage correlations among different classes during training. Unlike previous approaches that treat each support class independently, Meta-DETR processes multiple support classes simultaneously. This strategy not only enhances generalization capabilities by recognizing cross-class relationships but also significantly reduces misclassification among similar classes.

The paper reports that Meta-DETR achieves superior performance on several few-shot object detection benchmarks, including Pascal VOC and MS COCO, outperforming state-of-the-art methods by substantial margins. Numerical results highlight significant improvements in detection mAP across varying shot settings, underscoring Meta-DETR's efficacy in learning from minimal labeled data.

Practically, the implications of Meta-DETR are considerable. By eliminating dependency on region proposals, the model provides a more robust generalization framework even with extremely limited samples. Moreover, the ability to exploit inter-class correlations could be beneficial in real-world applications where novel object categories frequently appear, and annotations are scarce.

Theoretically, the success of Meta-DETR emphasizes the potential of image-level frameworks in few-shot detection and the utility of correlational learning strategies. As the research community continues to explore few-shot and zero-shot learning paradigms, Meta-DETR sets a precedent for future models to leverage holistic image features and class relationships to improve learning efficiency and accuracy.

Future research may delve into integrating multi-scale features into the Meta-DETR framework, potentially improving detection capabilities for small or occluded objects. Furthermore, extending the correlational meta-learning strategy to other vision tasks, such as segmentation or tracking, provides a promising avenue for expanding the framework's applicability.

In conclusion, Meta-DETR represents a significant advancement in few-shot object detection by foregoing traditional region-based methodologies and embracing an image-level approach accompanied by inter-class correlation exploitation. This innovative framework not only enhances the adaptability and accuracy of object detectors but also provides a strong foundation for further research and development in the discipline.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Gongjie Zhang (20 papers)
Zhipeng Luo (37 papers)
Kaiwen Cui (13 papers)
Shijian Lu (151 papers)
Eric P. Xing (192 papers)

Citations (73)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - ZhangGongjie/Meta-DETR: [T-PAMI 2022] Meta-DETR for Few-Shot Object Detection: Official PyTorch Implementation (393 stars)