Few-Shot Object Detection via Association and DIscrimination (2111.11656v2)

Published 23 Nov 2021 in cs.CV

Abstract: Object detection has achieved substantial progress in the last decade. However, detecting novel classes with only few samples remains challenging, since deep learning under low data regime usually leads to a degraded feature space. Existing works employ a holistic fine-tuning paradigm to tackle this problem, where the model is first pre-trained on all base classes with abundant samples, and then it is used to carve the novel class feature space. Nonetheless, this paradigm is still imperfect. Durning fine-tuning, a novel class may implicitly leverage the knowledge of multiple base classes to construct its feature space, which induces a scattered feature space, hence violating the inter-class separability. To overcome these obstacles, we propose a two-step fine-tuning framework, Few-shot object detection via Association and DIscrimination (FADI), which builds up a discriminative feature space for each novel class with two integral steps. 1) In the association step, in contrast to implicitly leveraging multiple base classes, we construct a compact novel class feature space via explicitly imitating a specific base class feature space. Specifically, we associate each novel class with a base class according to their semantic similarity. After that, the feature space of a novel class can readily imitate the well-trained feature space of the associated base class. 2) In the discrimination step, to ensure the separability between the novel classes and associated base classes, we disentangle the classification branches for base and novel classes. To further enlarge the inter-class separability between all classes, a set-specialized margin loss is imposed. Extensive experiments on Pascal VOC and MS-COCO datasets demonstrate FADI achieves new SOTA performance, significantly improving the baseline in any shot/split by +18.7. Notably, the advantage is most announced on extremely few-shot scenarios.

Authors (7)

Yuhang Cao (41 papers)
Jiaqi Wang (218 papers)
Ying Jin (57 papers)
Tong Wu (228 papers)
Kai Chen (512 papers)
Ziwei Liu (368 papers)
Dahua Lin (336 papers)

Citations (73)

View on Semantic Scholar

Summary

Few-Shot Object Detection via Association and Discrimination

Recent advancements in object detection have yielded impressive results, yet the challenge of few-shot object detection (FSOD) persists due to the inherent difficulty of detecting novel classes with limited data. This paper addresses the shortcomings of current methodologies and presents a novel framework named Few-Shot Object Detection via Association and Discrimination (FADI). This framework aims to establish discriminative feature spaces for novel classes through a two-step fine-tuning approach, thereby enhancing class separability without exhaustively retraining network structures.

Methodology Overview

The paper critiques existing FSOD approaches, which primarily rely on holistic fine-tuning paradigms. These paradigms first pre-train models using abundant data from known base classes before fine-tuning on novel class data. Although effective to some extent, this method often results in a scattered feature space for novel classes, compromising inter-class separability and leading to classification confusion.

The proposed FADI framework introduces a two-step fine-tuning process:

Association Step: This step constructs compact feature spaces for novel classes by imitating the feature spaces of semantically similar base classes. Each novel class is explicitly associated with a base class based on semantic similarity, as measured using auxiliary methods like WordNet. This semantic association aids in aligning novel class feature distributions with those of well-trained base classes, naturally achieving inter-class separability.
Discrimination Step: To address the confusion introduced during the association step, classification branches for base and novel classes are disentangled. This ensures that feature spaces remain distinct post-association. Additionally, a set-specialized margin loss is applied to reinforce inter-class separability across all classes.

Experimental Results

Extensive experiments on standard datasets like Pascal VOC and MS-COCO demonstrate the efficacy of the FADI framework. Notably, FADI improves baseline performance significantly, achieving up to +18.7 mAP in various few-shot scenarios. The framework shows remarkable advantages in extremely few-shot settings, particularly in 1- and 3-shot conditions. Such performance gains underscore the framework's ability to learn robust feature representations even under limited data scenarios, thereby outperforming existing FSOD benchmarks.

Implications and Future Directions

Practically, the FADI framework suggests promising applications in scenarios where novel class detection is needed but training data is scarce, such as wildlife monitoring or security systems. Theoretically, it contributes to the ongoing discourse on feature space optimization and class separability, advocating for the use of semantic similarity in feature alignment processes.

Future research could explore incorporating additional auxiliary data sources to define class similarity more comprehensively, potentially improving the association step. Furthermore, deepening the investigation into margin losses tailored for specific few-shot settings could lead to more refined methodologies in enlarging inter-class distances.

In conclusion, the FADI framework represents a significant step forward in few-shot object detection, providing an effective approach to feature space construction that improves class separability and detection accuracy for novel classes across limited-data scenarios.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - yhcao6/FADI (58 stars)