Few-shot Object Detection via Feature Reweighting (1812.01866v2)

Published 5 Dec 2018 in cs.CV

Abstract: Conventional training of a deep CNN based object detector demands a large number of bounding box annotations, which may be unavailable for rare categories. In this work we develop a few-shot object detector that can learn to detect novel objects from only a few annotated examples. Our proposed model leverages fully labeled base classes and quickly adapts to novel classes, using a meta feature learner and a reweighting module within a one-stage detection architecture. The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples. The reweighting module transforms a few support examples from the novel classes to a global vector that indicates the importance or relevance of meta features for detecting the corresponding objects. These two modules, together with a detection prediction module, are trained end-to-end based on an episodic few-shot learning scheme and a carefully designed loss function. Through extensive experiments we demonstrate that our model outperforms well-established baselines by a large margin for few-shot object detection, on multiple datasets and settings. We also present analysis on various aspects of our proposed model, aiming to provide some inspiration for future few-shot detection works.

Citations (645)

View on Semantic Scholar

Summary

The paper presents a novel few-shot detection methodology that uses a meta feature learner and reweighting module to effectively address limited annotated data.
The approach, built on a YOLOv2 framework with episodic training, achieves rapid convergence and significantly higher mAP compared to state-of-the-art baselines.
The study sets the stage for practical applications in domains like medical imaging and wildlife monitoring, highlighting its potential in resource-constrained environments.

Few-Shot Object Detection via Feature Reweighting

The paper "Few-shot Object Detection via Feature Reweighting" by Kang et al. explores the development of a robust model for few-shot object detection, addressing the challenges posed by the scarcity of annotated data for novel object categories. The work innovatively combines meta-learning approaches with a reweighting mechanism to enhance the detection capabilities of deep convolutional neural networks (CNNs) when trained on limited examples.

Overview

Object detection tasks traditionally require vast amounts of data with precise bounding box annotations, which are often unavailable for rare or novel categories. This work introduces a paradigm that enables the rapid adaptation to new classes by leveraging fully annotated base classes. The proposed architecture integrates a meta feature learner and a reweighting module within a one-stage detector framework, optimized using a few-shot learning scheme.

Methodology

Meta Feature Learner: This component extracts features generalizable across different object classes. Trained on base classes with ample examples, it ensures that the learned features can generalize to unseen categories.
Reweighting Module: The module transforms support examples of novel classes into global vectors that evaluate the relevance of meta features for detection tasks. This feature reweighting aligns with the adaptability principle seen in few-shot classification tasks.
Detection Architecture: The model employs the YOLOv2 architecture, allowing direct regression of object features for efficient class and bounding box predictions without the need for proposal generation.
Training Scheme: The model uses an episodic training approach, spanning from meta-training on base classes to fine-tuning on novel classes.

Results and Analysis

Extensive experiments demonstrate the superiority of the proposed model over multiple baselines for few-shot object detection. Noteworthy findings include:

Performance: The authors report significant improvements in mean average precision (mAP) across multiple datasets and experimental scenarios. Their model outperforms state-of-the-art methods, especially in shot scenarios as low as 1-3 examples.
Learning Speed: The architecture exhibits rapid convergence, allowing faster adaptation compared to conventional methods, which is vital for practical deployment in dynamic environments.
Robust Feature Representation: The model's strength is further validated through its high performance on base classes, showing that the learned representations retain robustness even when transferred to novel categories.

Future Implications

The paper's implications extend into various domains where data scarcity is prevalent, such as medical imaging and wildlife monitoring. Although promising, there remains potential for further enhancement in detection performance, especially for complex datasets like MS-COCO. Future research could explore integration with life-long learning frameworks to continually improve detection capabilities as more data becomes available.

In summation, this work marks a significant contribution to the exploration of few-shot learning in object detection, providing a foundation for future innovations in adapting machine learning models to resource-constrained environments.

PDF Markdown