FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation (1907.12347v2)

Published 29 Jul 2019 in cs.CV

Abstract: Over the past few years, we have witnessed the success of deep learning in image recognition thanks to the availability of large-scale human-annotated datasets such as PASCAL VOC, ImageNet, and COCO. Although these datasets have covered a wide range of object categories, there are still a significant number of objects that are not included. Can we perform the same task without a lot of human annotations? In this paper, we are interested in few-shot object segmentation where the number of annotated training examples are limited to 5 only. To evaluate and validate the performance of our approach, we have built a few-shot segmentation dataset, FSS-1000, which consists of 1000 object classes with pixelwise annotation of ground-truth segmentation. Unique in FSS-1000, our dataset contains significant number of objects that have never been seen or annotated in previous datasets, such as tiny daily objects, merchandise, cartoon characters, logos, etc. We build our baseline model using standard backbone networks such as VGG-16, ResNet-101, and Inception. To our surprise, we found that training our model from scratch using FSS-1000 achieves comparable and even better results than training with weights pre-trained by ImageNet which is more than 100 times larger than FSS-1000. Both our approach and dataset are simple, effective, and easily extensible to learn segmentation of new object classes given very few annotated training examples. Dataset is available at https://github.com/HKUSTCV/FSS-1000.

Citations (218)

View on Semantic Scholar

Summary

The paper demonstrates that models trained with alternative datasets like fsCOCO and FSS can surpass those using ImageNet pre-trained weights.
It details multiple experimental configurations, emphasizing the role of learning rate and dataset combinations in optimizing segmentation performance.
Quantitative results reaching up to 82.66% performance highlight the effectiveness of tailored dataset selection in few-shot learning.

Evaluation of Few-Shot Learning Models on Diverse Image Datasets

This paper presents a comparative analysis of models trained using different methodologies across several prominent image datasets, specifically ImageNet, fsPASCAL, fsCOCO, and FSS. The research primarily focuses on assessing the performance of these models against the FSS test set, thereby examining the adaptability and efficacy of few-shot learning techniques across various benchmark datasets.

The paper details several experimental setups, where each configuration uniquely leverages a combination of the datasets. Notably, these models undergo evaluation to measure their performance, with results captured in a quantitative manner through the FSS test set metrics. The performance matrix presents strong numerical results, which enable a direct comparison of the models. The setup highlights the importance of choosing appropriate initial learning rates, particularly the decision to use a learning rate of $10^{-4}$ for models trained with ImageNet pre-trained weights, while shifting to $10^{-3}$ otherwise.

Results Summary

The table of results within the paper offers insightful quantitative data:

Model I demonstrated a test set performance of 66.45% when using only ImageNet and fsPASCAL training.
Model II improved upon this with a 71.34% performance employing ImageNet and fsCOCO.
Model III and IV showcased incremental improvements at 79.30% and 80.12% respectively, with Model IV dispensing with ImageNet entirely but emphasizing fsCOCO and FSS datasets.
Model V and VI, with results of 81.97% and 82.66%, highlight fsPASCAL's utility in conjunction with FSS, offering the highest yields in performance without ImageNet's influence.

Implications and Future Directions

The findings assert that models trained without ImageNet pre-trained weights can achieve competitive or superior performance, thus challenging the conventional reliance on pre-trained weights from large datasets like ImageNet in transfer learning. The results achieved using fsCOCO and FSS datasets underscore the viability and perhaps necessity of tailoring dataset choices to the specific characteristics of the task at hand in few-shot learning scenarios.

The paper implicitly suggests a fertile ground for future research involving the exploration of combinations of datasets and learning configurations. This could further elucidate strategies for improving model performance in data-limited environments. Moreover, the examination of such model configurations has theoretical significance as it relates to understanding the dynamics of knowledge transfer between disparate datasets in few-shot learning contexts.

In summary, this research contributes valuable experimental evidence towards optimizing few-shot learning model performance, advocating for a strategic approach in dataset and learning rate selection. Future investigations could expand on these configurations, including cross-domain image recognition and more diversified dataset applications, to enhance the robustness and universality of few-shot learning methodologies.

PDF Markdown

Related Papers

GitHub

GitHub - HKUSTCV/FSS-1000: FSS-1000, A 1000-class Dataset For Few-shot Segmentation (271 stars)