Mining Latent Classes for Few-shot Segmentation (2103.15402v3)

Published 29 Mar 2021 in cs.CV

Abstract: Few-shot segmentation (FSS) aims to segment unseen classes given only a few annotated samples. Existing methods suffer the problem of feature undermining, i.e. potential novel classes are treated as background during training phase. Our method aims to alleviate this problem and enhance the feature embedding on latent novel classes. In our work, we propose a novel joint-training framework. Based on conventional episodic training on support-query pairs, we add an additional mining branch that exploits latent novel classes via transferable sub-clusters, and a new rectification technique on both background and foreground categories to enforce more stable prototypes. Over and above that, our transferable sub-cluster has the ability to leverage extra unlabeled data for further feature enhancement. Extensive experiments on two FSS benchmarks demonstrate that our method outperforms previous state-of-the-art by a large margin of 3.7% mIOU on PASCAL-5i and 7.0% mIOU on COCO-20i at the cost of 74% fewer parameters and 2.5x faster inference speed. The source code is available at https://github.com/LiheYoung/MiningFSS.

View on arXiv

Authors (5)

Lihe Yang (12 papers)
Wei Zhuo (24 papers)
Lei Qi (84 papers)
Yinghuan Shi (79 papers)
Yang Gao (761 papers)

Citations (117)

View on Semantic Scholar

Summary

Mining Latent Classes for Few-shot Segmentation

The paper "Mining Latent Classes for Few-shot Segmentation" introduces an innovative framework addressing the problem of few-shot segmentation (FSS), which involves segmenting unseen classes with limited annotated data. Current methods in FSS often misclassify potential novel classes as background categories during training, leading to feature undermining. This research proposes a joint-training framework incorporating a novel mining branch aimed at identifying latent novel classes via transferable sub-clusters, alongside a rectification technique to stabilize class prototypes. The efficacy of this approach is demonstrated through substantial improvements in performance benchmarks, as well as reductions in model size and inference time.

Key Contributions

Joint-training Framework: This method incorporates episodic training on support-query pairs with an auxiliary mining branch. The mining branch enhances the feature embeddings by utilizing latent novel classes identified through transferable sub-clusters, which are derived from the base classes. This dual strategy fosters the ability to generalize effectively to unseen classes without further training or fine-tuning, addressing the critical issue of feature undermining.
Rectification of Prototypes: The paper introduces a prototype rectification technique to mitigate prototype bias, which is a common issue in FSS due to limited support samples. This is achieved by refining both foreground and background prototypes. A global background prototype is maintained and updated as a moving average of all training set backgrounds, broadening the context. For foreground classes, region-level prototypes from additional samples are integrated.
Empirical Validation: The framework was empirically validated with experiments on PASCAL-5 $^i$ and COCO-20 $^i$ datasets, where it outperformed existing state-of-the-art models, marking improvements of 3.7% mIOU on PASCAL-5 $^i$ and 7.0% mIOU on COCO-20 $^i$ . Remarkably, these enhancements were achieved with 74% fewer parameters and increased inference speed by 2.5x.
Exploitation of Unlabeled Data: Beyond the improvements derived from the labeled data, the proposed methodology can integrate additional unlabeled data, further enhancing performance. This is a significant step towards more realistic few-shot learning settings where class labels are frequently sparse or entirely absent.

Implications

The presented paper provides both theoretical advancements and practical improvements within the field of few-shot segmentation. The concept of mining latent novel classes through pseudo labeling introduces an innovative approach to leverage unannotated or partially annotated data, aligning well with the constant influx of unlabeled data in real-world applications. Practically, the reduction in model complexity and enhanced inference speed can facilitate the deployment of segmentation models in resource-constrained environments, such as mobile devices and edge computing.

Future Directions

Future work could explore further automation in the selection of sub-clusters, potentially utilizing more sophisticated clustering algorithms or data-driven techniques to adaptively determine the number and characteristics of sub-clusters. Additionally, with advancements in other areas of meta-learning and semi-supervised learning, the potential integration of cross-domain knowledge could further extend the applicability of FSS methods.

In conclusion, this research significantly contributes to the domain of few-shot learning by introducing a method that not only enhances performance metrics but is designed with efficiency and scalability in mind, making it well-suited for the increasingly demanding applications of computer vision technology.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - LiheYoung/MiningFSS: [ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation (71 stars)