Part-aware Prototype Network for Few-shot Semantic Segmentation (2007.06309v3)

Published 13 Jul 2020 in cs.CV

Abstract: Few-shot semantic segmentation aims to learn to segment new object classes with only a few annotated examples, which has a wide range of real-world applications. Most existing methods either focus on the restrictive setting of one-way few-shot segmentation or suffer from incomplete coverage of object regions. In this paper, we propose a novel few-shot semantic segmentation framework based on the prototype representation. Our key idea is to decompose the holistic class representation into a set of part-aware prototypes, capable of capturing diverse and fine-grained object features. In addition, we propose to leverage unlabeled data to enrich our part-aware prototypes, resulting in better modeling of intra-class variations of semantic objects. We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes based on labeled and unlabeled images. Extensive experimental evaluations on two benchmarks show that our method outperforms the prior art with a sizable margin.

Authors (4)

Yongfei Liu (25 papers)
Xiangyi Zhang (7 papers)
Songyang Zhang (116 papers)
Xuming He (109 papers)

Citations (295)

View on Semantic Scholar

Summary

Insights into "Part-aware Prototype Network for Few-shot Semantic Segmentation"

The paper, titled "Part-aware Prototype Network for Few-shot Semantic Segmentation," presents an innovative approach to tackling the few-shot semantic segmentation problem, where the objective is to segment novel object classes with limited annotated examples. This research is primarily focused on addressing the limitations found in previous methods that either restrict themselves to one-way segmentation or fail to comprehensively cover object regions.

Core Contributions

The authors propose a novel framework that leverages prototype representations to enhance the few-shot learning process. The key innovation is the decomposition of class representations into "part-aware prototypes," which enables the capture of diverse, fine-grained features within objects. This granular representation assists in achieving better spatial coverage across varied semantic object regions. Furthermore, the incorporation of unlabeled data to enrich these prototypes constitutes a significant advancement, allowing effective modeling of intra-class variations by extending beyond a small support set.

Methodological Advancements

The methodology is rooted in a graph neural network model, which serves as a pivotal mechanism for generating and refining the part-aware prototypes. This network processes both labeled and unlabeled images to produce a robust representation of semantic classes. The prototype generation network functions in two primary modules:

Part Generation Module: Initially clusters features from labeled support images to form part-aware prototypes, which are subsequently augmented with a global semantic context.
Part Refinement Module: Enhances prototypes through additional context derived from unlabeled data, employing a graph attention mechanism for feature augmentation and pruning.

The segmentation is achieved through a part-aware mask generation network, employing a straightforward matching strategy to predict segmentation on query images. The result is a refined segmentation output, benefitting from both labeled and unlabeled data sources.

Empirical Evaluation

The framework is evaluated across two benchmark datasets, PASCAL- $5^i$ and COCO- $20^i$ , demonstrating superior performance over existing methods, particularly in the challenging multi-way few-shot setting. Quantitative results indicate a significant margin of improvement in mean-IoU scores, underscoring the efficacy of the proposed part-aware prototypes. Ablation studies further validate the utility of different components of the model, such as the importance of unlabeled data in refining prototype representations.

Implications and Future Directions

This approach has profound implications for practical applications requiring efficient adaptation to new classes with minimal labelled data, such as in autonomous navigation or medical imaging. The paper introduces semi-supervised learning into few-shot semantic segmentation, creating a fertile ground for further exploration. Future work could focus on scaling this method to more complex object classes and extending the graph neural network framework to support even more unlabeled data, potentially improving performance in real-time applications.

In summary, the introduction of part-aware prototypes coupled with unlabeled data utilization marks a progression in few-shot learning paradigms, fostering enhanced segmentation capabilities with limited data resources.

PDF Markdown