Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generalized Few-shot Semantic Segmentation (2010.05210v4)

Published 11 Oct 2020 in cs.CV

Abstract: Training semantic segmentation models requires a large amount of finely annotated data, making it hard to quickly adapt to novel classes not satisfying this condition. Few-Shot Segmentation (FS-Seg) tackles this problem with many constraints. In this paper, we introduce a new benchmark, called Generalized Few-Shot Semantic Segmentation (GFS-Seg), to analyze the generalization ability of simultaneously segmenting the novel categories with very few examples and the base categories with sufficient examples. It is the first study showing that previous representative state-of-the-art FS-Seg methods fall short in GFS-Seg and the performance discrepancy mainly comes from the constrained setting of FS-Seg. To make GFS-Seg tractable, we set up a GFS-Seg baseline that achieves decent performance without structural change on the original model. Then, since context is essential for semantic segmentation, we propose the Context-Aware Prototype Learning (CAPL) that significantly improves performance by 1) leveraging the co-occurrence prior knowledge from support samples, and 2) dynamically enriching contextual information to the classifier, conditioned on the content of each query image. Both two contributions are experimentally shown to have substantial practical merit. Extensive experiments on Pascal-VOC and COCO manifest the effectiveness of CAPL, and CAPL generalizes well to FS-Seg by achieving competitive performance. Code is available at https://github.com/dvlab-research/GFS-Seg.

Citations (72)

Summary

  • The paper introduces GFS-Seg, a novel benchmark that enables simultaneous segmentation of both base and novel categories.
  • It presents the Context-Aware Prototype Learning (CAPL) method that dynamically adjusts classifier weights using context from support and query images.
  • Experimental results on Pascal-VOC and COCO demonstrate significant improvements in mIoU over conventional few-shot segmentation approaches.

Generalized Few-shot Semantic Segmentation

The paper "Generalized Few-shot Semantic Segmentation" introduces a new benchmark known as Generalized Few-Shot Semantic Segmentation (GFS-Seg). This benchmark is designed to address limitations in Few-Shot Segmentation (FS-Seg) methods, which have mainly focused on segmenting novel categories with constrained settings. The authors propose a new approach, GFS-Seg, to simultaneously segment both novel categories (with very few examples) and base categories (with sufficient examples), highlighting issues with previous FS-Seg methods and presenting their solution through the Context-Aware Prototype Learning (CAPL).

Benchmark Introduction

GFS-Seg extends the typical FS-Seg framework by enabling the segmentation of both novel and base categories. FS-Seg frameworks have traditionally required support samples to contain target classes present in the query samples, limiting their practicality. The proposed GFS-Seg benchmark eliminates this constraint, allowing simultaneous evaluation of base and novel classes during the inference stage without needing support samples to contain identical target classes.

Methodology

The paper introduces a baseline for the GFS-Seg task that achieves reasonable performance without structural alterations to the original model framework. Moreover, a significant performance enhancement is observed with the introduction of the CAPL approach. CAPL improves semantic segmentation by dynamically leveraging context-dependent information from both support and query samples to enhance classifier performance. The contributions of CAPL include:

  1. Co-occurrence Mining: Utilizes support samples to incorporate prior knowledge about co-occurrence from base categories, enriching the prototypes used during inference phase.
  2. Dynamic Contextual Information Enrichment: Adapts to various contexts in query images by adjusting classifier weights conditioned on each query sample's content.

Experimental Results

The experimental evaluations conducted on popular datasets like Pascal-VOC and COCO demonstrate the effectiveness of the CAPL method. The CAPL approach not only yields substantial improvements over the baseline for GFS-Seg tasks but also generalizes well to FS-Seg tasks, achieving competitive performance. Specifically, the CAPL method shows marked improvements in the mean Intersection over Union (mIoU) for base and novel class segmentation over standard FS-Seg methods when applied under generalized scenarios.

Implications and Future Work

This paper shows that considering the integrated context within query images can significantly enhance the ability of segmentation models to generalize across both seen and unseen categories. The GFS-Seg benchmark opens pathways for future research aimed at optimizing semantic segmentation models for broader real-world applications where prior categorical knowledge is limited.

In terms of future developments in AI, this approach underscores the importance of contextual learning and dynamic model adaptation, potentially influencing advancements in other machine learning fields, including object detection and recognition.

The paper provides insight into the shortcomings of current few-shot methodologies and suggests innovative solutions that may set the stage for improved machine learning models in scenarios where data annotation is sparse or limited.