Feature Weighting and Boosting for Few-Shot Segmentation (1909.13140v1)

Published 28 Sep 2019 in cs.CV

Abstract: This paper is about few-shot segmentation of foreground objects in images. We train a CNN on small subsets of training images, each mimicking the few-shot setting. In each subset, one image serves as the query and the other(s) as support image(s) with ground-truth segmentation. The CNN first extracts feature maps from the query and support images. Then, a class feature vector is computed as an average of the support's feature maps over the known foreground. Finally, the target object is segmented in the query image by using a cosine similarity between the class feature vector and the query's feature map. We make two contributions by: (1) Improving discriminativeness of features so their activations are high on the foreground and low elsewhere; and (2) Boosting inference with an ensemble of experts guided with the gradient of loss incurred when segmenting the support images in testing. Our evaluations on the PASCAL-$5^i$ and COCO-$20^i$ datasets demonstrate that we significantly outperform existing approaches.

Citations (302)

View on Semantic Scholar

Summary

The paper introduces enhanced feature weighting that optimizes feature responses to improve segmentation accuracy under limited data conditions.
The paper presents a boosted inference method using ensemble learning to iteratively refine features and mitigate overfitting.
Evaluation on PASCAL and COCO datasets demonstrates significant performance gains, underscoring the method's robustness in few-shot scenarios.

An Expert Overview of: Feature Weighting and Boosting for Few-Shot Segmentation

In their paper, Nguyen and Todorovic present innovative methodologies for tackling the challenging problem of few-shot segmentation in computer vision. Few-shot segmentation involves segmenting a target object class in a query image based on only a few annotated examples known as support images. This challenge stems from the significant intra-class variability and the scarcity of training instances, which calls for robust generalization from limited data.

Methodology

The authors leverage a convolutional neural network (CNN) architecture, trained under a few-shot learning paradigm. By structuring the training set into multiple subsets, they simulate the conditions of few-shot learning—each subset comprises one query image and one or more support images with known ground truths.

Contribution 1: Enhanced Feature Discriminativeness

The first major contribution is improving feature discriminativeness. The authors argue that conventional approaches may result in CNNs capturing non-discriminative features that could activate across different classes. To mitigate this, they propose a feature relevance estimation step that optimizes feature responses to be higher within the areas of the target class, while suppressing activations elsewhere. This is articulated as an optimization problem focused on feature differences across foreground and background pixels, and solved with a closed-form derivation.

Contribution 2: Boosted Inference with Ensemble Learning

Their second contribution enhances the robustness of inference using a novel boosted ensemble learning method. This technique adapts the representation of features through a series of ensemble experts, which modify the features iteratively guided by the loss gradient from segmenting the support image. This approach not only aims to reduce overfitting when generalizing to the query image but also aligns with ensemble strategies' traditional merits in machine learning.

Results and Implications

Evaluation on PASCAL- $5^i$ and COCO- $20^i$ datasets reveals quantitatively significant improvements over existing methodologies. The method demonstrated consistent performance enhancements attributable to increased discriminative feature power and robust generalization driven by ensemble boosting. These findings suggest that the proposed methods advance the state-of-the-art in few-shot segmentation, providing a robust learning mechanism under data-scarce conditions.

The implications of this research extend beyond current few-shot applications, potentially influencing future architectures and strategies in scenarios characterized by small sample availability. The demonstrated efficacy on COCO- $20^i$ , a dataset with more complex scenes and lower-quality segmentation masks than PASCAL, further underscores the robustness of their approach.

Future Directions

The work invites future exploration into more sophisticated ensemble and relevance estimation techniques, possibly integrating other meta-learning frameworks and semi-supervised learning paradigms. Additionally, given the efficiency of the closed-form solutions in feature relevance determination, this strategy might be investigated for other learning contexts beyond segmentation.

In summary, Nguyen and Todorovic's contribution to few-shot segmentation lies in their proficient use of feature weighting and boosting strategies, offering a scalable, efficient, and effective approach relevant for low-data scenarios in computer vision.

PDF Markdown