Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement (2312.15731v4)
Abstract: The Few-Shot Segmentation (FSS) aims to accomplish the novel class segmentation task with a few annotated images. Current FSS research based on meta-learning focus on designing a complex interaction mechanism between the query and support feature. However, unlike humans who can rapidly learn new things from limited samples, the existing approach relies solely on fixed feature matching to tackle new tasks, lacking adaptability. In this paper, we propose a novel framework based on the adapter mechanism, namely Adaptive FSS, which can efficiently adapt the existing FSS model to the novel classes. In detail, we design the Prototype Adaptive Module (PAM), which utilizes accurate category information provided by the support set to derive class prototypes, enhancing class-specific information in the multi-stage representation. In addition, our approach is compatible with diverse FSS methods with different backbones by simply inserting PAM between the layers of the encoder. Experiments demonstrate that our method effectively improves the performance of the FSS models (e.g., MSANet, HDMNet, FPTrans, and DCAMA) and achieve new state-of-the-art (SOTA) results (i.e., 72.4\% and 79.1\% mIoU on PASCAL-5$i$ 1-shot and 5-shot settings, 52.7\% and 60.0\% mIoU on COCO-20$i$ 1-shot and 5-shot settings). Our code can be available at https://github.com/jingw193/AdaptiveFSS.
- Few-shot segmentation without meta-learning: A good transductive inference is all you need? In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 13979–13988.
- Conv-adapter: Exploring parameter efficient transfer learning for convnets. arXiv preprint arXiv:2208.07463.
- Apanet: adaptive prototypes alignment network for few-shot semantic segmentation. arXiv preprint arXiv:2111.12263.
- Adaptformer: Adapting vision transformers for scalable visual recognition. Advances in Neural Information Processing Systems, 35: 16664–16678.
- A baseline for few-shot image classification. arXiv preprint arXiv:1909.02729.
- Few-shot semantic segmentation with prototype learning. In BMVC, volume 3.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
- The pascal visual object classes (voc) challenge. International journal of computer vision, 88: 303–338.
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In International Conference on Machine Learning.
- Clip-adapter: Better vision-language models with feature adapters. arXiv preprint arXiv:2110.04544.
- Simultaneous detection and segmentation. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII 13, 297–312. Springer.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Cross attention network for few-shot classification. Advances in neural information processing systems, 32.
- Parameter-efficient transfer learning for NLP. In International Conference on Machine Learning, 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
- LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models. arXiv preprint arXiv:2304.01933.
- Msanet: Multi-similarity and attention guidance for boosting few-shot segmentation. arXiv preprint arXiv:2206.09667.
- Visual prompt tuning. In European Conference on Computer Vision, 709–727. Springer.
- Integrative few-shot learning for classification and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9979–9990.
- Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19627–19638.
- Relational embedding for few-shot classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 8822–8833.
- Learning what not to segment: A new perspective on few-shot segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8057–8067.
- Cross-domain few-shot learning with task-specific adapters. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7161–7170.
- Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 740–755. Springer.
- Few-shot segmentation with optimal transport matching and message flow. arXiv preprint arXiv:2108.08518.
- Crnet: Cross-reference networks for few-shot segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4165–4173.
- P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 61–68.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, 10012–10022.
- Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 8741–8750.
- Hypercorrelation squeeze for few-shot segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, 6941–6952.
- Feature weighting and boosting for few-shot segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 622–631.
- St-adapter: Parameter-efficient image-to-video transfer learning. Advances in Neural Information Processing Systems, 35: 26462–26477.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
- Hierarchical Dense Correlation Distillation for Few-Shot Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 23641–23651.
- Conditional networks for few-shot semantic segmentation.
- Optimization as a model for few-shot learning. In International conference on learning representations.
- RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5192–5201.
- One-shot learning for semantic segmentation. arXiv preprint arXiv:1709.03410.
- Dense cross-query-and-support attention weighted mask aggregation for few-shot segmentation. In European Conference on Computer Vision, 151–168. Springer.
- Learning to Compare: Relation Network for Few-Shot Learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1199–1208.
- Differentiable meta-learning model for few-shot semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 12087–12094.
- Prior guided feature enrichment network for few-shot segmentation. IEEE transactions on pattern analysis and machine intelligence, 44(2): 1050–1065.
- Matching networks for one shot learning. Advances in neural information processing systems, 29.
- Panet: Few-shot image semantic segmentation with prototype alignment. In proceedings of the IEEE/CVF international conference on computer vision, 9197–9206.
- List: Lite prompted self-training makes parameter-efficient few-shot learners. arXiv preprint arXiv:2110.06274.
- Scale-aware graph neural network for few-shot semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5475–5484.
- Prototype mixture models for few-shot semantic segmentation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VIII 16, 763–778. Springer.
- Brinet: Towards bridging the intra-class and inter-class gaps in one-shot segmentation. arXiv preprint arXiv:2008.06226.
- MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7131–7140.
- TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning. ArXiv, abs/1905.06549.
- Self-guided and cross-guided learning for few-shot segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8312–8321.
- Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5217–5226.
- Few-shot segmentation via cycle-consistent transformer. Advances in Neural Information Processing Systems, 34: 21984–21996.
- Feature-proxy transformer for few-shot segmentation. Advances in Neural Information Processing Systems, 35: 6575–6588.
- HG-Meta: Graph Meta-learning over Heterogeneous Graphs. In SDM.
- Tip-adapter: Training-free adaption of clip for few-shot classification. In European Conference on Computer Vision, 493–510. Springer.
- Sg-one: Similarity guidance network for one-shot semantic segmentation. IEEE transactions on cybernetics, 50(9): 3855–3865.
- A few shot classification methods based on multiscale relational networks. Applied Sciences, 12(8): 4059.
- Continual prompt tuning for dialog state tracking. arXiv preprint arXiv:2203.06654.
- Jing Wang (740 papers)
- Jinagyun Li (1 paper)
- Chen Chen (753 papers)
- Yisi Zhang (12 papers)
- Haoran Shen (6 papers)
- Tianxiang Zhang (10 papers)