Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Less is More: A Closer Look at Semantic-based Few-Shot Learning (2401.05010v2)

Published 10 Jan 2024 in cs.CV and cs.AI

Abstract: Few-shot Learning aims to learn and distinguish new categories with a very limited number of available images, presenting a significant challenge in the realm of deep learning. Recent researchers have sought to leverage the additional textual or linguistic information of these rare categories with a pre-trained LLM to facilitate learning, thus partially alleviating the problem of insufficient supervision signals. However, the full potential of the textual information and pre-trained LLM have been underestimated in the few-shot learning till now, resulting in limited performance enhancements. To address this, we propose a simple but effective framework for few-shot learning tasks, specifically designed to exploit the textual information and LLM. In more detail, we explicitly exploit the zero-shot capability of the pre-trained LLM with the learnable prompt. And we just add the visual feature with the textual feature for inference directly without the intricate designed fusion modules in previous works. Additionally, we apply the self-ensemble and distillation to further enhance these components. Our extensive experiments conducted across four widely used few-shot datasets demonstrate that our simple framework achieves impressive results. Particularly noteworthy is its outstanding performance in the 1-shot learning task, surpassing state-of-the-art methods by an average of 3.0\% in classification accuracy. \footnote{We will make the source codes of the proposed framework publicly available upon acceptance. }.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Associative alignment for few-shot image classification. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 18–35. Springer, 2020.
  2. Mixture-based feature space learning for few-shot image classification. In Proc. of ICCV, 2021.
  3. Matching feature sets for few-shot image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9014–9024, 2022.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  5. A closer look at few-shot classification. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
  6. Semantic prompt for few-shot image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23581–23591, 2023.
  7. Meta-baseline: Exploring simple meta-learning for few-shot learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9062–9071, 2021a.
  8. Visformer: The vision-friendly transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 589–598, 2021b.
  9. A baseline for few-shot image classification. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
  10. Self-promoted supervision for few-shot transformer. In European Conference on Computer Vision, pages 329–347. Springer, 2022.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
  12. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence, 28(4):594–611, 2006.
  13. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, pages 1126–1135. PMLR, 2017.
  14. Born again neural networks. In International Conference on Machine Learning, pages 1607–1616. PMLR, 2018.
  15. A survey on deep learning for multimodal data fusion. Neural Computation, 32(5):829–864, 2020.
  16. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770–778. IEEE Computer Society, 2016.
  17. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
  18. Distilling the knowledge in a neural network. ArXiv preprint, abs/1503.02531, 2015.
  19. Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR, 2019.
  20. Ray Jackendoff. On beyond zebra: The relation of linguistic and visual information. Cognition, 26(2):89–114, 1987.
  21. Relational embedding for few-shot classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8822–8833.
  22. Model-agnostic boundary-adversarial sampling for test-time generalization in few-shot learning. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I, pages 599–617. Springer-Verlag.
  23. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2016.
  24. Learning multiple layers of features from tiny images. 2009.
  25. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015.
  26. Deep learning. nature, 521(7553):436–444, 2015.
  27. Meta-learning with differentiable convex optimization. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10649–10657. IEEE Computer Society.
  28. Boosting few-shot learning with adaptive margin loss. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12576–12584, 2020.
  29. Focal loss for dense object detection. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 2999–3007. IEEE Computer Society, 2017.
  30. Negative margin matters: Understanding margin in few-shot classification. In Proc. of ECCV, 2020.
  31. Learning a few-shot embedding model with contrastive learning. 35(10):8635–8643. Number: 10.
  32. Cross-modality graph neural network for few-shot learning. In 2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2021.
  33. Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
  34. OpenAI. Gpt-4 technical report, 2023.
  35. Tadam: Task dependent adaptive metric for improved few-shot learning. Advances in neural information processing systems, 31, 2018.
  36. Few-shot image recognition with knowledge transfer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 441–449, 2019.
  37. Visual and linguistic semantic representations are aligned at the border of human visual cortex. Nature neuroscience, 24(11):1628–1636, 2021.
  38. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  39. Meta-learning for semi-supervised few-shot classification. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018.
  40. Imagenet large scale visual recognition challenge. International journal of computer vision, 2015.
  41. Few-shot learning with graph neural networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018.
  42. The development of embodied cognition: Six lessons from babies. Artificial life, 11(1-2):13–29, 2005.
  43. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 4077–4087, 2017.
  44. A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities. ACM Computing Surveys, 2023.
  45. Rethinking few-shot image classification: a good embedding is all you need? In Proc. of ECCV, 2020.
  46. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008, 2017.
  47. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
  48. Bridging multi-task learning and meta-learning: Towards efficient training and effective adaptation. In International Conference on Machine Learning, pages 10991–11002. PMLR, 2021.
  49. Generalizing from a few examples: A survey on few-shot learning. ACM computing surveys (csur), 2020.
  50. Few-shot classification with feature map reconstruction networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8012–8021, 2021.
  51. Adaptive cross-modal few-shot learning. Advances in Neural Information Processing Systems, 32, 2019.
  52. Aligning visual prototypes with bert embeddings for few-shot learning. In Proceedings of the 2021 International Conference on Multimedia Retrieval, pages 367–375, 2021.
  53. Visual-language prompt tuning with knowledge-guided context optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6757–6767, 2023.
  54. Few-shot learning via embedding adaptation with set-to-set functions. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 8805–8814. IEEE, 2020.
  55. Hybrid graph neural networks for few-shot learning. In Proc. of AAAI, 2022.
  56. Revisiting knowledge distillation via label smoothing regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3903–3911, 2020.
  57. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 3712–3721. IEEE, 2019.
  58. Deep mutual learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4320–4328, 2018.
  59. Conditional prompt learning for vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16816–16825, 2022a.
  60. Learning to prompt for vision-language models. International Journal of Computer Vision, 130(9):2337–2348, 2022b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Chunpeng Zhou (4 papers)
  2. Haishuai Wang (26 papers)
  3. Xilu Yuan (1 paper)
  4. Zhi Yu (33 papers)
  5. Jiajun Bu (52 papers)

Summary

We haven't generated a summary for this paper yet.