Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Constrained Few-Shot Learning: Human-Like Low Sample Complexity Learning and Non-Episodic Text Classification (2208.08089v2)

Published 17 Aug 2022 in cs.LG and cs.CL

Abstract: Few-shot learning (FSL) is an emergent paradigm of learning that attempts to learn to reason with low sample complexity to mimic the way humans learn, generalise and extrapolate from only a few seen examples. While FSL attempts to mimic these human characteristics, fundamentally, the task of FSL as conventionally formulated using meta-learning with episodic-based training does not in actuality align with how humans acquire and reason with knowledge. FSL with episodic training, while only requires $K$ instances of each test class, still requires a large number of labelled training instances from disjoint classes. In this paper, we introduce the novel task of constrained few-shot learning (CFSL), a special case of FSL where $M$, the number of instances of each training class is constrained such that $M \leq K$ thus applying a similar restriction during FSL training and test. We propose a method for CFSL leveraging Cat2Vec using a novel categorical contrastive loss inspired by cognitive theories such as fuzzy trace theory and prototype theory.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Bridging few-shot learning and adaptation: New challenges of support-query shift. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 554–569. Springer.
  2. Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136.
  3. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  4. A theoretical analysis of the number of shots in few-shot learning. arXiv preprint arXiv:1909.11722.
  5. A closer look at the training strategy for modern meta-learning. Advances in Neural Information Processing Systems, 33:396–406.
  6. A closer look at few-shot classification. arXiv preprint arXiv:1904.04232.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  8. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR.
  9. Induction networks for few-shot text classification. arXiv preprint arXiv:1902.10482.
  10. Philipp Grohs and Felix Voigtlaender. 2021. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. arXiv preprint arXiv:2104.02746.
  11. Michael U Gutmann and Aapo Hyvärinen. 2012. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of machine learning research, 13(2).
  12. Supervised contrastive learning. Advances in Neural Information Processing Systems, 33:18661–18673.
  13. Steinar Laenen and Luca Bertinetto. 2021. On episodes, prototypical networks, and few-shot learning. Advances in Neural Information Processing Systems, 34.
  14. Jaron Mar and Jiamou Liu. 2022. From cognitive to computational modeling: Text-based risky decision-making guided by fuzzy trace theory. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 391–409, Seattle, United States. Association for Computational Linguistics.
  15. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30.
  16. Stanley Smith Stevens. 1946. On the theory of scales of measurement. Science, 103(2684):677–680.
  17. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1199–1208.
  18. Rethinking few-shot image classification: a good embedding is all you need? In European Conference on Computer Vision, pages 266–282. Springer.
  19. Learning a universal template for few-shot dataset generalization. In International Conference on Machine Learning, pages 10424–10433. PMLR.
  20. Matching networks for one shot learning. Advances in neural information processing systems, 29.
  21. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 1480–1489.
Citations (1)

Summary

We haven't generated a summary for this paper yet.