Constrained Few-Shot Learning: Human-Like Low Sample Complexity Learning and Non-Episodic Text Classification (2208.08089v2)
Abstract: Few-shot learning (FSL) is an emergent paradigm of learning that attempts to learn to reason with low sample complexity to mimic the way humans learn, generalise and extrapolate from only a few seen examples. While FSL attempts to mimic these human characteristics, fundamentally, the task of FSL as conventionally formulated using meta-learning with episodic-based training does not in actuality align with how humans acquire and reason with knowledge. FSL with episodic training, while only requires $K$ instances of each test class, still requires a large number of labelled training instances from disjoint classes. In this paper, we introduce the novel task of constrained few-shot learning (CFSL), a special case of FSL where $M$, the number of instances of each training class is constrained such that $M \leq K$ thus applying a similar restriction during FSL training and test. We propose a method for CFSL leveraging Cat2Vec using a novel categorical contrastive loss inspired by cognitive theories such as fuzzy trace theory and prototype theory.
- Bridging few-shot learning and adaptation: New challenges of support-query shift. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 554–569. Springer.
- Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- A theoretical analysis of the number of shots in few-shot learning. arXiv preprint arXiv:1909.11722.
- A closer look at the training strategy for modern meta-learning. Advances in Neural Information Processing Systems, 33:396–406.
- A closer look at few-shot classification. arXiv preprint arXiv:1904.04232.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR.
- Induction networks for few-shot text classification. arXiv preprint arXiv:1902.10482.
- Philipp Grohs and Felix Voigtlaender. 2021. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. arXiv preprint arXiv:2104.02746.
- Michael U Gutmann and Aapo Hyvärinen. 2012. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of machine learning research, 13(2).
- Supervised contrastive learning. Advances in Neural Information Processing Systems, 33:18661–18673.
- Steinar Laenen and Luca Bertinetto. 2021. On episodes, prototypical networks, and few-shot learning. Advances in Neural Information Processing Systems, 34.
- Jaron Mar and Jiamou Liu. 2022. From cognitive to computational modeling: Text-based risky decision-making guided by fuzzy trace theory. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 391–409, Seattle, United States. Association for Computational Linguistics.
- Prototypical networks for few-shot learning. Advances in neural information processing systems, 30.
- Stanley Smith Stevens. 1946. On the theory of scales of measurement. Science, 103(2684):677–680.
- Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1199–1208.
- Rethinking few-shot image classification: a good embedding is all you need? In European Conference on Computer Vision, pages 266–282. Springer.
- Learning a universal template for few-shot dataset generalization. In International Conference on Machine Learning, pages 10424–10433. PMLR.
- Matching networks for one shot learning. Advances in neural information processing systems, 29.
- Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 1480–1489.