Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Meta-Learning via In-Context Learning (2405.16124v2)

Published 25 May 2024 in cs.LG

Abstract: Unsupervised meta-learning aims to learn feature representations from unsupervised datasets that can transfer to downstream tasks with limited labeled data. In this paper, we propose a novel approach to unsupervised meta-learning that leverages the generalization abilities of in-context learning observed in transformer architectures. Our method reframes meta-learning as a sequence modeling problem, enabling the transformer encoder to learn task context from support images and utilize it to predict query images. At the core of our approach lies the creation of diverse tasks generated using a combination of data augmentations and a mixing strategy that challenges the model during training while fostering generalization to unseen tasks at test time. Experimental results on benchmark datasets showcase the superiority of our approach over existing unsupervised meta-learning baselines, establishing it as the new state-of-the-art in the field. Remarkably, our method achieves competitive results with supervised and self-supervised approaches, underscoring the efficacy of the model in leveraging generalization over memorization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Visual prompting via image inpainting. Advances in Neural Information Processing Systems, 35:25005–25017, 2022.
  2. Meta-learning with differentiable closed-form solvers. In International Conference on Learning Representations (ICLR), 2019. International Conference on Learning Representations, 2019.
  3. Pytorch lightning bolts. https://lightning-bolts.readthedocs.io/en/latest/, 2022. Online; accessed 25 Apr 2024.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  5. Universeg: Universal medical image segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 21438–21451, 2023.
  6. Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33:9912–9924, 2020.
  7. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  9. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
  10. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision, pages 1422–1430, 2015.
  11. A survey on in-context learning. arXiv preprint arXiv:2301.00234, 2022.
  12. Context-aware meta-learning. In The Twelfth International Conference on Learning Representations, 2023.
  13. In-context learning for few-shot molecular property prediction. arXiv preprint arXiv:2310.08863, 2023.
  14. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR, 2017.
  15. What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598, 2022.
  16. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  17. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
  18. Unsupervised learning via meta-learning. In International Conference on Learning Representations, 2018.
  19. Unsupervised meta-learning via few-shot pseudo-supervised contrastive learning. In The Eleventh International Conference on Learning Representations, 2022.
  20. Unsupervised meta-learning for few-shot image classification. Advances in neural information processing systems, 32, 2019.
  21. Unsupervised meta-learning through latent-space interpolation in generative models. In International Conference on Learning Representations, 2020.
  22. General-purpose in-context learning by meta-learning transformers. arXiv preprint arXiv:2212.04458, 2022.
  23. Unsupervised meta-learning via latent space energy-based model of symbol vector coupling. In Fifth Workshop on Meta-Learning at the Conference on Neural Information Processing Systems, 2021.
  24. Self-supervised set representation learning for unsupervised meta-learning. In The Eleventh International Conference on Learning Representations, 2022.
  25. Meta-gmvae: Mixture of gaussian vae for unsupervised meta-learning. In International Conference on Learning Representations, 2020.
  26. Transformers as algorithms: Generalization and stability in in-context learning. In International Conference on Machine Learning, pages 19565–19594. PMLR, 2023.
  27. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  28. Learning a few-shot embedding model with contrastive learning. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 8635–8643, 2021.
  29. What makes good in-context examples for gpt-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 100–114, 2022.
  30. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.
  31. Metaicl: Learning to learn in context. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2791–2809, 2022.
  32. A simple neural attentive meta-learner. In International Conference on Learning Representations, 2018.
  33. Grokking: Generalization beyond overfitting on small algorithmic datasets. arXiv preprint arXiv:2201.02177, 2022.
  34. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  35. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  36. Meta-learning requires meta-augmentation. Advances in Neural Information Processing Systems, 33:5705–5715, 2020.
  37. S. Ravi and H. Larochelle. Optimization as a model for few-shot learning. In International conference on learning representations, 2016.
  38. B. Schroeder and Y. Cui. Fgvcx fungi classification challenge 2018. https://github.com/visipedia/fgvcx_fungi_comp, 2018. Online; accessed 25 Apr 2024.
  39. C. Shorten and T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. Journal of big data, 6(1):1–48, 2019.
  40. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
  41. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1199–1208, 2018.
  42. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  43. Meta-dataset: A dataset of datasets for learning to learn from few examples. In International Conference on Learning Representations, 2019.
  44. L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  45. J. Vanschoren. Meta-learning. Automated machine learning: methods, systems, challenges, pages 35–61, 2019.
  46. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  47. Manifold mixup: Better representations by interpolating hidden states. In International conference on machine learning, pages 6438–6447. PMLR, 2019.
  48. Advances and challenges in meta-learning: A technical review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
  49. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
  50. The caltech-ucsd birds-200-2011 dataset. California Institute of Technology, 2011.
  51. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  52. Emergent abilities of large language models. Transactions on Machine Learning Research, 2022.
  53. P. H. Winston. Learning and reasoning by analogy. Communications of the ACM, 23(12):689–703, 1980.
  54. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771, 2019.
  55. Improving generalization in meta-learning via task augmentation. In International conference on machine learning, pages 11887–11897. PMLR, 2021.
  56. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6023–6032, 2019.
  57. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations, 2018.
  58. What makes good examples for visual in-context learning? Advances in Neural Information Processing Systems, 36, 2024.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com