Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring intra-task relations to improve meta-learning algorithms (2312.16612v1)

Published 27 Dec 2023 in cs.LG

Abstract: Meta-learning has emerged as an effective methodology to model several real-world tasks and problems due to its extraordinary effectiveness in the low-data regime. There are many scenarios ranging from the classification of rare diseases to LLMling of uncommon languages where the availability of large datasets is rare. Similarly, for more broader scenarios like self-driving, an autonomous vehicle needs to be trained to handle every situation well. This requires training the ML model on a variety of tasks with good quality data. But often times, we find that the data distribution across various tasks is skewed, i.e.the data follows a long-tail distribution. This leads to the model performing well on some tasks and not performing so well on others leading to model robustness issues. Meta-learning has recently emerged as a potential learning paradigm which can effectively learn from one task and generalize that learning to unseen tasks. In this study, we aim to exploit external knowledge of task relations to improve training stability via effective mini-batching of tasks. We hypothesize that selecting a diverse set of tasks in a mini-batch will lead to a better estimate of the full gradient and hence will lead to a reduction of noise in training.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. “Torchmeta: A Meta-Learning library for PyTorch” Available at: https://github.com/tristandeleu/pytorch-meta, 2019 URL: https://arxiv.org/abs/1909.06576
  2. Chelsea Finn, Pieter Abbeel and Sergey Levine “Model-agnostic meta-learning for fast adaptation of deep networks” In arXiv preprint arXiv:1703.03400, 2017
  3. Akshat Jindal, Shreya Singh and Soham Gadgil “Classification for everyone: Building geography agnostic models for fairer recognition” In arXiv preprint arXiv:2312.02957, 2023
  4. SDGV Akanksha Kumari and Shreya Singh “Parallelization of alphabeta pruning algorithm for enhancing the two player games” In Int. J. Advances Electronics Comput. Sci 4, 2017, pp. 74–81
  5. “Gradient-based meta-learning with learned layerwise metric and subspace” In arXiv preprint arXiv:1801.05558, 2018
  6. “Lgm-net: Learning to generate matching networks for few-shot learning” In arXiv preprint arXiv:1905.06331, 2019
  7. George A Miller “WordNet: a lexical database for English” In Communications of the ACM 38.11 ACM New York, NY, USA, 1995, pp. 39–41
  8. G Mohammed Abdulla, Shreya Singh and Sumit Borar “Shop Your Right Size: A System for Recommending Sizes for Fashion Products” In Companion Proceedings of The 2019 World Wide Web Conference, WWW 1́9 San Francisco, USA: Association for Computing Machinery, 2019, pp. 327–334 DOI: 10.1145/3308560.3316599
  9. Boris Oreshkin, Pau Rodríguez López and Alexandre Lacoste “Tadam: Task dependent adaptive metric for improved few-shot learning” In Advances in Neural Information Processing Systems, 2018, pp. 721–731
  10. Charles Rajan, Nishit Asnani and Shreya Singh “Shaping Political Discourse using multi-source News Summarization” In arXiv preprint arXiv:2312.11703, 2023
  11. Chetanya Rastogi, Prabhat Agarwal and Shreya Singh “Exploring Graph Based Approaches for Author Name Disambiguation” In arXiv preprint arXiv:2312.08388, 2023
  12. “Imagenet large scale visual recognition challenge” In International journal of computer vision 115.3 Springer, 2015, pp. 211–252
  13. “Multimodal Group Activity State Detection for Classroom Response System Using Convolutional Neural Networks” In Recent Findings in Intelligent Computing Techniques Singapore: Springer Singapore, 2019, pp. 245–251
  14. “One embedding to do them all” In arXiv preprint arXiv:1906.12120, 2019
  15. “Footwear Size Recommendation System” In arXiv preprint arXiv:1806.11423, 2018
  16. Jake Snell, Kevin Swersky and Richard Zemel “Prototypical networks for few-shot learning” In Advances in neural information processing systems, 2017, pp. 4077–4087
  17. “TAdaNet: Task-Adaptive Network for Graph-Enriched Meta-Learning” In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2́0 Virtual Event, CA, USA: Association for Computing Machinery, 2020, pp. 1789–1799 DOI: 10.1145/3394486.3403230
  18. “Matching networks for one shot learning” In Advances in neural information processing systems, 2016, pp. 3630–3638
  19. “Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation” In Advances in Neural Information Processing Systems, 2019, pp. 1–12
  20. “Automated relational meta-learning” In arXiv preprint arXiv:2001.00745, 2020
  21. “Hierarchically structured meta-learning” In arXiv preprint arXiv:1905.05301, 2019
  22. Sung Whan Yoon, Jun Seo and Jaekyun Moon “Tapnet: Neural network augmented with task-adaptive projection for few-shot learning” In arXiv preprint arXiv:1905.06549, 2019
Citations (1)

Summary

We haven't generated a summary for this paper yet.