Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications (2005.11700v2)

Published 24 May 2020 in cs.CL

Abstract: Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning method, is successfully employed in NLP applications including few-shot text classification and multi-domain low-resource language generation. Many impacting factors, including data quantity, similarity among tasks, and the balance between general LLM and task-specific adaptation, can affect the performance of MAML in NLP, but few works have thoroughly studied them. In this paper, we conduct an empirical study to investigate these impacting factors and conclude when MAML works the best based on the experimental results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  2. 2016. Learning to learn by gradient descent by gradient descent. In NIPS, pages 3981–3989.
  3. 2018. Metareg: Towards domain generalization using meta-regularization. Advances in neural information processing systems, 31.
  4. 2020. Few-shot text classification with distributional signatures. ICLR.
  5. 2021. Documenting large webtext corpora: A case study on the colossal clean crawled corpus. arXiv preprint arXiv:2104.08758.
  6. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. ICML, pages 1126–1135.
  7. 2018. Transfer learning for style-specific text generation. In Proceedings of 32nd Conference on Neural Information Processing Systems, NIPS, volume 2018.
  8. 2018. Meta-learning for low-resource neural machine translation. EMNLP.
  9. 2018. Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In EMNLP, pages 4803–4809.
  10. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In ICIWWW, pages 507–517.
  11. 2022. Unsupervised neural stylistic text generation using transfer learning and adapters. arXiv preprint arXiv:2210.03264.
  12. 2018. Learning to generalize: Meta-learning for domain generalization. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
  13. 2019. Personalizing dialogue agents via meta-learning. In ACL, pages 5454–5459.
  14. 2019. Meta-learning for low-resource natural language generation in task-oriented dialogue systems. In IJCAI, pages 3151–3157.
  15. 2019. Model-agnostic meta-learning for relation classification with limited supervision. In ACL, pages 5873–5879.
  16. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. Technical blog.
  17. 2002. Bleu: a method for automatic evaluation of machine translation. In ACL, pages 311–318.
  18. 2014. Glove: Global vectors for word representation. In EMNLP, pages 1532–1543.
  19. 2019. Domain adaptive dialog generation via meta learning. In ACL, pages 2639–2649.
  20. 2017. Optimization as a model for few-shot learning.
  21. 2010. Software framework for topic modelling with large corpora. In LREC. Citeseer.
  22. 2016. Meta-learning with memory-augmented neural networks. In ICML, pages 1842–1850.
  23. 2021. Few-shot text generation with natural language instructions. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 390–402.
  24. 2015. A survey of available corpora for building data-driven dialogue systems. arXiv preprint arXiv:1512.05742.
  25. 2020. Learning to customize model structures for few-shot dialogue generation tasks. In ACL, pages 5832–5841.
  26. 2017. Attention is all you need. In NIPS, pages 5998–6008.
  27. 2002. A perspective view and survey of meta-learning. Artificial intelligence review, 18:77–95.
  28. 2018. Personalizing dialogue agents: I have a dog, do you have pets too? In ACL, pages 2204–2213.
  29. 2022. Improving meta-learning for low-resource text classification and generation via memory imitation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 583–595.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zequn Liu (14 papers)
  2. Ruiyi Zhang (98 papers)
  3. Yiping Song (14 papers)
  4. Wei Ju (46 papers)
  5. Ming Zhang (313 papers)
Citations (8)