Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Similarity-Based Domain-Ordering Reduce Catastrophic Forgetting for Intent Recognition? (2402.14155v1)

Published 21 Feb 2024 in cs.CL and cs.AI

Abstract: Task-oriented dialogue systems are expected to handle a constantly expanding set of intents and domains even after they have been deployed to support more and more functionalities. To live up to this expectation, it becomes critical to mitigate the catastrophic forgetting problem (CF) that occurs in continual learning (CL) settings for a task such as intent recognition. While existing dialogue systems research has explored replay-based and regularization-based methods to this end, the effect of domain ordering on the CL performance of intent recognition models remains unexplored. If understood well, domain ordering has the potential to be an orthogonal technique that can be leveraged alongside existing techniques such as experience replay. Our work fills this gap by comparing the impact of three domain-ordering strategies (min-sum path, max-sum path, random) on the CL performance of a generative intent recognition model. Our findings reveal that the min-sum path strategy outperforms the others in reducing catastrophic forgetting when training on the 220M T5-Base model. However, this advantage diminishes with the larger 770M T5-Large model. These results underscores the potential of domain ordering as a complementary strategy for mitigating catastrophic forgetting in continually learning intent recognition models, particularly in resource-constrained scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Samuel J. Bell and Neil D. Lawrence. 2022. The effect of task ordering in continual learning. arXiv preprint arXiv:2205.13323.
  2. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41–48.
  3. Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420.
  4. Zhiyuan Chen and Bing Liu. 2018. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(3):1–207.
  5. Introduction to Algorithms, Third Edition, 3rd edition. The MIT Press.
  6. Preview, attend and review: Schema-aware curriculum learning for multi-domain dialog state tracking. arXiv preprint arXiv:2106.00291.
  7. Robert French. 1970. Using semi-distributed representations to overcome catastrophic forgetting in connectionist networks.
  8. Zixuan Ke and Bing Liu. 2022. Continual learning of natural language processing tasks: A survey. arXiv preprint arXiv:2211.12701.
  9. Yann LeCun. 1998. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/.
  10. Yann LeCun and Corinna Cortes. 2005. The mnist database of handwritten digits.
  11. Sungjin Lee. 2017. Toward continual learning for conversational agents. arXiv preprint arXiv:1712.09943.
  12. Overcoming catastrophic forgetting during domain adaptation of seq2seq language generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5441–5454.
  13. Continual few-shot intent detection. In Proceedings of the 29th International Conference on Computational Linguistics, pages 333–343.
  14. Continual learning in task-oriented dialogue systems. arXiv preprint arXiv:2012.15504.
  15. Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier.
  16. Continual learning for natural language generation in task-oriented dialog systems. arXiv preprint arXiv:2010.00910.
  17. Toward understanding catastrophic forgetting in continual learning. arXiv preprint arXiv:1908.01091.
  18. ELLE: Efficient lifelong pre-training for emerging data. arXiv preprint arXiv:2203.06311.
  19. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  20. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  21. A principled approach for learning task similarity in multitask learning. arXiv preprint arXiv:1903.09109.
  22. Gido M Van de Ven and Andreas S Tolias. 2019. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734.
  23. A survey on curriculum learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):4555–4576.
  24. Learning to prompt for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 139–149.
  25. Curriculum learning for natural language understanding. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6095–6104, Online. Association for Computational Linguistics.
  26. Towards scalable multi-domain conversational agents: The schema-guided dialogue dataset. Proceedings of the AAAI conference on artificial intelligence.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Amogh Mannekote (6 papers)
  2. Xiaoyi Tian (11 papers)
  3. Kristy Elizabeth Boyer (7 papers)
  4. Bonnie J. Dorr (20 papers)