Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OmniDialog: An Omnipotent Pre-training Model for Task-Oriented Dialogue System (2312.16864v1)

Published 28 Dec 2023 in cs.CL

Abstract: Pre-trained conversation models (PCMs) have demonstrated remarkable results in task-oriented dialogue (TOD) systems. Many PCMs focus predominantly on dialogue management tasks like dialogue state tracking, dialogue generation tasks like response generation, or both. However, the existing PCMs seldom consider dialogue comprehension tasks, such as dialogue question answering and summarization tasks. These tasks allow PCMs to glean dialogue context from various angles. This observation naturally raises the question: Can the performance of downstream dialogue tasks be enhanced if a PCM is pre-trained on dialogue management, generation, and comprehension tasks? To investigate this, we proposed an Omnipotent Dialogue pre-training model (OmniDialog). It unifies these three dialogue tasks into a monolithic framework by multi-task learning, fostering inter-task communication. The pre-training corpus of OmniDialog spans $\mathbf{7}$ dialogue-focused tasks, drawing from $\mathbf{15}$ datasets and encompassing over $\mathbf{3.2}$ million dialogue utterances. To our knowledge, OmniDialog is a pioneering PCM pre-trained across dialogue management, generation, and comprehension domains. We evaluated its performance across four tasks: dialogue summarization, end-to-end dialogue modeling, dialogue state tracking, and intent classification. The results underscore its efficacy in domain transfer learning, low-resource, and full-dataset scenarios. Furthermore, to glean a nuanced understanding of OmniDialog's strengths and potential pitfalls, we designed a fine-grained analysis framework for dialogue-centric tasks. Experimental results show that the OmniDialog is good at hard samples, such as long dialogues and lengthy responses.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. W. He, Y. Dai, M. Yang, J. Sun, F. Huang, L. Si, and Y. Li, “Unified dialog model pre-training for task-oriented dialog understanding and generation,” in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022, pp. 187–200.
  2. H. Brabra, M. Báez, B. Benatallah, W. Gaaloul, S. Bouguelia, and S. Zamanirad, “Dialogue management in conversational systems: a review of approaches, challenges, and opportunities,” IEEE Transactions on Cognitive and Developmental Systems, 2021.
  3. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
  4. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” vol. abs/1910.10683, 2019. [Online]. Available: http://arxiv.org/abs/1910.10683
  5. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  6. Y. Zhang, S. Sun, M. Galley, Y. Chen, C. Brockett, X. Gao, J. Gao, J. Liu, and B. Dolan, “DialoGPT: large-scale generative pre-training for conversational response generation,” vol. abs/1911.00536, 2019. [Online]. Available: http://arxiv.org/abs/1911.00536
  7. S. Bao, H. He, F. Wang, H. Wu, and H. Wang, “Plato: Pre-trained dialogue generation model with discrete latent variable,” arXiv preprint arXiv:1910.07931, 2019.
  8. Q. Liu, L. Yu, L. Rimell, and P. Blunsom, “Pretraining the noisy channel model for task-oriented dialogue,” Transactions of the Association for Computational Linguistics, vol. 9, pp. 657–674, 2021.
  9. W. He, Y. Dai, Y. Zheng, Y. Wu, Z. Cao, D. Liu, P. Jiang, M. Yang, F. Huang, L. Si et al., “Galaxy: A generative pre-trained model for task-oriented dialog with semi-supervised learning and explicit policy injection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, 2022, pp. 10 749–10 757.
  10. M. Henderson, I. Casanueva, N. Mrkšić, P.-H. Su, T.-H. Wen, and I. Vulić, “Convert: Efficient and accurate conversational representations from transformers,” arXiv preprint arXiv:1911.03688, 2019.
  11. Y. Su, L. Shu, E. Mansimov, A. Gupta, D. Cai, Y.-A. Lai, and Y. Zhang, “Multi-task pre-training for plug-and-play task-oriented dialogue system,” arXiv preprint arXiv:2109.14739, 2021.
  12. L. Cui, Y. Wu, S. Liu, Y. Zhang, and M. Zhou, “Mutual: A dataset for multi-turn dialogue reasoning,” arXiv preprint arXiv:2004.04494, 2020.
  13. K. Sun, D. Yu, J. Chen, D. Yu, Y. Choi, and C. Cardie, “Dream: A challenge data set and models for dialogue-based reading comprehension,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 217–231, 2019.
  14. Y. Chen, Y. Liu, L. Chen, and Y. Zhang, “Dialogsum: A real-life scenario dialogue summarization dataset,” arXiv preprint arXiv:2105.06762, 2021.
  15. L. Zhao, F. Zheng, K. He, W. Zeng, Y. Lei, H. Jiang, W. Wu, W. Xu, J. Guo, and F. Meng, “Todsum: Task-oriented dialogue summarization with state tracking,” arXiv preprint arXiv:2110.12680, 2021.
  16. V. Sanh, A. Webson, C. Raffel, S. H. Bach, L. Sutawika, Z. Alyafeai, A. Chaffin, A. Stiegler, T. L. Scao, A. Raja et al., “Multitask prompted training enables zero-shot task generalization,” arXiv preprint arXiv:2110.08207, 2021.
  17. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 1–35, 2023.
  18. J. Zhou, H. Wu, Z. Lin, G. Li, and Y. Zhang, “Dialogue state tracking with multi-level fusion of predicted dialogue states and conversations,” arXiv preprint arXiv:2107.05168, 2021.
  19. B. Peng, C. Li, J. Li, S. Shayandeh, L. Liden, and J. Gao, “Soloist: Few-shot task-oriented dialog with a single pretrained auto-regressive model,” arXiv preprint arXiv:2005.05298, vol. 3, 2020.
  20. S. Young, M. Gašić, B. Thomson, and J. D. Williams, “Pomdp-based statistical spoken dialog systems: A review,” Proceedings of the IEEE, vol. 101, no. 5, pp. 1160–1179, 2013.
  21. W. Liang, Y. Tian, C. Chen, and Z. Yu, “Moss: End-to-end dialog system framework with modular supervision,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, 2020, pp. 8327–8335.
  22. T.-H. Wen, D. Vandyke, N. Mrksic, M. Gasic, L. M. Rojas-Barahona, P.-H. Su, S. Ultes, and S. Young, “A network-based end-to-end trainable task-oriented dialogue system,” arXiv preprint arXiv:1604.04562, 2016.
  23. M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), ser. NAACL-HLT 2018.   New Orleans, Louisiana: Association for Computational Linguistics, 2018, pp. 2227–2237. [Online]. Available: https://www.aclweb.org/anthology/N18-1202
  24. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), ser. NAACL-HLT 2019.   Minneapolis, Minnesota: Association for Computational Linguistics, 2019, pp. 4171–4186. [Online]. Available: https://www.aclweb.org/anthology/N19-1423
  25. M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” vol. abs/1910.13461, 2019. [Online]. Available: http://arxiv.org/abs/1910.13461
  26. J. Wei, M. Bosma, V. Y. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le, “Finetuned language models are zero-shot learners,” arXiv preprint arXiv:2109.01652, 2021.
  27. B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,” arXiv preprint arXiv:2104.08691, 2021.
  28. T. Vu, B. Lester, N. Constant, R. Al-Rfou, and D. Cer, “Spot: Better frozen model adaptation through soft prompt transfer,” arXiv preprint arXiv:2110.07904, 2021.
  29. Z.-H. Jiang, W. Yu, D. Zhou, Y. Chen, J. Feng, and S. Yan, “Convbert: Improving bert with span-based dynamic convolution,” Advances in Neural Information Processing Systems, vol. 33, pp. 12 837–12 848, 2020.
  30. A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, “Glue: A multi-task benchmark and analysis platform for natural language understanding,” arXiv preprint arXiv:1804.07461, 2018.
  31. P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,” arXiv preprint arXiv:1606.05250, 2016.
  32. J. Zhang, Y. Zhao, M. Saleh, and P. Liu, “Pegasus: Pre-training with extracted gap-sentences for abstractive summarization,” in International Conference on Machine Learning.   PMLR, 2020, pp. 11 328–11 339.
  33. S. Lee, H. Schulz, A. Atkinson, J. Gao, K. Suleman, L. El Asri, M. Adada, M. Huang, S. Sharma, W. Tay et al., “Multi-domain task-completion dialog challenge,” Dialog system technology challenges, vol. 8, no. 9, 2019.
  34. A. Coucke, A. Saade, A. Ball, T. Bluche, A. Caulier, D. Leroy, C. Doumouro, T. Gisselbrecht, F. Caltagirone, T. Lavril et al., “Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces,” arXiv preprint arXiv:1805.10190, 2018.
  35. S. Larson, A. Mahendran, J. J. Peper, C. Clarke, A. Lee, P. Hill, J. K. Kummerfeld, K. Leach, M. A. Laurenzano, L. Tang et al., “An evaluation dataset for intent classification and out-of-scope prediction,” arXiv preprint arXiv:1909.02027, 2019.
  36. H. AMIN, “Atis airline travel information system,” https://www.kaggle.com/datasets/hassanamin/atis-airlinetravelinformationsystem, 2019.
  37. R. Lowe, N. Pow, I. Serban, and J. Pineau, “The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems,” arXiv preprint arXiv:1506.08909, 2015.
  38. C. Zhu, Y. Liu, J. Mei, and M. Zeng, “Mediasum: A large-scale media interview dataset for dialogue summarization,” arXiv preprint arXiv:2103.06410, 2021.
  39. M. Eric and C. D. Manning, “Key-value retrieval networks for task-oriented dialogue,” arXiv preprint arXiv:1705.05414, 2017.
  40. N. Mrkšić, D. O. Séaghdha, T.-H. Wen, B. Thomson, and S. Young, “Neural belief tracker: Data-driven dialogue state tracking,” arXiv preprint arXiv:1606.03777, 2016.
  41. B. Byrne, K. Krishnamoorthi, C. Sankar, A. Neelakantan, D. Duckworth, S. Yavuz, B. Goodrich, A. Dubey, A. Cedilnik, and K.-Y. Kim, “Taskmaster-1: Toward a realistic and diverse dialog dataset,” arXiv preprint arXiv:1909.05358, 2019.
  42. X. Li, Y. Wang, S. Sun, S. Panda, J. Liu, and J. Gao, “Microsoft dialogue challenge: Building end-to-end task-completion dialogue systems,” arXiv preprint arXiv:1807.11125, 2018.
  43. L. E. Asri, H. Schulz, S. Sharma, J. Zumer, J. Harris, E. Fine, R. Mehrotra, and K. Suleman, “Frames: a corpus for adding memory to goal-oriented dialogue systems,” arXiv preprint arXiv:1704.00057, 2017.
  44. M. M. Meyer, J. J. Silberg, C. A. Voigt, J. B. Endelman, S. L. Mayo, Z.-G. Wang, and F. H. Arnold, “Library analysis of schema-guided protein recombination,” Protein Science, vol. 12, no. 8, pp. 1686–1693, 2003.
  45. J. Worsham and J. Kalita, “Multi-task learning for natural language processing in the 2020s: where are we going?” Pattern Recognition Letters, vol. 136, pp. 120–126, 2020.
  46. E. Hosseini-Asl, B. McCann, C.-S. Wu, S. Yavuz, and R. Socher, “A simple language model for task-oriented dialogue,” Advances in Neural Information Processing Systems, vol. 33, pp. 20 179–20 191, 2020.
  47. P. Budzianowski, T.-H. Wen, B.-H. Tseng, I. Casanueva, S. Ultes, O. Ramadan, and M. Gašić, “Multiwoz–a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling,” arXiv preprint arXiv:1810.00278, 2018.
  48. I. Casanueva, T. Temčinas, D. Gerz, M. Henderson, and I. Vulić, “Efficient intent detection with dual sentence encoders,” arXiv preprint arXiv:2003.04807, 2020.
  49. K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
  50. M. Henderson, B. Thomson, and J. D. Williams, “The second dialog state tracking challenge,” in Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL), 2014, pp. 263–272.
  51. C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74–81.
  52. Y. Zhang, Z. Ou, and Z. Yu, “Task-oriented dialog systems that consider multiple appropriate responses under the same context,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, 2020, pp. 9604–9611.
  53. Z. Lin, A. Madotto, G. I. Winata, and P. Fung, “Mintl: Minimalist transfer learning for task-oriented dialogue systems,” arXiv preprint arXiv:2009.12005, 2020.
  54. H. Jeon and G. G. Lee, “Domain state tracking for a simplified dialogue system,” arXiv preprint arXiv:2103.06648, 2021.
  55. ——, “Dora: Towards policy optimization for task-oriented dialogue system with efficient context,” Computer Speech & Language, vol. 72, p. 101310, 2022.
  56. Y. Yang, Y. Li, and X. Quan, “Ubar: Towards fully end-to-end task-oriented dialog system with gpt-2,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 16, 2021, pp. 14 230–14 238.
  57. B.-H. Tseng, Y. Dai, F. Kreyssig, and B. Byrne, “Transferable dialogue systems and user simulators,” arXiv preprint arXiv:2107.11904, 2021.
  58. N. Lubis, C. Geishauser, M. Heck, H.-c. Lin, M. Moresi, C. van Niekerk, and M. Gašić, “Lava: Latent action spaces via variational auto-encoding for dialogue policy optimization,” arXiv preprint arXiv:2011.09378, 2020.
  59. C.-S. Wu, A. Madotto, E. Hosseini-Asl, C. Xiong, R. Socher, and P. Fung, “Transferable multi-domain state generator for task-oriented dialogue systems,” arXiv preprint arXiv:1905.08743, 2019.
  60. L. Ren, J. Ni, and J. McAuley, “Scalable and accurate dialogue state tracking via hierarchical sequence generation,” arXiv preprint arXiv:1909.00754, 2019.
  61. L. Zhou and K. Small, “Multi-domain dialogue state tracking as dynamic knowledge graph enhanced question answering,” arXiv preprint arXiv:1911.06192, 2019.
  62. S. Kim, S. Yang, G. Kim, and S.-W. Lee, “Efficient dialogue state tracking by selectively overwriting memory,” arXiv preprint arXiv:1911.03906, 2019.
  63. Y. Yang, D. Cer, A. Ahmad, M. Guo, J. Law, N. Constant, G. H. Abrego, S. Yuan, C. Tar, Y.-H. Sung et al., “Multilingual universal sentence encoder for semantic retrieval,” arXiv preprint arXiv:1907.04307, 2019.
  64. C.-S. Wu, S. Hoi, R. Socher, and C. Xiong, “Tod-bert: Pre-trained natural language understanding for task-oriented dialogue,” arXiv preprint arXiv:2004.06871, 2020.
  65. A. See, P. J. Liu, and C. D. Manning, “Get to the point: Summarization with pointer-generator networks,” arXiv preprint arXiv:1704.04368, 2017.
  66. Y. Liu and M. Lapata, “Text summarization with pretrained encoders,” arXiv preprint arXiv:1908.08345, 2019.
  67. S. Narayan, S. B. Cohen, and M. Lapata, “Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization,” arXiv preprint arXiv:1808.08745, 2018.
  68. B. Gliwa, I. Mochol, M. Biesek, and A. Wawer, “Samsum corpus: A human-annotated dialogue dataset for abstractive summarization,” arXiv preprint arXiv:1911.12237, 2019.
  69. S. Shleifer and A. M. Rush, “Pre-trained summarization distillation,” arXiv preprint arXiv:2010.13002, 2020.
  70. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  71. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., “Huggingface’s transformers: State-of-the-art natural language processing,” arXiv preprint arXiv:1910.03771, 2019.
  72. PLTRDY, “extoracle 0.1,” https://pypi.org/project/extoracle/, 2020.
  73. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  74. L. Zhao, F. Zheng, W. Zeng, K. He, W. Xu, H. Jiang, W. Wu, and Y. Wu, “Domain-oriented prefix-tuning: Towards efficient and generalizable fine-tuning for zero-shot dialogue summarization,” arXiv preprint arXiv:2204.04362, 2022.
  75. L. Zhao, F. Zheng, W. Zeng, K. He, R. Geng, H. Jiang, W. Wu, and W. Xu, “Adpl: Adversarial prompt-based domain adaptation for dialogue summarization with knowledge disentanglement,” in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022, pp. 245–255.
  76. J. Chen and D. Yang, “Structure-aware abstractive conversation summarization via discourse and action graphs,” arXiv preprint arXiv:2104.08400, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Mingtao Yang (4 papers)
  2. See-Kiong Ng (103 papers)
  3. Jinlan Fu (36 papers)