Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Co-guiding for Multi-intent Spoken Language Understanding (2312.03716v1)

Published 22 Nov 2023 in cs.CL and cs.AI

Abstract: Recent graph-based models for multi-intent SLU have obtained promising results through modeling the guidance from the prediction of intents to the decoding of slot filling. However, existing methods (1) only model the unidirectional guidance from intent to slot, while there are bidirectional inter-correlations between intent and slot; (2) adopt homogeneous graphs to model the interactions between the slot semantics nodes and intent label nodes, which limit the performance. In this paper, we propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks. In the first stage, the initial estimated labels of both tasks are produced, and then they are leveraged in the second stage to model the mutual guidances. Specifically, we propose two heterogeneous graph attention networks working on the proposed two heterogeneous semantics label graphs, which effectively represent the relations among the semantics nodes and label nodes. Besides, we further propose Co-guiding-SCL Net, which exploits the single-task and dual-task semantics contrastive relations. For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning, which considers the two tasks' mutual guidances in the contrastive learning procedure. Experiment results on multi-intent SLU show that our model outperforms existing models by a large margin, obtaining a relative improvement of 21.3% over the previous best model on MixATIS dataset in overall accuracy. We also evaluate our model on the zero-shot cross-lingual scenario and the results show that our model can relatively improve the state-of-the-art model by 33.5% on average in terms of overall accuracy for the total 9 languages.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. S. Young, M. Gašić, B. Thomson, and J. D. Williams, “Pomdp-based statistical spoken dialog systems: A review,” Proceedings of the IEEE, vol. 101, no. 5, pp. 1160–1179, 2013.
  2. C.-W. Goo, G. Gao, Y.-K. Hsu, C.-L. Huo, T.-C. Chen, K.-W. Hsu, and Y.-N. Chen, “Slot-gated modeling for joint slot filling and intent prediction,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).   New Orleans, Louisiana: Association for Computational Linguistics, 2018, pp. 753–757.
  3. C. Li, L. Li, and J. Qi, “A self-attentive model with gate mechanism for spoken language understanding,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.   Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 3824–3833.
  4. Y. Liu, F. Meng, J. Zhang, J. Zhou, Y. Chen, and J. Xu, “CM-net: A novel collaborative memory network for spoken language understanding,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).   Hong Kong, China: Association for Computational Linguistics, 2019, pp. 1051–1060.
  5. H. E, P. Niu, Z. Chen, and M. Song, “A novel bi-directional interrelated model for joint intent detection and slot filling,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, 2019, pp. 5467–5471.
  6. L. Qin, W. Che, Y. Li, H. Wen, and T. Liu, “A stack-propagation framework with token-level intent detection for spoken language understanding,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).   Hong Kong, China: Association for Computational Linguistics, 2019, pp. 2078–2087.
  7. B. Kim, S. Ryu, and G. G. Lee, “Two-stage multi-intent detection for spoken language understanding,” Multimedia Tools and Applications, vol. 76, no. 9, pp. 11 377–11 390, 2017.
  8. R. Gangadharaiah and B. Narayanaswamy, “Joint multiple intent detection and slot labeling for goal-oriented dialog,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).   Minneapolis, Minnesota: Association for Computational Linguistics, 2019, pp. 564–569.
  9. L. Qin, X. Xu, W. Che, and T. Liu, “AGIF: An adaptive graph-interactive framework for joint multiple intent detection and slot filling,” in Findings of the Association for Computational Linguistics: EMNLP 2020.   Online: Association for Computational Linguistics, 2020, pp. 1807–1816.
  10. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.   OpenReview.net, 2018.
  11. L. Qin, F. Wei, T. Xie, X. Xu, W. Che, and T. Liu, “GL-GIN: Fast and accurate non-autoregressive model for joint multiple intent detection and slot filling,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).   Online: Association for Computational Linguistics, 2021, pp. 178–188.
  12. B. Xing and I. Tsang, “Co-guiding net: Achieving mutual guidances between multiple intent detection and slot filling via heterogeneous semantics-label graphs,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.   Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 159–169.
  13. X. Zhang and H. Wang, “A joint model of intent determination and slot filling for spoken language understanding,” in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016, S. Kambhampati, Ed.   IJCAI/AAAI Press, 2016, pp. 2993–2999.
  14. D. Hakkani-Tür, G. Tur, A. Celikyilmaz, Y.-N. Chen, J. Gao, L. Deng, and Y.-Y. Wang, “Multi-domain joint semantic frame parsing using bi-directional rnn-lstm,” in Interspeech 2016, 2016, pp. 715–719.
  15. C. Zhang, Y. Li, N. Du, W. Fan, and P. Yu, “Joint slot filling and intent detection via capsule neural networks,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, 2019, pp. 5259–5267.
  16. D. Wu, L. Ding, F. Lu, and J. Xie, “SlotRefine: A fast non-autoregressive model for joint intent detection and slot filling,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, 2020, pp. 1932–1937.
  17. L. Qin, T. Liu, W. Che, B. Kang, S. Zhao, and T. Liu, “A co-interactive transformer for joint slot filling and intent detection,” in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 8193–8197.
  18. J. Ni, T. Young, V. Pandelea, F. Xue, V. Adiga, and E. Cambria, “Recent advances in deep learning based dialogue systems: A systematic survey,” arXiv preprint arXiv:2105.04387, 2021.
  19. Y. Cao, Z. Liu, C. Li, Z. Liu, J. Li, and T.-S. Chua, “Multi-channel graph neural network for entity alignment,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 1452–1461.
  20. K. Wang, W. Shen, Y. Yang, X. Quan, and R. Wang, “Relational graph attention network for aspect-based sentiment analysis,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, Jul. 2020, pp. 3229–3238.
  21. J. Shi, S. Cao, L. Hou, J. Li, and H. Zhang, “TransferNet: An effective and transparent framework for multi-hop question answering over relation graph,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.   Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 4149–4158.
  22. B. Xing and I. W. Tsang, “Understand me, if you refer to aspect knowledge: Knowledge-aware gated recurrent memory network,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 5, pp. 1092–1102, 2022.
  23. B. Xing and I. Tsang, “DARER: Dual-task temporal relational recurrent reasoning network for joint dialog sentiment classification and act recognition,” in Findings of the Association for Computational Linguistics: ACL 2022.   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 3611–3621.
  24. ——, “Dignet: Digging clues from local-global interactive graph for aspect-level sentiment classification,” arXiv preprint arXiv:2201.00989, 2022.
  25. C. Zhang, Q. Li, and D. Song, “Aspect-based sentiment classification with aspect-specific graph convolutional networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 4568–4578.
  26. B. Xing and I. Tsang, “Neural subgraph explorer: Reducing noisy information via target-oriented syntax graph pruning,” in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, 2022, pp. 4425–4431.
  27. B. Xing and I. W. Tsang, “Co-evolving graph reasoning network for emotion-cause pair extraction,” in Machine Learning and Knowledge Discovery in Databases: Research Track, D. Koutra, C. Plant, M. Gomez Rodriguez, E. Baralis, and F. Bonchi, Eds.   Cham: Springer Nature Switzerland, 2023, pp. 305–322.
  28. ——, “Relational temporal graph reasoning for dual-task dialogue language understanding,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 13 170–13 184, 2023.
  29. M. S. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, and M. Welling, “Modeling relational data with graph convolutional networks,” in ESWC, 2018, pp. 593–607.
  30. B. Xing and I. Tsang, “Group is better than individual: Exploiting label topologies and label relations for joint multiple intent detection and slot filling,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.   Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 3964–3975. [Online]. Available: https://aclanthology.org/2022.emnlp-main.263
  31. D. Zhang, S.-W. Li, W. Xiao, H. Zhu, R. Nallapati, A. O. Arnold, and B. Xiang, “Pairwise supervised contrastive learning of sentence representations,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.   Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 5786–5798.
  32. Y. Zhou, P. Liu, and X. Qiu, “KNN-contrastive learning for out-of-domain intent classification,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 5129–5141.
  33. Z. Wang, P. Wang, L. Huang, X. Sun, and H. Wang, “Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 7109–7119.
  34. L. Qin, Q. Chen, T. Xie, Q. Li, J.-G. Lou, W. Che, and M.-Y. Kan, “GL-CLeF: A global–local contrastive learning framework for cross-lingual spoken language understanding,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 2677–2686.
  35. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9729–9738.
  36. C. T. Hemphill, J. J. Godfrey, and G. R. Doddington, “The ATIS spoken language systems pilot corpus,” in Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990, 1990.
  37. A. Coucke, A. Saade, A. Ball, T. Bluche, A. Caulier, D. Leroy, C. Doumouro, T. Gisselbrecht, F. Caltagirone, T. Lavril, M. Primet, and J. Dureau, “Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces,” 2018.
  38. B. Liu and I. Lane, “Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling,” in Proc. Interspeech 2016, 2016, pp. 685–689.
  39. Y. Wang, Y. Shen, and H. Jin, “A bi-model based RNN semantic frame parsing model for intent detection and slot filling,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).   New Orleans, Louisiana: Association for Computational Linguistics, 2018, pp. 309–314.
  40. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in NAACL, 2019, pp. 4171–4186.
  41. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  42. Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” in Advances in Neural Information Processing Systems, vol. 32, 2019.
  43. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2015.
  44. W. Xu, B. Haider, and S. Mansour, “End-to-end slot alignment and recognition for cross-lingual NLU,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, Nov. 2020, pp. 5052–5063.
  45. A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised cross-lingual representation learning at scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, Jul. 2020, pp. 8440–8451.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Bowen Xing (14 papers)
  2. Ivor W. Tsang (109 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.