Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 83 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 220 tok/s Pro
GPT OSS 120B 473 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents (2410.22476v1)

Published 29 Oct 2024 in cs.CL and cs.IR

Abstract: In task-oriented dialogue systems, intent detection is crucial for interpreting user queries and providing appropriate responses. Existing research primarily addresses simple queries with a single intent, lacking effective systems for handling complex queries with multiple intents and extracting different intent spans. Additionally, there is a notable absence of multilingual, multi-intent datasets. This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual multi-label intent dataset. We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets. We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets. Comprehensive analysis demonstrates the superiority of our pointer network-based system over baseline approaches in terms of accuracy and F1-score across various datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Gpt-3.5 turbo documentation.
  2. Appraisal of opinion expressions in discourse. Lingvisticæ Investigationes, 32(2):279–292.
  3. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  4. Nearest neighbor ensembles: An effective method for difficult problems in streaming classification with emerging new classes. In 2019 IEEE International Conference on Data Mining (ICDM), pages 970–975. IEEE.
  5. Efficient intent detection with dual sentence encoders. arXiv preprint arXiv:2003.04807.
  6. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
  7. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
  8. Span-convert: Few-shot span extraction for dialog with pretrained conversational representations. arXiv preprint arXiv:2005.08866.
  9. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190.
  10. User attention-guided multimodal dialog systems. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pages 445–454.
  11. Ali Degirmenci and Omer Karal. 2022. Efficient density and cluster based incremental outlier detection in data streams. Information Sciences, 607:901–920.
  12. Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314.
  13. Bert: Pre-training of deep bidirectional transformers for language understanding.
  14. Multitask learning for multilingual intent detection and slot filling in dialogue systems. Information Fusion, 91:299–315.
  15. Rashmi Gangadharaiah. 2019. Joint multiple intent detection and slot labeling for goal-oriented dialog.
  16. Matscie: An automated tool for the generation of databases of methods and parameters used in the computational materials science literature. Computational Materials Science (Comput. Mater. Sci.), 192:110325.
  17. Convert: Efficient and accurate conversational representations from transformers. arXiv preprint arXiv:1911.03688.
  18. Spm: A split-parsing method for joint multi-intent detection and slot filling. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 668–675.
  19. From n to n+ 1: Multiclass transfer incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3358–3365.
  20. J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics, pages 159–174.
  21. An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv:1909.02027.
  22. A novel semi-supervised classification approach for evolving data streams. Expert Systems with Applications, 215:119273.
  23. Benchmarking natural language understanding services for building conversational agents.
  24. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  25. Unitranser: A unified transformer semantic representation framework for multimodal task-oriented dialog system. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 103–114.
  26. Intention reasoning network for multi-domain end-to-end task-oriented dialogue. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2273–2285.
  27. Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transactions on Knowledge and Data Engineering, 23(6):859–874.
  28. Classification under streaming emerging new classes: A solution using completely-random trees. IEEE Transactions on Knowledge and Data Engineering, 29(8):1605–1618.
  29. Streaming classification with emerging new class by class matrix sketching. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31.
  30. Ankan Mullick. 2023a. Exploring multilingual intent dynamics and applications. IJCAI Doctoral Consortium.
  31. Ankan Mullick. 2023b. Novel intent detection and active learning based classification (student abstract). arXiv e-prints, pages arXiv–2304.
  32. Matscire: Leveraging pointer networks to automate entity and relation extraction for material science knowledge-base construction. Computational Materials Science, 233:112659.
  33. A graphical framework to detect and categorize diverse opinions from online news. In Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES), pages 40–49.
  34. A generic opinion-fact classifier with application in understanding opinionatedness in various news section. In Proceedings of the 26th International Conference on World Wide Web Companion, pages 827–828.
  35. Intent identification and entity extraction for healthcare queries in indic languages. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1825–1836.
  36. An evaluation framework for legal document summarization. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4747–4753.
  37. Fine-grained intent classification in the legal domain. arXiv preprint arXiv:2205.03509.
  38. Using sentence-level classification helps entity extraction from material science literature. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4540–4545.
  39. A framework to generate high-quality datapoints for multiple novel intent detection. arXiv preprint arXiv:2205.02005.
  40. Dilof: Effective and memory efficient local outlier detection in data streams. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1993–2002.
  41. Tapas Nayak and Hwee Tou Ng. 2020. Effective modeling of encoder-decoder architecture for joint entity and relation extraction. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 8528–8535.
  42. OpenAI. 2023. Gpt-4 technical report.
  43. How multilingual is multilingual bert? arXiv preprint arXiv:1906.01502.
  44. Gl-gin: Fast and accurate non-autoregressive model for joint multiple intent detection and slot filling. arXiv preprint arXiv:2106.01925.
  45. Agif: An adaptive graph-interactive framework for joint multiple intent detection and slot filling. arXiv preprint arXiv:2004.10087.
  46. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
  47. Toward open set recognition. IEEE transactions on pattern analysis and machine intelligence, 35(7):1757–1772.
  48. Cross-lingual transfer learning for multilingual task oriented dialog. arXiv preprint arXiv:1810.13327.
  49. Enhancing joint multiple intent detection and slot filling with global intent-slot co-occurrence. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7967–7977.
  50. Modeling factuality judgments in social media text. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 415–420.
  51. Online ensemble learning of data streams with gradually evolved classes. IEEE Transactions on Knowledge and Data Engineering, 28(6):1532–1545.
  52. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  53. What is left to be understood in atis? In 2010 IEEE Spoken Language Technology Workshop, pages 19–24. IEEE.
  54. Active learning through label error statistical methods. Knowledge-Based Systems, 189:105140.
  55. Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system. arXiv preprint arXiv:2104.11882.
  56. Unknown intent detection using gaussian mixture model with an application to zero-shot intent classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1050–1060.
  57. Out-of-scope intent detection with self-supervision and discriminative training. arXiv preprint arXiv:2106.08616.
  58. Knn-contrastive learning for out-of-domain intent classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5129–5141.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces MLMCID, a novel Pointer Network that jointly extracts and detects multi-label and multi-class intents from complex user queries.
  • It utilizes an encoder-decoder framework with models like BERT and RoBERTa, achieving up to 89% accuracy in coarse intent detection.
  • The study extends existing datasets to multilingual settings, enhancing dialogue systems with more precise intent identification.

A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents

The paper investigates a novel approach for addressing multi-label, multi-class intent detection in task-oriented dialogue systems, particularly focusing on the complexity of detecting and extracting multiple intents within a single user query. The authors highlight that existing research predominantly tackles simple queries with single intents, thus motivating their paper in response to the absence of systems capable of managing complex queries involving multiple intents and extracting various intent spans.

Central to this research is the introduction of a Pointer Network-based architecture termed MLMCID (Multi-Label Multi-Class Intent Detection) designed for efficient span extraction and detection of multiple intents. The paper also presents a newly curated dataset (MLMCID-dataset) that merges existing benchmark datasets and extends them to multilingual settings, particularly in English, Spanish, and Thai, to support the formation of both coarse and fine-grained intent labels.

The methodology integrates an encoder-decoder framework with Pointer Networks, capable of identifying precise intent spans and corresponding intent labels within sentences, coupled with a feed-forward network for intent detection. For encoder choices, models like BERT and RoBERTa are employed for English data, while multi-lingual counterparts like XLM-R are used for non-English datasets. This comprehensive architecture allows for end-to-end learning while optimally addressing the nuances found in multi-intent statements.

Key Findings and Results

The empirical studies reveal that the proposed model surpasses several existing models, including state-of-the-art LLMs like Llama-2 and GPT variants, in terms of accuracy and macro F1-score across a variety of datasets. Notably, RoBERTa combined with Pointer Networks demonstrates superior performance, proving robust across all tested datasets for both primary and average intent detection.

In terms of numerical significance, the paper reports high accuracy scores when using RoBERTa for coarse and fine-grained intent detection, achieving superior results in comparison to its baseline counterparts. For example, RoBERTa achieves accuracy improvements up to 89% for coarse labels on mixed datasets, demonstrating the model’s efficacy over baseline LLMs which show significantly lower performances under the same conditions.

Implications and Future Directions

The implications of this work are multifold. Practically, the development of such models can significantly enhance the user experience in interactive systems by providing more accurate and contextually relevant responses amidst complex user queries. Theoretically, this work contributes to advancements in natural language understanding models by proposing effective solutions for multi-intent, multi-span problems, laying the groundwork for future explorations in intent classification tasks.

For future developments, the authors suggest the exploration of even more sophisticated models capable of handling scenarios involving a greater number of intents. This includes considering non-linear dependencies among multiple intents and enhancing the model's ability to generalize across diverse linguistic and contextual scenarios.

In conclusion, this paper expands the frontier of intention detection within natural language processing by offering efficient techniques for handling complex, multi-intent conversational queries. It sets a strong precedent for subsequent research focused on advancing task-oriented dialogue systems.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.