Efficient Intent Detection with Dual Sentence Encoders (2003.04807v1)

Published 10 Mar 2020 in cs.CL

Abstract: Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i.e., in few-shot setups). Motivated by these requirements, we introduce intent detection methods backed by pretrained dual sentence encoders such as USE and ConveRT. We demonstrate the usefulness and wide applicability of the proposed intent detectors, showing that: 1) they outperform intent detectors based on fine-tuning the full BERT-Large model or using BERT as a fixed black-box encoder on three diverse intent detection data sets; 2) the gains are especially pronounced in few-shot setups (i.e., with only 10 or 30 annotated examples per intent); 3) our intent detectors can be trained in a matter of minutes on a single CPU; and 4) they are stable across different hyperparameter settings. In hope of facilitating and democratizing research focused on intention detection, we release our code, as well as a new challenging single-domain intent detection dataset comprising 13,083 annotated examples over 77 intents.

PDF Abstract

Efficient Intent Detection with Dual Sentence Encoders

The paper presents a paper on intent detection for task-oriented conversational systems, focusing on achieving resource efficiency and robust performance in low-data scenarios. This research introduces innovative intent detection models utilizing dual sentence encoders, namely Universal Sentence Encoder (USE) and Conversational Representations from Transformers (ConveRT). These encoders are pretrained on conversational tasks, enabling them to capture intricate conversational nuances, which is essential for intent classification.

Overview of Intent Detection Challenges

Intent detection is a fundamental task for conversational systems, pivotal in interpreting users' goals by categorizing their utterances into predefined intents. The complexity of deploying intent detectors for new domains in few-shot learning scenarios highlights the necessity for more efficient and adaptable models. Traditional models relying on large-scale transformers like BERT face challenges: adaptation demands substantial computing resources and enough labeled data, which are typically scarce in practical settings.

Dual Sentence Encoders as a Solution

The key contribution of this work lies in leveraging dual sentence encoders for intent detection. These models capture conversational context through a conversational response selection pretraining objective, inherently aligning with task-oriented dialog systems. USE and ConveRT provide fixed sentence representations which facilitate their direct use in classifiers with minimal adaptation overhead. The authors demonstrate that intent detectors using fixed USE and ConveRT outperform those based on BERT without fine-tuning on three intent detection datasets, particularly excelling in few-shot scenarios with observable accuracy improvements.

Empirical Validation and Performance

The empirical evaluation spans three benchmark datasets, including a newly introduced single-domain dataset, banking77, which contains 13,083 examples of 77 nuanced intents in the banking domain. The results consistently show the superior efficiency and performance of the proposed dual encoder-based models, particularly in low-data regimes. Noteworthy are the results in few-shot setups, where these models demonstrated superior adaptation, highlighting their potential for real-world deployment where data is sparse.

Moreover, the authors highlight several advantages: models based on dual encoders exhibit robustness to hyperparameter variations, significant computational efficiency with the capability to train on a single CPU, and fast inference speeds, making them apt for environments with limited computational resources.

Implications and Future Work

This paper has significant implications for developing conversational AI systems. By reducing computational requirements and enhancing the flexibility of intent detectors, the research paves the way for broader accessibility and faster deployment cycles in commercial applications. The comparative analysis against BERT further underscores the necessity of task-aligned pretraining in sentence encoding models.

Future research could explore augmenting these dual encoder models with multilingual capabilities, extending their applicability across languages without substantial retraining. Another avenue is enhancing zero-shot cross-lingual transfer, leveraging the conversational strengths of these models in diverse languages with minimal annotated data. Additionally, integrating these models to handle out-of-scope intent prediction could enhance conversational AI systems' robustness, providing a comprehensive toolkit for building advanced task-oriented dialogue systems.

In conclusion, by addressing intent detection's computational and data challenges, this research contributes significantly to conversational AI development, offering a scalable and efficient approach poised for practical application in various domains. The release of related code and datasets further encourages advancement and democratization within the field.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Iñigo Casanueva (18 papers)
Tadas Temčinas (4 papers)
Daniela Gerz (11 papers)
Matthew Henderson (13 papers)
Ivan Vulić (130 papers)

Citations (413)

View on Semantic Scholar