DIET: Lightweight Language Understanding for Dialogue Systems (2004.09936v3)

Published 21 Apr 2020 in cs.CL

Abstract: Large-scale pre-trained LLMs have shown impressive results on language understanding benchmarks like GLUE and SuperGLUE, improving considerably over other pre-training methods like distributed representations (GloVe) and purely supervised approaches. We introduce the Dual Intent and Entity Transformer (DIET) architecture, and study the effectiveness of different pre-trained representations on intent and entity prediction, two common dialogue language understanding tasks. DIET advances the state of the art on a complex multi-domain NLU dataset and achieves similarly high performance on other simpler datasets. Surprisingly, we show that there is no clear benefit to using large pre-trained models for this task, and in fact DIET improves upon the current state of the art even in a purely supervised setup without any pre-trained embeddings. Our best performing model outperforms fine-tuning BERT and is about six times faster to train.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (4)

Tanja Bunk (1 paper)
Daksh Varshneya (3 papers)
Vladimir Vlasov (15 papers)
Alan Nichol (7 papers)

Citations (150)

View on Semantic Scholar

DIET: Lightweight Language Understanding for Dialogue Systems (2004.09936v3)

Related Papers