Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade (2305.14751v1)

Published 24 May 2023 in cs.CL and cs.AI

Abstract: In the constant updates of the product dialogue systems, we need to retrain the natural language understanding (NLU) model as new data from the real users would be merged into the existent data accumulated in the last updates. Within the newly added data, new intents would emerge and might have semantic entanglement with the existing intents, e.g. new intents that are semantically too specific or generic are actually subset or superset of some existing intents in the semantic space, thus impairing the robustness of the NLU model. As the first attempt to solve this problem, we setup a new benchmark consisting of 4 Dialogue Version Control dataSets (DialogVCS). We formulate the intent detection with imperfect data in the system update as a multi-label classification task with positive but unlabeled intents, which asks the models to recognize all the proper intents, including the ones with semantic entanglement, in the inference. We also propose comprehensive baseline models and conduct in-depth analyses for the benchmark, showing that the semantically entangled intents can be effectively recognized with an automatic workflow.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Zefan Cai (26 papers)
  2. Xin Zheng (57 papers)
  3. Tianyu Liu (177 papers)
  4. Xu Wang (319 papers)
  5. Haoran Meng (6 papers)
  6. Jiaqi Han (24 papers)
  7. Gang Yuan (1 paper)
  8. Binghuai Lin (20 papers)
  9. Baobao Chang (80 papers)
  10. Yunbo Cao (43 papers)
Citations (4)