Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
51 tokens/sec
2000 character limit reached

The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG) (2405.13084v2)

Published 21 May 2024 in cs.CL and cs.AI

Abstract: Recently, increasing research interests have focused on retrieval augmented generation (RAG) to mitigate hallucination for LLMs. Following this trend, we launch the FutureDial-RAG challenge at SLT 2024, which aims at promoting the study of RAG for dialog systems. The challenge builds upon the MobileCS2 dataset, a real-life customer service datasets with nearly 3000 high-quality dialogs containing annotations for knowledge base query and corresponding results. Over the dataset, we define two tasks, track 1 for knowledge retrieval and track 2 for response generation, which are core research questions in dialog systems with RAG. We build baseline systems for the two tracks and design metrics to measure whether the systems can perform accurate retrieval and generate informative and coherent response. The baseline results show that it is very challenging to perform well on the two tasks, which encourages the participating teams and the community to study how to make better use of RAG for real-life dialog systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Artificial hallucinations in ChatGPT: Implications in scientific writing, 2023.
  3. Multiwoz–a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. arXiv preprint arXiv:1810.00278, 2018.
  4. Advancing semi-supervised task oriented dialog systems by JSA learning of discrete latent variable models. In Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp.  456–467, 2022.
  5. Knowledge-retrieval task-oriented dialog systems with semi-supervision. In INTERSPEECH, 2023.
  6. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proc. of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019.
  7. Re2g: Retrieve, rerank, generate. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  2701–2715, 2022.
  8. Realm: retrieval-augmented language model pre-training. In Proceedings of the 37th International Conference on Machine Learning, pp.  3929–3938, 2020.
  9. Poly-encoders: Architectures and pre-training strategies for fast and accurate multi-sentence scoring. In International Conference on Learning Representations, 2020.
  10. Leveraging passage retrieval with generative models for open domain question answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp.  874–880, 2021.
  11. Unsupervised dense information retrieval with contrastive learning. Transactions on Machine Learning Research, 2022a.
  12. Few-shot learning with retrieval augmented language models. arXiv e-prints, pp.  arXiv–2208, 2022b.
  13. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  6769–6781, 2020.
  14. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  15. Building Markovian generative architectures over pretrained LM backbones for efficient task-oriented dialog systems. In IEEE Spoken Language Technology Workshop, 2022a.
  16. Information extraction and human-robot dialogue towards real-life tasks: A baseline study with the mobilecs dataset. In EMNLP 2022 SereTOD Workshop, 2022b.
  17. Variational latent-state GPT for semi-supervised task-oriented dialog systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
  18. Proceedings of the towards semi-supervised and reinforced task-oriented dialog systems (seretod). In Proceedings of the Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems (SereTOD), 2022a.
  19. A challenge on semi-supervised and reinforced task-oriented dialog systems. arXiv preprint arXiv:2207.02657, 2022b.
  20. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022.
  21. Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9, 2019.
  22. Toolformer: Language models can teach themselves to use tools. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  23. Language models that seek for knowledge: Modular search & generation for dialogue and prompt completion. arXiv preprint arXiv:2203.13224, 2022a.
  24. Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. arXiv preprint arXiv:2208.03188, 2022b.
  25. React: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR), 2023.
  26. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations, 2019.
  27. A probabilistic end-to-end task-oriented dialog model with latent belief states towards semi-supervised learning. In Empirical Methods in Natural Language Processing (EMNLP), 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets