Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

q2d: Turning Questions into Dialogs to Teach Models How to Search (2304.14318v2)

Published 27 Apr 2023 in cs.CL

Abstract: One of the exciting capabilities of recent LLMs for dialog is their ability to independently search for relevant information to ground a given dialog response. However, obtaining training data to teach models how to issue search queries is time and resource consuming. In this work, we propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions. We prompt a LLM (PaLM) to create conversational versions of question answering datasets, and use it to improve query generation models that communicate with external search APIs to ground dialog responses. Unlike previous approaches which relied on human written dialogs with search queries, our method allows to automatically generate query-based grounded dialogs with better control and scale. Our experiments demonstrate that: (1) For query generation on the QReCC dataset, models trained on our synthetically-generated data achieve 90%--97% of the performance of models trained on the human-generated data; (2) We can successfully generate data for training dialog models in new domains without any existing dialog data as demonstrated on the multi-hop MuSiQue and Bamboogle QA datasets. (3) We perform a thorough analysis of the generated dialogs showing that humans find them of high quality and struggle to distinguish them from human-written dialogs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Open-domain question answering goes conversational via question rewriting. arXiv preprint arXiv:2010.04898.
  2. Improving language models by retrieving from trillions of tokens. arXiv preprint arXiv:2112.04426.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  4. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055.
  5. Universal sentence encoder for english. In Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, pages 169–174.
  6. Quac: Question answering in context. arXiv preprint arXiv:1808.07036.
  7. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
  8. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  9. The pascal recognising textual entailment challenge. In Machine learning challenges workshop, pages 177–190. Springer.
  10. Dialog inpainting: Turning documents into dialogs. In International Conference on Machine Learning, pages 4558–4586. PMLR.
  11. Promptagator: Few-shot dense retrieval from 8 examples. arXiv preprint arXiv:2209.11755.
  12. Cast-19: A dataset for conversational information seeking. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1985–1988.
  13. Evaluating groundedness in dialogue systems: The begin benchmark. arXiv preprint arXiv:2105.00071.
  14. Can you unpack that? learning to rewrite questions-in-context. Can You Unpack That? Learning to Rewrite Questions-in-Context.
  15. Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375.
  16. Jeroen Antonius Gerardus Groenendijk and Martin Johan Bastiaan Stokhof. 1984. Studies on the Semantics of Questions and the Pragmatics of Answers. Ph.D. thesis, Univ. Amsterdam.
  17. Dialfact: A benchmark for fact-checking in dialogue. arXiv preprint arXiv:2110.08222.
  18. True: Re-evaluating factual consistency evaluation. arXiv preprint arXiv:2204.04991.
  19. q2superscript𝑞2q^{\textnormal{2}}italic_q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT: Evaluating factual consistency in knowledge-grounded dialogues via question generation and question answering. arXiv preprint arXiv:2104.08202.
  20. Nan-Jiang Jiang and Marie-Catherine de Marneffe. 2022. Investigating reasons for disagreement in natural language inference. arXiv preprint arXiv:2209.03392.
  21. Internet-augmented dialogue generation. arXiv preprint arXiv:2107.07566.
  22. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466.
  23. Hallucinations in neural machine translation.
  24. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  25. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
  26. On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661.
  27. Parlai: A dialog research software platform. arXiv preprint arXiv:1705.06476.
  28. I like fish, especially dolphins: Addressing contradictions in dialogue modeling. arXiv preprint arXiv:2012.13391.
  29. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255.
  30. Measuring and narrowing the compositionality gap in language models. arXiv preprint arXiv:2210.03350.
  31. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67.
  32. Language models that seek for knowledge: Modular search & generation for dialogue and prompt completion. arXiv preprint arXiv:2203.13224.
  33. Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. arXiv preprint arXiv:2208.03188.
  34. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239.
  35. Musique: Multi-hop questions via single-hop question composition. arXiv preprint arXiv:2108.00573.
  36. Reducing quantity hallucinations in abstractive summarization. arXiv preprint arXiv:2009.13312.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yonatan Bitton (36 papers)
  2. Shlomi Cohen-Ganor (1 paper)
  3. Ido Hakimi (9 papers)
  4. Yoad Lewenberg (3 papers)
  5. Roee Aharoni (35 papers)
  6. Enav Weinreb (2 papers)
Citations (3)