Generating Multi-Aspect Queries for Conversational Search (2403.19302v3)
Abstract: Conversational information seeking (CIS) systems aim to model the user's information need within the conversational context and retrieve the relevant information. One major approach to modeling the conversational context aims to rewrite the user utterance in the conversation to represent the information need independently. Recent work has shown the benefit of expanding the rewritten utterance with relevant terms. In this work, we hypothesize that breaking down the information of an utterance into multi-aspect rewritten queries can lead to more effective retrieval performance. This is more evident in more complex utterances that require gathering evidence from various information sources, where a single query rewrite or query representation cannot capture the complexity of the utterance. To test this hypothesis, we conduct extensive experiments on five widely used CIS datasets where we leverage LLMs to generate multi-aspect queries to represent the information need for each utterance in multiple query rewrites. We show that, for most of the utterances, the same retrieval model would perform better with more than one rewritten query by 85% in terms of nDCG@3. We further propose a multi-aspect query generation and retrieval framework, called MQ4CS. Our extensive experiments show that MQ4CS outperforms the state-of-the-art query rewriting methods. We make our code and our new dataset of generated multi-aspect queries publicly available.
- Trec ikat 2023: The interactive knowledge assistance track overview. arXiv preprint arXiv:2401.01330.
- Conversational search (Dagstuhl Seminar 19461). In Dagstuhl Reports, volume 9. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
- Open-domain question answering goes conversational via question rewriting. In NAACL-HLT, pages 520–534. Association for Computational Linguistics.
- Quac: Question answering in context. arXiv preprint arXiv:1808.07036.
- Trec cast 2019: The conversational assistance track overview. arXiv preprint arXiv:2003.13624.
- Wizard of wikipedia: Knowledge-powered conversational agents. In ICLR (Poster). OpenReview.net.
- Can you unpack that? learning to rewrite questions-in-context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5918–5924, Hong Kong, China. Association for Computational Linguistics.
- Perspectives on large language models for relevance judgment. In ICTIR, pages 39–50. ACM.
- Multidoc2dial: Modeling dialogues grounded in multiple documents. In EMNLP (1), pages 6162–6176. Association for Computational Linguistics.
- Precise zero-shot dense retrieval without relevance labels. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1762–1777, Toronto, Canada. Association for Computational Linguistics.
- Samsum corpus: A human-annotated dialogue dataset for abstractive summarization. CoRR, abs/1911.12237.
- Cosplade: Contextualizing SPLADE for conversational information retrieval. In ECIR (1), volume 13980 of Lecture Notes in Computer Science, pages 537–552. Springer.
- Knowledge-grounded dialogue generation with a unified knowledge representation. In NAACL-HLT, pages 206–218. Association for Computational Linguistics.
- Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pages 2356–2362.
- Multi-stage conversational passage retrieval: An approach to fusing term importance estimation and neural query rewriting. ACM Transactions on Information Systems (TOIS), 39(4):1–29.
- Llms as narcissistic evaluators: When ego inflates evaluation scores. CoRR, abs/2311.09766.
- Faithful chain-of-thought reasoning. CoRR, abs/2301.13379.
- Sean MacAvaney and Luca Soldaini. 2023. One-shot labeling for automatic relevance estimation. In SIGIR, pages 2230–2235. ACM.
- Quinn Patwardhan and Grace Hui Yang. 2023. Sequencing matters: A generate-retrieve-generate model for building conversational agents.
- Hongjin Qian and Zhicheng Dou. 2022. Explicit query rewriting for conversational dense retrieval. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4725–4737.
- Filip Radlinski and Nick Craswell. 2017. A theoretical framework for conversational search. In CHIIR, pages 117–126.
- Stephen E. Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4):333–389.
- WikiChat: Stopping the hallucination of large language model chatbots by few-shot grounding on Wikipedia. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 2387–2413, Singapore. Association for Computational Linguistics.
- Language models that seek for knowledge: Modular search & generation for dialogue and prompt completion. In EMNLP (Findings), pages 373–393. Association for Computational Linguistics.
- Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. CoRR, abs/2208.03188.
- Question rewriting for conversational question answering. In Proceedings of the 14th ACM international conference on web search and data mining, pages 355–363.
- Ilps at trec 2019 conversational assistant track. In TREC.
- Query resolution for conversational search with limited supervision. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.
- Few-shot generative conversational query rewriting. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, pages 1933–1936.
- Few-shot conversational dense retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 829–838.
- Generate rather than retrieve: Large language models are strong context generators. In ICLR. OpenReview.net.
- Conversational information seeking. arXiv preprint arXiv:2201.08808.