ConvSDG: Session Data Generation for Conversational Search (2403.11335v1)
Abstract: Conversational search provides a more convenient interface for users to search by allowing multi-turn interaction with the search engine. However, the effectiveness of the conversational dense retrieval methods is limited by the scarcity of training data required for their fine-tuning. Thus, generating more training conversational sessions with relevant labels could potentially improve search performance. Based on the promising capabilities of LLMs on text generation, we propose ConvSDG, a simple yet effective framework to explore the feasibility of boosting conversational search by using LLM for session data generation. Within this framework, we design dialogue/session-level and query-level data generation with unsupervised and semi-supervised learning, according to the availability of relevance judgments. The generated data are used to fine-tune the conversational dense retriever. Extensive experiments on four widely used datasets demonstrate the effectiveness and broad applicability of our ConvSDG framework compared with several strong baselines.
- Fengran Mo (35 papers)
- Bole Yi (1 paper)
- Kelong Mao (23 papers)
- Chen Qu (37 papers)
- Kaiyu Huang (16 papers)
- Jian-Yun Nie (70 papers)