Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How to Leverage Personal Textual Knowledge for Personalized Conversational Information Retrieval (2407.16192v1)

Published 23 Jul 2024 in cs.IR and cs.CL

Abstract: Personalized conversational information retrieval (CIR) combines conversational and personalizable elements to satisfy various users' complex information needs through multi-turn interaction based on their backgrounds. The key promise is that the personal textual knowledge base (PTKB) can improve the CIR effectiveness because the retrieval results can be more related to the user's background. However, PTKB is noisy: not every piece of knowledge in PTKB is relevant to the specific query at hand. In this paper, we explore and test several ways to select knowledge from PTKB and use it for query reformulation by using a LLM. The experimental results show the PTKB might not always improve the search results when used alone, but LLM can help generate a more appropriate personalized query when high-quality guidance is provided.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. TREC iKAT 2023: The Interactive Knowledge Assistance Track Overview. arXiv preprint arXiv:2401.01330 (2024).
  2. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 243–250.
  3. TREC CAsT 2019: The conversational assistance track overview. arXiv preprint arXiv:2003.13624 (2020).
  4. CAsT 2020: The Conversational Assistance Track Overview. Technical Report.
  5. TREC CAsT 2021: The conversational assistance track overview. In In Proceedings of TREC.
  6. A large-scale evaluation and analysis of personalized search strategies. In Proceedings of the 16th international conference on World Wide Web. 581–590.
  7. Ben He and Iadh Ounis. 2009. Finding Good Feedback Documents. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (Hong Kong, China) (CIKM ’09). 2011–2014.
  8. C-eval: A multi-level multi-discipline chinese evaluation suite for foundation models. Advances in Neural Information Processing Systems 36 (2024).
  9. InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2023. 6649–6675.
  10. Billion-scale similarity search with gpus. IEEE Transactions on Big Data 7, 3 (2019), 535–547.
  11. AlpacaEval: An Automatic Evaluator of Instruction-following Models. https://github.com/tatsu-lab/alpaca_eval.
  12. Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2356–2362.
  13. Contextualized Query Embeddings for Conversational Search. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 1004–1015.
  14. ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval. arXiv preprint arXiv:2404.13556 (2024).
  15. Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search. (2023).
  16. Search-Oriented Conversational Query Editing. In Findings of the Association for Computational Linguistics: ACL 2023. 4160–4172.
  17. Curriculum Contrastive Context Denoising for Few-shot Conversational Dense Retrieval. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 176–186.
  18. ConvTrans: Transforming Web Search Sessions for Conversational Dense Retrieval. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2935–2946.
  19. Learning Denoised and Interpretable Session Representation for Conversational Search. In Proceedings of the ACM Web Conference 2023. 3193–3202.
  20. CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search. arXiv preprint arXiv:2406.05013 (2024).
  21. ConvGQR: Generative Query Reformulation for Conversational Search. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 4998–5012.
  22. Learning to Relate to Previous Turns in Conversational Search. In 29th ACM SIGKDD Conference On Knowledge Discover and Data Mining (SIGKDD).
  23. History-Aware Conversational Dense Retrieval. arXiv preprint arXiv:2401.16659 (2024).
  24. ConvSDG: Session Data Generation for Conversational Search. In Companion Proceedings of the ACM on Web Conference 2024. 1634–1642.
  25. ClueWeb22: 10 billion web documents with rich information. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3360–3362.
  26. TREC CAsT 2022: Going beyond user ask and system retrieve with initiative and response generation. NIST Special Publication (2022), 500–338.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Fengran Mo (35 papers)
  2. Longxiang Zhao (1 paper)
  3. Kaiyu Huang (16 papers)
  4. Yue Dong (61 papers)
  5. Degen Huang (8 papers)
  6. Jian-Yun Nie (70 papers)
Citations (4)