RAGSys: Item-Cold-Start Recommender as RAG System (2405.17587v2)
Abstract: LLMs (LLM) hold immense promise for real-world applications, but their generic knowledge often falls short of domain-specific needs. Fine-tuning, a common approach, can suffer from catastrophic forgetting and hinder generalizability. In-Context Learning (ICL) offers an alternative, which can leverage Retrieval-Augmented Generation (RAG) to provide LLMs with relevant demonstrations for few-shot learning tasks. This paper explores the desired qualities of a demonstration retrieval system for ICL. We argue that ICL retrieval in this context resembles item-cold-start recommender systems, prioritizing discovery and maximizing information gain over strict relevance. We propose a novel evaluation method that measures the LLM's subsequent performance on NLP tasks, eliminating the need for subjective diversity scores. Our findings demonstrate the critical role of diversity and quality bias in retrieved demonstrations for effective ICL, and highlight the potential of recommender system techniques in this domain.
- Diversifying search results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain) (WSDM ’09). Association for Computing Machinery, New York, NY, USA, 5–14. https://doi.org/10.1145/1498759.1498766
- In-context Examples Selection for Machine Translation. In Findings of the Association for Computational Linguistics: ACL 2023, Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 8857–8873. https://doi.org/10.18653/v1/2023.findings-acl.564
- Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection. In Advances in Neural Information Processing Systems, A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 57125–57211. https://proceedings.neurips.cc/paper_files/paper/2023/file/b2e63e36c57e153b9015fece2352a9f9-Paper-Conference.pdf
- Quality-biased ranking of web documents. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (Hong Kong, China) (WSDM ’11). Association for Computing Machinery, New York, NY, USA, 95–104. https://doi.org/10.1145/1935826.1935849
- The Reversal Curse: LLMs trained on ”A is B” fail to learn ”B is A”. ArXiv abs/2309.12288 (2023). https://api.semanticscholar.org/CorpusID:262083829
- Cover trees for nearest neighbor. In Proceedings of the 23rd International Conference on Machine Learning (Pittsburgh, Pennsylvania, USA) (ICML ’06). Association for Computing Machinery, New York, NY, USA, 97–104. https://doi.org/10.1145/1143844.1143857
- Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1 (1998), 107–117. https://doi.org/10.1016/S0169-7552(98)00110-X Proceedings of the Seventh International World Wide Web Conference.
- Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Melbourne, Australia) (SIGIR ’98). Association for Computing Machinery, New York, NY, USA, 335–336. https://doi.org/10.1145/290941.291025
- Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (Hong Kong, China) (CIKM ’09). Association for Computing Machinery, New York, NY, USA, 621–630. https://doi.org/10.1145/1645953.1646033
- Overview of the TREC 2009 Web Track.. In Trec, Vol. 9. 20–29.
- A comparative analysis of cascade measures for novelty and diversity. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (Hong Kong, China) (WSDM ’11). Association for Computing Machinery, New York, NY, USA, 75–84. https://doi.org/10.1145/1935826.1935847
- Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore, Singapore) (SIGIR ’08). Association for Computing Machinery, New York, NY, USA, 659–666. https://doi.org/10.1145/1390334.1390446
- An Effectiveness Measure for Ambiguous and Underspecified Queries. In Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory (Cambridge, UK) (ICTIR ’09). Springer-Verlag, Berlin, Heidelberg, 188–199. https://doi.org/10.1007/978-3-642-04417-5_17
- Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration. In Machine Learning and Knowledge Discovery in Databases, Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and Filip Železný (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 225–240.
- Efficient and effective spam filtering and re-ranking for large web datasets. Inf. Retr. 14, 5 (oct 2011), 441–465. https://doi.org/10.1007/s10791-011-9162-z
- Parallelizing Exploration-Exploitation Tradeoffs in Gaussian Process Bandit Optimization. Journal of Machine Learning Research 15, 119 (2014), 4053–4103. http://jmlr.org/papers/v15/desautels14a.html
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
- A Survey for In-context Learning. CoRR abs/2301.00234 (2023). https://doi.org/10.48550/arXiv.2301.00234
- Sreenivas Gollapudi and Aneesh Sharma. 2009. An axiomatic approach for result diversification. In Proceedings of the 18th International Conference on World Wide Web (Madrid, Spain) (WWW ’09). Association for Computing Machinery, New York, NY, USA, 381–390. https://doi.org/10.1145/1526709.1526761
- Demystifying Prompts in Language Models via Perplexity Estimation. In Findings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 10136–10148. https://doi.org/10.18653/v1/2023.findings-emnlp.679
- Mistral 7B. arXiv e-prints, Article arXiv:2310.06825 (Oct. 2023), arXiv:2310.06825 pages. https://doi.org/10.48550/arXiv.2310.06825 arXiv:2310.06825 [cs.CL]
- Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 39–48. https://doi.org/10.1145/3397271.3401075
- Diverse Demonstrations Improve In-context Compositional Generalization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 1401–1422. https://doi.org/10.18653/v1/2023.acl-long.78
- Lu Li and Chee-Yong Chan. 2013. Efficient indexing for diverse query results. Proc. VLDB Endow. 6, 9 (jul 2013), 745–756. https://doi.org/10.14778/2536360.2536373
- Unified Demonstration Retriever for In-Context Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 4644–4668. https://doi.org/10.18653/v1/2023.acl-long.256
- TruthfulQA: Measuring How Models Mimic Human Falsehoods. In Annual Meeting of the Association for Computational Linguistics. https://api.semanticscholar.org/CorpusID:237532606
- Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 1950–1965. https://proceedings.neurips.cc/paper_files/paper/2022/file/0cde695b83bd186c1fd456302888454c-Paper-Conference.pdf
- What Makes Good In-Context Examples for GPT-3?. In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, Eneko Agirre, Marianna Apidianaki, and Ivan Vulić (Eds.). Association for Computational Linguistics, Dublin, Ireland and Online, 100–114. https://doi.org/10.18653/v1/2022.deelio-1.10
- An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning. ArXiv abs/2308.08747 (2023). https://api.semanticscholar.org/CorpusID:261031244
- Daniel Machlab and Rick Battle. 2024. LLM In-Context Recall is Prompt Dependent. CoRR, Article arXiv:2404.08865 (April 2024), arXiv:2404.08865 pages. https://doi.org/10.48550/arXiv.2404.08865 arXiv:2404.08865 [cs.CL]
- Detecting spam web pages through content analysis. In Proceedings of the 15th International Conference on World Wide Web (Edinburgh, Scotland) (WWW ’06). Association for Computing Machinery, New York, NY, USA, 83–92. https://doi.org/10.1145/1135777.1135794
- Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th International Conference on Machine Learning (Helsinki, Finland) (ICML ’08). Association for Computing Machinery, New York, NY, USA, 784–791. https://doi.org/10.1145/1390156.1390255
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model. In Advances in Neural Information Processing Systems, A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36. Curran Associates, Inc., 53728–53741. https://proceedings.neurips.cc/paper_files/paper/2023/file/a85b405ed65c6477a4fe8302b5e06ce7-Paper-Conference.pdf
- In-Context Retrieval-Augmented Language Models. Transactions of the Association for Computational Linguistics 11 (11 2023), 1316–1331. https://doi.org/10.1162/tacl_a_00605 arXiv:https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00605/2178834/tacl_a_00605.pdf
- Okapi at TREC-3. In Overview of the Third Text REtrieval Conference (TREC-3) (overview of the third text retrieval conference (trec–3) ed.). Gaithersburg, MD: NIST, 109–126. https://www.microsoft.com/en-us/research/publication/okapi-at-trec-3/
- Learning To Retrieve Prompts for In-Context Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz (Eds.). Association for Computational Linguistics, Seattle, United States, 2655–2671. https://doi.org/10.18653/v1/2022.naacl-main.191
- Selective Annotation Makes Language Models Better Few-Shot Learners. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https://openreview.net/pdf?id=qY1hlv7gwg
- Saúl Vargas. 2014. Novelty and diversity enhancement and evaluation in recommender systems and information retrieval. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (Gold Coast, Queensland, Australia) (SIGIR ’14). Association for Computing Machinery, New York, NY, USA, 1281. https://doi.org/10.1145/2600428.2610382
- Saúl Vargas and Pablo Castells. 2011. Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the Fifth ACM Conference on Recommender Systems (Chicago, Illinois, USA) (RecSys ’11). Association for Computing Machinery, New York, NY, USA, 109–116. https://doi.org/10.1145/2043932.2043955
- On query result diversification. In Proceedings of the 2011 IEEE 27th International Conference on Data Engineering (ICDE ’11). IEEE Computer Society, USA, 1163–1174. https://doi.org/10.1109/ICDE.2011.5767846
- Text Embeddings by Weakly-Supervised Contrastive Pre-training. CoRR, Article arXiv:2212.03533 (Dec. 2022), arXiv:2212.03533 pages. https://doi.org/10.48550/arXiv.2212.03533 arXiv:2212.03533 [cs.CL]
- Learning to Retrieve In-Context Examples for Large Language Models. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Yvette Graham and Matthew Purver (Eds.). Association for Computational Linguistics, St. Julian’s, Malta, 1752–1767. https://aclanthology.org/2024.eacl-long.105
- Compositional Exemplars for In-context Learning. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.). PMLR, 39818–39833. https://proceedings.mlr.press/v202/ye23c.html
- Complementary Explanations for Effective In-Context Learning. In Findings of the Association for Computational Linguistics: ACL 2023, Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 4469–4484. https://doi.org/10.18653/v1/2023.findings-acl.273
- Generate rather than retrieve: Large language models are strong context generators. In International Conference for Learning Representation (ICLR).
- Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval (Toronto, Canada) (SIGIR ’03). Association for Computing Machinery, New York, NY, USA, 10–17. https://doi.org/10.1145/860435.860440
- DGCN: Diversified Recommendation with Graph Convolutional Networks. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 401–412. https://doi.org/10.1145/3442381.3449835
- Learning for search result diversification. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (Gold Coast, Queensland, Australia) (SIGIR ’14). Association for Computing Machinery, New York, NY, USA, 293–302. https://doi.org/10.1145/2600428.2609634
- Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (Chiba, Japan) (WWW ’05). Association for Computing Machinery, New York, NY, USA, 22–32. https://doi.org/10.1145/1060745.1060754
- Emile Contal (7 papers)
- Garrin McGoldrick (2 papers)