Gen-IR @ SIGIR 2023: The First Workshop on Generative Information Retrieval (2306.02887v2)
Abstract: Generative information retrieval (IR) has experienced substantial growth across multiple research communities (e.g., information retrieval, computer vision, natural language processing, and machine learning), and has been highly visible in the popular press. Theoretical, empirical, and actual user-facing products have been released that retrieve documents (via generation) or directly generate answers given an input request. We would like to investigate whether end-to-end generative models are just another trend or, as some claim, a paradigm change for IR. This necessitates new metrics, theoretical grounding, evaluation methods, task definitions, models, user interfaces, etc. The goal of this workshop (https://coda.io/@sigir/gen-ir) is to focus on previously explored Generative IR techniques like document retrieval and direct Grounded Answer Generation, while also offering a venue for the discussion and exploration of how Generative IR can be applied to new domains like recommendation systems, summarization, etc. The format of the workshop is interactive, including roundtable and keynote sessions and tends to avoid the one-sided dialogue of a mini-conference.
- Constitutional AI: Harmlessness from AI Feedback. https://doi.org/10.48550/ARXIV.2212.08073
- Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models. https://doi.org/10.48550/ARXIV.2212.08037
- Autoregressive Entity Retrieval. CoRR abs/2010.00904 (2020). https://arxiv.org/abs/2010.00904
- GERE: Generative Evidence Retrieval for Fact Verification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 2184–2189. https://doi.org/10.1145/3477495.3531827
- CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 191–200.
- The Infinite Index: Information Retrieval on Generative Text-To-Image Models. https://doi.org/10.48550/ARXIV.2212.07476
- Generative Slate Recommendation with Reinforcement Learning. In WSDM 2023: The Sixteenth International Conference on Web Search and Data Mining. ACM.
- Continuous diffusion for categorical data. https://doi.org/10.48550/ARXIV.2211.15089
- VAE-IPS: A Deep Generative Recommendation Method for Unbiased Learning From Implicit Feedback. In CONSEQUENCES+REVEAL Workshop at RecSys ’22. ACM.
- Nora Kassner and Hinrich Schütze. 2020. BERT-kNN: Adding a kNN Search Component to Pretrained Language Models for Better QA. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 3424–3430. https://doi.org/10.18653/v1/2020.findings-emnlp.307
- Generalization through Memorization: Nearest Neighbor Language Models. In International Conference on Learning Representations. https://openreview.net/forum?id=HklBjCEKvH
- Generative Retrieval for Long Sequences. CoRR abs/2204.13596 (2022). https://doi.org/10.48550/arXiv.2204.13596 arXiv:2204.13596
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).
- Deep generative ranking for personalized recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems. 34–42.
- Rethinking Search: Making Domain Experts out of Dilettantes. SIGIR Forum 55, 1, Article 13 (jul 2021), 27 pages. https://doi.org/10.1145/3476415.3476428
- MS MARCO: A human generated machine reading comprehension dataset. In CoCo@ NIPs.
- Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140 (2020), 1–67.
- Recipes for Building an Open-Domain Chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 300–325. https://doi.org/10.18653/v1/2021.eacl-main.24
- Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Francis Bach and David Blei (Eds.). PMLR, Lille, France, 2256–2265. https://proceedings.mlr.press/v37/sohl-dickstein15.html
- Learning to summarize with human feedback. Advances in Neural Information Processing Systems 33 (2020), 3008–3021.
- Transformer Memory as a Differentiable Search Index. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.). https://openreview.net/forum?id=Vu-B0clPfq
- Galactica: A Large Language Model for Science. https://doi.org/10.48550/ARXIV.2211.09085
- A hybrid retrieval-generation neural conversation model. In Proceedings of the 28th ACM international conference on information and knowledge management. 1341–1350.
- Fine-Tuning Language Models from Human Preferences. arXiv preprint arXiv:1909.08593 (2019). https://arxiv.org/abs/1909.08593
- Gabriel Bénédict (5 papers)
- Ruqing Zhang (60 papers)
- Donald Metzler (49 papers)