GenSERP: Large Language Models for Whole Page Presentation (2402.14301v2)
Abstract: The advent of LLMs brings an opportunity to minimize the effort in search engine result page (SERP) organization. In this paper, we propose GenSERP, a framework that leverages LLMs with vision in a few-shot setting to dynamically organize intermediate search results, including generated chat answers, website snippets, multimedia data, knowledge panels into a coherent SERP layout based on a user's query. Our approach has three main stages: (1) An information gathering phase where the LLM continuously orchestrates API tools to retrieve different types of items, and proposes candidate layouts based on the retrieved items, until it's confident enough to generate the final result. (2) An answer generation phase where the LLM populates the layouts with the retrieved content. In this phase, the LLM adaptively optimize the ranking of items and UX configurations of the SERP. Consequently, it assigns a location on the page to each item, along with the UX display details. (3) A scoring phase where an LLM with vision scores all the generated SERPs based on how likely it can satisfy the user. It then send the one with highest score to rendering. GenSERP features two generation paradigms. First, coarse-to-fine, which allow it to approach optimal layout in a more manageable way, (2) beam search, which give it a better chance to hit the optimal solution compared to greedy decoding. Offline experimental results on real-world data demonstrate how LLMs can contextually organize heterogeneous search results on-the-fly and provide a promising user experience.
- Fair and balanced: learning to present news stories. In Proceedings of the Fifth International Conference on Web Search and Web Data Mining, WSDM 2012, Seattle, WA, USA, February 8-12, 2012, Eytan Adar, Jaime Teevan, Eugene Agichtein, and Yoelle Maarek (Eds.). ACM, 333–342. https://doi.org/10.1145/2124295.2124337
- EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education. CoRR abs/2308.02773 (2023). https://doi.org/10.48550/ARXIV.2308.02773 arXiv:2308.02773
- Whole Page Optimization with Global Constraints. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 3153–3161. https://doi.org/10.1145/3292500.3330675
- PaRaDe: Passage Ranking using Demonstrations with Large Language Models. CoRR abs/2310.14408 (2023). https://doi.org/10.48550/ARXIV.2310.14408 arXiv:2310.14408
- PAL: Program-aided Language Models. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA (Proceedings of Machine Learning Research), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.), Vol. 202. PMLR, 10764–10799. https://proceedings.mlr.press/v202/gao23f.html
- CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. CoRR abs/2305.11738 (2023). https://doi.org/10.48550/ARXIV.2305.11738 arXiv:2305.11738
- ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving. CoRR abs/2309.17452 (2023). https://doi.org/10.48550/ARXIV.2309.17452 arXiv:2309.17452
- Large Language Models as Zero-Shot Conversational Recommenders. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023, Birmingham, United Kingdom, October 21-25, 2023, Ingo Frommholz, Frank Hopfgartner, Mark Lee, Michael Oakes, Mounia Lalmas, Min Zhang, and Rodrygo L. T. Santos (Eds.). ACM, 720–730. https://doi.org/10.1145/3583780.3614949
- An Efficient Bandit Algorithm for Realtime Multivariate Optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017. ACM, 1813–1821. https://doi.org/10.1145/3097983.3098184
- Inner Monologue: Embodied Reasoning through Planning with Language Models. In Conference on Robot Learning, CoRL 2022, 14-18 December 2022, Auckland, New Zealand (Proceedings of Machine Learning Research), Karen Liu, Dana Kulic, and Jeffrey Ichnowski (Eds.), Vol. 205. PMLR, 1769–1782. https://proceedings.mlr.press/v205/huang23c.html
- Neil Hurley and Mi Zhang. 2011. Novelty and Diversity in Top-N Recommendation - Analysis and Evaluation. ACM Trans. Internet Techn. 10, 4 (2011), 14:1–14:30. https://doi.org/10.1145/1944339.1944341
- Lost in the Middle: How Language Models Use Long Contexts. CoRR abs/2307.03172 (2023). https://doi.org/10.48550/ARXIV.2307.03172 arXiv:2307.03172
- Zero-Shot Listwise Document Reranking with a Large Language Model. CoRR abs/2305.02156 (2023). https://doi.org/10.48550/ARXIV.2305.02156 arXiv:2305.02156
- WebGPT: Browser-assisted question-answering with human feedback. CoRR abs/2112.09332 (2021). arXiv:2112.09332 https://arxiv.org/abs/2112.09332
- Harrie Oosterhuis and Maarten de Rijke. 2018. Ranking for Relevance and Display Preferences in Complex Presentation Layouts. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018, Kevyn Collins-Thompson, Qiaozhu Mei, Brian D. Davison, Yiqun Liu, and Emine Yilmaz (Eds.). ACM, 845–854. https://doi.org/10.1145/3209978.3209992
- OpenAI. 2023. GPT-4 Technical Report. CoRR abs/2303.08774 (2023). https://doi.org/10.48550/ARXIV.2303.08774 arXiv:2303.08774
- Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting. CoRR abs/2306.17563 (2023). https://doi.org/10.48550/ARXIV.2306.17563 arXiv:2306.17563
- Improving Passage Retrieval with Zero-Shot Question Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 3781–3797. https://doi.org/10.18653/V1/2022.EMNLP-MAIN.249
- Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 9248–9274. https://aclanthology.org/2023.findings-emnlp.620
- Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA (Proceedings of Machine Learning Research), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.), Vol. 202. PMLR, 30706–30775. https://proceedings.mlr.press/v202/shao23a.html
- Content-based layout optimization. In Joint Proceedings of the ACM IUI 2019 Workshops co-located with the 24th ACM Conference on Intelligent User Interfaces (ACM IUI 2019), Los Angeles, USA, March 20, 2019 (CEUR Workshop Proceedings), Christoph Trattner, Denis Parra, and Nathalie Riche (Eds.), Vol. 2327. CEUR-WS.org. https://ceur-ws.org/Vol-2327/IUI19WS-ESIDA-1.pdf
- Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 14918–14937. https://aclanthology.org/2023.emnlp-main.923
- Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models. CoRR abs/2310.07712 (2023). https://doi.org/10.48550/ARXIV.2310.07712 arXiv:2310.07712
- Adaptive, Personalized Diversity for Visual Discovery. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, September 15-19, 2016, Shilad Sen, Werner Geyer, Jill Freyne, and Pablo Castells (Eds.). ACM, 35–38. https://doi.org/10.1145/2959100.2959171
- RecMind: Large Language Model Powered Agent For Recommendation. CoRR abs/2308.14296 (2023). https://doi.org/10.48550/ARXIV.2308.14296 arXiv:2308.14296
- Efficient Ordered Combinatorial Semi-Bandits for Whole-Page Recommendation. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, Satinder Singh and Shaul Markovitch (Eds.). AAAI Press, 2746–2753. https://doi.org/10.1609/AAAI.V31I1.10939
- Beyond Ranking: Optimizing Whole-Page Presentation. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM ’16). Association for Computing Machinery, New York, NY, USA, 103–112. https://doi.org/10.1145/2835776.2835824
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.). http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html
- Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models. CoRR abs/2303.04671 (2023). https://doi.org/10.48550/ARXIV.2303.04671 arXiv:2303.04671
- Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation. In International Conference on Artificial Intelligence and Statistics, AISTATS 2022, 28-30 March 2022, Virtual Event (Proceedings of Machine Learning Research), Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera (Eds.), Vol. 151. PMLR, 8360–8369. https://proceedings.mlr.press/v151/xiao22a.html
- GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation. CoRR abs/2311.07562 (2023). https://doi.org/10.48550/ARXIV.2311.07562 arXiv:2311.07562
- Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V. CoRR abs/2310.11441 (2023). https://doi.org/10.48550/ARXIV.2310.11441 arXiv:2310.11441
- Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation. CoRR abs/2310.08541 (2023). https://doi.org/10.48550/ARXIV.2310.08541 arXiv:2310.08541
- ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https://openreview.net/pdf?id=WE_vluYUL-X
- Deep reinforcement learning for page-wise recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2-7, 2018, Sole Pera, Michael D. Ekstrand, Xavier Amatriain, and John O’Donovan (Eds.). ACM, 95–103. https://doi.org/10.1145/3240323.3240374
- Beyond Yes and No: Improving Zero-Shot LLM Rankers via Scoring Fine-Grained Relevance Labels. CoRR abs/2310.14122 (2023). https://doi.org/10.48550/ARXIV.2310.14122 arXiv:2310.14122
- A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models. CoRR abs/2310.09497 (2023). https://doi.org/10.48550/ARXIV.2310.09497 arXiv:2310.09497