Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation (2405.00981v2)

Published 2 May 2024 in cs.AI and cs.CL
Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation

Abstract: Designing preference elicitation (PE) methodologies that can quickly ascertain a user's top item preferences in a cold-start setting is a key challenge for building effective and personalized conversational recommendation (ConvRec) systems. While LLMs enable fully natural language (NL) PE dialogues, we hypothesize that monolithic LLM NL-PE approaches lack the multi-turn, decision-theoretic reasoning required to effectively balance the exploration and exploitation of user preferences towards an arbitrary item set. In contrast, traditional Bayesian optimization PE methods define theoretically optimal PE strategies, but cannot generate arbitrary NL queries or reason over content in NL item descriptions -- requiring users to express preferences via ratings or comparisons of unfamiliar items. To overcome the limitations of both approaches, we formulate NL-PE in a Bayesian Optimization (BO) framework that seeks to actively elicit NL feedback to identify the best recommendation. Key challenges in generalizing BO to deal with natural language feedback include determining: (a) how to leverage LLMs to model the likelihood of NL preference feedback as a function of item utilities, and (b) how to design an acquisition function for NL BO that can elicit preferences in the infinite space of language. We demonstrate our framework in a novel NL-PE algorithm, PEBOL, which uses: 1) Natural Language Inference (NLI) between user preference utterances and NL item descriptions to maintain Bayesian preference beliefs, and 2) BO strategies such as Thompson Sampling (TS) and Upper Confidence Bound (UCB) to steer LLM query generation. We numerically evaluate our methods in controlled simulations, finding that after 10 turns of dialogue, PEBOL can achieve an MRR@10 of up to 0.27 compared to the best monolithic LLM baseline's MRR@10 of 0.17, despite relying on earlier and smaller LLMs.

Understanding Bayesian Optimization and LLMs in Natural Language Preference Elicitation

Introduction to the Technique

The paper under discussion merges Bayesian Optimization (BO) with LLMs to enhance Preference Elicitation (PE) in systems that engage users through natural language. This innovative approach targets conversational recommendation systems that operate in "cold-start" settings—a scenario where initial user preferences aren't known.

BO traditionally excels at discovering user preferences by balancing exploration (learning new information) and exploitation (leveraging known information). However, it struggles with understanding and generating natural language dialogues. LLMs, conversely, are adept at handling language but lack the strategic decision-making BO provides. The proposed solution, named PEBOL (Preference Elicitation with Bayesian Optimization augmented LLMs), integrates these strengths, setting a new stage for conversational recommendation systems.

Key Components of the Approach

Bayesian Optimization Framework

PEBOL applies BO to PE by maintaining a probabilistic model of user preferences in the form of Bayesian "beliefs". These beliefs about item utilities are updated based on user’s feedback, which allows the system to carefully select what to inquire next about based on current knowledge and uncertainties.

Natural Language Handling via LLMs

LLMs in PEBOL are used to generate and understand natural language queries and responses. Queries generated by LLMs prompt users to disclose their preferences in a conversational format, making the interaction more user-friendly and less rigid compared to traditional PE approaches.

Observations from Experiments

The system was tested against GPT-3.5 in various settings, such as different datasets and levels of user-response noise. Here are some notable findings:

  1. Effectiveness in Cold-Start Scenarios: PEBOL displayed significant improvements in performance measures like MAP@10, achieving up to 131% better outcomes than GPT-3.5 alone in early interactions (10 dialogue turns).
  2. Robustness to Noise: PEBOL maintained superior performance even when users' responses included noise, illustrating its effectiveness in less-than-ideal real-world scenarios.
  3. Comparisons of Different Strategies: Various acquisition strategies were explored within PEBOL, such as Thompson Sampling and Upper Confidence Bound, which helped in understanding how different methods affect the exploration-exploitation balance.

Implications and Future Directions

Practical Applications

For businesses, this research paves the way for more effective conversational recommendation systems that can start functioning effectively without needing extensive initial data about user preferences. This can be particularly valuable in e-commerce, content recommendation, and any service that benefits from personalized user engagement.

Theoretical Contributions

Theoretically, this work broadens the understanding of integrating decision-theoretical reasoning with LLMs, a crossover not often explored in machine learning. It sets groundwork for future studies on how different AI domains can be blended for enhanced user interaction.

Speculations on Future Developments

Looking ahead, the successful integration of BO and LLMs suggests potential extensions where more complex models of user behavior could be incorporated, potentially taking into account varying user moods, contexts, or even indirect preference signals within longer dialogues.

Conclusion

The intersection of Bayesian Optimization and LLMs through the PEBOL framework represents a significant step forward in making conversational systems more responsive and effective right from the start. This fusion not only amplifies the strengths of both methods but also opens new horizons in personalized AI interactions, enhancing both user experience and system performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Preference Elicitation with Soft Attributes in Interactive Recommendation. ArXiv abs/2311.02085 (2023). https://api.semanticscholar.org/CorpusID:265034238
  2. Craig Boutilier. 2002. A POMDP formulation of preference elicitation problems. In AAAI/IAAI. Edmonton, AB, 239–246.
  3. A Bayesian interactive optimization approach to procedural animation design. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 103–112.
  4. Towards knowledge-based recommender dialog system. arXiv preprint arXiv:1908.05391 (2019).
  5. Towards Conversational Recommender Systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 815–824.
  6. The pascal recognising textual entailment challenge. In Machine learning challenges workshop. Springer, 177–190.
  7. Active preference learning with discrete choice data. Advances in neural information processing systems 20 (2007).
  8. Leveraging Large Language Models in Conversational Recommender Systems. arXiv preprint arXiv:2305.07961 (2023).
  9. Roman Garnett. 2023. Bayesian optimization. Cambridge University Press.
  10. Preferential bayesian optimization. In International Conference on Machine Learning. PMLR, 1282–1291.
  11. Shengbo Guo and Scott Sanner. 2010. Real-time multiattribute Bayesian preference elicitation with pairwise comparison queries. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 289–296.
  12. Gaussian process preference elicitation. Advances in neural information processing systems 23 (2010).
  13. Bayesian Preference Elicitation with Language Models. arXiv:2403.05534 [cs.CL]
  14. Large language models as zero-shot conversational recommenders. arXiv preprint arXiv:2308.10053 (2023).
  15. Item recommendation with variational autoencoders and heterogeneous priors. In Proceedings of the 3rd Workshop on Deep Learning for Recommender Systems. 10–14.
  16. Designing Engaging Games Using Bayesian Optimization. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 5571–5582. https://doi.org/10.1145/2858036.2858253
  17. MeLU: Meta-Learned User Preference Estimator for Cold-Start Recommendation. arXiv:1908.00413 [cs.IR]
  18. Interactive Path Reasoning on Graph for Conversational Recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 2073–2083. https://doi.org/10.1145/3394486.3403258
  19. Eliciting Human Preferences with Language Models. arXiv:2310.11589 [cs.CL]
  20. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web. 661–670.
  21. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In Proceedings of the fourth ACM international conference on Web search and data mining. 297–306.
  22. Towards deep conversational recommendations. Advances in neural information processing systems 31 (2018).
  23. Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-start Users. ACM Trans. Inf. Syst. 39, 4, Article 40 (aug 2021), 29 pages. https://doi.org/10.1145/3446427
  24. On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661 (2020).
  25. Francesca Rossi and Allesandro Sperduti. 2004. Acquiring both constraint and solution preferences in interactive constraint systems. Constraints 9, 4 (2004), 311–332.
  26. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 104, 1 (2016), 148–175. https://doi.org/10.1109/JPROC.2015.2494218
  27. FEVER: a large-scale dataset for fact extraction and VERification. arXiv preprint arXiv:1803.05355 (2018).
  28. Gradient-based Optimization for Bayesian Preference Elicitation. CoRR abs/1911.09153 (2019). arXiv:1911.09153 http://arxiv.org/abs/1911.09153
  29. Rethinking the evaluation for conversational recommendation in the era of large language models. arXiv preprint arXiv:2305.13112 (2023).
  30. Towards unified conversational recommender systems via knowledge-enhanced prompt learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1929–1937.
  31. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017).
  32. Improving Conversational Recommendation Systems’ Quality with Context-Aware Item Meta-Information. In Findings of the Association for Computational Linguistics: NAACL 2022. 38–48.
  33. Bayesian Preference Elicitation with Keyphrase-Item Coembeddings for Interactive Recommendation. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization. 55–64.
  34. Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161 (2019).
  35. Recipe-MPR: A Test Collection for Evaluating Multi-aspect Preference-based Natural Language Retrieval. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (¡conf-loc¿, ¡city¿Taipei¡/city¿, ¡country¿Taiwan¡/country¿, ¡/conf-loc¿) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 2744–2753. https://doi.org/10.1145/3539618.3591880
  36. Conversational Contextual Bandit: Algorithm and Application. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW ’20). Association for Computing Machinery, New York, NY, USA, 662–672. https://doi.org/10.1145/3366423.3380148
  37. Interactive collaborative filtering. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 1411–1420.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. David Eric Austin (1 paper)
  2. Anton Korikov (10 papers)
  3. Armin Toroghi (8 papers)
  4. Scott Sanner (70 papers)
Citations (7)