STaR-GATE: Teaching Language Models to Ask Clarifying Questions (2403.19154v3)

Published 28 Mar 2024 in cs.CL and cs.AI

Abstract: When prompting LLMs to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a LLM's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained LLM-the Questioner-and a Roleplayer whose preferences are unknown to the Questioner. By asking questions, the Questioner elicits preferences from the Roleplayer. The Questioner is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an Oracle with access to the Roleplayer's latent preferences. After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks. Our results indicate that teaching a LLM to ask better questions leads to better personalized responses.

References (40)

Authors (4)

Chinmaya Andukuri (1 paper)
Jan-Philipp Fränken (12 papers)
Tobias Gerstenberg (18 papers)
Noah D. Goodman (83 papers)

Citations (16)

View on Semantic Scholar

Summary

Teaching LLMs to Ask Clarifying Questions

Introduction

In the domain of conversational AI and Natural Language Processing, the ability of LLMs (LMs) to interpret user intents accurately is paramount. Traditional models often stumble when prompts are ambiguous or lack sufficient detail, leading to suboptimal responses. The paper "STaR-GATE: Teaching LLMs to Ask Clarifying Questions" introduces an iterative algorithm designed to enhance a LM's questioning capability, thereby improving its responses to user prompts. By embedding an active learning loop into the model's training process, STaR-GATE (Self-Taught Reasoner-Generative Active Task Elicitation) achieves significant improvements in generating contextually relevant questions, which in turn facilitates more personalized and accurate responses to user inputs.

Methodology

STaR-GATE Overview:

STaR-GATE combines the principles of active preference elicitation (GATE) with a self-improvement learning strategy (STaR). The paper constructs a synthetic dataset encompassing 25,500 unique persona-task prompts, simulating interactions between the model (termed the Questioner) and a Roleplayer. The Roleplayer represents users with undisclosed preferences, and the Questioner is tasked with eliciting these preferences through targeted questions. Success is measured by the Questioner's ability to generate responses that align with gold-standard answers produced by an Oracle, which has complete access to the Roleplayer's preferences.

Key Techniques:

Iterative Finetuning: The model undergoes repeated cycles of self-improvement by iteratively refining its questioning approach based on feedback loops. This process involves generating questions, eliciting preferences, and receiving rewards based on the relevance of generated responses to the gold-standard answers.
Regularization: To prevent overfitting on the question-asking behavior and ensure balanced performance in both questioning and responding tasks, response regularization techniques are employed. This involves generating responses from the model and incorporating these responses into the training data.
Roleplayer and Oracle Models: Utilizing pre-trained LLMs as Roleplayers and Oracles introduces variability and depth to the training process, simulating real-world interactions and preferences more effectively.

Results

After two iterations of applying STaR-GATE, the improved model demonstrates a notable capability in asking better questions that lead to generating responses preferred by users in 72% of tasks. This outcome underscores the efficacy of the algorithm in enhancing the model's interactive and elicitation skills. Moreover, the win rates and log probabilities of generating gold-standard responses see consistent improvement across iterations, indicating a positive trend in the model's learning trajectory.

Implications and Future Directions

The success of STaR-GATE in teaching LLMs to ask clarifying questions has both theoretical and practical implications:

Enhanced Model Interactivity: The ability to ask poignant questions can significantly improve user experience in conversational AI, making interactions more dynamic and context-aware.
Personalized Responses: By effectively eliciting user preferences, models can offer more tailored responses, enhancing satisfaction and engagement across various applications, from virtual assistants to customer service bots.
Future Work: While promising, the STaR-GATE approach points toward further research avenues, including exploring other modalities of elicitation, integrating with larger, more capable models, and adapting the methodology across diverse languages and cultural contexts.

Conclusion

The STaR-GATE algorithm represents a significant step forward in the development of conversational AI, empowering LLMs to interact more effectively with users by asking clarifying questions. Through iterative self-improvement and targeted response generation, the approach not only boosts the model's understanding of user intents but also fosters a more personalized and engaging conversational experience.

PDF Markdown

Related Papers

Find Related Papers

Tweets

https://twitter.com/jphilippfranken/status/1773563102006509677

https://twitter.com/fly51fly/status/1773598747609670086

https://twitter.com/sebkrier/status/1777318884338438212

https://twitter.com/chinmaya_mohan/status/1930502238872666454

https://twitter.com/carterleffen/status/1773773815312507000

https://twitter.com/potato_y_salad/status/1786794077284872596