Eliciting Human Preferences with Language Models (2310.11589v1)
Abstract: LLMs (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts. But selecting examples or writing prompts for can be challenging--especially in tasks that involve unusual edge cases, demand precise articulation of nebulous preferences, or require an accurate mental model of LM behavior. We propose to use LMs themselves to guide the task specification process. In this paper, we introduce Generative Active Task Elicitation (GATE): a learning framework in which models elicit and infer intended behavior through free-form, language-based interaction with users. We study GATE in three domains: email validation, content recommendation, and moral reasoning. In preregistered experiments, we show that LMs prompted to perform GATE (e.g., by generating open-ended questions or synthesizing informative edge cases) elicit responses that are often more informative than user-written prompts or labels. Users report that interactive task elicitation requires less effort than prompting or example labeling and surfaces novel considerations not initially anticipated by users. Our findings suggest that LM-driven elicitation can be a powerful tool for aligning models to complex human preferences and values.
- Charu C Aggarwal et al. Recommender systems, volume 1. Springer, 2016.
- Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
- Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological bulletin, 111(2):256, 1992.
- Gwern Branwen. GPT-3 nonfiction — calibration, 2020. URL https://www.gwern.net/GPT-3-nonfiction#calibration.
- Language Models are Few-Shot Learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 1877–1901. Curran Associates, Inc., 2020a. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020b.
- Making rational decisions using adaptive utility elicitation. In Aaai/Iaai, pp. 363–369, 2000.
- Issues in requirements elicitation, 1992.
- Improving generalization with active learning. Mach. Learn., 15(2):201–221, may 1994. ISSN 0885-6125. doi: 10.1023/A:1022673506211. URL https://doi.org/10.1023/A:1022673506211.
- Effective communication in requirements elicitation: a comparison of methodologies. Requirements Engineering, 7:47–60, 2002.
- Committee-based sampling for training probabilistic classifiers. In Proceedings of the Twelfth International Conference on International Conference on Machine Learning, ICML’95, pp. 150–157, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc. ISBN 1558603778.
- Underspecification presents challenges for credibility in modern machine learning. The Journal of Machine Learning Research, 23(1):10237–10297, 2022.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
- Optimal experiment design. Measurement Science and Technology, 9(6):864, 1998.
- K Anders Ericsson. Protocol analysis. A companion to cognitive science, pp. 425–432, 2017.
- Verbal reports as data. Psychological review, 87(3):215, 1980.
- Probabilistic model-agnostic meta-learning. Advances in neural information processing systems, 31, 2018.
- Shortcut learning in deep neural networks. ArXiv, abs/2004.07780, 2020.
- Automation bias: a systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1):121–127, 2012.
- Techniques for requirements elicitation. In [1993] Proceedings of the IEEE International Symposium on Requirements Engineering, pp. 152–164. IEEE, 1993.
- Multiple criteria decision analysis, volume 37. Springer, 2016.
- Conjoint analysis in consumer research: issues and outlook. Journal of consumer research, 5(2):103–123, 1978.
- Why deliberative democracy? Princeton University Press, 2004.
- Cooperative inverse reinforcement learning. Advances in neural information processing systems, 29, 2016.
- Development of nasa-tlx (task load index): Results of empirical and theoretical research. In Peter A. Hancock and Najmedin Meshkati (eds.), Human Mental Workload, volume 52 of Advances in Psychology, pp. 139–183. North-Holland, 1988. doi: https://doi.org/10.1016/S0166-4115(08)62386-9. URL https://www.sciencedirect.com/science/article/pii/S0166411508623869.
- A comparison of the four prominent user-based methods for evaluating the usability of computer software. Ergonomics, 38(10):2030–2044, 1995.
- Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745, 2011.
- Imagenet classification with deep convolutional neural networks. In F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger (eds.), Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012. URL https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf.
- Stated preference methods: an introduction. Journal of transport economics and policy, pp. 11–25, 1988.
- Jon A Krosnick. Questionnaire design. The Palgrave handbook of survey research, pp. 439–455, 2018.
- Hidden incentives for auto-induced distributional shift. arXiv preprint arXiv:2009.09153, 2020.
- Designing and conducting focus group interviews, volume 18. Citeseer, 2002.
- Heterogeneous uncertainty sampling for supervised learning. In Machine learning proceedings 1994, pp. 148–156. Elsevier, 1994.
- A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’94, pp. 3–12, Berlin, Heidelberg, 1994. Springer-Verlag. ISBN 038719889X.
- Petra Lietz. Research into questionnaire design: A summary of the literature. International journal of market research, 52(2):249–272, 2010.
- Teaching models to express their uncertainty in words. arXiv preprint arXiv:2205.14334, 2022.
- Naresh K Malhotra. Questionnaire design. The handbook of marketing research: Uses, misuses, and future advances, 83, 2006.
- Active learning principles for in-context learning with large language models, 2023.
- Julian McAuley. Personalized machine learning. Cambridge University Press, 2022.
- Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence, 12(1):1–32, 2023.
- Understanding the failure modes of out-of-distribution generalization. ArXiv, abs/2010.15775, 2021.
- Algorithms for inverse reinforcement learning. In Icml, volume 1, pp. 2, 2000.
- OpenAI. Gpt-4 technical report, 2023.
- Requirements elicitation techniques: a systematic literature review based on the maturity of the techniques. IET Software, 12(4):365–378, 2018.
- Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019. URL https://arxiv.org/abs/1908.10084.
- Herbert Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58(5):527 – 535, 1952.
- Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. ArXiv, abs/1911.08731, 2019.
- An investigation of why overparameterization exacerbates spurious correlations. ArXiv, abs/2005.04345, 2020.
- Paul A Samuelson. Consumption theory in terms of revealed preference. Economica, 15(60):243–253, 1948.
- Burr Settles. Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 2009.
- An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 1070–1079, 2008.
- Robustness to spurious correlations via human annotations. In ICML, 2020.
- Task ambiguity in humans and language models. arXiv preprint arXiv:2212.10711, 2022a.
- Active learning helps pretrained models learn the intended task. Advances in Neural Information Processing Systems, 35:28140–28153, 2022b.
- Robust active preference elicitation. arXiv preprint arXiv:2003.01899, 2020.
- MIND: A large-scale dataset for news recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3597–3606, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.331. URL https://aclanthology.org/2020.acl-main.331.
- The k-armed dueling bandits problem. Journal of Computer and System Sciences, 78(5):1538–1556, 2012.
- Generative adversarial active learning. arXiv preprint arXiv:1702.07956, 2017.
- Fine-tuning language models from human preferences, 2020.
- Requirements elicitation: A survey of techniques, approaches, and tools. Engineering and managing software requirements, pp. 19–46, 2005.