Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Prompt Risk Control: A Rigorous Framework for Responsible Deployment of Large Language Models (2311.13628v2)

Published 22 Nov 2023 in cs.LG, cs.AI, and cs.CL

Abstract: The recent explosion in the capabilities of LLMs has led to a wave of interest in how best to prompt a model to perform a given task. While it may be tempting to simply choose a prompt based on average performance on a validation set, this can lead to a deployment where unexpectedly poor responses are generated, especially for the worst-off users. To mitigate this prospect, we propose Prompt Risk Control, a lightweight framework for selecting a prompt based on rigorous upper bounds on families of informative risk measures. We offer methods for producing bounds on a diverse set of metrics, including quantities that measure worst-case responses and disparities in generation quality across the population of users. In addition, we extend the underlying statistical bounding techniques to accommodate the possibility of distribution shifts in deployment. Experiments on applications such as open-ended chat, medical question summarization, and code generation highlight how such a framework can foster responsible deployment by reducing the risk of the worst outcomes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Falcon-40B: an open large language model with state-of-the-art performance. 2023.
  2. A gentle introduction to conformal prediction and distribution-free uncertainty quantification. arXiv:2107.07511, 2021.
  3. Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control. arXiv:2110.01052, 2021.
  4. Anthony B Atkinson et al. On the Measurement of Inequality. Journal of Economic Theory, 2(3):244–263, 1970.
  5. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv:2204.05862, 2022.
  6. Distribution-free, risk-controlling prediction sets. Journal of the ACM, 68(6):1–34, 2021.
  7. On the summarization of consumer health questions. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
  8. Goodness-of-fit test statistics that dominate the Kolmogorov statistics. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 47(1):47–59, 1979.
  9. A global analysis of metrics used for measuring performance in natural language processing. arXiv:2204.11574, 2022.
  10. Language models are few-shot learners. In Advances in Neural Information Processing Systems, 2020.
  11. Rethink reporting of evaluation results in AI. Science, 380(6641):136–138, 2023.
  12. Scaling instruction-finetuned language models. arXiv:2210.11416, 2022.
  13. Distribution-free statistical dispersion control for societal applications. In Advances in Neural Information Processing Systems, 2023.
  14. Raft: Reward ranked finetuning for generative foundation model alignment. arXiv:2304.06767, 2023.
  15. Conformal prediction under feedback covariate shift for biomolecular design. Proceedings of the National Academy of Sciences, 119(43):e2204569119, 2022.
  16. Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned. arXiv:2209.07858, 2022.
  17. Adaptive conformal inference under distribution shift. In Advances in Neural Information Processing Systems, 2021.
  18. Detoxify. Github. https://github.com/unitaryai/detoxify, 2020.
  19. Wassily Hoeffding. Probability Inequalities for Sums of Bounded Random Variables. Journal of the American Statistical Association, 58(301):13–30, 1963.
  20. Challenges and applications of large language models. arXiv:2307.10169, 2023.
  21. Conformal prediction with large language models for multi-choice question answering. In Proceedings of the ICML 2023 Neural Conversational AI TEACH Workshop, 2023.
  22. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021.
  23. Holistic evaluation of language models. Transactions on Machine Learning Research, 2023.
  24. Chin-Yew Lin. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. Association for Computational Linguistics, 2004.
  25. Frank J. Massey. The Kolmogorov-Smirnov Test for Goodness of Fit. Journal of the American Statistical Association, 46(253):68–78, 1951.
  26. Amit Moscovich. Fast calculation of p-values for one-sided Kolmogorov-Smirnov type statistics. Comput. Stat. Data Anal., 185(C):107769, 2023.
  27. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, 2016.
  28. Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.
  29. OpenAI. Gpt-4 technical report, 2023.
  30. PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction. In International Conference on Learning Representations, 2020.
  31. PAC prediction sets under covariate shift. In International Conference on Learning Representations, 2022.
  32. Red teaming language models with language models. In Conference on Empirical Methods in Natural Language Processing, 2022.
  33. Prediction sets adaptive to unknown covariate shift. Journal of the Royal Statistical Society Series B: Statistical Methodology, page qkad069, 2023.
  34. Conformal language modeling. arXiv:2306.10193, 2023.
  35. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(1):1–67, 2020.
  36. Robots that ask for help: Uncertainty alignment for large language model planners. In 7th Annual Conference on Robot Learning, 2023.
  37. Optimization of conditional value-at-risk. The Journal of Risk, 2(3):21–41, 2000.
  38. Code llama: Open foundation models for code. arXiv:2308.12950, 2023.
  39. Semantic uncertainty intervals for disentangled latent spaces. In Advances in Neural Information Processing Systems, 2022.
  40. Confident adaptive language modeling. In Advances in Neural Information Processing Systems, 2022.
  41. A tutorial on conformal prediction. Journal of Machine Learning Research, 9(12):371–421, 2008.
  42. Quantile risk control: A flexible framework for bounding the probability of high-loss predictions. In International Conference on Learning Representations, 2023.
  43. Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288, 2023.
  44. John von Neumann. Various techniques used in connection with random digits. In Monte Carlo Method, pages 36–38. National Bureau of Standards Applied Mathematics Series, 12, 1951.
  45. Defensive forecasting for linear protocols. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005.
  46. Do prompt-based models really understand the meaning of their prompts? In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022.
  47. Finetuned language models are zero-shot learners. In International Conference on Learning Representations, 2022a.
  48. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems, 2022b.
  49. Fairness risk measures. In International Conference on Machine Learning, 2019.
  50. Shlomo Yitzhaki. Relative deprivation and the Gini coefficient. The quarterly journal of economics, 93(2):321–324, 1979.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com