Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distinguishing the Knowable from the Unknowable with Language Models (2402.03563v2)

Published 5 Feb 2024 in cs.LG, cs.AI, and cs.CL

Abstract: We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of LLMs over free-form text. In the absence of ground-truth probabilities, we explore a setting where, in order to (approximately) disentangle a given LLM's uncertainty, a significantly larger model stands in as a proxy for the ground truth. We show that small linear probes trained on the embeddings of frozen, pretrained models accurately predict when larger models will be more confident at the token level and that probes trained on one text domain generalize to others. Going further, we propose a fully unsupervised method that achieves non-trivial accuracy on the same task. Taken together, we interpret these results as evidence that LLMs naturally contain internal representations of different types of uncertainty that could potentially be leveraged to devise more informative indicators of model confidence in diverse practical settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Pythia: A suite for analyzing large language models across training and scaling, 2023.
  2. Language models are few-shot learners. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  3. Discovering latent knowledge in language models without supervision. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=ETKGuby0hcs.
  4. Selectively answering ambiguous questions, 2023.
  5. Editing factual knowledge in language models. arXiv preprint arXiv:2104.08164, 2021.
  6. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.  248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
  7. Calibration of pre-trained transformers. In Webber, B., Cohn, T., He, Y., and Liu, Y. (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  295–302, Online, November 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.21. URL https://aclanthology.org/2020.emnlp-main.21.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. In North American Chapter of the Association for Computational Linguistics, 2019. URL https://api.semanticscholar.org/CorpusID:52967399.
  9. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Balcan, M. F. and Weinberger, K. Q. (eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pp.  1050–1059, New York, New York, USA, 20–22 Jun 2016. PMLR. URL https://proceedings.mlr.press/v48/gal16.html.
  10. The pile: An 800gb dataset of diverse text for language modeling. CoRR, abs/2101.00027, 2021. URL https://arxiv.org/abs/2101.00027.
  11. Uncertainty estimation for language reward models, 2022.
  12. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  770–778, 2016. doi: 10.1109/CVPR.2016.90.
  13. Training compute-optimal large language models, 2022.
  14. Decomposing uncertainty for large language models through input clarification ensembling. arXiv preprint arXiv:2311.08718, 2023.
  15. Does it know?: Probing for uncertainty in language model latent beliefs. In NeurIPS Workshop on Attributing Model Behavior at Scale, 2023. URL https://openreview.net/forum?id=uSvN2oozRK.
  16. Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Machine Learning, pp.  457–506, 2021. doi: 10.1007/s10994-021-05946-3.
  17. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, March 2023. ISSN 1557-7341. doi: 10.1145/3571730. URL http://dx.doi.org/10.1145/3571730.
  18. How can we know when language models know? on the calibration of language models for question answering. Transactions of the Association for Computational Linguistics, 9:962–977, 2021. doi: 10.1162/tacl˙a˙00407. URL https://aclanthology.org/2021.tacl-1.57.
  19. Language models (mostly) know what they know, 2022.
  20. Selective question answering under domain shift. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J. (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  5684–5696, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.503. URL https://aclanthology.org/2020.acl-main.503.
  21. Scaling laws for neural language models, 2020.
  22. Adam: A method for stochastic optimization. In Bengio, Y. and LeCun, Y. (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
  23. Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=VD-AYtP0dve.
  24. Simple and scalable predictive uncertainty estimation using deep ensembles, 2017.
  25. Inference-time intervention: Eliciting truthful answers from a language model. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=aLLuYpn83y.
  26. Teaching models to express their uncertainty in words, 2022.
  27. On faithfulness and factuality in abstractive summarization. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J. (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  1906–1919, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.173. URL https://aclanthology.org/2020.acl-main.173.
  28. Fast model editing at scale. arXiv preprint arXiv:2110.11309, 2021.
  29. Memory-based model editing at scale. In International Conference on Machine Learning, pp.  15817–15831. PMLR, 2022.
  30. GPT-4 technical report, 2023.
  31. Bootstrapped thompson sampling and deep exploration, 2015.
  32. Epistemic neural networks. CoRR, abs/2107.08924, 2021. URL https://arxiv.org/abs/2107.08924.
  33. The neural testbed: Evaluating joint predictions, 2022.
  34. Fine-tuning language models via epistemic neural networks, 2023.
  35. PyTorch: An Imperative Style, High-Performance Deep Learning Library, pp.  8026–8037. Curran Associates Inc., Red Hook, NY, USA, 2019. doi: 10.5555/3454287.3455008.
  36. Language models as knowledge bases? arXiv preprint arXiv:1909.01066, 2019.
  37. Pezeshkpour, P. Measuring and modifying factual knowledge in large language models. arXiv preprint arXiv:2306.06264, 2023.
  38. Prompting gpt-3 to be reliable, 2023.
  39. Fine-tuning language models for factuality, 2023a.
  40. Just ask for calibration: Strategies for eliciting calibrated confidence scores from language models fine-tuned with human feedback, 2023b.
  41. Llama: Open and efficient foundation language models, 2023a.
  42. Llama 2: Open foundation and fine-tuned chat models, 2023b.
  43. Investigating selective prediction approaches across several tasks in IID, OOD, and adversarial settings. In Muresan, S., Nakov, P., and Villavicencio, A. (eds.), Findings of the Association for Computational Linguistics: ACL 2022, pp.  1995–2002, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.findings-acl.158. URL https://aclanthology.org/2022.findings-acl.158.
  44. Reducing llm hallucinations using epistemic neural networks, 2023.
  45. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Linzen, T., Chrupała, G., and Alishahi, A. (eds.), Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp.  353–355, Brussels, Belgium, November 2018. Association for Computational Linguistics. doi: 10.18653/v1/W18-5446. URL https://aclanthology.org/W18-5446.
  46. Bayesian learning via stochastic gradient langevin dynamics. In International Conference on Machine Learning, 2011. URL https://api.semanticscholar.org/CorpusID:2178983.
  47. Delving into deep imbalanced regression. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.  11842–11851. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/yang21m.html.
  48. Can we edit factual knowledge by in-context learning? arXiv preprint arXiv:2305.12740, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Gustaf Ahdritz (5 papers)
  2. Tian Qin (27 papers)
  3. Nikhil Vyas (26 papers)
  4. Boaz Barak (40 papers)
  5. Benjamin L. Edelman (11 papers)
Citations (14)