Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models (2405.16282v5)

Published 25 May 2024 in cs.CL, cs.AI, and cs.LG

Abstract: As the use of LLMs becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence conveyed in the model's response when explicitly asked about its certainty. Using various datasets and prompting techniques that encourage model introspection, we probe the alignment between models' internal and expressed confidence. These techniques encompass using structured evaluation scales to rate confidence, including answer options when prompting, and eliciting the model's confidence level for outputs it does not recognize as its own. Notably, among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment, with an average Spearman's $\hat{\rho}$ of 0.42, across a wide range of tasks. Our work contributes to the ongoing efforts to facilitate risk assessment in the application of LLMs and to further our understanding of model trustworthiness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Hussam Alkaissi and Samy I McFarlane. 2023. Artificial hallucinations in chatgpt: implications in scientific writing. Cureus, 15(2).
  2. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  3. François Chollet. 2019. On the measure of intelligence. arXiv preprint arXiv:1911.01547.
  4. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
  5. Active prompting with chain-of-thought for large language models. arXiv preprint arXiv:2302.12246.
  6. Improving factuality and reasoning in language models through multiagent debate. arXiv preprint arXiv:2305.14325.
  7. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556.
  8. Large language models can self-improve.
  9. The factual inconsistency problem in abstractive text summarization: A survey. CoRR, abs/2104.14839.
  10. Survey of hallucination in natural language generation. ACM Comput. Surv., 55(12).
  11. Qasc: A dataset for question answering via sentence composition. In AAAI Conference on Artificial Intelligence.
  12. Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation.
  13. RiddleSense: Reasoning about riddle questions featuring linguistic creativity and commonsense knowledge. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1504–1515, Online. Association for Computational Linguistics.
  14. Teaching models to express their uncertainty in words.
  15. Generating with confidence: Uncertainty quantification for black-box large language models.
  16. Learning confidence for transformer-based neural machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2353–2364, Dublin, Ireland. Association for Computational Linguistics.
  17. On the probability-quality paradox in language generation. arXiv preprint arXiv:2203.17217.
  18. Reducing conversational agents’ overconfidence through linguistic calibration. Transactions of the Association for Computational Linguistics, 10:857–872.
  19. Can a suit of armor conduct electricity? a new dataset for open book question answering. In Conference on Empirical Methods in Natural Language Processing.
  20. OpenAI. 2023. Gpt-4 technical report.
  21. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  22. Chatgpt: Optimizing language models for dialogue.
  23. Retrieval augmentation reduces hallucination in conversation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3784–3803, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  24. Commonsenseqa: A question answering challenge targeting commonsense knowledge. In Proceedings of NAACL-HLT, pages 4149–4158.
  25. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  26. Zephyr: Direct distillation of lm alignment. arXiv preprint arXiv:2310.16944.
  27. Towards better confidence estimation for neural models. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 7335–7339.
  28. Large language models are not fair evaluators. arXiv preprint arXiv:2305.17926.
  29. Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations.
  30. Chain-of-thought prompting elicits reasoning in large language models.
  31. Rewoo: Decoupling reasoning from observations for efficient augmented language models.
  32. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  33. Beyond confidence: Reliable models should also consider atypicality.
  34. Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685.
  35. Navigating the grey area: Expressions of overconfidence and uncertainty in language models.
  36. Mindstorms in natural language-based societies of mind.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Abhishek Kumar (171 papers)
  2. Robert Morabito (4 papers)
  3. Sanzhar Umbet (1 paper)
  4. Jad Kabbara (13 papers)
  5. Ali Emami (36 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets