Robustness of decreasing calibration error near perfect accuracy in underconfident settings

Investigate whether the observed decrease in calibration error as semantic accuracy approaches 100% for configurations that are systematically underconfident is a robust phenomenon across models, datasets, and prompting setups.

Background

In analyzing calibration versus accuracy under different configurations, the authors observe that underconfident settings sometimes show decreasing calibration error as models approach perfect semantic accuracy. This pattern could be incidental or dataset/model-specific.

They explicitly state uncertainty about the robustness of this effect, flagging it as an unresolved empirical question about the general behavior of calibration at high accuracy under underconfidence.

References

For the underconfident configurations, we see also little correlation overall, except for in the high-accuracy regime: calibration error tends to decrease when models approach perfect semantic accuracy. However, it is not clear whether this is a robust phenomenon.

— Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs (2511.04869 - Nakkiran et al., 6 Nov 2025) in Section 5: Experiments — Model Scaling Effects

Robustness of decreasing calibration error near perfect accuracy in underconfident settings

Sponsor

Background

References

Related Problems