Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Top-label calibration and multiclass-to-binary reductions (2107.08353v4)

Published 18 Jul 2021 in cs.LG, cs.AI, stat.ME, and stat.ML

Abstract: A multiclass classifier is said to be top-label calibrated if the reported probability for the predicted class -- the top-label -- is calibrated, conditioned on the top-label. This conditioning on the top-label is absent in the closely related and popular notion of confidence calibration, which we argue makes confidence calibration difficult to interpret for decision-making. We propose top-label calibration as a rectification of confidence calibration. Further, we outline a multiclass-to-binary (M2B) reduction framework that unifies confidence, top-label, and class-wise calibration, among others. As its name suggests, M2B works by reducing multiclass calibration to numerous binary calibration problems, each of which can be solved using simple binary calibration routines. We instantiate the M2B framework with the well-studied histogram binning (HB) binary calibrator, and prove that the overall procedure is multiclass calibrated without making any assumptions on the underlying data distribution. In an empirical evaluation with four deep net architectures on CIFAR-10 and CIFAR-100, we find that the M2B + HB procedure achieves lower top-label and class-wise calibration error than other approaches such as temperature scaling. Code for this work is available at \url{https://github.com/aigen/df-posthoc-calibration}.

Citations (32)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com