Dice Question Streamline Icon: https://streamlinehq.com

Applicability of temperature-invariance results to hierarchical and non-identifiable models

Determine for which hierarchical Bayesian models the asymptotic finding that the choice of temperature τ in power posteriors does not meaningfully impact posterior predictive performance holds, and ascertain how this result maps to statistically non-identifiable models such as Bayesian neural networks, where standard identifiability and concentration assumptions may fail.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper proves that, under broad posterior concentration conditions, posterior predictive distributions based on power posteriors become asymptotically indistinguishable from plug-in predictives, implying that the choice of temperature τ has vanishing impact on predictive accuracy in moderate-to-large samples. These results are established in total variation and Kullback–Leibler divergence, and extend to generalized Bayes posteriors under analogous concentration assumptions.

In the discussion, the authors note limits of their assumptions in certain modeling regimes. Coarsened posteriors intentionally prevent concentration and are therefore outside the current theory. The authors explicitly state uncertainty about the scope of their results for hierarchical models and for statistically non-identifiable models such as Bayesian neural networks, where the cold posterior effect has been observed empirically and where the paper’s assumptions are typically violated. Clarifying applicability in these settings is left as an open question.

References

It is also not clear for which hierarchical models our results hold, or how our results map to statistically non-identifiable models like Bayesian neural networks.

Predictive performance of power posteriors (2408.08806 - McLatchie et al., 16 Aug 2024) in Section 6 (Discussion)