Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A deterministic and computable Bernstein-von Mises theorem (1904.02505v2)

Published 4 Apr 2019 in math.ST, cs.LG, and stat.TH

Abstract: Bernstein-von Mises results (BvM) establish that the Laplace approximation is asymptotically correct in the large-data limit. However, these results are inappropriate for computational purposes since they only hold over most, and not all, datasets and involve hard-to-estimate constants. In this article, I present a new BvM theorem which bounds the Kullback-Leibler (KL) divergence between a fixed log-concave density $f\left(\boldsymbol{\theta}\right)$ and its Laplace approximation. The bound goes to $0$ as the higher-derivatives of $f\left(\boldsymbol{\theta}\right)$ tend to $0$ and $f\left(\boldsymbol{\theta}\right)$ becomes increasingly Gaussian. The classical BvM theorem in the IID large-data asymptote is recovered as a corollary. Critically, this theorem further suggests a number of computable approximations of the KL divergence with the most promising being: [ KL\left(g_{LAP},f\right)\approx\frac{1}{2}\text{Var}{\boldsymbol{\theta}\sim g\left(\boldsymbol{\theta}\right)}\left(\log\left[f\left(\boldsymbol{\theta}\right)\right]-\log\left[g{LAP}\left(\boldsymbol{\theta}\right)\right]\right) ] An empirical investigation of these bounds in the logistic classification model reveals that these approximations are great surrogates for the KL divergence. This result, and future results of a similar nature, could provide a path towards rigorously controlling the error due to the Laplace approximation and more modern approximation methods.

Citations (14)

Summary

We haven't generated a summary for this paper yet.