Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Statistically Optimal Uncertainty Quantification for Expensive Black-Box Models (2408.05887v1)

Published 12 Aug 2024 in stat.ME and stat.CO

Abstract: Uncertainty quantification, by means of confidence interval (CI) construction, has been a fundamental problem in statistics and also important in risk-aware decision-making. In this paper, we revisit the basic problem of CI construction, but in the setting of expensive black-box models. This means we are confined to using a low number of model runs, and without the ability to obtain auxiliary model information such as gradients. In this case, there exist classical methods based on data splitting, and newer methods based on suitable resampling. However, while all these resulting CIs have similarly accurate coverage in large sample, their efficiencies in terms of interval length differ, and a systematic understanding of which method and configuration attains the shortest interval appears open. Motivated by this, we create a theoretical framework to study the statistical optimality on CI tightness under computation constraint. Our theory shows that standard batching, but also carefully constructed new formulas using uneven-size or overlapping batches, batched jackknife, and the so-called cheap bootstrap and its weighted generalizations, are statistically optimal. Our developments build on a new bridge of the classical notion of uniformly most accurate unbiasedness with batching and resampling, by viewing model runs as asymptotically Gaussian "data", as well as a suitable notion of homogeneity for CIs.

Summary

We haven't generated a summary for this paper yet.