L-C2ST: Local Diagnostics for Posterior Approximations in Simulation-Based Inference (2306.03580v2)
Abstract: Many recent works in simulation-based inference (SBI) rely on deep generative models to approximate complex, high-dimensional posterior distributions. However, evaluating whether or not these approximations can be trusted remains a challenge. Most approaches evaluate the posterior estimator only in expectation over the observation space. This limits their interpretability and is not sufficient to identify for which observations the approximation can be trusted or should be improved. Building upon the well-known classifier two-sample test (C2ST), we introduce L-C2ST, a new method that allows for a local evaluation of the posterior estimator at any given observation. It offers theoretically grounded and easy to interpret -- e.g. graphical -- diagnostics, and unlike C2ST, does not require access to samples from the true posterior. In the case of normalizing flow-based posterior estimators, L-C2ST can be specialized to offer better statistical power, while being computationally more efficient. On standard SBI benchmarks, L-C2ST provides comparable results to C2ST and outperforms alternative local approaches such as coverage tests based on highest predictive density (HPD). We further highlight the importance of local evaluation and the benefit of interpretability of L-C2ST on a challenging application from computational neuroscience.
- Interrogating theoretical models of neural computation with emergent property inference. eLife, 10, 7 2021. ISSN 2050084X. doi: 10.7554/eLife.56265.
- Stan: A probabilistic programming language. Journal of statistical software, 76(1), 2017.
- HNPE: Leveraging Global Parameters for Neural Posterior Estimation. In NeurIPS 2021, Sydney (Online), Australia, December 2021. URL https://hal.science/hal-03139916.
- The frontier of simulation-based inference. Proceedings of the National Academy of Sciences (PNAS), 117:30055–30062, 2020. ISSN 0027-8424. doi: 10.1073/pnas.1912789117.
- Confidence sets and hypothesis testing in a likelihood-free inference setting. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 2323–2334. PMLR, 13–18 Jul 2020.
- Real-time gravitational wave science with neural posterior estimation. Phys. Rev. Lett., 127:241103, Dec 2021. doi: 10.1103/PhysRevLett.127.241103. URL https://link.aps.org/doi/10.1103/PhysRevLett.127.241103.
- Group equivariant neural posterior estimation. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=u6s8dSporO8.
- Towards reliable simulation-based inference with balanced neural ratio estimation. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=o762mMj4XK.
- Calibrated predictive distributions via diagnostics for conditional coverage. 5 2022. doi: 10.48550/arxiv.2205.14568. URL https://arxiv.org/abs/2205.14568v2.
- A large-scale study of probabilistic calibration in neural network regression. In Proceedings of the 40th International Conference on Machine Learning, 2023. To appear.
- Neural spline flows. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- nflows: normalizing flows in PyTorch. November 2020. doi: 10.5281/zenodo.4296287.
- Institute of mathematical statistics monographs: Computer age statistical inference: Algorithms, evidence, and data science series number 5. Cambridge University Press, Cambridge, England, July 2016.
- Model misspecification in approximate Bayesian computation: consequences and diagnostics. Journal of the Royal Statistical Society: Series B, 82(2):421–444, 2019. doi: 10.1111/rssb.12356.
- Bayesian workflow, 2020.
- Training deep neural density estimators to identify mechanistic models of neural dynamics. eLife, 9:1–46, 9 2020. ISSN 2050084X. doi: 10.7554/ELIFE.56261.
- Generative adversarial networks. Communications of the ACM, 63:139–144, 6 2014. ISSN 15577317. doi: 10.1145/3422622.
- Automatic posterior transformation for likelihood-free inference. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97, pages 2404–2414. PMLR, 09–15 Jun 2019.
- Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of Machine Learning Research, 13(11):307–361, 2012. URL http://jmlr.org/papers/v13/gutmann12a.html.
- Simulation-based inference for whole-brain network modeling of epilepsy using deep neural density estimators. medRxiv, page 2022.06.02.22275860, 6 2022. doi: 10.1101/2022.06.02.22275860. URL https://www.medrxiv.org/content/10.1101/2022.06.02.22275860v1.
- The elements of statistical learning. Springer series in statistics. Springer, New York, NY, 2 edition, December 2009.
- Likelihood-free MCMC with amortized approximate ratio estimators. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 4239–4248. PMLR, 13–18 Jul 2020. doi: 10.48550/arxiv.1903.04057.
- A crisis in simulation-based inference? beware, your posterior approximations can be unfaithful. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=LHAbHkt6Aq.
- Inverting brain grey matter models with likelihood-free inference: a tool for trustable cytoarchitecture measurements. Machine Learning for Biomedical Imaging, 1:1–28, 2022. ISSN 2766-905X. URL https://melba-journal.org/2022:010.
- Electroencephalogram and visual evoked potential generation in a mathematical model of coupled cortical columns. Biological Cybernetics 1995 73:4, 73:357–366, 9 1995. ISSN 1432-0770. doi: 10.1007/BF00199471.
- Global and local two-sample tests via regression. Electronic Journal of Statistics, 13:5253–5305, 12 2018. ISSN 19357524. doi: 10.48550/arxiv.1812.08927.
- Robust simulation-based inference in cosmology with bayesian neural networks. Machine Learning: Science and Technology, 4(1):01LT01, feb 2023. doi: 10.1088/2632-2153/acbb53. URL https://dx.doi.org/10.1088/2632-2153/acbb53.
- The virtual brain: a simulator of primate brain network dynamics. Frontiers in neuroinformatics, 7, 6 2013. ISSN 1662-5196. doi: 10.3389/FNINF.2013.00010. URL https://pubmed.ncbi.nlm.nih.gov/23781198/.
- Validation diagnostics for SBI algorithms based on normalizing flows, 2022.
- Revisiting classifier two-sample tests. 5th International Conference on Learning Representations, ICLR 2017, 10 2016. doi: 10.48550/arxiv.1610.06545. URL https://arxiv.org/abs/1610.06545v4.
- Benchmarking simulation-based inference. Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (PMLR), 130:343–351, 4 2021. doi: 10.48550/arxiv.2101.04653.
- Lossless, scalable implicit likelihood inference for cosmological fields. Journal of Cosmology and Astroparticle Physics, 2021, 7 2021. doi: 10.1088/1475-7516/2021/11/049.
- Simulation-based calibration checking for bayesian computation: The choice of test quantities shapes sensitivity. 11 2022. doi: 10.48550/arxiv.2211.02383. URL https://arxiv.org/abs/2211.02383v1.
- E-valuating classifier two-sample tests, 2022.
- Masked autoregressive flow for density estimation. Advances in Neural Information Processing Systems (NeurIPS), pages 2339–2348, 12 2017. ISSN 10495258. doi: 10.48550/arxiv.1705.07057.
- Sequential neural likelihood: Fast likelihood-free inference with autoregressive flows. 89:837–848, 16–18 Apr 2019.
- Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22:1–64, 2021. ISSN 15337928. doi: 10.48550/arxiv.1912.02762.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems (NeurIPS), page 12, Vancouver, BC, Canada, 2019.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Variational inference with normalizing flows. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 1530–1538, Lille, France, 07–09 Jul 2015. PMLR. URL https://proceedings.mlr.press/v37/rezende15.html.
- Monte Carlo statistical methods. Springer Texts in Statistics. Springer, New York, NY, 2 edition, July 2005.
- Validating bayesian inference algorithms with simulation-based calibration. 4 2018. doi: 10.48550/arxiv.1804.06788.
- Inferring coalescence times from DNA sequence data. Genetics, 145:505–518, 2 1997. ISSN 00166731. doi: 10.1093/GENETICS/145.2.505.
- sbi: A toolkit for simulation-based inference. Journal of Open Source Software, 5(52):2505, 2020. doi: 10.21105/joss.02505.
- Neural posterior estimation for exoplanetary atmospheric retrieval. A&A, 672:A147, 2023. doi: 10.1051/0004-6361/202245263. URL https://doi.org/10.1051/0004-6361/202245263.
- Robust neural posterior estimation and statistical model criticism. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=MHE27tjD8m3.
- Discriminative calibration. 2023.
- Diagnostics for conditional density models and bayesian inference algorithms. In Cassio de Campos and Marloes H. Maathuis, editors, Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, volume 161 of Proceedings of Machine Learning Research, pages 1830–1840. PMLR, 27–30 Jul 2021. URL https://proceedings.mlr.press/v161/zhao21b.html.