Likelihood-Free Parameter Estimation with Neural Bayes Estimators (2208.12942v5)
Abstract: Neural point estimators are neural networks that map data to parameter point estimates. They are fast, likelihood free and, due to their amortised nature, amenable to fast bootstrap-based uncertainty quantification. In this paper, we aim to increase the awareness of statisticians to this relatively new inferential tool, and to facilitate its adoption by providing user-friendly open-source software. We also give attention to the ubiquitous problem of making inference from replicated data, which we address in the neural setting using permutation-invariant neural networks. Through extensive simulation studies we show that these neural point estimators can quickly and optimally (in a Bayes sense) estimate parameters in weakly-identified and highly-parameterised models with relative ease. We demonstrate their applicability through an analysis of extreme sea-surface temperature in the Red Sea where, after training, we obtain parameter estimates and bootstrap-based confidence intervals from hundreds of spatial fields in a fraction of a second.
- Hierarchical Modeling and Analysis for Spatial Data. Chapman and Hall/CRC Press, Boca Raton, FL.
- Fast Gaussian Process estimation for large-scale in situ inference using convolutional neural networks. In IEEE International Conference on Big Data, pages 3731–3739.
- Approximate Bayesian computation in population genetics. Genetics, 162:2025–2035.
- Estimating space and space-time covariance functions for large data sets: A weighted composite likelihood approach. Journal of the American Statistical Association, 107:268–280.
- Julia: A fresh approach to numerical computing. SIAM Review, 59:65–98.
- Statistical Inference. Duxbury, Belmont, CA, second edition.
- High-order composite likelihood inference for max-stable distributions and processes. Journal of Computational and Graphical Statistics, 25:1212–1229.
- A likelihood-free inference framework for population genetic data using exchangeable neural networks. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- Linear and nonlinear ARMA model parameter estimation using an artificial neural network. IEEE Transactions on Biomedical Engineering, 44:168–174.
- A note on pseudolikelihood constructed from marginal densities. Biometrika, 3:729–737.
- Creel, M. (2017). Neural nets for indirect inference. Econometrics and Statistics, 2:36–49.
- Cressie, N. (1993). Statistics for Spatial Data. Wiley, Hoboken, NJ, revised edition.
- Cressie, N. (2018). Mission co2ntrol: A statistical scientist’s role in remote sensing of atmospheric carbon dioxide. Journal of the American Statistical Association, 113(521):152–168.
- Cressie, N. (2023). Decisions, decisions, decisions in an uncertain environment. Environmetrics, 34:e2767.
- Statistics of extremes. Annual Review of Statistics and its Application, 2:203–235.
- Spatial extremes. In Gelfand, A. E., Fuentes, M., Hoeting, J. A., and Smith, R. L., editors, Handbook of Environmental and Ecological Statistics, pages 711–744. Chapman & Hall/CRC Press, Boca Raton, FL.
- Monte Carlo methods of inference for implicit statistical models. Journal of the Royal Statistical Society B, 46:193–227.
- Model-Based Geostatistics. Springer, New York, NY.
- The operational sea surface temperature and sea ice analysis (OSTIA) system. Remote Sensing of Environment, 116:140–158.
- The unreasonable effectiveness of convolutional neural networks in population genetic inference. Molecular Biology and Evolution, 36:220–238.
- Neural parameter calibration for large-scale multiagent models. Proceedings of the National Academy of Sciences, 120(7):e2216415120.
- Fast covariance parameter estimation of spatial Gaussian process models using neural networks. Stat, 10:e382.
- Deep Learning. MIT Press, Cambridge, MA.
- Universal approximation of symmetric and anti-symmetric functions. Communications in Mathematical Sciences, 20:1397–1408.
- Estimating high-resolution Red Sea surface temperature hotspots, using a low-rank semiparametric spatial model. Annals of Applied Statistics, 15:572–596.
- A conditional approach for multivariate extreme values. Journal of the Royal Statistical Society B, 66:497–546.
- Multilayer feedforward networks are universal approximators. Neural Networks, 2:359–366.
- Huser, R. (2021). Editorial: EVA 2019 data competition on spatio-temporal prediction of Red Sea surface temperature extremes. Extremes, 24:91–104.
- Composite likelihood estimation for the Brown–Resnick process. Biometrika, 100:511–518.
- Full likelihood inference for max-stable data. Stat, 8:e218.
- Vecchia likelihood approximation for accurate and fast inference in intractable spatial extremes models. arXiv:2203.05626v1.
- Advances in statistical modeling of spatial extremes. Wiley Interdisciplinary Reviews: Computational Statistics, 14:e1537.
- Innes, M. (2018). Flux: Elegant machine learning with Julia. Journal of Open Source Software, 3:602.
- Theory of Point Estimation. Springer, New York, NY, 2nd edition.
- Neural networks for parameter estimation in intractable models. Computational Statistics & Data Analysis, 185:107762.
- Fundamentals and recent developments in approximate Bayesian computation. Systematic Biology, 66:66–82.
- McCullagh, P. (2002). What is a statistical model. The Annals of Statistics, 30:1225–1310.
- Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs. In 7th International Conference on Learning Representations, ICLR.
- Likelihood-free inference with generative neural networks via scoring rule minimization. arXiv:2205.15784.
- Likelihood-based inference for max-stable processes. Journal of the American Statistical Association, 105:263–277.
- BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Transactions on Neural Networks and Learning Systems, 33:1452–1466.
- R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
- Modelling extremes of spatial aggregates of precipitation using conditional methods. Annals of Applied Statistics, 16:2693–2713.
- Parameter estimation with dense and convolutional neural networks applied to the FitzHugh-Nagumo ODE. In Bruna, J., Hesthaven, J., and Zdeborova, L., editors, Proceedings of the 2nd Annual Conference on Mathematical and Scientific Machine Learning, volume 145 of Proceedings of Machine Learning Research, pages 1–28. PMLR.
- Tapered composite likelihood for spatial max-stable models. Spatial Statistics, 8:86–103.
- Schlather, M. (2002). Models for stationary max-stable random fields. Extremes, 5:33–44.
- High-dimensional modeling of spatial and spatio-temporal conditional extremes using INLA and Gaussian Markov random fields. Extremes, to appear.
- Conditional modelling of spatio-temporal extremes for Red Sea surface temperatures. Spatial Statistics, 41:100482.
- Handbook of Approximate Bayesian Computation. Chapman & Hall/CRC Press, Boca Raton, FL.
- On Deep Set learning and the choice of aggregations. In Tetko, I. V., Kurková, V., Karpov, P., and Theis, F. J., editors, Proceedings of the 28th International Conference on Artificial Neural Networks, ICANN, Lecture Notes in Computer Science, pages 444–457. Springer.
- Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer, New York, NY.
- Approximating likelihoods for large spatial data sets. Journal of the Royal Statistical Society B, 66:275–296.
- Strasser, H. (1981). Consistency of maximum likelihood and Bayes estimates. The Annals of Statistics, 9:1107–1113.
- Subbotin, M. T. (1923). On the law of frequency of errors. Mathematicheskii Sbornik, 31:296–301.
- An overview of composite likelihood methods. Statistica Sinica, 21:5–42.
- A note on composite likelihood inference and model selection. Biometrika, 92:519–528.
- Vecchia, A. V. (1988). Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society B, 50:297–312.
- Higher-dimensional spatial extremes via single-site conditioning. Spatial Statistics, 51:100677.
- Universal approximation of functions on sets. Journal of Machine Learning Research, 23:1–56.
- On the limitations of representing functions on sets. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, volume 97, pages 6487–6494. PMLR.
- A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32:4–24.
- Deep sets. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Deep integro-difference equation models for spatio-temporal forecasting. Spatial Statistics, 37:100408.
- Zhang, H. (2004). Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journal of the American Statistical Association, 99:250–261.
- Zhou, D. (2018). Universality of deep convolutional neural networks. Applied and Computational Harmonic Analysis, 48:787–794.