Bayesian Machine Learning meets Formal Methods: An application to spatio-temporal data (2110.01360v3)
Abstract: We propose an interdisciplinary framework that combines Bayesian predictive inference, a well-established tool in Machine Learning, with Formal Methods rooted in the computer science community. Bayesian predictive inference allows for coherently incorporating uncertainty about unknown quantities by making use of methods or models that produce predictive distributions, which in turn inform decision problems. By formalizing these decision problems into properties with the help of spatio-temporal logic, we can formulate and predict how likely such properties are to be satisfied in the future at a certain location. Moreover, we can leverage our methodology to evaluate and compare models directly on their ability to predict the satisfaction of application-driven properties. The approach is illustrated in an urban mobility application, where the crowdedness in the center of Milan is proxied by aggregated mobile phone traffic data. We specify several desirable spatio-temporal properties related to city crowdedness such as a fault-tolerant network or the reachability of hospitals. After verifying these properties on draws from the posterior predictive distributions, we compare several spatio-temporal Bayesian models based on their overall and property-based predictive performance.
- Sudipto Banerjee. 2017. High-dimensional Bayesian geostatistics. Bayesian Analysis 12, 2 (2017), 583 – 614. https://doi.org/10.1214/17-BA1056R
- A multi-source dataset of urban life in the city of Milan and the Province of Trentino. Scientific data 2, 1 (2015), 1–15. https://doi.org/10.1038/sdata.2015.55
- The time varying network of urban space uses in Milan. Applied Network Science 4, 1 (2019), 1–16. https://doi.org/10.1007/s41109-019-0245-x
- Mobile phone data analytics against the COVID-19 epidemics in Italy: flow diversity and local job markets during the national lockdown. CoRR abs/2004.11278 (2020), 24 pages. arXiv:2004.11278 https://arxiv.org/abs/2004.11278
- Scalable Stochastic Parametric Verification with Stochastic Variational Smoothed Model Checking. In Runtime Verification - 23rd International Conference, RV 2023, Thessaloniki, Greece, October 3-6, 2023, Proceedings (Lecture Notes in Computer Science, Vol. 14245), Panagiotis Katsaros and Laura Nenzi (Eds.). Springer, 45–65. https://doi.org/10.1007/978-3-031-44267-4_3
- Smoothed model checking for uncertain Continuous-Time Markov Chains. Inf. Comput. 247 (2016), 235–253. https://doi.org/10.1016/J.IC.2016.01.004
- Luca Bortolussi and Simone Silvetti. 2018. Bayesian Statistical Parameter Synthesis for Linear Temporal Properties of Stochastic Models. 10806 (2018), 396–413. https://doi.org/10.1007/978-3-319-89963-3_23
- Bayesian modeling for large spatio-temporal data: an application to mobile networks. In Smart Statistics for Smart Applications. Book of Short Papers SIS 2019. Pearson, Società Italiana di Statistica, Università Cattolica del Sacro Cuore, Largo Gemelli 1, 691–696.
- Evidence and future potential of mobile phone data for disease disaster management. Geoforum 75 (2016), 253–264. https://doi.org/10.1016/j.geoforum.2016.07.019
- Edmund M. Clarke and E. Allen Emerson. 1982. Design and synthesis of synchronization skeletons using branching time temporal logic. In Logics of Programs, Dexter Kozen (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 52–71.
- Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences 111, 45 (2014), 15888–15893. https://doi.org/10.1073/pnas.1408439111
- Georgios E. Fainekos and George J. Pappas. 2009. Robustness of temporal logic specifications for continuous-time signals. Theoretical Computer Science 410, 42 (2009), 4262–4291. https://doi.org/10.1016/j.tcs.2009.06.021
- Stefano Favaro and Yee Whye Teh. 2013. MCMC for normalized random measure mixture models. Statist. Sci. 28, 3 (2013), 335–359. https://doi.org/10.1214/13-STS422
- Spatiotemporal analysis of urban mobility using aggregate mobile phone derived presence and demographic data: a case study in the city of Rome, Italy. Data 4, 1 (2019), 25 pages. https://doi.org/10.3390/data4010008
- POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES. Statistica Sinica 6, 4 (1996), 733–760. http://www.jstor.org/stable/24306036
- John Geweke and Gianni Amisano. 2010. Comparing and evaluating Bayesian predictive distributions of asset returns. International Journal of Forecasting 26, 2 (2010), 216–230. https://doi.org/10.1016/j.ijforecast.2009.10.007
- Tilmann Gneiting and Adrian E Raftery. 2007. Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102, 477 (2007), 359–378. https://doi.org/10.1198/016214506000001437
- Development of origin–destination matrices using mobile phone call data. Transportation Research Part C: Emerging Technologies 40 (2014), 63–74. https://doi.org/10.1016/j.trc.2014.01.002
- Gregor Kastner. 2016. Dealing with stochastic volatility in time series using the R package stochvol. Journal of Statistical Software 69, 5 (2016), 1–30. https://doi.org/10.18637/jss.v069.i05
- Leonhard Knorr-Held and Håvard Rue. 2002. On block updating in Markov random field models for disease mapping. Scandinavian Journal of Statistics 29, 4 (2002), 597–614. https://doi.org/10.1111/1467-9469.00308
- Predictive inference based on Markov chain Monte Carlo output. International Statistical Review 89, 2 (2021), 274–301. https://doi.org/10.1111/insr.12405
- Statistical Model Checking. Springer International Publishing, Cham, 478–504. https://doi.org/10.1007/978-3-319-91908-9_23
- Estimation of Disease Rates in Small Areas: A new Mixed Model for Spatial Dependence. In Statistical Models in Epidemiology, the Environment, and Clinical Trials, M. Elizabeth Halloran and Donald Berry (Eds.). Springer New York, New York, NY, 179–191.
- James E. Matheson and Robert L. Winkler. 1976. Scoring rules for continuous probability distributions. Management Science 22, 10 (1976), 1087–1096. https://doi.org/10.1287/mnsc.22.10.1087
- Simulation smoothing for state-space models: A computational efficiency analysis. Computational Statistics & Data Analysis 55, 1 (2011), 199–212. https://doi.org/10.1016/j.csda.2010.07.009
- Bayesian modeling and clustering for spatio-temporal areal data: An application to Italian unemployment. Spatial Statistics 52 (2022), 100715. https://doi.org/10.1016/j.spasta.2022.100715
- A Logic for Monitoring Dynamic Networks of Spatially-distributed Cyber-Physical Systems. Log. Methods Comput. Sci. 18, 1 (2022). https://doi.org/10.46298/LMCS-18(1:4)2022
- Monitoring spatio-temporal properties. In Runtime Verification, Jyotirmoy Deshmukh and Dejan Ničković (Eds.). Springer International Publishing, Cham, 21–46.
- MoonLight: a lightweight tool for monitoring spatio-temporal properties. Int. J. Softw. Tools Technol. Transf. 25, 4 (2023), 503–517. https://doi.org/10.1007/S10009-023-00710-5
- Qualitative and quantitative monitoring of spatio-temporal properties with SSTL. arXiv preprint arXiv:1706.09334 abs/1706.09334 (2017), 38 pages. https://doi.org/10.23638/LMCS-14(4:2)2018
- Dynamic, interactive and visual analysis of population distribution and mobility dynamics in an urban environment using the mobility explorer framework. Information 8, 2 (2017), 56. https://doi.org/10.3390/info8020056
- J. P. Queille and J. Sifakis. 1982. Specification and verification of concurrent systems in CESAR. In International Symposium on Programming, Mariangiola Dezani-Ciancaglini and Ugo Montanari (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 337–351.
- Terrance D. Savitsky and Matthew R. Williams. 2022. Bayesian dependent functional mixture estimation for area and time-indexed data: an application for the prediction of monthly county employment. Bayesian Analysis 17, 3 (2022), 791 – 815. https://doi.org/10.1214/21-BA1274
- A Bayesian spatio-temporal model to analyzing the stability of patterns of population distribution in an urban space using mobile phone data. International Journal of Geographical Information Science 35, 1 (2021), 116–134. https://doi.org/10.1080/13658816.2020.1798967
- Spatially disaggregated population estimates in the absence of national population and housing census data. Proceedings of the National Academy of Sciences 115, 14 (2018), 3529–3537. https://doi.org/10.1073/pnas.1715305115
- Numerical vs. statistical probabilistic model checking. International Journal on Software Tools for Technology Transfer 8, 3 (June 2006), 216–228. https://doi.org/10.1007/s10009-005-0187-8
- Bayesian statistical model checking with application to Stateflow/Simulink verification. Formal Methods in System Design 43, 2 (2013), 338–367. https://doi.org/10.1007/s10703-013-0195-3