Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications (2405.14892v1)

Published 18 May 2024 in cs.DC and stat.CO

Abstract: Addressing the statistical challenge of computing the multivariate normal (MVN) probability in high dimensions holds significant potential for enhancing various applications. One common way to compute high-dimensional MVN probabilities is the Separation-of-Variables (SOV) algorithm. This algorithm is known for its high computational complexity of O(n3) and space complexity of O(n2), mainly due to a Cholesky factorization operation for an n X n covariance matrix, where $n$ represents the dimensionality of the MVN problem. This work proposes a high-performance computing framework that allows scaling the SOV algorithm and, subsequently, the confidence region detection algorithm. The framework leverages parallel linear algebra algorithms with a task-based programming model to achieve performance scalability in computing process probabilities, especially on large-scale systems. In addition, we enhance our implementation by incorporating Tile Low-Rank (TLR) approximation techniques to reduce algorithmic complexity without compromising the necessary accuracy. To evaluate the performance and accuracy of our framework, we conduct assessments using simulated data and a wind speed dataset. Our proposed implementation effectively handles high-dimensional multivariate normal (MVN) probability computations on shared and distributed-memory systems using finite precision arithmetics and TLR approximation computation. Performance results show a significant speedup of up to 20X in solving the MVN problem using TLR approximation compared to the reference dense solution without sacrificing the application's accuracy. The qualitative results on synthetic and real datasets demonstrate how we maintain high accuracy in detecting confidence regions even when relying on TLR approximation to perform the underlying linear algebra operations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. A. J. Cannon, “Multivariate quantile mapping bias correction: an n-dimensional probability density function transform for climate model simulations of multiple variables,” Climate dynamics, vol. 50, pp. 31–49, 2018.
  2. D. Bolin and F. Lindgren, “Excursion and contour uncertainty regions for latent Gaussian models,” Journal of the Royal Statistical Society: Series B: Statistical Methodology, pp. 85–106, 2015.
  3. E. Pignat and S. Calinon, “Bayesian Gaussian mixture model for robotic policy imitation,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4452–4458, 2019.
  4. H. Sun, H. V. Burton, and H. Huang, “Machine learning applications for building structural design and performance assessment: State-of-the-art review,” Journal of Building Engineering, vol. 33, p. 101816, 2021.
  5. G. Amisano and J. Geweke, “Prediction using several macroeconomic models,” Review of Economics and Statistics, vol. 99, no. 5, pp. 912–925, 2017.
  6. M. Cameletti, F. Lindgren, D. Simpson, and H. Rue, “Spatio-temporal modeling of particulate matter concentration through the spde approach,” AStA Advances in Statistical Analysis, vol. 97, pp. 109–131, 2013.
  7. K. Ejaz, M. Arif, M. S. M. Rahim, D. Izdrui, D. M. Craciun, and O. Geman, “Confidence region identification and contour detection in MRI image,” Computational Intelligence and Neuroscience, vol. 2022, 2022.
  8. H. Bevins, A. Fialkov, E. de Lera Acedo, W. Handley, S. Singh, R. Subrahmanyan, and R. Barkana, “Astrophysical constraints from the saras 3 non-detection of the cosmic dawn sky-averaged 21-cm signal,” Nature Astronomy, vol. 6, no. 12, pp. 1473–1483, 2022.
  9. K. Ejaz, M. S. M. Rahim, U. I. Bajwa, H. Chaudhry, A. Rehman, and F. Ejaz, “Hybrid segmentation method with confidence region detection for tumor identification,” IEEE Access, vol. 9, pp. 35 256–35 278, 2020.
  10. M. Sommerfeld, S. Sain, and A. Schwartzman, “Confidence regions for spatial excursion sets from repeated random field observations, with an application to climate,” Journal of the American Statistical Association, vol. 113, no. 523, pp. 1327–1340, 2018.
  11. R. E. Caflisch, “Monte carlo and quasi-monte carlo methods,” Acta numerica, vol. 7, pp. 1–49, 1998.
  12. J. Cao, M. G. Genton, D. E. Keyes, and G. M. Turkiyyah, “Exploiting low-rank covariance structures for computing high-dimensional normal and student-t probabilities,” Statistics and Computing, vol. 31, pp. 1–16, 2021.
  13. E. Agullo, C. Augonnet, J. Dongarra, H. Ltaief, R. Namyst, S. Thibault, and S. Tomov, “A hybridization methodology for high-performance linear algebra software for GPUs,” in GPU Computing Gems Jade Edition.   Elsevier, 2012, pp. 473–484.
  14. G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemarinier, H. Ltaief et al., “Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA,” in 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.   IEEE, 2011, pp. 1432–1441.
  15. K. Akbudak, H. Ltaief, A. Mikhalev, and D. Keyes, “Tile low rank cholesky factorization for climate/weather modeling applications on manycore architectures,” in International Conference on High Performance Computing.   Springer, 2017, pp. 22–40.
  16. S. Abdulah, K. Akbudak, W. Boukaram, A. Charara, D. Keyes, H. Ltaief, A. Mikhalev, D. Sukkari, and G. Turkiyyah, “Hierarchical computations on manycore architectures (HiCMA),” See http://github. com/ecrc/hicma, 2019.
  17. J. Cao, M. G. Genton, D. E. Keyes, and G. M. Turkiyyah, “tlrmvnmvt: Computing high-dimensional multivariate normal and student-t probabilities with low-rank methods in r,” Journal of Statistical Software, vol. 101, pp. 1–25, 2022.
  18. C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, “StarPU: A unified platform for task scheduling on heterogeneous multicore architectures,” Concurrency Computat. Pract. Exper., vol. 23, pp. 187–198, 2011.
  19. N. Anceschi, A. Fasano, D. Durante, and G. Zanella, “Bayesian conjugacy in probit, tobit, multinomial probit and extensions: A review and new results,” Journal of the American Statistical Association, no. just-accepted, pp. 1–64, 2023.
  20. A. Genz, “Numerical computation of multivariate normal probabilities,” Journal of computational and graphical statistics, vol. 1, no. 2, pp. 141–149, 1992.
  21. M. G. Genton, D. E. Keyes, and G. Turkiyyah, “Hierarchical decompositions for the computation of high-dimensional multivariate normal probabilities,” Journal of Computational and Graphical Statistics, vol. 27, no. 2, pp. 268–277, 2018.
  22. G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, T. Herault, and J. Dongarra, “PaRSEC: Exploiting heterogeneity to enhance scalability,” Computing in Science Engineering, vol. 15, no. 6, pp. 36–45, Nov 2013.
  23. A. Yarkhan, J. Kurzak, and J. Dongarra, “Quark users’ guide,” Electrical Engineering and Computer Science, Innovative Computing Laboratory, University of Tennessee, vol. 268, 2011.
  24. S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “Tile low-rank approximation of large-scale maximum likelihood estimation on manycore architectures,” 2018.
  25. S. Mondal, S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “Tile low-rank approximations of non-Gaussian space and space-time tukey g-and-h random field likelihoods and predictions on large-scale systems,” Journal of Parallel and Distributed Computing, vol. 180, p. 104715, 2023.
  26. H. Ltaief, J. Cranney, D. Gratadour, Y. Hong, L. Gatineau, and D. Keyes, “Meeting the real-time challenges of ground-based telescopes using low-rank matrix computations,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021, pp. 1–16.
  27. H. Ltaief, Y. Hong, L. Wilson, M. Jacquelin, M. Ravasi, and D. E. Keyes, “Scaling the “Memory Wall” for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 Systems.”   ACM/IEEE, 2023.
  28. T. Gneiting, W. Kleiber, and M. Schlather, “Matérn cross-covariance functions for multivariate random fields,” Journal of the American Statistical Association, vol. 105, no. 491, pp. 1167–1177, 2010.
  29. S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “Exageostat: A high performance unified software for geostatistics on manycore systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 12, pp. 2771–2784, 2018.
  30. P. Giani, F. Tagle, M. G. Genton, S. Castruccio, and P. Crippa, “Closing the gap between wind energy targets and implementation for emerging countries,” Applied Energy, vol. 269, p. 115085, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0306261920305973
  31. W. Chen, S. Castruccio, M. G. Genton, and P. Crippa, “Current and future estimates of wind energy potential over saudi arabia,” Journal of Geophysical Research: Atmospheres, vol. 123, no. 12, pp. 6443–6459, 2018. [Online]. Available: https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2017JD028212
  32. E. González-Estrada and W. Cosmes, “Shapiro–wilk test for skew normal distributions based on data transformations,” Journal of Statistical Computation and Simulation, vol. 89, no. 17, pp. 3258–3272, 2019.
  33. Z. I. Botev, “The normal law under linear restrictions: simulation and estimation via minimax tilting,” Journal of the Royal Statistical Society. Series B (Statistical Methodology), pp. 125–148, 2017.
  34. D. Azzimonti and D. Ginsbourger, “Estimating orthant probabilities of high-dimensional Gaussian vectors with an application to set estimation,” Journal of Computational and Graphical Statistics, vol. 27, no. 2, pp. 255–267, 2018.
  35. T. P. Barnett, D. W. Pierce, and R. Schnur, “Detection of anthropogenic climate change in the world’s oceans,” Science, vol. 292, no. 5515, pp. 270–274, 2001.
  36. S. Marsili-Libelli, S. Guerrizio, and N. Checchi, “Confidence regions of estimated parameters for ecological systems,” Ecological Modelling, vol. 165, no. 2-3, pp. 127–146, 2003.

Summary

We haven't generated a summary for this paper yet.