Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Training of Probabilistic Neural Networks for Survival Analysis (2404.06421v3)

Published 9 Apr 2024 in cs.LG

Abstract: Variational Inference (VI) is a commonly used technique for approximate Bayesian inference and uncertainty estimation in deep learning models, yet it comes at a computational cost, as it doubles the number of trainable parameters to represent uncertainty. This rapidly becomes challenging in high-dimensional settings and motivates the use of alternative techniques for inference, such as Monte Carlo Dropout (MCD) or Spectral-normalized Neural Gaussian Process (SNGP). However, such methods have seen little adoption in survival analysis, and VI remains the prevalent approach for training probabilistic neural networks. In this paper, we investigate how to train deep probabilistic survival models in large datasets without introducing additional overhead in model complexity. To achieve this, we adopt three probabilistic approaches, namely VI, MCD, and SNGP, and evaluate them in terms of their prediction performance, calibration performance, and model complexity. In the context of probabilistic survival analysis, we investigate whether non-VI techniques can offer comparable or possibly improved prediction performance and uncertainty calibration compared to VI. In the MIMIC-IV dataset, we find that MCD aligns with VI in terms of the concordance index (0.748 vs. 0.743) and mean absolute error (254.9 vs. 254.7) using hinge loss, while providing C-calibrated uncertainty estimates. Moreover, our SNGP implementation provides D-calibrated survival functions in all datasets compared to VI (4/4 vs. 2/4, respectively). Our work encourages the use of techniques alternative to VI for survival analysis in high-dimensional datasets, where computational efficiency and overhead are of concern.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. H. Jiang, B. Kim, M. Guan, and M. Gupta, “To trust or not to trust a classifier,” in Advances in Neural Information Processing Systems, vol. 31, 2018, pp. 5546––5557.
  2. Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in Proceedings of The 33rd International Conference on Machine Learning, vol. 48, 2016, pp. 1050–1059.
  3. E. Hüllermeier and W. Waegeman, “Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods,” Machine Learning, vol. 110, no. 3, pp. 457–506, 2021.
  4. M. Magris and A. Iosifidis, “Bayesian learning for neural networks: an algorithmic survey,” Artificial Intelligence Review, pp. 1–51, 2023.
  5. A. Kendall and Y. Gal, “What uncertainties do we need in bayesian deep learning for computer vision?” in Advances in Neural Information Processing Systems, vol. 30, 2017.
  6. D. R. Cox, “Regression models and life-tables,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 34, no. 2, pp. 187–202, 1972.
  7. D. W. Kim, S. Lee, S. Kwon, W. Nam, I.-H. Cha, and H. J. Kim, “Deep learning-based survival prediction of oral cancer patients,” Scientific Reports, vol. 9, no. 1, p. 6994, 2019.
  8. E. Giunchiglia, A. Nemchenko, and M. van der Schaar, “RNN-SURV: A Deep Recurrent Model for Survival Analysis,” in Artificial Neural Networks and Machine Learning – ICANN 2018, vol. 11141, 2018, pp. 23–32.
  9. J. Katzman, U. Shaham, J. Bates, A. Cloninger, T. Jiang, and Y. Kluger, “Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network,” BMC Medical Research Methodology, vol. 18, no. 1, 2018.
  10. M. F. Gensheimer and B. Narasimhan, “A scalable discrete-time survival model for neural networks,” PeerJ, vol. 7, pp. 1–17, 2019.
  11. L. Zhao and D. Feng, “Deep Neural Networks for Survival Analysis Using Pseudo Values,” IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 11, pp. 3308–3314, 2020.
  12. L. Zhao, “Deep neural networks for predicting restricted mean survival times,” Bioinformatics, vol. 36, no. 24, pp. 5672–5677, 2021.
  13. C. M. Lillelund, M. Magris, and C. F. Pedersen, “Uncertainty estimation in deep bayesian survival models,” in 2023 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 2023, pp. 1–4.
  14. S.-a. Qi, N. Kumar, R. Verma, J.-Y. Xu, G. Shen-Tu, and R. Greiner, “Using bayesian neural networks to select features and compute credible intervals for personalized survival prediction,” IEEE Transactions on Biomedical Engineering, vol. 70, no. 12, pp. 3389–3400, 2023.
  15. H. Loya, P. Poduval, D. Anand, N. Kumar, and A. Sethi, “Uncertainty estimation in cancer survival prediction,” Accepted at AI4AH Workshop at ICLR 2020, 2020.
  16. A. G. Roy, S. Conjeti, N. Navab, and C. Wachinger, “Bayesian quicknat: Model uncertainty in deep whole-brain segmentation for structure-wise quality control,” NeuroImage, vol. 195, pp. 11–22, 2019.
  17. J. Lee, J. Feng, M. Humt, M. G. Müller, and R. Triebel, “Trust your robots! predictive uncertainty estimation of neural networks with sparse gaussian processes,” in Conference on Robot Learning, 2022, pp. 1168–1179.
  18. Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. Dillon, B. Lakshminarayanan, and J. Snoek, “Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift,” in Advances in Neural Information Processing Systems, vol. 32, 2019, p. 14003–14014.
  19. A. G. Wilson and P. Izmailov, “Bayesian deep learning and a probabilistic perspective of generalization,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 4697–4708.
  20. C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural networks,” in Advances in Neural Information Processing Systems, vol. 37, 2015, p. 1613–1622.
  21. B. He and S. Luo, “Joint modeling of multivariate longitudinal measurements and survival data with applications to parkinson’s disease,” Statistical Methods in Medical Research, vol. 25, no. 4, pp. 1346–1358, 2016.
  22. V. G. Hennessey, L. G. Leon-Novelo, J. Li, L. Zhu, E. Chi, and J. G. Ibrahim, “A bayesian joint model for longitudinal das28 scores and competing risk informative drop out in a rheumatoid arthritis clinical trial,” arXiv:1801.08628, 2018.
  23. H. Zhu, S. M. DeSantis, and S. Luo, “Joint modeling of longitudinal zero-inflated count and time-to-event data: A bayesian perspective,” Statistical Methods in Medical Research, vol. 27, no. 4, pp. 1258–1270, 2018.
  24. S. Sinharay, “Experiences with markov chain monte carlo convergence assessment in two psychometric examples,” Journal of Educational and Behavioral Statistics, vol. 29, no. 4, pp. 461–488, 2004.
  25. A. Graves, “Practical variational inference for neural networks,” in Advances in Neural Information Processing Systems, vol. 24, 2011.
  26. J. Liu, Z. Lin, S. Padhy, D. Tran, T. Bedrax Weiss, and B. Lakshminarayanan, “Simple and principled uncertainty estimation with deterministic deep learning via distance awareness,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 7498–7512.
  27. I. Osband, C. Blundell, A. Pritzel, and B. V. Roy, “Deep exploration via bootstrapped dqn,” in Advances in Neural Information Processing Systems, vol. 29, 2016.
  28. T. Pearce, N. Anastassacos, M. Zaki, and A. Neely, “Bayesian inference with anchored ensembles of neural networks, and application to exploration in reinforcement learning,” Published at the Exploration in Reinforcement Learning Work- shop at the 35th International Conference on Machine Learning, 2018.
  29. J. Caldeira and B. Nord, “Deeply uncertain: comparing methods of uncertainty quantification in deep learning algorithms,” Machine Learning: Science and Technology, vol. 2, no. 1, p. 015002, 2020.
  30. S. Boluki, R. Ardywibowo, S. Z. Dadaneh, M. Zhou, and X. Qian, “Learnable bernoulli dropout for bayesian deep learning,” in Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 37, 2020.
  31. F. Verdoja and V. Kyrki, “Notes on the behavior of mc dropout,” Presented at the ICML 2021 Workshop on ”Uncertainty and Robustness in Deep Learning”, 2021.
  32. D. Alvares, E. Lázaro, V. Gómez-Rubio, and C. Armero, “Bayesian survival analysis with bugs,” Statistics in Medicine, vol. 40, no. 12, pp. 2975–3020, 2021.
  33. R. Li, C. Chang, J. M. Justesen, Y. Tanigawa, J. Qian, T. Hastie, M. A. Rivas, and R. Tibshirani, “Fast Lasso method for large-scale and ultrahigh-dimensional Cox model with applications to UK Biobank,” Biostatistics, vol. 23, no. 2, pp. 522–540, 2020.
  34. Z. Zhang, A. Stringer, P. Brown, and J. Stafford, “Bayesian inference for cox proportional hazard models with partial likelihoods, nonlinear covariate effects and correlated observations,” Statistical Methods in Medical Research, vol. 32, no. 1, pp. 165–180, 2023.
  35. C. Nagpal, X. Li, and A. Dubrawski, “Deep survival machines: Fully parametric survival regression and representation learning for censored data with competing risks,” IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 8, 2021.
  36. C.-N. Yu, R. Greiner, H.-C. Lin, and V. Baracos, “Learning patient-specific cancer survival distributions as a sequence of dependent regressors,” in Advances in Neural Information Processing Systems, vol. 24, 2011.
  37. E. L. Kaplan and P. Meier, “Nonparametric estimation from incomplete observations,” Journal of the American Statistical Association, vol. 53, no. 282, pp. 457–481, 1958.
  38. Y. Gal, “Uncertainty in deep learning,” Ph.D. dissertation, University of Cambridge, 2016.
  39. J. Liu, Z. Lin, S. Padhy, D. Tran, T. Bedrax Weiss, and B. Lakshminarayanan, “Simple and principled uncertainty estimation with deterministic deep learning via distance awareness,” Advances in Neural Information Processing Systems, vol. 33, pp. 7498–7512, 2020.
  40. C. Lee, W. Zame, J. Yoon, and M. Van Der Schaar, “Deephit: A deep learning approach to survival analysis with competing risks,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.
  41. METABRIC Group, C. Curtis, S. P. Shah, S.-F. Chin, G. Turashvili, O. M. Rueda, M. J. Dunning, D. Speed, A. G. Lynch, S. Samarajiwa, Y. Yuan, S. Gräf, G. Ha, G. Haffari, A. Bashashati, R. Russell, S. McKinney, A. Langerød, A. Green, E. Provenzano, G. Wishart, S. Pinder, P. Watson, F. Markowetz, L. Murphy, I. Ellis, A. Purushotham, A.-L. Børresen-Dale, J. D. Brenton, S. Tavaré, C. Caldas, and S. Aparicio, “The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups,” Nature, vol. 486, no. 7403, pp. 346–352, 2012.
  42. L. A. Gloeckler Ries, M. E. Reichman, D. R. Lewis, B. F. Hankey, and B. K. Edwards, “Cancer survival and incidence from the surveillance, epidemiology, and end results (SEER) program,” Oncologist, vol. 8, no. 6, pp. 541–552, 2003.
  43. W. A. Knaus, “The support prognostic model: Objective estimates of survival for seriously ill hospitalized adults,” Annals of Internal Medicine, vol. 122, no. 3, pp. 191–203, 1995.
  44. A. E. W. Johnson, L. Bulgarelli, L. Shen, A. Gayles, A. Shammout, S. Horng, T. J. Pollard, S. Hao, B. Moody, B. Gow, L.-w. H. Lehman, L. A. Celi, and R. G. Mark, “MIMIC-IV, a freely accessible electronic health record dataset,” Scientific Data, vol. 10, no. 1, p. 1, 2023.
  45. N. Simon, J. Friedman, T. Hastie, and R. Tibshirani, “Regularization paths for cox’s proportional hazards model via coordinate descent,” Journal of statistical software, vol. 39, no. 5, pp. 1–13, 2011.
  46. J. H. Friedman, “Greedy function approximation: A gradient boosting machine.” The Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, 2001.
  47. H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer, “Random survival forests,” The Annals of Applied Statistics, vol. 2, no. 3, pp. 841–860, 2008.
  48. C. Nagpal, S. Yadlowsky, N. Rostamzadeh, and K. Heller, “Deep Cox Mixtures for Survival Regression,” in Proceedings of Machine Learning Research, vol. 126, 2021, pp. 1–27.
  49. J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Advances in Neural Information Processing Systems, vol. 25, 2012.
  50. L. Antolini, P. Boracchi, and E. Biganzoli, “A time-dependent discrimination index for survival data,” Statistics in Medicine, vol. 24, no. 24, pp. 3927–3944, 2005.
  51. S.-a. Qi, W. Sun, and R. Greiner, “SurvivalEVAL: A comprehensive open-source python package for evaluating individual survival distributions,” Proceedings of the AAAI Symposium Series, vol. 2, pp. 453–457, 2024.
  52. E. Graf, C. Schmoor, W. Sauerbrei, and M. Schumacher, “Assessment and comparison of prognostic classification schemes for survival data,” Statistics in Medicine, vol. 18, no. 17-18, pp. 2529–2545, 1999.
  53. P. C. Austin and E. W. Steyerberg, “The integrated calibration index (ici) and related metrics for quantifying the calibration of logistic regression models,” Statistics in Medicine, vol. 38, no. 21, pp. 4051–4065, 2019.
  54. H. Haider, B. Hoehn, S. Davis, and R. Greiner, “Effective ways to build and evaluate individual survival distributions,” Journal of Machine Learning Research, vol. 21, no. 1, pp. 1–63, 2020.
  55. T. Hothorn, P. Bühlmann, S. Dudoit, A. Molinaro, and M. J. Van Der Laan, “Survival ensembles,” Biostatistics, vol. 7, no. 3, pp. 355–373, 2005.
  56. H. Steck, B. Krishnapuram, C. Dehing-oberije, P. Lambin, and V. C. Raykar, “On ranking in survival analysis: Bounds on the concordance index,” in Advances in Neural Information Processing Systems, vol. 20, 2007.
  57. S. Qi, N. Kumar, M. Farrokh, W. Sun, L. Kuan, R. Ranganath, R. Henao, and R. Greiner, “An effective meaningful way to evaluate survival models,” Proceedings of Machine Learning Research, vol. 202, pp. 28 244–28 276, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets