Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interpretable Machine Learning for Survival Analysis (2403.10250v2)

Published 15 Mar 2024 in stat.ML, cs.LG, and stat.ME

Abstract: With the spread and rapid advancement of black box machine learning models, the field of interpretable machine learning (IML) or explainable artificial intelligence (XAI) has become increasingly important over the last decade. This is particularly relevant for survival analysis, where the adoption of IML techniques promotes transparency, accountability and fairness in sensitive areas, such as clinical decision making processes, the development of targeted therapies, interventions or in other medical or healthcare related contexts. More specifically, explainability can uncover a survival model's potential biases and limitations and provide more mathematically sound ways to understand how and which features are influential for prediction or constitute risk factors. However, the lack of readily available IML methods may have deterred medical practitioners and policy makers in public health from leveraging the full potential of machine learning for predicting time-to-event data. We present a comprehensive review of the limited existing amount of work on IML methods for survival analysis within the context of the general IML taxonomy. In addition, we formally detail how commonly used IML methods, such as such as individual conditional expectation (ICE), partial dependence plots (PDP), accumulated local effects (ALE), different feature importance measures or Friedman's H-interaction statistics can be adapted to survival outcomes. An application of several IML methods to real data on data on under-5 year mortality of Ghanaian children from the Demographic and Health Surveys (DHS) Program serves as a tutorial or guide for researchers, on how to utilize the techniques in practice to facilitate understanding of model decisions or predictions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) 1972; 34(2): 187–220.
  2. doi: 10.1038/s41416-018-0302-8
  3. Martinussen T, Vansteelandt S. On collapsibility and confounding bias in Cox and Aalen regression models. Lifetime Data Analysis 2013; 19(3): 279–296. doi: 10.1007/s10985-013-9242-z
  4. Martinussen T. Causality and the Cox regression model. Annual Review of Statistics and Its Application 2022; 9: 249–259. doi: https://doi.org/10.1146/annurev-statistics-040320-114441
  5. doi: https://doi.org/10.1145/3214306
  6. doi: 10.1214/08-AOAS169
  7. Zhao L, Feng D. DNNSurv: Deep Neural Networks for Survival Analysis Using Pseudo Values.; 2020
  8. Vellido A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural computing and applications 2020; 32(24): 18069–18083. doi: https://doi.org/10.1007/s00521-019-04051-w
  9. doi: 10.1001/amajethics.2019.167
  10. doi: doi: 10.1056/NEJMsa1507092
  11. doi: https://doi.org/10.1038/s42256-021-00373-4
  12. doi: 10.7326/M18-1990
  13. doi: 10.1001/jamainternmed.2018.3763
  14. Barrajón E, Barrajón L. Effect of right censoring bias on survival analysis.; 2020.
  15. Bland JM, Altman DG. The logrank test. Bmj 2004; 328(7447): 1073.
  16. doi: https://doi.org/10.1016/j.knosys.2022.110234
  17. Molnar C. Interpretable Machine Learning. 2 ed. 2022.
  18. doi: https://doi.org/10.1016/j.knosys.2020.106164
  19. doi: https://doi.org/10.15388/21-INFOR468
  20. doi: 10.1016/j.neunet.2021.12.015
  21. doi: 10.1109/ACCESS.2021.3108341
  22. Rahman MM, Purushotham S. PseudoNAM: A pseudo value based interpretable neural additive model for survival analysis. In: CEUR; 2021
  23. Xu L, Guo C. CoxNAM: An interpretable deep survival analysis model. Expert Systems with Applications 2023; 227: 120218. doi: https://doi.org/10.1016/j.eswa.2023.120218
  24. doi: 10.1109/TAI.2023.3266418
  25. doi: https://doi.org/10.1155/2022/8167821
  26. doi: 10.1109/ACCESS.2021.3127881
  27. Alex Goldstein JB, Pitkin E. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics 2015; 24(1): 44-65. doi: https://doi.org/10.1080/10618600.2014.907095
  28. doi: https://doi.org/10.48550/arXiv.1711.00399
  29. Eberhart R, Kennedy J. Particle swarm optimization. In: . 4. Citeseer. ; 1995: 1942–1948
  30. Kovalev MS, Utkin LV. A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov–Smirnov bounds. Neural Networks 2020; 132: 1–18.
  31. Shapley LS. 17. A Value for n-Person Games: 307–318; Princeton: Princeton University Press . 1953
  32. doi: https://doi.org/10.1038/s42256-019-0138-9
  33. doi: https://doi.org/10.1016/j.artint.2021.103502
  34. doi: https://doi.org/10.48550/arXiv.2111.13507
  35. doi: 10.1371/journal.pone.0130140
  36. doi: 10.1093/bioinformatics/btad113
  37. arXiv:1703.01365 [cs]
  38. Fong RC, Vedaldi A. Interpretable explanations of black boxes by meaningful perturbation. In: IEE; 2017: 3429–3437
  39. doi: 10.1038/s42256-021-00343-w
  40. Friedman JH. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 2001; 29(5): 1189–1232.
  41. Apley DW, Zhu J. Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society Series B: Statistical Methodology 2020; 82(4): 1059-1086. doi: 10.1111/rssb.12377
  42. Breiman L. Random forests. Machine learning 2001; 45: 5–32. doi: https://doi.org/10.1023/A:1010933404324
  43. doi: https://doi.org/10.1080/01621459.2017.1307116
  44. doi: https://doi.org/10.1186/1471-2105-9-307
  45. doi: 10.1007/s11222-021-10057-z
  46. Watson DS, Wright MN. Testing conditional independence in supervised learning algorithms. Machine Learning 2021; 110(8): 2107–2129. doi: https://doi.org/10.1007/s10994-021-06030-6
  47. doi: https://doi.org/10.1007/s10182-023-00477-9
  48. doi: https://doi.org/10.1093/bioinformatics/btq134
  49. Hooker G. Generalized functional anova diagnostics for high-dimensional functions of dependent variables. Journal of Computational and Graphical Statistics 2007; 16(3): 709–732. doi: 10.1198/106186007X237892
  50. Hastie T, Tibshirani R. Generalized Additive Models. Statistical Science 1986; 1(3): 297 – 310. doi: 10.1214/ss/1177013604
  51. Hastie T, Tibshirani R. Generalized additive models for medical research. Statistical Methods in Medical Research 1995; 4(3): 187–196. doi: 10.1177/096228029500400302
  52. doi: 10.1177/1471082X17748083
  53. doi: https://doi.org/10.48550/arXiv.2203.00870
  54. doi: 10.1148/radiol.2021210902
  55. doi: https://doi.org/10.1002/sim.4492
  56. doi: https://doi.org/10.1007/s11634-023-00537-7
  57. Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. John Wiley & Sons . 2002
  58. Xie Y, Yu Z. Mixture cure rate models with neural network estimated nonparametric components. Computational Statistics 2021; 36(4): 2467–2489. doi: https://doi.org/10.1007/s00180-021-01086-3
  59. Wu J, Witten D. Flexible and interpretable models for survival data. Journal of Computational and Graphical Statistics 2019; 28(4): 954–966. doi: 10.1080/10618600.2019.1592758
  60. doi: 10.1002/sim.3743
  61. Chen GH. Survival Kernets: Scalable and interpretable deep kernel survival analysis with an accuracy guarantee.; 2023
  62. Kpotufe S, Verma N. Time-accuracy tradeoffs in kernel prediction: controlling prediction quality. Journal of Machine Learning Research 2017; 18: 1443–1471.
  63. doi: https://doi.org/10.1016/j.csbj.2021.04.067
  64. doi: 10.1371/journal.pcbi.1006076
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com