Interpretable Machine Learning for Survival Analysis (2403.10250v2)
Abstract: With the spread and rapid advancement of black box machine learning models, the field of interpretable machine learning (IML) or explainable artificial intelligence (XAI) has become increasingly important over the last decade. This is particularly relevant for survival analysis, where the adoption of IML techniques promotes transparency, accountability and fairness in sensitive areas, such as clinical decision making processes, the development of targeted therapies, interventions or in other medical or healthcare related contexts. More specifically, explainability can uncover a survival model's potential biases and limitations and provide more mathematically sound ways to understand how and which features are influential for prediction or constitute risk factors. However, the lack of readily available IML methods may have deterred medical practitioners and policy makers in public health from leveraging the full potential of machine learning for predicting time-to-event data. We present a comprehensive review of the limited existing amount of work on IML methods for survival analysis within the context of the general IML taxonomy. In addition, we formally detail how commonly used IML methods, such as such as individual conditional expectation (ICE), partial dependence plots (PDP), accumulated local effects (ALE), different feature importance measures or Friedman's H-interaction statistics can be adapted to survival outcomes. An application of several IML methods to real data on data on under-5 year mortality of Ghanaian children from the Demographic and Health Surveys (DHS) Program serves as a tutorial or guide for researchers, on how to utilize the techniques in practice to facilitate understanding of model decisions or predictions.
- Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) 1972; 34(2): 187–220.
- doi: 10.1038/s41416-018-0302-8
- Martinussen T, Vansteelandt S. On collapsibility and confounding bias in Cox and Aalen regression models. Lifetime Data Analysis 2013; 19(3): 279–296. doi: 10.1007/s10985-013-9242-z
- Martinussen T. Causality and the Cox regression model. Annual Review of Statistics and Its Application 2022; 9: 249–259. doi: https://doi.org/10.1146/annurev-statistics-040320-114441
- doi: https://doi.org/10.1145/3214306
- doi: 10.1214/08-AOAS169
- Zhao L, Feng D. DNNSurv: Deep Neural Networks for Survival Analysis Using Pseudo Values.; 2020
- Vellido A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural computing and applications 2020; 32(24): 18069–18083. doi: https://doi.org/10.1007/s00521-019-04051-w
- doi: 10.1001/amajethics.2019.167
- doi: doi: 10.1056/NEJMsa1507092
- doi: https://doi.org/10.1038/s42256-021-00373-4
- doi: 10.7326/M18-1990
- doi: 10.1001/jamainternmed.2018.3763
- Barrajón E, Barrajón L. Effect of right censoring bias on survival analysis.; 2020.
- Bland JM, Altman DG. The logrank test. Bmj 2004; 328(7447): 1073.
- doi: https://doi.org/10.1016/j.knosys.2022.110234
- Molnar C. Interpretable Machine Learning. 2 ed. 2022.
- doi: https://doi.org/10.1016/j.knosys.2020.106164
- doi: https://doi.org/10.15388/21-INFOR468
- doi: 10.1016/j.neunet.2021.12.015
- doi: 10.1109/ACCESS.2021.3108341
- Rahman MM, Purushotham S. PseudoNAM: A pseudo value based interpretable neural additive model for survival analysis. In: CEUR; 2021
- Xu L, Guo C. CoxNAM: An interpretable deep survival analysis model. Expert Systems with Applications 2023; 227: 120218. doi: https://doi.org/10.1016/j.eswa.2023.120218
- doi: 10.1109/TAI.2023.3266418
- doi: https://doi.org/10.1155/2022/8167821
- doi: 10.1109/ACCESS.2021.3127881
- Alex Goldstein JB, Pitkin E. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics 2015; 24(1): 44-65. doi: https://doi.org/10.1080/10618600.2014.907095
- doi: https://doi.org/10.48550/arXiv.1711.00399
- Eberhart R, Kennedy J. Particle swarm optimization. In: . 4. Citeseer. ; 1995: 1942–1948
- Kovalev MS, Utkin LV. A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov–Smirnov bounds. Neural Networks 2020; 132: 1–18.
- Shapley LS. 17. A Value for n-Person Games: 307–318; Princeton: Princeton University Press . 1953
- doi: https://doi.org/10.1038/s42256-019-0138-9
- doi: https://doi.org/10.1016/j.artint.2021.103502
- doi: https://doi.org/10.48550/arXiv.2111.13507
- doi: 10.1371/journal.pone.0130140
- doi: 10.1093/bioinformatics/btad113
- arXiv:1703.01365 [cs]
- Fong RC, Vedaldi A. Interpretable explanations of black boxes by meaningful perturbation. In: IEE; 2017: 3429–3437
- doi: 10.1038/s42256-021-00343-w
- Friedman JH. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 2001; 29(5): 1189–1232.
- Apley DW, Zhu J. Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society Series B: Statistical Methodology 2020; 82(4): 1059-1086. doi: 10.1111/rssb.12377
- Breiman L. Random forests. Machine learning 2001; 45: 5–32. doi: https://doi.org/10.1023/A:1010933404324
- doi: https://doi.org/10.1080/01621459.2017.1307116
- doi: https://doi.org/10.1186/1471-2105-9-307
- doi: 10.1007/s11222-021-10057-z
- Watson DS, Wright MN. Testing conditional independence in supervised learning algorithms. Machine Learning 2021; 110(8): 2107–2129. doi: https://doi.org/10.1007/s10994-021-06030-6
- doi: https://doi.org/10.1007/s10182-023-00477-9
- doi: https://doi.org/10.1093/bioinformatics/btq134
- Hooker G. Generalized functional anova diagnostics for high-dimensional functions of dependent variables. Journal of Computational and Graphical Statistics 2007; 16(3): 709–732. doi: 10.1198/106186007X237892
- Hastie T, Tibshirani R. Generalized Additive Models. Statistical Science 1986; 1(3): 297 – 310. doi: 10.1214/ss/1177013604
- Hastie T, Tibshirani R. Generalized additive models for medical research. Statistical Methods in Medical Research 1995; 4(3): 187–196. doi: 10.1177/096228029500400302
- doi: 10.1177/1471082X17748083
- doi: https://doi.org/10.48550/arXiv.2203.00870
- doi: 10.1148/radiol.2021210902
- doi: https://doi.org/10.1002/sim.4492
- doi: https://doi.org/10.1007/s11634-023-00537-7
- Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. John Wiley & Sons . 2002
- Xie Y, Yu Z. Mixture cure rate models with neural network estimated nonparametric components. Computational Statistics 2021; 36(4): 2467–2489. doi: https://doi.org/10.1007/s00180-021-01086-3
- Wu J, Witten D. Flexible and interpretable models for survival data. Journal of Computational and Graphical Statistics 2019; 28(4): 954–966. doi: 10.1080/10618600.2019.1592758
- doi: 10.1002/sim.3743
- Chen GH. Survival Kernets: Scalable and interpretable deep kernel survival analysis with an accuracy guarantee.; 2023
- Kpotufe S, Verma N. Time-accuracy tradeoffs in kernel prediction: controlling prediction quality. Journal of Machine Learning Research 2017; 18: 1443–1471.
- doi: https://doi.org/10.1016/j.csbj.2021.04.067
- doi: 10.1371/journal.pcbi.1006076