Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 34 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 461 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Optimal Sparse Survival Trees (2401.15330v3)

Published 27 Jan 2024 in cs.LG

Abstract: Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for survival analysis due to their appealing interpretablility and their ability to capture complex relationships. However, most existing methods to produce survival trees rely on heuristic (or greedy) algorithms, which risk producing sub-optimal models. We present a dynamic-programming-with-bounds approach that finds provably-optimal sparse survival tree models, frequently in only a few seconds.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Optimal survival trees. Machine Learning, 111(8):2951–3023, 2022.
  2. Recurrent neural networks for multivariate time series with missing values. Scientific reports, 8(1):6085, 2018.
  3. Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS computational biology, 14(4):e1006076, 2018.
  4. M. Choi. Kaggle insurance data, 2018. URL https://www.kaggle.com/datasets/mirichoi0218/insurance.
  5. Recpam: a computer program for recursive partition and amalgamation for censored survival data and other situations frequently occurring in biostatistics. i. methods and program features. Computer methods and programs in biomedicine, 26(3):239–256, 1988.
  6. J. Council. Data challenges are halting ai projects, ibm executive says. The Wall Street Journal, 2019. URL https://www.wsj.com/articles/data-challenges-are-halting-ai-projects-ibm-executive-says-11559035800.
  7. D. R. Cox. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2):187–202, 1972.
  8. D. R. Cox. Analysis of binary data. Routledge, 2018.
  9. Exponential survival trees. Statistics in medicine, 8(8):947–961, 1989.
  10. Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. In Mayo Clinic Proceedings, volume 87, pages 517–523. Elsevier, 2012.
  11. D. Dua and C. Graff. UCI machine learning repository. http://archive.ics.uci.edu/ml, 2017. Accessed: 2022-04-01.
  12. J. Dunn. Optimal Trees for Prediction and Prescription. PhD thesis, Massachusetts Institute of Technology, 2018.
  13. Twelve key challenges in medical machine learning and solutions. Intelligence-Based Medicine, 6:100068, 2022. ISSN 2666-5212. doi: https://doi.org/10.1016/j.ibmed.2022.100068. URL https://www.sciencedirect.com/science/article/pii/S2666521222000217.
  14. S. Fotso et al. PySurvival: Open source package for survival analysis modeling, 2019–. URL https://www.pysurvival.io/.
  15. Rnn-surv: A deep recurrent model for survival analysis. In Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27, pages 23–32. Springer, 2018.
  16. L. Gordon and R. A. Olshen. Tree-structured survival analysis. Cancer treatment reports, 69(10):1065–1069, 1985.
  17. Assessment and comparison of prognostic classification schemes for survival data. Statistics in medicine, 18(17-18):2529–2545, 1999.
  18. evtree: Evolutionary learning of globally optimal classification and regression trees in r. Journal of Statistical Software, 61:1–29, 2014.
  19. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in medicine, 15(4):361–387, 1996.
  20. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical statistics, 15(3):651–674, 2006.
  21. ctree: Conditional inference trees. The comprehensive R archive network, 8, 2015.
  22. Optimal sparse decision trees. In Proceedings of Conference on Neural Information Processing Systems (NeurIPS), 2019.
  23. H. Hung and C.-T. Chiang. Estimation methods for time-dependent auc models with survival data. Canadian Journal of Statistics, 38(1):8–26, 2010.
  24. H. Ishwaran and U. B. Kogalur. Random survival forests for r. R news, 7(2):25–31, 2007.
  25. Alternative tree-structured survival analysis based on variance of survival time. Medical Decision Making, 24(6):670–680, 2004.
  26. The statistical analysis of failure time data. John Wiley & Sons, 2011.
  27. E. L. Kaplan and P. Meier. Nonparametric estimation from incomplete observations. Journal of the American statistical association, 53(282):457–481, 1958.
  28. Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC medical research methodology, 18(1):1–12, 2018.
  29. S. Keles and M. R. Segal. Residual-based tree-structured survival analysis. Statistics in medicine, 21(2):313–326, 2002.
  30. D. G. Kleinbaum and M. Klein. Survival analysis a self-learning text. Springer, 1996.
  31. J. Lambert and S. Chevret. Summary measure of discrimination in survival models based on cumulative/dynamic time-dependent roc curves. Statistical methods in medical research, 25(5):2088–2102, 2016.
  32. J. F. Lawless. Statistical models and methods for lifetime data. John Wiley & Sons, 2011.
  33. M. LeBlanc and J. Crowley. Relative risk trees for censored survival data. Biometrics, pages 411–425, 1992.
  34. M. LeBlanc and J. Crowley. Survival trees by goodness of split. Journal of the American Statistical Association, 88(422):457–467, 1993.
  35. Applied survival analysis: regression modeling of time-to-event data. John Wiley & Sons, 2011.
  36. Generalized and scalable optimal sparse decision trees. In Proceedings of International Conference on Machine Learning (ICML), pages 6150–6160, 2020.
  37. Fast sparse decision tree optimization via reference ensembles. In Proceedings of AAAI Conference on Artificial Intelligence, 2022.
  38. Tree-based multivariate regression and density estimation with right-censored data. Journal of Multivariate Analysis, 90(1):154–177, 2004.
  39. Learning optimal decision trees using caching branch-and-bound search. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2020.
  40. L. Norton. A gompertzian model of human breast cancer growth. Cancer research, 48(24_Part_1):7067–7071, 1988.
  41. S. Pölsterl. scikit-survival: A library for time-to-event analysis built on top of scikit-learn. Journal of Machine Learning Research, 21(212):1–6, 2020. URL http://jmlr.org/papers/v21/20-729.html.
  42. Neural networks as statistical methods in survival analysis. Clinical applications of artificial neural networks, 237:255, 2001.
  43. Interpretable machine learning: Fundamental principles and 10 grand challenges. Statistics Surveys, 16(none):1 – 85, 2022. doi: 10.1214/21-SS133. URL https://doi.org/10.1214/21-SS133.
  44. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. german breast cancer study group. Journal of Clinical Oncology, 12(10):2086–2093, 1994.
  45. M. R. Segal. Regression trees for censored data. Biometrics, pages 35–47, 1988.
  46. T. Therneau and B. Atkinson. rpart: Recursive Partitioning and Regression Trees, 2019. URL https://CRAN.R-project.org/package=rpart. R package version 4.1-15.
  47. Martingale-based residuals for survival models. Biometrika, 77(1):147–160, 1990.
  48. Evaluating prediction rules for t-year survivors with censored regression models. Journal of the American Statistical Association, 102(478):527–537, 2007.
  49. On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in medicine, 30(10):1105–1117, 2011.
  50. H. Zhang. Splitting criteria in survival trees. In Statistical Modelling: Proceedings of the 10th International Workshop on Statistical Modelling Innsbruck, Austria, 10–14 July, 1995, pages 305–313. Springer, 1995.
  51. Optimal sparse regression trees. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 11270–11279, 2023.
Citations (3)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets