Comparison of Machine Learning Classification Algorithms and Application to the Framingham Heart Study (2402.15005v1)
Abstract: The use of machine learning algorithms in healthcare can amplify social injustices and health inequities. While the exacerbation of biases can occur and compound during the problem selection, data collection, and outcome definition, this research pertains to some generalizability impediments that occur during the development and the post-deployment of machine learning classification algorithms. Using the Framingham coronary heart disease data as a case study, we show how to effectively select a probability cutoff to convert a regression model for a dichotomous variable into a classifier. We then compare the sampling distribution of the predictive performance of eight machine learning classification algorithms under four training/testing scenarios to test their generalizability and their potential to perpetuate biases. We show that both the Extreme Gradient Boosting, and Support Vector Machine are flawed when trained on an unbalanced dataset. We introduced and show that the double discriminant scoring of type I is the most generalizable as it consistently outperforms the other classification algorithms regardless of the training/testing scenario. Finally, we introduce a methodology to extract an optimal variable hierarchy for a classification algorithm, and illustrate it on the overall, male and female Framingham coronary heart disease data.
- Williams P; Kind E. Data-driven Policing: The hardwiring of discriminatory policing practices across Europe. Brussels, Belgium: European Network Against Racism (ENAR) 2019.
- Klein A. Reducing bias in AI-based financial services. In: Brookings Institute 2020.
- Ferryman K; Winn RA. Artificial intelligence can entrench disparities – Here’s what we must do. The Cancer Letter 2018, Nov. 16. https://cancerletter.com/articles/20181116_1/.
- Ghassemi M; Naumann T; SchulamP; BeamAL; ChenIY; Ranganath R. Practical guidance on artificial intelligence for health-care data. Lancet Digital Health 1 2019, 157–59.
- Ghassemi M; Naumann T; Schulam P; Beam A.L; ChenI Y; Ranganath R. A review of challenges and opportunities in machine learning for health. AMIA Summits Transl. Sci. Proc. 2020, 191–200.
- Farooq F; Mogayzel P.J; Lanzkron S; Haywood C; Strouse J.J. Comparison of US federal and foundation funding of research for sickle cell disease and cystic fibrosis and factors associated with research productivity. JAMA Netw Open 2020, 3(3):e201737.
- Rothwell PM. .External validity of randomised controlled trials: “To whom do the results of this trial apply?” Lancet 2005, 365, 82–93.
- Ferryman K; Pitcan M. Fairness in precision medicine. Res. Proj., Data & Society 2018. https://datasociety.net/research/fairness-precision-medicine/
- Abebe R; Hill S; Vaughan JW; Small PM; Schwartz HA. Using search queries to understand health information needs in Africa. In Proceedings of the Thirteenth International AAAI Conference on Web and Social Media 2019, 3–14.
- James S; Herman J; Rankin S; Keisling M; Mottet L; Anafi M. The report of the 2015 US transgender survey. Washington, DC: Natl. Cent. Transgend. Equal. 2016.
- Joshi S; Koyejo O; Kim B; Ghosh J. xGEMS: generating examplars to explain black-box models. arXiv:1806.08867 [cs.LG] 2018.
- Caruana R; Lou Y; Gehrke J; Koch P; Sturm M; Elhadad N. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2015, 1721–30. New York: Assoc. Comput. Mach.
- Zech J.R; Badgeley M.A; Liu M; Costa A.B; Titano J.J; Oermann E.K. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLOS Med. 2018, 15:e1002683.
- Seyyed-Kalantari L; Liu G; McDermott M; Ghassemi M. CheXclusion: fairness gaps in deep chest X-ray classifiers. arXiv:2003.00827 [cs.CV] 2020.
- Cox F.F. An introduction to multivariate data analysis. Oxford University Press 2005, New York.
- Github. https://github.com/dmlc/xgboost
- Ho T.K. The Random Subspace Method for Constructing Decision Forests" (PDF).IEEE Transactions on Pattern Analysis and Machine Intelligence 1998, 20 (8): 832–44.
- Bellman R.E. Dynamic Programming. Dover 1957.