Application of machine learning for hematological diagnosis (1708.00253v1)

Published 1 Aug 2017 in stat.ML

Abstract: Quick and accurate medical diagnosis is crucial for the successful treatment of a disease. Using machine learning algorithms, we have built two models to predict a hematologic disease, based on laboratory blood test results. In one predictive model, we used all available blood test parameters and in the other a reduced set, which is usually measured upon patient admittance. Both models produced good results, with a prediction accuracy of 0.88 and 0.86, when considering the list of five most probable diseases, and 0.59 and 0.57, when considering only the most probable disease. Models did not differ significantly from each other, which indicates that a reduced set of parameters contains a relevant fingerprint of a disease, expanding the utility of the model for general practitioner's use and indicating that there is more information in the blood test results than physicians recognize. In the clinical test we showed that the accuracy of our predictive models was on a par with the ability of hematology specialists. Our study is the first to show that a machine learning predictive model based on blood tests alone, can be successfully applied to predict hematologic diseases and could open up unprecedented possibilities in medical diagnosis.

Citations (160)

View on Semantic Scholar

Summary

The paper develops two random forest models (SBA-HEM181 and SBA-HEM061) using 8233 cases to predict 43 hematologic diseases.
It demonstrates that while single disease predictions achieve 57–59% accuracy, top-5 predictions improve to 86–88% accuracy.
The study shows that machine learning models perform comparably to hematology specialists, highlighting their potential as diagnostic aids.

Application of Machine Learning for Hematological Diagnosis: A Critical Evaluation

The paper "Application of Machine Learning for Hematological Diagnosis" by Gunčar et al. presents a paper investigating the use of machine learning algorithms to predict hematologic diseases using laboratory blood test data. The research focuses on two predictive models crafted from extensive blood test parameters, demonstrating a significant foray into the integration of machine learning within medical diagnostics, particularly hematology.

Methodology and Model Performance

The paper utilizes a random forest algorithm to develop two distinct models: SBA-HEM181 and SBA-HEM061, employing 181 and 61 blood parameters respectively. These models were built using a dataset derived from 8233 cases collected over a decade at the University Medical Centre Ljubljana, involving 43 different hematological disease categories. The stratified ten-fold cross-validation was employed to ascertain model effectiveness, with the SBA-HEM181 model achieving a predictive accuracy rate of 59% and SBA-HEM061 achieving 57%. Notably, when the top five predictions are considered, the accuracies jumped significantly to 88% and 86%, suggesting the models successfully capture the relevant "fingerprint" of diseases from reduced data inputs.

Comparison with Human Experts

A particular highlight of this paper is the comparative analysis between the machine learning models and experienced clinicians. When evaluated against hematology specialists, the algorithms demonstrated capacities on par with expert predictions, achieving 60% accuracy comparable to specialists' 62% when considering a single probable disease. The performance advantage is pronounced against internal medicine specialists, affirming machine learning's potential to outperform non-specialists in recognizing complex patterns in hematologic parameters.

Implications and Future Prospects

This research underscores a pivotal implication: the latent potential within laboratory blood test data, which often goes underutilized in manual diagnoses, can be leveraged by machine learning to improve diagnostic accuracy and speed. The findings suggest substantial information redundancy and interdependency among blood parameters, advocating for streamlined diagnostics using fewer yet informed laboratory tests.

Machine learning models like those investigated in this paper offer a promising supplement to current medical practices. They could particularly aid non-specialists in identifying diseases early, thereby refining referral practices and alleviating diagnostic burdens. Furthermore, these models hold potential integration into clinical decision support systems, providing a robust tool to improve interpretative accuracy in routine diagnostics.

Conclusion

Gunčar et al.'s exploration of machine learning for hematological diagnosis presents a compelling case for the utility of computational techniques in enhancing diagnostic yields from routine laboratory tests. By illustrating comparative efficacy alongside human specialists, the research highlights the changing landscape of medical diagnostics, poised towards incorporating machine learning as a standard practice augmenting human expertise. The paper’s insights may inspire further exploration across other domains of internal medicine, fostering a broader adaptation of machine learning methodologies in clinical settings.