Machine Learning-Based Heart Disease Diagnosis: A Systematic Literature Review (2112.06459v1)

Published 13 Dec 2021 in cs.LG

Abstract: Heart disease is one of the significant challenges in today's world and one of the leading causes of many deaths worldwide. Recent advancement of ML application demonstrates that using electrocardiogram (ECG) and patient data, detecting heart disease during the early stage is feasible. However, both ECG and patient data are often imbalanced, which ultimately raises a challenge for the traditional ML to perform unbiasedly. Over the years, several data level and algorithm level solutions have been exposed by many researchers and practitioners. To provide a broader view of the existing literature, this study takes a systematic literature review (SLR) approach to uncover the challenges associated with imbalanced data in heart diseases predictions. Before that, we conducted a meta-analysis using 451 referenced literature acquired from the reputed journals between 2012 and November 15, 2021. For in-depth analysis, 49 referenced literature has been considered and studied, taking into account the following factors: heart disease type, algorithms, applications, and solutions. Our SLR study revealed that the current approaches encounter various open problems/issues when dealing with imbalanced data, eventually hindering their practical applicability and functionality.

PDF Abstract

Systematic Review of Machine Learning-Based Heart Disease Diagnosis

The paper "Machine Learning-Based Heart Disease Diagnosis: A Systematic Literature Review" offers a comprehensive evaluation of the advancements in utilizing ML for the diagnosis of heart diseases, with a specific focus on the challenges posed by imbalanced datasets. As heart disease remains a prevalent cause of mortality worldwide, particularly in low- and middle-income countries, early detection through effective and accessible diagnostic tools is critical. The research conducted a systematic literature review (SLR) to scrutinize existing methodologies and explore potential improvements in diagnostic accuracy.

Key Contributions and Findings

This paper surveyed a significant corpus of literature, narrowing down 451 references to 49 for in-depth analysis. The investigation focused on heart disease types, classification algorithms, application scenarios, and methodologies for addressing data imbalance. This paper has several pivotal contributions and observations:

Heart Disease Types and Datasets:
- Predominantly studied heart conditions include arrhythmia, coronary artery disease, and myocardial infarction. The MIT-BIH arrhythmia and Cleveland datasets are frequently used due to their accessibility and comprehensive nature.
- Imbalance in data, typical in medical datasets where disease occurrence is significantly less prevalent, is a common problem that can bias ML models toward majority classes.
ML and DL Algorithm Adoption:
- Deep learning (DL), particularly Convolutional Neural Networks (CNN), has become the dominant approach due to its ability to handle complex image data such as ECGs, achieving high accuracy rates. For instance, CNN-based models have demonstrated accuracies nearing 99%.
- Generative Adversarial Networks (GANs) have recently gained traction for their capability to generate synthetic datasets that balance class distributions, benefiting the training process of DL and ML models.
Addressing Imbalanced Datasets:
- The review recognized both data-level techniques like Synthetic Minority Over-sampling (SMOTE) and algorithm-level solutions to mitigate the effects of imbalanced data.
- Advanced techniques like focal loss in CNNs have been adopted to enhance model sensitivity to minority classes, demonstrating improved performance metrics.
Evaluation Metrics:
- While accuracy is a common metric, the review highlights the importance of more informative measures like sensitivity, specificity, F1-score, and Area Under the Curve (AUC) to evaluate model performance comprehensively in imbalanced scenarios.

Practical and Theoretical Implications

This systematic literature review identifies critical gaps and opportunities within the field of ML-based heart disease diagnostics. Practically, it underscores the necessity for models that incorporate real-world patient data to increase clinical applicability and reliability. The paper advocates for the development of models that can effectively operate under heterogeneous and imbalanced data conditions, thus enhancing implementation potential for real-time diagnosis in clinical settings.

Theoretically, the research calls for a deeper exploration into explainable AI to address the black-box nature of DL models, improving their interpretability and trustworthiness for clinical practitioners. This necessity for explainability is not only vital for the adoption of AI in healthcare but also for compliance with regulatory standards.

Future Directions

Looking toward the future, the paper emphasizes several avenues for continued research, including:

Developing multi-disease diagnostic models capable of operating under varying data distributions and clinical environments.
Enhancements in interpretability frameworks to empower clinicians to understand and trust AI-driven diagnoses.
Expanding on the robustness of models against noise and variance in ECG signals, thus improving diagnostic accuracy and reliability.

In conclusion, this paper provides valuable insights into the current landscape of ML applications in heart disease diagnosis, addressing a pressing need for robust, interpretable, and scalable solutions amidst the challenges of imbalanced medical data. Future research efforts, as identified, have the potential to significantly advance AI capabilities in healthcare, contributing to improved patient outcomes globally.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Md Manjurul Ahsan (27 papers)
Zahed Siddique (14 papers)

Citations (195)

View on Semantic Scholar

Machine Learning-Based Heart Disease Diagnosis: A Systematic Literature Review (2112.06459v1)

Systematic Review of Machine Learning-Based Heart Disease Diagnosis

Key Contributions and Findings

Practical and Theoretical Implications

Future Directions

Related Papers