A Systematic Review of AI Techniques in Breast Cancer Detection
The paper presented from the University of Sharjah conducts an extensive systematic literature review on the utilization of AI and ML for breast cancer detection and classification. Recognizing the significant impact and prevalence of breast cancer, this investigation reviews both genetic sequencing and histopathological imaging approaches, leveraging deep learning (DL) to enhance diagnostic accuracy and efficacy.
Methodological Overview
The research identifies and analyzes approximately 80 peer-reviewed papers from the last decade relevant to breast cancer detection leveraging AI, specifically focusing on DL and ML. The review encompasses techniques from ANN, CNN, DNN, SVM, and various hybrid models, and highlights performance metrics such as accuracy, specificity, and sensitivity. Notably, the paper isolates those works examining gene expression and imaging modalities, particularly encompassing datasets from both private and academic institutions.
Key Findings
- Model Effectiveness: CNN models consistently demonstrate superior accuracy in both binary and multiclass classifications. For genetic data, binary classification records a peak accuracy of 99.8%, while imaging data achieves 99.7% accuracy with sophisticated DL-ML hybrid models. Despite the overall success of CNNs, the paper notes the lack of exploration around attention mechanisms and other DL architectures such as GANs or LSTMs, suggesting areas of potential future research.
- Datasets and Feature Selection: The review emphasizes the availability of public datasets like the Cancer Genome Atlas and the METABRIC datasets for genetic sequencing, and resources such as the Wisconsin Breast Cancer Dataset and DDSM for imaging data, many of which are accessible and free. Feature selection techniques identified include PCA, XGboost, and CNN, emphasizing the nuanced differentiation between genetic and image features.
- Comparative Analysis: A crucial part of the paper compares genetic sequencing against imaging data, addressing the inherent trade-offs. Genetic data, while offering more precision and fewer but more potent features, are costly and computationally intense. Imaging data provides easier accessibility and leverage of CNN, though often requires rigorous preprocessing to negate irrelevant features.
Implications and Future Directions
This paper posits that advancing DL applications in breast cancer detection is not merely a matter of adopting singular sophisticated models but rather involves integrating multiple data types and refining model generalization. Future research could greatly benefit from focusing on feature selection to refine model predictions, employing newer DL architectures, and merging gene sequencing datasets for robust multi-class classifications. There is a clear indication for further investigation into model metrics beyond accuracy, considering AUC and confusion matrix parameters to ensure high clinical reliability and applicability.
Conclusion
While the review does not claim any groundbreaking revelations, it presents a comprehensive synthesis of recent advancements, barriers, and future opportunities in breast cancer research via AI. This structured approach provides a foundational dataset, methodological strategies for emerging researchers, and lucidity in the challenges and trajectories of breast cancer AI applications. This integration between genomic data and imaging, supported by a structured DL methodology, presents promising avenues for enhancing breast cancer diagnostics through AI.