Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Multi-Stage Framework for Joint Chest X-Ray Diagnosis and Visual Attention Prediction Using Deep Learning (2403.16970v4)

Published 25 Mar 2024 in eess.IV, cs.CV, and cs.LG

Abstract: Purpose: As visual inspection is an inherent process during radiological screening, the associated eye gaze data can provide valuable insights into relevant clinical decisions. As deep learning has become the state-of-the-art for computer-assisted diagnosis, integrating human behavior, such as eye gaze data, into these systems is instrumental to help align machine predictions with clinical diagnostic criteria, thus enhancing the quality of automatic radiological diagnosis. Methods: We propose a novel deep learning framework for joint disease diagnosis and prediction of corresponding clinical visual attention maps for chest X-ray scans. Specifically, we introduce a new dual-encoder multi-task UNet, which leverages both a DenseNet201 backbone and a Residual and Squeeze-and-Excitation block-based encoder to extract diverse features for visual attention map prediction, and a multi-scale feature-fusion classifier to perform disease classification. To tackle the issue of asynchronous training schedules of individual tasks in multi-task learning, we proposed a multi-stage cooperative learning strategy, with contrastive learning for feature encoder pretraining to boost performance. Results: Our proposed method is shown to significantly outperform existing techniques for chest X-ray diagnosis (AUC=0.93) and the quality of visual attention map prediction (Correlation coefficient=0.58). Conclusion: Benefiting from the proposed multi-task multi-stage cooperative learning, our technique demonstrates the benefit of integrating clinicians' eye gaze into clinical AI systems to boost performance and potentially explainability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Gazeradar: A gaze and radiomics-guided disease localization framework. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 686–696. Springer, 2022.
  2. Radiotransformer: a cascaded global-focal transformer for visual attention–guided disease classification. In European Conference on Computer Vision, pages 679–698. Springer, 2022.
  3. Aggregated deep saliency prediction by self-attention network. In Intelligent Computing Methodologies: 16th International Conference, ICIC 2020, Bari, Italy, October 2–5, 2020, Proceedings, Part III 16, pages 87–97. Springer, 2020.
  4. Automated methods for detection and classification pneumonia based on x-ray images using deep learning. In Artificial intelligence and blockchain for future cybersecurity applications, pages 257–284. Springer, 2021.
  5. Tom Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861–874, 2006.
  6. Detection and classification of lung diseases for pneumonia and covid-19 using machine and deep learning techniques. Journal of Ambient Intelligence and Humanized Computing, 14(4):3239–3259, 2023.
  7. Deep metric learning using triplet network. In Similarity-Based Pattern Recognition: Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015. Proceedings 3, pages 84–92. Springer, 2015.
  8. Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data, 6(1):317, 2019.
  9. Creation and validation of a chest x-ray dataset with eye-tracking and report dictation for ai development. Scientific data, 8(1):92, 2021.
  10. Classification and visual explanation for covid-19 pneumonia from ct images using triple learning. Scientific Reports, 12(1):20840, 2022.
  11. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  12. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology, 284(2):574–582, 2017.
  13. Comparing radiologists’ gaze and saliency maps generated by interpretability methods for chest x-rays. arXiv preprint arXiv:2112.11716, 2021.
  14. Building extraction based on se-unet. Journal of Geo-Information Science, 2019.
  15. Integrating eye-gaze data into cxr dl approaches: A preliminary study. In 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pages 196–199. IEEE, 2023.
  16. Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999, 2018.
  17. Weakly supervised intracranial hemorrhage segmentation using head-wise gradient-infused self-attention maps from a swin transformer in categorical learning. arXiv preprint arXiv:2304.04902, 2023.
  18. Deep learning-based meta-classifier approach for covid-19 classification using ct scan and chest x-ray images. Multimedia systems, 28(4):1401–1415, 2022.
  19. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  20. Observational supervision for medical image classification using gaze data. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part II 24, pages 603–614. Springer, 2021.
  21. Grad-cam-based classification of chest x-ray images of pneumonia patients. In Advances in Signal Processing and Intelligent Recognition Systems: 6th International Symposium, SIRS 2020, Chennai, India, October 14–17, 2020, Revised Selected Papers 6, pages 161–174. Springer, 2021.
  22. Scandmm: A deep markov model of scanpath prediction for 360° images. In IEEE Conference on Computer Vision and Pattern Recognition, 2023.
  23. Analysis of perceptual expertise in radiology–current knowledge and a new perspective. Frontiers in human neuroscience, 13:213, 2019.
  24. Gazegnn: A gaze-guided graph neural network for disease classification. arXiv preprint arXiv:2305.18221, 2023.
  25. Follow my eye: Using gaze to supervise computer-aided diagnosis. IEEE Transactions on Medical Imaging, 41(7):1688–1698, 2022.
  26. Multi-task unet: Jointly boosting saliency prediction and disease classification on chest x-ray images. arXiv preprint arXiv:2202.07118, 2022.
  27. Gaze-guided class activation mapping: leveraging human attention for network attention in chest x-rays classification. arXiv preprint arXiv:2202.07107, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Zirui Qiu (2 papers)
  2. Hassan Rivaz (73 papers)
  3. Yiming Xiao (36 papers)