Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis (2312.06069v2)

Published 11 Dec 2023 in cs.CV

Abstract: Obtaining large-scale radiology reports can be difficult for medical images due to various reasons, limiting the effectiveness of contrastive pre-training in the medical image domain and underscoring the need for alternative methods. In this paper, we propose eye-tracking as an alternative to text reports, as it allows for the passive collection of gaze signals without disturbing radiologist's routine diagnosis process. By tracking the gaze of radiologists as they read and diagnose medical images, we can understand their visual attention and clinical reasoning. When a radiologist has similar gazes for two medical images, it may indicate semantic similarity for diagnosis, and these images should be treated as positive pairs when pre-training a computer-assisted diagnosis (CAD) network through contrastive learning. Accordingly, we introduce the Medical contrastive Gaze Image Pre-training (McGIP) as a plug-and-play module for contrastive learning frameworks. McGIP uses radiologist's gaze to guide contrastive pre-training. We evaluate our method using two representative types of medical images and two common types of gaze data. The experimental results demonstrate the practicality of McGIP, indicating its high potential for various clinical scenarios and applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Gaze complements control input for goal prediction during assisted teleoperation. In Robotics science and systems.
  2. Inferring goals with gaze during teleoperated manipulation. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 7307–7314. IEEE.
  3. Big Self-Supervised Models Advance Medical Image Classification. arXiv preprint arXiv:2101.05224.
  4. Mitigating causal confusion in driving agents via gaze supervision. In Aligning Robot Representations with Humans workshop@ Conference on Robot Learning.
  5. Finding lung nodules with and without comparative visual scanning. Perception & psychophysics, 29(6): 594–598.
  6. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9650–9660.
  7. A simple framework for contrastive learning of visual representations. In International conference on machine learning, 1597–1607. PMLR.
  8. It depends on how you look at it: Scanpath comparison in multiple dimensions with MultiMatch, a vector-based approach. Behavior research methods, 44(4): 1079–1100.
  9. Federated Learning for Appearance-based Gaze Estimation in the Wild. In Annual Conference on Neural Information Processing Systems, 20–36. PMLR.
  10. Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733.
  11. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9729–9738.
  12. Eye tracking: A comprehensive guide to methods and measures. OUP Oxford.
  13. Hu, M.-K. 1962. Visual pattern recognition by moment invariants. IRE transactions on information theory, 8(2): 179–187.
  14. Hu, Z. 2020. Gaze analysis and prediction in virtual reality. In 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), 543–544. IEEE.
  15. Ehtask: Recognizing user tasks from eye and head movements in immersive virtual reality. IEEE Transactions on Visualization and Computer Graphics.
  16. Fixationnet: Forecasting eye fixations in task-oriented virtual environments. IEEE Transactions on Visualization and Computer Graphics, 27(5): 2681–2690.
  17. Sgaze: A data-driven eye-head coordination model for realtime gaze prediction. IEEE transactions on visualization and computer graphics, 25(5): 2002–2010.
  18. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042.
  19. Creation and validation of a chest X-ray dataset with eye-tracking and report dictation for AI development. Scientific data, 8(1): 1–18.
  20. Using gaze-tracking data and mixture distribution analysis to support a holistic model for the detection of cancers on mammograms. Academic radiology, 15(7): 881–886.
  21. Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behavior research methods, 45(1): 251–266.
  22. Breast imaging reporting and data system (BI-RADS). Radiologic Clinics of North America, 40: 409–430.
  23. Eye-gaze-guided vision transformer for rectifying shortcut learning. IEEE Transactions on Medical Imaging.
  24. Maharana, A. 2016. Application of Digital Fingerprinting: Duplicate Image Detection. Ph.D. thesis.
  25. Modeling visual search behavior of breast radiologists using a deep convolution neural network. Journal of Medical Imaging, 5(3): 035502.
  26. Missed cancer and visual search of mammograms: what feature-based machine-learning can tell us that deep-convolution learning cannot. In Medical Imaging 2019: Image Perception, Observer Performance, and Technology Assessment, volume 10952, 1095216. International Society for Optics and Photonics.
  27. The understanding of congruent and incongruent referential gaze in 17-month-old infants: an eye-tracking study comparing human and robot. Scientific Reports, 10(1): 1–10.
  28. Rendering optimizations for virtual reality using eye-tracking. In 2020 22nd symposium on virtual and augmented reality (SVR), 398–405. IEEE.
  29. INbreast: toward a full-field digital mammographic database. Academic Radiology, 19: 236–248.
  30. Organizers, G. M. M. 2022. NeurIPS 2022 Gaze Meets ML Workshop.
  31. Robot reading human gaze: Why eye tracking is better than head tracking for human-robot collaboration. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5048–5054. IEEE.
  32. Tufts Dental Database: A Multimodal Panoramic X-ray Dataset for Benchmarking Diagnostic Systems. IEEE Journal of Biomedical and Health Informatics, 26(4): 1650–1659.
  33. Crafting Better Contrastive Views for Siamese Representation Learning. arXiv preprint arXiv:2202.03278.
  34. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, 8748–8763. PMLR.
  35. Breaking with fixed set pathology recognition through report-guided contrastive training. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part V, 690–700. Springer.
  36. Casting your model: Learning to localize improves self-supervised representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11058–11067.
  37. MoCo-CXR: MoCo Pretraining Improves Representation and Transferability of Chest X-ray Models. In Proc. International Conference on Medical Imaging with Deep Learning (MIDL).
  38. Decoding Attention from Gaze: A Benchmark Dataset and End-to-End Models. In Annual Conference on Neural Information Processing Systems, 219–240. PMLR.
  39. Accelerating eye movement research via accurate and affordable smartphone eye tracking. Nature communications, 11(1): 4553.
  40. Investigating the association of eye gaze pattern and diagnostic error in mammography. In Medical Imaging 2013: Image Perception, Observer Performance, and Technology Assessment, volume 8673, 867302. International Society for Optics and Photonics.
  41. Medaug: Contrastive learning leveraging patient metadata improves representations for chest x-ray interpretation. In Machine Learning for Healthcare Conference, 755–769. PMLR.
  42. Pupil-Contour-Based Gaze Estimation With Real Pupil Axes for Head-Mounted Eye Tracking. IEEE Transactions on Industrial Informatics, 18(6): 3640–3650.
  43. Follow My Eye: Using Gaze to Supervise Computer-Aided Diagnosis. IEEE Transactions on Medical Imaging.
  44. A review of eye-tracking research in marketing. Review of marketing research, 123–147.
  45. Language over Labels: Contrastive Language Supervision Exceeds Purely Label-Supervised Classification Performance on Chest X-Rays. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop, 76–83.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zihao Zhao (42 papers)
  2. Sheng Wang (239 papers)
  3. Qian Wang (453 papers)
  4. Dinggang Shen (153 papers)
Citations (3)
Github Logo Streamline Icon: https://streamlinehq.com