Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Artificial General Intelligence for Medical Imaging Analysis (2306.05480v4)

Published 8 Jun 2023 in cs.AI
Artificial General Intelligence for Medical Imaging Analysis

Abstract: Large-scale AGI models, including LLMs such as ChatGPT/GPT-4, have achieved unprecedented success in a variety of general domain tasks. Yet, when applied directly to specialized domains like medical imaging, which require in-depth expertise, these models face notable challenges arising from the medical field's inherent complexities and unique characteristics. In this review, we delve into the potential applications of AGI models in medical imaging and healthcare, with a primary focus on LLMs, Large Vision Models, and Large Multimodal Models. We provide a thorough overview of the key features and enabling techniques of LLMs and AGI, and further examine the roadmaps guiding the evolution and implementation of AGI models in the medical sector, summarizing their present applications, potentialities, and associated challenges. In addition, we highlight potential future research directions, offering a holistic view on upcoming ventures. This comprehensive review aims to offer insights into the future implications of AGI in medical imaging, healthcare, and beyond.

Introduction to AGI in Medical Imaging

The integration of AGI into the healthcare sector is a profound advancement aimed at enhancing patient outcomes and efficiency in care delivery. AGI models, particularly those categorized as LLMs, Large Vision Models (LVMs), and Large Multimodal Models (LMMs), promise to revolutionize healthcare. These technologies harness the convergence of clinical expertise, domain-specific knowledge, and multimodal data interpretation to create potent tools within medical practice.

Adapting AGI to Health Care Challenges

Efficiently adapting AGI to healthcare necessitates confronting unique challenges. AGI must accommodate the detail-oriented and specialized nature of clinical data. The medical imaging arena showcases this necessity starkly - images derived from MRI and CT scans, among other sources, require nuanced interpretation that merges anatomical, pathological, and radiological expertise. Shortages of high-quality annotated medical data, privacy concerns – accentuated by legislations like HIPAA in the U.S. – and the sensitive nature of clinical information are significant hurdles in leveraging these sophisticated AI models within the medical domain.

Applications and Developing Strategies

The roadmap to AGI application in healthcare is multi-faceted, involving expert-in-the-loop methodologies, domain-specific tailoring, and prompt tuning to refine AGI outputs. LLMs like ChatGPT have potential for educational roles in medicine, patient consultations, and relieving clinical workloads. The synergy of text, image, and potentially genetic data presents a monumental stride toward an integrated diagnostic approach. This amalgamation necessitates strategies ensuring data accessibility and quality, as AGI seeks to align itself closely with detailed clinical diagnostics and procedures.

The prospective integration of LLMs in healthcare does not solely imply automated processes but rather speaks to a complementary relationship between these models and human medical practitioners. AGI holds the capacity to support and empower the medical field, serving as an auxiliary to the expertise of healthcare professionals.

Challenges in Practical Application

Despite the considerable promise, AGI's application is not straightforward. Medical AI must navigate prompt crafting intricacies, legal and ethical data concerns, and the transformation of raw healthcare data into an intelligible AI-compatible format while circumventing privacy constraints. For instance, the implementation of healthcare-specific LVMs like SAM and AIGC applications involves the necessity of high-quality annotations aligned with rigorous clinical standards. Addressing these challenges requires mitigative strategies that combine robust supervision, training on diverse and representative data, and establishing fail-safes for model predictions to minimize risks.

Conclusion

The migration of AGI into healthcare is a paradigm shift that brings with it a host of transformative implications for patient care. Integrating expert knowledge into AGI models and embedding these models within multimodal healthcare systems is essential for realizing their full potential. While the journey is fraught with challenges, ranging from ethical considerations to the need for extensive data, the collaboration between AI specialists and clinicians will be paramount. As we continue to refine these technologies, we edge closer to a future where AGI is an integral component of medical protocols, augmenting the expertise of healthcare professionals and enhancing patient care significantly.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (158)
  1. Y. Liu, T. Han, S. Ma, J. Zhang, Y. Yang, J. Tian, H. He, A. Li, M. He, Z. Liu et al., “Summary of chatgpt/gpt-4 research and perspective towards the future of large language models,” arXiv preprint arXiv:2304.01852, 2023.
  2. M. Cascella, J. Montomoli, V. Bellini, and E. Bignami, “Evaluating the feasibility of chatgpt in healthcare: an analysis of multiple clinical and research scenarios,” Journal of Medical Systems, vol. 47, no. 1, pp. 1–5, 2023.
  3. L. Zhang, A. Zaman, L. Wang, J. Yan, and D. Zhu, “A cascaded multi-modality analysis in mild cognitive impairment,” in Machine Learning in Medical Imaging: 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 10.   Springer, 2019, pp. 557–565.
  4. A. Zaman, L. Zhang, J. Yan, and D. Zhu, “Multi-modal image prediction via spatial hybrid u-net,” in Multiscale Multimodal Medical Imaging: First International Workshop, MMMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 1.   Springer, 2020, pp. 1–9.
  5. L. Zhang, L. Wang, J. Gao, S. L. Risacher, J. Yan, G. Li, T. Liu, D. Zhu, A. D. N. Initiative et al., “Deep fusion of brain structure-function in mild cognitive impairment,” Medical image analysis, vol. 72, p. 102082, 2021.
  6. L. Zhang, L. Wang, and D. Zhu, “Jointly analyzing alzheimer’s disease related structure-function using deep cross-model attention network,” in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI).   IEEE, 2020, pp. 563–567.
  7. R. OpenAI, “Gpt-4 technical report,” arXiv, 2023.
  8. S. Huang, L. Dong, W. Wang, Y. Hao, S. Singhal, S. Ma, T. Lv, L. Cui, O. K. Mohammed, Q. Liu et al., “Language is not all you need: Aligning perception with language models,” arXiv preprint arXiv:2302.14045, 2023.
  9. Fares Antaki, Samir Touma, Daniel Milad, Jonathan El-Khoury, Renaud Duval, “Evaluating the performance of chatgpt in ophthalmology: An analysis of its successes and shortcomings,” medRxiv (2023): 2023-01, 2023.
  10. Maad Mijwil, Mohammad Aljanabi, Ahmed Hussein Ali, “ChatGPT: Exploring the Role of Cybersecurity in the Protection of Medical Information,” Mesopotamian journal of cybersecurity, pp. 18–21, 2023.
  11. H. Dai, Z. Liu, W. Liao, X. Huang, Z. Wu, L. Zhao, W. Liu, N. Liu, S. Li, D. Zhu et al., “Chataug: Leveraging chatgpt for text data augmentation,” arXiv preprint arXiv:2302.13007, 2023.
  12. M. Akrout, B. Gyepesi, P. Holló, A. Poór, B. Kincső, S. Solis, K. Cirone, J. Kawahara, D. Slade, L. Abid et al., “Diffusion-based data augmentation for skin disease classification: Impact across original medical datasets to fully synthetic images,” arXiv preprint arXiv:2301.04802, 2023.
  13. Z. Liu, X. Yu, L. Zhang, Z. Wu, C. Cao, H. Dai, L. Zhao, W. Liu, D. Shen, Q. Li et al., “Deid-gpt: Zero-shot medical text de-identification by gpt-4,” arXiv preprint arXiv:2303.11032, 2023.
  14. C. Ma, Z. Wu, J. Wang, S. Xu, Y. Wei, Z. Liu, L. Guo, X. Cai, S. Zhang, T. Zhang et al., “Impressiongpt: an iterative optimizing framework for radiology report summarization with chatgpt,” arXiv preprint arXiv:2304.08448, 2023.
  15. Z. Xiao, Y. Chen, L. Zhang, J. Yao, Z. Wu, X. Yu, Y. Pan, L. Zhao, C. Ma, X. Liu et al., “Instruction-vit: Multi-modal prompts for instruction learning in vit,” arXiv preprint arXiv:2305.00201, 2023.
  16. T. Chen, L. Zhu, C. Ding, R. Cao, S. Zhang, Y. Wang, Z. Li, L. Sun, P. Mao, and Y. Zang, “Sam fails to segment anything?–sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more,” arXiv preprint arXiv:2304.09148, 2023.
  17. S. Wang, Z. Zhao, X. Ouyang, Q. Wang, and D. Shen, “Chatcad: Interactive computer-aided diagnosis on medical image using large language models,” arXiv preprint arXiv:2302.07257, 2023.
  18. Z. Wu, L. Zhang, C. Cao, X. Yu, H. Dai, C. Ma, Z. Liu, L. Zhao, G. Li, W. Liu et al., “Exploring the trade-offs: Unified large language models vs local fine-tuned models for highly-specific radiology nli task,” arXiv preprint arXiv:2304.09138, 2023.
  19. L. Zhao, L. Zhang, Z. Wu, Y. Chen, H. Dai, X. Yu, Z. Liu, T. Zhang, X. Hu, X. Jiang et al., “When brain-inspired ai meets agi,” arXiv preprint arXiv:2303.15935, 2023.
  20. M. Sallam, “Chatgpt utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns,” in Healthcare, vol. 11, no. 6.   MDPI, 2023, p. 887.
  21. J. Yang, H. Jin, R. Tang, X. Han, Q. Feng, H. Jiang, B. Yin, and X. Hu, “Harnessing the power of llms in practice: A survey on chatgpt and beyond,” arXiv preprint arXiv:2304.13712, 2023.
  22. J. Holmes, Z. Liu, L. Zhang, Y. Ding, T. T. Sio, L. A. McGee, J. B. Ashman, X. Li, T. Liu, J. Shen et al., “Evaluating large language models on a highly-specialized topic, radiation oncology physics,” arXiv preprint arXiv:2304.01938, 2023.
  23. P. W. Anderson, “More is different: broken symmetry and the nature of the hierarchical structure of science.” Science, vol. 177, no. 4047, pp. 393–396, 1972.
  24. J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler et al., “Emergent abilities of large language models,” arXiv preprint arXiv:2206.07682, 2022.
  25. A. Srivastava, A. Rastogi, A. Rao, A. A. M. Shoeb, A. Abid, A. Fisch, A. R. Brown, A. Santoro, A. Gupta, A. Garriga-Alonso et al., “Beyond the imitation game: Quantifying and extrapolating the capabilities of language models,” arXiv preprint arXiv:2206.04615, 2022.
  26. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning.   PMLR, 2021, pp. 8748–8763.
  27. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever, “Zero-shot text-to-image generation,” in International Conference on Machine Learning.   PMLR, 2021, pp. 8821–8831.
  28. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
  29. J. Chen, H. Guo, K. Yi, B. Li, and M. Elhoseiny, “Visualgpt: Data-efficient adaptation of pretrained language models for image captioning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 18 030–18 040.
  30. Z.-Y. Dou, Y. Xu, Z. Gan, J. Wang, S. Wang, L. Wang, C. Zhu, P. Zhang, L. Yuan, N. Peng et al., “An empirical study of training end-to-end vision-and-language transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 166–18 176.
  31. D. Driess, F. Xia, M. S. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu et al., “Palm-e: An embodied multimodal language model,” arXiv preprint arXiv:2303.03378, 2023.
  32. K. Zhang, J. Yu, Z. Yan, Y. Liu, E. Adhikarla, S. Fu, X. Chen, C. Chen, Y. Zhou, X. Li et al., “Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks,” arXiv preprint arXiv:2305.17100, 2023.
  33. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  34. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  35. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  36. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  37. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” OpenAI, 2018.
  38. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  39. S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen, C. Dewan, M. Diab, X. Li, X. V. Lin et al., “Opt: Open pre-trained transformer language models,” arXiv preprint arXiv:2205.01068, 2022.
  40. T. Mikolov, M. Karafiát, L. Burget, J. Cernockỳ, and S. Khudanpur, “Recurrent neural network based language model.” in Interspeech, vol. 2, no. 3.   Makuhari, 2010, pp. 1045–1048.
  41. A. Graves and A. Graves, “Long short-term memory,” Supervised sequence labelling with recurrent neural networks, pp. 37–45, 2012.
  42. R. Dey and F. M. Salem, “Gate-variants of gated recurrent unit (gru) neural networks,” in 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS).   IEEE, 2017, pp. 1597–1600.
  43. C. Zhou, Q. Li, C. Li, J. Yu, Y. Liu, G. Wang, K. Zhang, C. Ji, Q. Yan, L. He et al., “A comprehensive survey on pretrained foundation models: A history from bert to chatgpt,” arXiv preprint arXiv:2302.09419, 2023.
  44. Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, and Z. Sui, “A survey for in-context learning,” arXiv preprint arXiv:2301.00234, 2022.
  45. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 1–35, 2023.
  46. J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith, and D. C. Schmidt, “A prompt pattern catalog to enhance prompt engineering with chatgpt,” arXiv preprint arXiv:2302.11382, 2023.
  47. X. Liu, Y. Zheng, Z. Du, M. Ding, Y. Qian, Z. Yang, and J. Tang, “Gpt understands, too,” arXiv preprint arXiv:2103.10385, 2021.
  48. T. Schick and H. Schütze, “It’s not just size that matters: Small language models are also few-shot learners,” arXiv preprint arXiv:2009.07118, 2020.
  49. T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh, “Autoprompt: Eliciting knowledge from language models with automatically generated prompts,” arXiv preprint arXiv:2010.15980, 2020.
  50. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, 2022.
  51. X. Li, Y. Yang, Y. Sun, and L. Zhang, “A developmental actor-critic reinforcement learning approach for task-nonspecific robot,” in 2016 IEEE Chinese Guidance, Navigation and Control Conference (CGNCC).   IEEE, 2016, pp. 2231–2237.
  52. Y. Yang, X. Li, and L. Zhang, “Task-specific pre-learning to improve the convergence of reinforcement learning based on a deep neural network,” in 2016 12th World Congress on Intelligent Control and Automation (WCICA).   IEEE, 2016, pp. 2209–2214.
  53. T. H. Kung, M. Cheatham, A. Medenilla, C. Sillos, L. De Leon, C. Elepaño, M. Madriaga, R. Aggabao, G. Diaz-Candido, J. Maningo et al., “Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models,” PLoS digital health, vol. 2, no. 2, p. e0000198, 2023.
  54. K. C. Siontis, P. A. Noseworthy, Z. I. Attia, and P. A. Friedman, “Artificial intelligence-enhanced electrocardiography in cardiovascular disease management,” Nature Reviews Cardiology, vol. 18, no. 7, pp. 465–478, 2021.
  55. D. Bzdok and A. Meyer-Lindenberg, “Machine learning for precision psychiatry: opportunities and challenges,” Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, vol. 3, no. 3, pp. 223–230, 2018.
  56. B. Van Ginneken, C. M. Schaefer-Prokop, and M. Prokop, “Computer-aided diagnosis: how to move from the laboratory to the clinic,” Radiology, vol. 261, no. 3, pp. 719–732, 2011.
  57. J. Unkelbach, M. Alber, M. Bangert, R. Bokrantz, T. C. Chan, J. O. Deasy, A. Fredriksson, B. L. Gorissen, M. Van Herk, W. Liu et al., “Robust radiotherapy planning,” Physics in Medicine & Biology, vol. 63, no. 22, p. 22TR02, 2018.
  58. J. Y. Chang, X. Zhang, A. Knopf, H. Li, S. Mori, L. Dong, H.-M. Lu, W. Liu, S. N. Badiyan, S. Both et al., “Consensus guidelines for implementing pencil-beam scanning proton therapy for thoracic malignancies on behalf of the ptcog thoracic and lymphoma subcommittee,” International Journal of Radiation Oncology* Biology* Physics, vol. 99, no. 1, pp. 41–50, 2017.
  59. H. Wu, M. Wang, J. Wu, F. Francis, Y.-H. Chang, A. Shavick, H. Dong, M. T. Poon, N. Fitzpatrick, A. P. Levine et al., “A survey on clinical natural language processing in the united kingdom from 2007 to 2022,” NPJ digital medicine, vol. 5, no. 1, p. 186, 2022.
  60. A. Chebli, A. Djebbar, and H. F. Marouani, “Semi-supervised learning for medical application: A survey,” in 2018 International Conference on Applied Smart Systems (ICASS).   IEEE, 2018, pp. 1–9.
  61. S. Budd, E. C. Robinson, and B. Kainz, “A survey on active learning and human-in-the-loop deep learning for medical image analysis,” Medical Image Analysis, vol. 71, p. 102062, 2021.
  62. Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, and H. Poon, “Domain-specific language model pretraining for biomedical natural language processing,” ACM Transactions on Computing for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021.
  63. S. Rezayi, Z. Liu, Z. Wu, C. Dhakal, B. Ge, C. Zhen, T. Liu, and S. Li, “Agribert: knowledge-infused agricultural language models for matching food and nutrition.”   IJCAI, 2022.
  64. Z. Liu, X. He, L. Liu, T. Liu, and X. Zhai, “Context matters: A strategy to pre-train language model for science education,” arXiv preprint arXiv:2301.12031, 2023.
  65. S. Rezayi, H. Dai, Z. Liu, Z. Wu, A. Hebbar, A. H. Burns, L. Zhao, D. Zhu, Q. Li, W. Liu et al., “Clinicalradiobert: Knowledge-infused few shot learning for clinical notes named entity recognition,” in Machine Learning in Medical Imaging: 13th International Workshop, MLMI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 18, 2022, Proceedings.   Springer, 2022, pp. 269–278.
  66. Z. Liu, M. He, Z. Jiang, Z. Wu, H. Dai, L. Zhang, S. Luo, T. Han, X. Li, X. Jiang et al., “Survey on natural language processing in medical image analysis.” Zhong nan da xue xue bao. Yi xue ban= Journal of Central South University. Medical Sciences, vol. 47, no. 8, pp. 981–993, 2022.
  67. M. Moor, O. Banerjee, Z. S. H. Abad, H. M. Krumholz, J. Leskovec, E. J. Topol, and P. Rajpurkar, “Foundation models for generalist medical artificial intelligence,” Nature, vol. 616, no. 7956, pp. 259–265, 2023.
  68. X. Chen, L. Li, Q. Fei, N. Zhang, C. Tan, Y. Jiang, F. Huang, and H. Chen, “One model for all domains: Collaborative domain-prefix tuning for cross-domain ner,” arXiv preprint arXiv:2301.10410, 2023.
  69. B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,” arXiv preprint arXiv:2104.08691, 2021.
  70. X. Liu, K. Ji, Y. Fu, W. Tam, Z. Du, Z. Yang, and J. Tang, “P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022, pp. 61–68.
  71. J.-B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr, Y. Hasson, K. Lenc, A. Mensch, K. Millican, M. Reynolds et al., “Flamingo: a visual language model for few-shot learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 23 716–23 736, 2022.
  72. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
  73. V. P. Aggelidis and P. D. Chatzoglou, “Methods for evaluating hospital information systems: a literature review,” EuroMed Journal of Business, 2008.
  74. R. Brate, M.-H. Dang, F. Hoppe, Y. He, A. Meroño-Peñuela, and V. Sadashivaiah, “Improving language model predictions via prompts enriched with knowledge graphs,” in Workshop on Deep Learning for Knowledge Graphs (DL4KG@ ISWC2022), 2022.
  75. B. Peng, M. Galley, P. He, H. Cheng, Y. Xie, Y. Hu, Q. Huang, L. Liden, Z. Yu, W. Chen et al., “Check your facts and try again: Improving large language models with external knowledge and automated feedback,” arXiv preprint arXiv:2302.12813, 2023.
  76. H. Cai, W. Liao, Z. Liu, X. Huang, Y. Zhang, S. Ding, S. Li, Q. Li, T. Liu, and X. Li, “Coarse-to-fine knowledge graph domain adaptation based on distantly-supervised iterative training,” arXiv preprint arXiv:2211.02849, 2022.
  77. A. Grünebaum, J. Chervenak, S. L. Pollet, A. Katz, and F. A. Chervenak, “The exciting potential for chatgpt in obstetrics and gynecology,” American Journal of Obstetrics and Gynecology, 2023.
  78. C. M. P. Jacoba, L. A. Celi, A. C. Lorch, W. Fickweiler, L. Sobrin, J. W. Gichoya, L. P. Aiello, and P. S. Silva, “Bias and non-diversity of big data in artificial intelligence: Focus on retinal diseases: “massachusetts eye and ear special issue”,” in Seminars in Ophthalmology.   Taylor & Francis, 2023, pp. 1–9.
  79. M. Abdalla, M. Abdalla, F. Rudzicz, and G. Hirst, “Using word embeddings to improve the privacy of clinical notes,” Journal of the American Medical Informatics Association, vol. 27, no. 6, pp. 901–907, 2020.
  80. M. J. Sheller, B. Edwards, G. A. Reina, J. Martin, S. Pati, A. Kotrotsou, M. Milchenko, W. Xu, D. Marcus, R. R. Colen et al., “Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data,” Scientific reports, vol. 10, no. 1, pp. 1–12, 2020.
  81. P. D. Thai, M. Doan, W. Liu, T. Liu, S. Li, H.-s. Zhou, and T. N. Dinh, “Blockchain peer-to-peer network: Performance and security,” in Handbook on Blockchain.   Springer, 2022, pp. 55–83.
  82. H. Lee, “The rise of chatgpt: Exploring its potential in medical education,” Anatomical Sciences Education, 2023.
  83. J. Wang, E. Shi, S. Yu, Z. Wu, C. Ma, H. Dai, Q. Yang, Y. Kang, J. Wu, H. Hu et al., “Prompt engineering for healthcare: Methodologies and applications,” arXiv preprint arXiv:2304.14670, 2023.
  84. T. L. Murray, M. Calhoun, and N. C. Philipsen, “Privacy, confidentiality, hipaa, and hitech: implications for the health care practitioner,” The Journal for Nurse Practitioners, vol. 7, no. 9, pp. 747–752, 2011.
  85. D. S. Char, M. D. Abràmoff, and C. Feudtner, “Identifying ethical considerations for machine learning healthcare applications,” The American Journal of Bioethics, vol. 20, no. 11, pp. 7–17, 2020.
  86. A. E. Johnson, T. J. Pollard, L. Shen, L.-w. H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. Anthony Celi, and R. G. Mark, “Mimic-iii, a freely accessible critical care database,” Scientific data, vol. 3, no. 1, pp. 1–9, 2016.
  87. A. E. Johnson, T. J. Pollard, S. J. Berkowitz, N. R. Greenbaum, M. P. Lungren, C.-y. Deng, R. G. Mark, and S. Horng, “Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports,” Scientific data, vol. 6, no. 1, p. 317, 2019.
  88. S. Ghosh, E. S. Baranowski, M. Biehl, W. Arlt, P. Tino, and K. Bunte, “Interpretable models capable of handling systematic missingness in imbalanced classes and heterogeneous datasets,” arXiv preprint arXiv:2206.02056, 2022.
  89. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18.   Springer, 2015, pp. 234–241.
  90. S. M. Anwar, M. Majid, A. Qayyum, M. Awais, M. Alnowami, and M. K. Khan, “Medical image analysis using convolutional neural networks: a review,” Journal of medical systems, vol. 42, pp. 1–13, 2018.
  91. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
  92. N. Rieke, J. Hancox, W. Li, F. Milletari, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein et al., “The future of digital health with federated learning,” NPJ digital medicine, vol. 3, no. 1, p. 119, 2020.
  93. P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al., “Advances and open problems in federated learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
  94. Y. Cui, J. Ren, H. Xu, P. He, H. Liu, L. Sun, and J. Tang, “Diffusionshield: A watermark for copyright protection against generative diffusion models,” arXiv preprint arXiv:2306.04642, 2023.
  95. Y. Liu, H. Hu, X. Zhang, and L. Sun, “Watermarking text data on large language models for dataset copyright protection,” arXiv preprint arXiv:2305.13257, 2023.
  96. A. Esteva, A. Robicquet, B. Ramsundar, V. Kuleshov, M. DePristo, K. Chou, C. Cui, G. Corrado, S. Thrun, and J. Dean, “A guide to deep learning in healthcare,” Nature medicine, vol. 25, no. 1, pp. 24–29, 2019.
  97. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition.   Ieee, 2009, pp. 248–255.
  98. C. G. Snoek, M. Worring, and A. W. Smeulders, “Early versus late fusion in semantic video analysis,” in Proceedings of the 13th annual ACM international conference on Multimedia, 2005, pp. 399–402.
  99. K. Gadzicki, R. Khamsehashari, and C. Zetzsche, “Early vs late fusion in multimodal convolutional neural networks,” in 2020 IEEE 23rd international conference on information fusion (FUSION).   IEEE, 2020, pp. 1–6.
  100. Y. Wang, Y. Xie, J. Zeng, H. Wang, L. Fan, and Y. Song, “Cross-modal fusion for multi-label image classification with attention mechanism,” Computers and Electrical Engineering, vol. 101, p. 108002, 2022.
  101. M. Böhle, M. Fritz, and B. Schiele, “Holistically explainable vision transformers,” arXiv preprint arXiv:2301.08669, 2023.
  102. S. Stevens and Y. Su, “An investigation of language model interpretability via sentence editing,” arXiv preprint arXiv:2011.14039, 2020.
  103. M. Hassanin, S. Anwar, I. Radwan, F. S. Khan, and A. Mian, “Visual attention methods in deep learning: An in-depth survey,” arXiv preprint arXiv:2204.07756, 2022.
  104. K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013.
  105. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626.
  106. N. Tajbakhsh, H. Roth, D. Terzopoulos, and J. Liang, “Guest editorial annotation-efficient deep learning: the holy grail of medical imaging,” IEEE transactions on medical imaging, vol. 40, no. 10, pp. 2526–2533, 2021.
  107. J. Hagerty, R. J. Stanley, and W. V. Stoecker, “Medical image processing in the age of deep learning,” in Proceedings of the 12th international joint conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), 2017, pp. 306–311.
  108. N. Jouppi, C. Young, N. Patil, and D. Patterson, “Motivation for and evaluation of the first tensor processing unit,” ieee Micro, vol. 38, no. 3, pp. 10–19, 2018.
  109. M. Garland, S. Le Grand, J. Nickolls, J. Anderson, J. Hardwick, S. Morton, E. Phillips, Y. Zhang, and V. Volkov, “Parallel computing experiences with cuda,” IEEE micro, vol. 28, no. 4, pp. 13–27, 2008.
  110. G. A. Kaissis, M. R. Makowski, D. Rückert, and R. F. Braren, “Secure, privacy-preserving and federated machine learning in medical imaging,” Nature Machine Intelligence, vol. 2, no. 6, pp. 305–311, 2020.
  111. T. Choudhary, V. Mishra, A. Goswami, and J. Sarangapani, “A comprehensive survey on model compression and acceleration,” Artificial Intelligence Review, vol. 53, pp. 5113–5155, 2020.
  112. A. Polino, R. Pascanu, and D. Alistarh, “Model compression via distillation and quantization,” arXiv preprint arXiv:1802.05668, 2018.
  113. S. He, R. Bao, J. Li, P. E. Grant, and Y. Ou, “Accuracy of segment-anything model (sam) in medical image segmentation tasks,” arXiv preprint arXiv:2304.09324, 2023.
  114. P. Shi, J. Qiu, S. M. D. Abaxi, H. Wei, F. P.-W. Lo, and W. Yuan, “Generalist vision foundation models for medical imaging: A case study of segment anything model on zero-shot medical segmentation,” arXiv preprint arXiv:2304.12637, 2023.
  115. D. Cheng, Z. Qin, Z. Jiang, S. Zhang, Q. Lao, and K. Li, “Sam on medical images: A comprehensive study on three prompt modes,” arXiv preprint arXiv:2305.00035, 2023.
  116. M. A. Mazurowski, H. Dong, H. Gu, J. Yang, N. Konz, and Y. Zhang, “Segment anything model for medical image analysis: an experimental study,” arXiv preprint arXiv:2304.10517, 2023.
  117. Y. Huang, X. Yang, L. Liu, H. Zhou, A. Chang, X. Zhou, R. Chen, J. Yu, J. Chen, C. Chen et al., “Segment anything model for medical images?” arXiv preprint arXiv:2304.14660, 2023.
  118. C. Mattjie, L. V. de Moura, R. C. Ravazio, L. S. Kupssinskü, O. Parraga, M. M. Delucis, and R. C. Barros, “Exploring the zero-shot capabilities of the segment anything model (sam) in 2d medical imaging: A comprehensive evaluation and practical guideline,” arXiv preprint arXiv:2305.00109, 2023.
  119. J. Ma and B. Wang, “Segment anything in medical images,” arXiv preprint arXiv:2304.12306, 2023.
  120. J. Wu, R. Fu, H. Fang, Y. Liu, Z. Wang, Y. Xu, Y. Jin, and T. Arbel, “Medical sam adapter: Adapting segment anything model for medical image segmentation,” arXiv preprint arXiv:2304.12620, 2023.
  121. Z. Qiu, Y. Hu, H. Li, and J. Liu, “Learnable ophthalmology sam,” arXiv preprint arXiv:2304.13425, 2023.
  122. K. Zhang and D. Liu, “Customized segment anything model for medical image segmentation,” arXiv preprint arXiv:2304.13785, 2023.
  123. Y. Liu, J. Zhang, Z. She, A. Kheradmand, and M. Armand, “Samm (segment any medical model): A 3d slicer integration to sam,” arXiv preprint arXiv:2304.05622, 2023.
  124. Y. Cao, S. Li, Y. Liu, Z. Yan, Y. Dai, P. S. Yu, and L. Sun, “A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt,” arXiv preprint arXiv:2303.04226, 2023.
  125. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684–10 695.
  126. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, 2022.
  127. J. Shi, Y. Liu, P. Zhou, and L. Sun, “Badgpt: Exploring security vulnerabilities of chatgpt via backdoor attacks to instructgpt,” arXiv preprint arXiv:2304.12298, 2023.
  128. Z. Yuan, Y. Liu, K. Zhang, P. Zhou, and L. Sun, “Backdoor attacks to pre-trained unified foundation models,” arXiv preprint arXiv:2302.09360, 2023.
  129. D. Cirillo, S. Catuara-Solarz, C. Morey, E. Guney, L. Subirats, S. Mellino, A. Gigante, A. Valencia, M. J. Rementeria, A. S. Chadha et al., “Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare,” NPJ digital medicine, vol. 3, no. 1, p. 81, 2020.
  130. G. Zeng, W. Yang, Z. Ju, Y. Yang, S. Wang, R. Zhang, M. Zhou, J. Zeng, X. Dong, R. Zhang, H. Fang, P. Zhu, S. Chen, and P. Xie, “MedDialog: Large-scale medical dialogue datasets,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, Nov. 2020, pp. 9241–9250. [Online]. Available: https://aclanthology.org/2020.emnlp-main.743
  131. A. Ben Abacha and D. Demner-Fushman, “A question-entailment approach to question answering,” BMC bioinformatics, vol. 20, no. 1, pp. 1–23, 2019.
  132. C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman et al., “Laion-5b: An open large-scale dataset for training next generation image-text models,” arXiv preprint arXiv:2210.08402, 2022.
  133. D. Demner-Fushman, M. D. Kohli, M. B. Rosenman, S. E. Shooshan, L. Rodriguez, S. Antani, G. R. Thoma, and C. J. McDonald, “Preparing a collection of radiology examinations for distribution and retrieval,” Journal of the American Medical Informatics Association, vol. 23, no. 2, pp. 304–310, 2016.
  134. G. Ilharco, M. Wortsman, N. Carlini, R. Taori, A. Dave, V. Shankar, H. Namkoong, J. Miller, H. Hajishirzi, A. Farhadi et al., “Openclip,” Zenodo, vol. 4, p. 5, 2021.
  135. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
  136. R. Dutt, L. Ericsson, P. Sanchez, S. A. Tsaftaris, and T. Hospedales, “Parameter-efficient fine-tuning for medical image analysis: The missed opportunity,” arXiv preprint arXiv:2305.08252, 2023.
  137. J. Chen, A. Zhang, X. Shi, M. Li, A. Smola, and D. Yang, “Parameter-efficient fine-tuning design spaces,” arXiv preprint arXiv:2301.01821, 2023.
  138. N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for nlp,” in International Conference on Machine Learning.   PMLR, 2019, pp. 2790–2799.
  139. E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
  140. T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “Qlora: Efficient finetuning of quantized llms,” arXiv preprint arXiv:2305.14314, 2023.
  141. D. Zhu, J. Chen, X. Shen, X. Li, and M. Elhoseiny, “Minigpt-4: Enhancing vision-language understanding with advanced large language models,” arXiv preprint arXiv:2304.10592, 2023.
  142. H. Liu, C. Li, Q. Wu, and Y. J. Lee, “Visual instruction tuning,” arXiv preprint arXiv:2304.08485, 2023.
  143. X. Zhang, C. Wu, Z. Zhao, W. Lin, Y. Zhang, Y. Wang, and W. Xie, “Pmc-vqa: Visual instruction tuning for medical visual question answering,” arXiv preprint arXiv:2305.10415, 2023.
  144. J. Li, D. Li, S. Savarese, and S. Hoi, “Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models,” arXiv preprint arXiv:2301.12597, 2023.
  145. Z. Wang, Z. Wu, D. Agarwal, and J. Sun, “Medclip: Contrastive learning from unpaired medical images and text,” arXiv preprint arXiv:2210.10163, 2022.
  146. R. Taori, I. Gulrajani, T. Zhang, Y. Dubois, X. Li, C. Guestrin, P. Liang, and T. B. Hashimoto, “Stanford alpaca: An instruction-following llama model,” https://github.com/tatsu-lab/stanford_alpaca, 2023.
  147. W.-L. Chiang, Z. Li, Z. Lin, Y. Sheng, Z. Wu, H. Zhang, L. Zheng, S. Zhuang, Y. Zhuang, J. E. Gonzalez et al., “Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality,” See https://vicuna.lmsys.org, 2023.
  148. O. Thawakar, A. Shaker, S. S. Mullappilly, H. Cholakkal, R. M. Anwer, S. Khan, J. Laaksonen, and F. S. Khan, “Xraygpt: Chest radiographs summarization using medical vision-language models,” https://github.com/mbzuai-oryx/XrayGPT, 2023.
  149. A. Kazerouni, E. K. Aghdam, M. Heidari, R. Azad, M. Fayyaz, I. Hacihaliloglu, and D. Merhof, “Diffusion models for medical image analysis: A comprehensive survey,” arXiv preprint arXiv:2211.07804, 2022.
  150. Z. Guan, M. Hu, Z. Zhou, J. Zhang, S. Li, and N. Liu, “Badsam: Exploring security vulnerabilities of sam via backdoor attacks,” arXiv preprint arXiv:2305.03289, 2023.
  151. K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” in proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 1175–1191.
  152. W. Liao, Z. Liu, H. Dai, S. Xu, Z. Wu, Y. Zhang, X. Huang, D. Zhu, H. Cai, T. Liu et al., “Differentiate chatgpt-generated and human-written medical texts,” arXiv preprint arXiv:2304.11567, 2023.
  153. I. Y. Chen, E. Pierson, S. Rose, S. Joshi, K. Ferryman, and M. Ghassemi, “Ethical machine learning in healthcare,” Annual review of biomedical data science, vol. 4, pp. 123–144, 2021.
  154. F. K. Dankar and K. El Emam, “Practicing differential privacy in health care: A review.” Trans. Data Priv., vol. 6, no. 1, pp. 35–67, 2013.
  155. OpenAI, “Gpt-4 technical report,” 2023.
  156. K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl et al., “Large language models encode clinical knowledge,” arXiv preprint arXiv:2212.13138, 2022.
  157. G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
  158. V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (19)
  1. Xiang Li (1002 papers)
  2. Lu Zhang (373 papers)
  3. Zihao Wu (100 papers)
  4. Zhengliang Liu (91 papers)
  5. Lin Zhao (227 papers)
  6. Yixuan Yuan (67 papers)
  7. Jun Liu (606 papers)
  8. Gang Li (579 papers)
  9. Dajiang Zhu (68 papers)
  10. Pingkun Yan (55 papers)
  11. Quanzheng Li (122 papers)
  12. Wei Liu (1135 papers)
  13. Tianming Liu (161 papers)
  14. Dinggang Shen (153 papers)
  15. Hanqi Jiang (27 papers)
  16. Chao Cao (104 papers)
  17. Shaochen Xu (16 papers)
  18. Yiwei Li (107 papers)
  19. Haixing Dai (39 papers)
Citations (31)