Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions (2404.03264v1)

Published 4 Apr 2024 in cs.CY and cs.AI
Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions

Abstract: Foundation model, which is pre-trained on broad data and is able to adapt to a wide range of tasks, is advancing healthcare. It promotes the development of healthcare AI models, breaking the contradiction between limited AI models and diverse healthcare practices. Much more widespread healthcare scenarios will benefit from the development of a healthcare foundation model (HFM), improving their advanced intelligent healthcare services. Despite the impending widespread deployment of HFMs, there is currently a lack of clear understanding about how they work in the healthcare field, their current challenges, and where they are headed in the future. To answer these questions, a comprehensive and deep survey of the challenges, opportunities, and future directions of HFMs is presented in this survey. It first conducted a comprehensive overview of the HFM including the methods, data, and applications for a quick grasp of the current progress. Then, it made an in-depth exploration of the challenges present in data, algorithms, and computing infrastructures for constructing and widespread application of foundation models in healthcare. This survey also identifies emerging and promising directions in this field for future development. We believe that this survey will enhance the community's comprehension of the current progress of HFM and serve as a valuable source of guidance for future development in this field. The latest HFM papers and related resources are maintained on our website: https://github.com/YutingHe-list/Awesome-Foundation-Models-for-Advancing-Healthcare.

Advancing Healthcare with Foundation Models: Opportunities and Challenges

Comprehensive Overview of Healthcare Foundation Models (HFMs)

Healthcare Foundation Models (HFMs) represent a transformative approach in applying AI to a wide spectrum of tasks in the healthcare domain. Leaning on the foundational model concept, HFMs are designed to harness broad data to learn general abilities that can be adapted to an expansive range of healthcare applications. This adaptation aims to mitigate the inherent contradiction between the specialized nature of existing AI models and the diverse requirements of healthcare practices. Recent research efforts underscore the notable progress of HFMs across various sub-fields of healthcare AI, such as language, vision, bioinformatics, and multimodality. This survey presents a systematic evaluation of the current status of HFMs, emphasizing their methodologies, underlying data, practical applications, and emerging challenges. It places a strong emphasis on language foundation models (LFMs), vision foundation models (VFMs), bioinformatics foundation models (BFMs), and multimodal foundation models (MFMs), assessing their contributions to improving intelligent healthcare services.

Methodological Insights Into HFMs

Language Foundation Models

LLMs in healthcare (LFMs) have demonstrated exceptional success, particularly in medical text processing and dialogue tasks. The pre-training of LFMs typically revolves around generative learning (GL) and contrastive learning (CL) paradigms, with fine-tuning (FT) and prompt engineering (PE) playing significant roles in adaptation. Models like GatorTronGPT and PMC-LLaMA exemplify this approach, showing the versatility of LFMs in handling various natural language processing tasks within the healthcare domain.

Vision Foundation Models

Vision foundation models (VFMs) have significantly advanced medical image analysis. They adopt supervised learning (SL), generative learning (GL), and contrastive learning (CL) for pre-training, with FT, AT, and PE extensively used for task adaptation. Models such as UNI and SAM-Med3D exemplify the potential of VFMs in medical image segmentation, showcasing their adaptability and superior performance in diverse medical scenarios.

Bioinformatics Foundation Models

Bioinformatics foundation models (BFMs) focus on leveraging large-scale biological data. They adopt a mix of generative learning (GL) and hybrid learning (HL) for pre-training, with fine-tuning (FT) being the predominant adaptation method. Models like RNA-FM and ESM-2 represent this approach, offering insights into the secrets of life and enhancing our understanding of protein sequences, DNA, and RNA.

Multimodal Foundation Models

The Multimodal Foundation Models (MFMs) integrate information from multiple modalities, achieving remarkable capability in interpreting medical data and performing cross-modality tasks. These models leverage a blend of generative, contrastive, and hybrid learning in pre-training, with fine-tuning (FT) and prompt engineering (PE) as essential for adaptation. The development of models such as BiomedGPT and RadFM highlights the significant advances in this field, demonstrating their effectiveness in multimodal healthcare settings.

Challenges and Future Directions

Despite the promising advancements in HFMs, several challenges persist, chiefly concerning data diversity, ethical considerations, algorithmic responsibility, and computational demands. Addressing these challenges requires concerted efforts in research, focusing on ethical data collection, enhancing algorithms' responsibility and reliability, and innovating for computational efficiency. Future developments in HFMs are likely to focus on fostering AI-human collaboration, enhancing models' dynamic capabilities, expanding applications to complex real-world scenarios, and emphasizing sustainability and trustworthiness.

In conclusion, HFMs hold immense promise in transforming healthcare through AI, offering sophisticated solutions that bridge the gap between specialized models and the broad spectrum of healthcare needs. Overcoming existing challenges and focusing on future directions will be crucial in unlocking the full potential of HFMs, paving the way for more advanced, reliable, and inclusive healthcare AI applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (440)
  1. Y. LeCun et al., “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
  2. X. Gu et al., “Beyond supervised learning for pervasive healthcare,” IEEE Rev. Biomed. Eng., 2023.
  3. F. Jiang et al., “Artificial intelligence in healthcare: past, present and future,” Stroke and vascular neurology, vol. 2, no. 4, 2017.
  4. P. Rajpurkar et al., “Ai in health and medicine,” Nat. Med., vol. 28, no. 1, pp. 31–38, 2022.
  5. K. Cao et al., “Large-scale pancreatic cancer detection via non-contrast ct and deep learning,” Nat. Med., pp. 1–11, 2023.
  6. J. De Fauw et al., “Clinically applicable deep learning for diagnosis and referral in retinal disease,” Nat. Med., vol. 24, no. 9, pp. 1342–1350, 2018.
  7. A. Esteva et al., “Dermatologist-level classification of skin cancer with deep neural networks,” Nature, vol. 542, no. 7639, pp. 115–118, 2017.
  8. R. Bommasani et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
  9. M. Moor et al., “Foundation models for generalist medical artificial intelligence,” Nature, vol. 616, no. 7956, pp. 259–265, 2023.
  10. B. Azad et al., “Foundational models in medical imaging: A comprehensive survey and future vision,” arXiv preprint arXiv:2310.18689, 2023.
  11. J. Qiu et al., “Large ai models in health informatics: Applications, challenges, and the future,” IEEE J. Biomed. Health Inform., 2023.
  12. A. J. Thirunavukarasu et al., “Large language models in medicine,” Nat. Med., vol. 29, no. 8, pp. 1930–1940, 2023.
  13. K. He et al., “A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics,” arXiv preprint arXiv:2310.05694, 2023.
  14. X. Yang et al., “A large language model for electronic health records,” NPJ Digit. Med., vol. 5, no. 1, p. 194, 2022.
  15. K. Singhal et al., “Large language models encode clinical knowledge,” Nature, vol. 620, no. 7972, pp. 172–180, 2023.
  16. Z. Li et al., “D-lmbmap: a fully automated deep-learning pipeline for whole-brain profiling of neural circuitry,” Nat. Methods, vol. 20, no. 10, pp. 1593–1604, 2023.
  17. Z. Wang et al., “Foundation model for endoscopy video analysis via large-scale self-supervised pre-train,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 101–111.
  18. Y. Zhou et al., “A foundation model for generalizable disease detection from retinal images,” Nature, vol. 622, no. 7981, pp. 156–163, 2023.
  19. A. Kirillov et al., “Segment anything,” in Proc. IEEE Int. Conf. Comput. Vis., October 2023, pp. 4015–4026.
  20. R. Rombach et al., “High-resolution image synthesis with latent diffusion models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 10 684–10 695.
  21. X. Shen and X. Li, “Omnina: A foundation model for nucleotide sequences,” bioRxiv, pp. 2024–01, 2024.
  22. H. Dalla-Torre et al., “The nucleotide transformer: Building and evaluating robust foundation models for human genomics,” bioRxiv, pp. 2023–01, 2023.
  23. N. Brandes et al., “Proteinbert: a universal deep-learning model of protein sequence and function,” Bioinformatics, vol. 38, no. 8, pp. 2102–2110, 2022.
  24. J. Jumper et al., “Highly accurate protein structure prediction with alphafold,” Nature, vol. 596, no. 7873, pp. 583–589, 2021. [Online]. Available: https://doi.org/10.1038/s41586-021-03819-2
  25. J. Chen et al., “Interpretable rna foundation model from unannotated data for highly accurate rna structure and function predictions,” bioRxiv, pp. 2022–08, 2022.
  26. C. Wu et al., “Towards generalist foundation model for radiology,” arXiv preprint arXiv:2308.02463, 2023.
  27. N. Fei et al., “Towards artificial general intelligence via a multimodal foundation model,” Nat. Commun., vol. 13, no. 1, p. 3094, 2022.
  28. K. Zhang et al., “Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks,” arXiv preprint arXiv:2305.17100, 2023.
  29. J. N. Acosta et al., “Multimodal biomedical ai,” Nat. Med., vol. 28, no. 9, pp. 1773–1784, 2022.
  30. T. Tu et al., “Towards generalist biomedical ai,” NEJM AI, vol. 1, no. 3, p. AIoa2300138, 2024.
  31. P. Shrestha et al., “Medical vision language pretraining: A survey,” arXiv preprint arXiv:2312.06224, 2023.
  32. M. J. Willemink et al., “Preparing medical imaging data for machine learning,” Radiology, vol. 295, no. 1, pp. 4–15, 2020.
  33. J. J. Hatherley, “Limits of trust in medical ai,” Journal of medical ethics, 2020.
  34. A. F. Markus et al., “The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies,” Journal of biomedical informatics, vol. 113, p. 103655, 2021.
  35. C.-J. Wu et al., “Sustainable ai: Environmental implications, challenges and opportunities,” Proceedings of Machine Learning and Systems, vol. 4, pp. 795–813, 2022.
  36. J. Devlin et al., “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of NAACL-HLT, 2019, pp. 4171–4186.
  37. J. Lee et al., “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 09 2019. [Online]. Available: https://doi.org/10.1093/bioinformatics/btz682
  38. S. B. Patel and K. Lam, “Chatgpt: the future of discharge summaries?” The Lancet Digital Health, vol. 5, no. 3, pp. e107–e108, 2023.
  39. Y. He et al., “Geometric visual similarity learning in 3d medical image self-supervised pre-training,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 9538–9547.
  40. Z. Zhou et al., “Models genesis,” Med. Image Anal., vol. 67, p. 101840, 2021.
  41. J. Ma et al., “Segment anything in medical images,” Nat. Commun., vol. 15, no. 1, p. 654, 2024.
  42. M. A. Mazurowski et al., “Segment anything model for medical image analysis: an experimental study,” Med. Image Anal., vol. 89, p. 102918, 2023.
  43. M. Baharoon et al., “Towards general purpose vision foundation models for medical image analysis: An experimental study of dinov2 on radiology benchmarks,” arXiv preprint arXiv:2312.02366, 2023.
  44. X. Wang et al., “Uni-rna: universal pre-trained models revolutionize rna research,” bioRxiv, pp. 2023–07, 2023.
  45. Y. Ji et al., “Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome,” Bioinformatics, vol. 37, no. 15, pp. 2112–2120, 8 2021. [Online]. Available: https://doi.org/10.1093/bioinformatics/btab083
  46. A. Radford et al., “Learning transferable visual models from natural language supervision,” in Proc. Int. Conf. Mach. Learn.   PMLR, 2021, pp. 8748–8763.
  47. Z. Zhao et al., “Clip in medical imaging: A comprehensive survey,” arXiv preprint arXiv:2312.07353, 2023.
  48. B. Wang et al., “Pre-trained language models in biomedical domain: A systematic survey,” ACM Computing Surveys, vol. 56, no. 3, pp. 1–52, 2023.
  49. H. Zhou, B. Gu, X. Zou, Y. Li, S. S. Chen, P. Zhou, J. Liu, Y. Hua, C. Mao, X. Wu et al., “A survey of large language models in medicine: Progress, application, and challenge,” arXiv preprint arXiv:2311.05112, 2023.
  50. M. Yuan et al., “Large language models illuminate a progressive pathway to artificial healthcare assistant: A review,” arXiv preprint arXiv:2311.01918, 2023.
  51. H. H. Lee et al., “Foundation models for biomedical image segmentation: A survey,” arXiv preprint arXiv:2401.07654, 2024.
  52. Q. Li et al., “Progress and opportunities of foundation models in bioinformatics,” arXiv preprint arXiv:2402.04286, 2024.
  53. J. Liu et al., “Large language models in bioinformatics: applications and perspectives,” arXiv preprint arXiv:2401.04155, 2024.
  54. Y. Qiu et al., “Pre-training in medical data: A survey,” Machine Intelligence Research, vol. 20, no. 2, pp. 147–179, 2023.
  55. D.-Q. Wang et al., “Accelerating the integration of chatgpt and other large-scale ai models into biomedical research and healthcare,” MedComm–Future Medicine, vol. 2, no. 2, p. e43, 2023.
  56. S. Zhang and D. Metaxas, “On the challenges and perspectives of foundation models for medical image analysis,” Med. Image Anal., vol. 91, p. 102996, 2024.
  57. Y. Zhang et al., “Data-centric foundation models in computational healthcare: A survey,” arXiv preprint arXiv:2401.02458, 2024.
  58. A. Radford et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  59. Q. Jin et al., “Medcpt: Contrastive pre-trained transformers with large-scale pubmed search logs for zero-shot biomedical information retrieval,” Bioinformatics, vol. 39, no. 11, p. btad651, 2023.
  60. C. Peng et al., “A study of generative large language model for medical research and healthcare,” NPJ Digit. Med., vol. 6, 2023. [Online]. Available: https://doi.org/10.1038/s41746-023-00958-w
  61. Y. Gu et al., “Domain-specific language model pretraining for biomedical natural language processing,” ACM Transactions on Computing for Healthcare, vol. 3, no. 1, pp. 1–23, 2021.
  62. H. Wang et al., “Huatuo: Tuning llama model with chinese medical knowledge,” arXiv preprint arXiv:2304.06975, 2023.
  63. H. Zhang et al., “Huatuogpt, towards taming language model to be a doctor,” in Findings of the Association for Computational Linguistics: EMNLP 2023, 2023, pp. 10 859–10 885.
  64. C. Wu et al., “Pmc-llama: Towards building open-source language models for medicine,” arXiv preprint arXiv:2305.10415, vol. 6, 2023.
  65. J. Chen et al., “Huatuogpt-ii, one-stage training for medical adaption of llms,” arXiv preprint arXiv:2311.09774, 2023.
  66. Z. Chen et al., “Meditron-70b: Scaling medical pretraining for large language models,” arXiv preprint arXiv:2311.16079, 2023.
  67. X. Zhang et al., “Alpacare: Instruction-tuned large language models for medical application,” arXiv preprint arXiv:2310.14558, 2023.
  68. Y. Chen et al., “Bianque: Balancing the questioning and suggestion ability of health llms with multi-turn health conversations polished by chatgpt,” arXiv preprint arXiv:2310.15896, 2023.
  69. Y. Li et al., “Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge,” Cureus, vol. 15, no. 6, 2023.
  70. T. Han et al., “Medalpaca–an open-source collection of medical conversational ai models and training data,” arXiv preprint arXiv:2304.08247, 2023.
  71. Q. Ye et al., “Qilin-med: Multi-stage knowledge injection advanced medical large language model,” arXiv preprint arXiv:2310.09089, 2023.
  72. L. Luo et al., “Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks,” Journal of the American Medical Informatics Association, p. ocae037, 02 2024.
  73. W. Wang et al., “Gpt-doctor: Customizing large language models for medical consultation,” arXiv preprint arXiv:2312.10225, 2023.
  74. H. Xiong et al., “Doctorglm: Fine-tuning your chinese doctor is not a herculean task,” arXiv preprint arXiv:2304.01097, 2023.
  75. G. Wang et al., “Clinicalgpt: Large language models finetuned with diverse medical data and comprehensive evaluation,” arXiv preprint arXiv:2306.09968, 2023.
  76. Q. Li et al., “From beginner to expert: Modeling medical knowledge into general llms,” arXiv preprint arXiv:2312.01040, 2023.
  77. Y. Labrak et al., “Biomistral: A collection of open-source pretrained large language models for medical domains,” arXiv preprint arXiv:2402.10373, 2024.
  78. H. Touvron et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
  79. E. Alsentzer et al., “Publicly available clinical BERT embeddings,” in Proceedings of the 2nd Clinical Natural Language Processing Workshop.   Minneapolis, Minnesota, USA: Association for Computational Linguistics, Jun. 2019, pp. 72–78.
  80. Y.-P. Chen et al., “Modified bidirectional encoder representations from transformers extractive summarization model for hospital information systems based on character-level tokens (alphabert): development and performance evaluation,” JMIR medical informatics, vol. 8, no. 4, p. e17787, 2020.
  81. Y. Li et al., “Behrt: transformer for electronic health records,” Scientific reports, vol. 10, no. 1, p. 7155, 2020.
  82. H. Yuan et al., “BioBART: Pretraining and evaluation of a biomedical generative language model,” in Proceedings of the 21st Workshop on Biomedical Language Processing.   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 97–109. [Online]. Available: https://aclanthology.org/2022.bionlp-1.9
  83. S. Yang et al., “Zhongjing: Enhancing the chinese medical capabilities of large language model through expert feedback and real-world multi-turn dialogue,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, 2024, pp. 19 368–19 376.
  84. Q. Xie et al., “Me llama: Foundation large language models for medical applications,” arXiv preprint arXiv:2402.12749, 2024.
  85. F. Jia et al., “Oncogpt: A medical conversational model tailored with oncology domain expertise on a large language model meta-ai (llama),” arXiv preprint arXiv:2402.16810, 2024.
  86. J. Wang et al., “Jmlr: Joint medical llm and retrieval training for enhancing reasoning and professional question answering capability,” arXiv preprint arXiv:2402.17887, 2024.
  87. Singhal et al., “Large language models encode clinical knowledge,” Nature, vol. 620, no. 7973, p. E19, 2023.
  88. K. Singhal et al., “Towards expert-level medical question answering with large language models,” arXiv preprint arXiv:2305.09617, 2023.
  89. S. Pieri et al., “Bimedix: Bilingual medical mixture of experts llm,” arXiv preprint arXiv:2402.13253, 2024.
  90. C. Shu, B. Chen, F. Liu, Z. Fu, E. Shareghi, and N. Collier, “Visual med-alpaca: A parameter-efficient biomedical llm with visual capabilities,” 2023.
  91. W. Gao et al., “Ophglm: Training an ophthalmology large language-and-vision assistant based on instructions and dialogue,” arXiv preprint arXiv:2306.12174, 2023.
  92. S. Wang et al., “Chatcad: Interactive computer-aided diagnosis on medical image using large language models,” arXiv preprint arXiv:2302.07257, 2023.
  93. Z. Zhao et al., “Chatcad+: Towards a universal and reliable interactive cad using llms,” arXiv preprint arXiv:2305.15964, 2023.
  94. Z. Liu et al., “Deid-gpt: Zero-shot medical text de-identification by gpt-4,” arXiv preprint arXiv:2303.11032, 2023.
  95. Y. Gao et al., “Leveraging a medical knowledge graph into large language models for diagnosis prediction,” arXiv preprint arXiv:2308.14321, 2023.
  96. H. Nori et al., “Can generalist foundation models outcompete special-purpose tuning? case study in medicine,” Medicine, vol. 84, no. 88.3, pp. 77–3, 2023.
  97. S. Sivarajkumar and Y. Wang, “Healthprompt: A zero-shot learning paradigm for clinical natural language processing,” in AMIA Annual Symposium Proceedings, vol. 2022.   American Medical Informatics Association, 2022, p. 972.
  98. X. Tang et al., “Medagents: Large language models as collaborators for zero-shot medical reasoning,” arXiv preprint arXiv:2311.10537, 2023.
  99. A. Elfrink et al., “Soft-prompt tuning to predict lung cancer using primary care free-text dutch medical notes,” in International Conference on Artificial Intelligence in Medicine.   Springer, 2023, pp. 193–198.
  100. M. Abaho et al., “Position-based prompting for health outcome generation,” in Proceedings of the 21st Workshop on Biomedical Language Processing, 2022, pp. 26–36.
  101. S. Lee et al., “Clinical decision transformer: Intended treatment recommendation through goal prompting,” arXiv preprint arXiv:2302.00612, 2023.
  102. O. Byambasuren et al., “Preliminary study on the construction of chinese medical knowledge graph,” Journal of Chinese Information Processing, vol. 33, no. 10, pp. 1–9, 2019.
  103. D. Jin et al., “What disease does this patient have? a large-scale open domain question answering dataset from medical exams,” Applied Sciences, vol. 11, no. 14, p. 6421, 2021.
  104. D. A. Lindberg et al., “The unified medical language system,” Yearbook of medical informatics, vol. 2, no. 01, pp. 41–51, 1993.
  105. J. Li et al., “Pre-trained language models for text generation: A survey,” ACM Comput. Surv., mar 2024, just Accepted. [Online]. Available: https://doi.org/10.1145/3649449
  106. E. J. Hu et al., “Lora: Low-rank adaptation of large language models,” in Proc. Int. Conf. Learn. Represent., 2021.
  107. J. Wang et al., “Prompt engineering for healthcare: Methodologies and applications,” arXiv preprint arXiv:2304.14670, 2023.
  108. J. Wei et al., “Chain-of-thought prompting elicits reasoning in large language models,” Proc. Adv. Neural Inf. Process. Syst., vol. 35, pp. 24 824–24 837, 2022.
  109. I. Beltagy et al., “Scibert: A pretrained language model for scientific text,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3615–3620.
  110. S. Verkijk and P. Vossen, “Medroberta. nl: a language model for dutch electronic health records,” Computational Linguistics in the Netherlands Journal, vol. 11, pp. 141–159, 2021.
  111. M. Awais et al., “Foundational models defining a new era in vision: A survey and outlook,” arXiv preprint arXiv:2307.13721, 2023.
  112. K. He et al., “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9729–9738.
  113. J. Ma and B. Wang, “Towards foundation models of biological image segmentation,” Nat. Methods, vol. 20, no. 7, pp. 953–955, 2023.
  114. S. Chen et al., “Med3d: Transfer learning for 3d medical image analysis,” arXiv preprint arXiv:1904.00625, 2019.
  115. Z. Huang et al., “Stu-net: Scalable and transferable medical image segmentation models empowered by large-scale supervised pre-training,” arXiv preprint arXiv:2304.06716, 2023.
  116. J. Wasserthal et al., “Totalsegmentator: Robust segmentation of 104 anatomic structures in ct images,” Radiology: Artificial Intelligence, vol. 5, no. 5, 2023.
  117. V. I. Butoi et al., “Universeg: Universal medical image segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
  118. H. Wang et al., “Sam-med3d,” arXiv preprint arXiv:2310.15161, 2023.
  119. J. Ye et al., “Sa-med2d-20m dataset: Segment anything in 2d medical imaging with 20 million masks,” arXiv preprint arXiv:2311.11969, 2023.
  120. H.-Y. Zhou et al., “A unified visual information preservation framework for self-supervised pre-training in medical image analysis,” IEEE Trans. Pattern Anal. Mach. Intell., 2023.
  121. J. Qiu et al., “Visionfm: a multi-modal multi-task vision foundation model for generalist ophthalmic artificial intelligence,” arXiv preprint arXiv:2310.04992, 2023.
  122. Y. Du et al., “Segvol: Universal and interactive volumetric medical image segmentation,” arXiv preprint arXiv:2311.13385, 2023.
  123. Q. Kang et al., “Deblurring masked autoencoder is better recipe for ultrasound image recognition,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2023, pp. 352–362.
  124. J. Jiao et al., “Usfm: A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis,” arXiv preprint arXiv:2401.00153, 2024.
  125. Z. Zhou et al., “Models genesis: Generic autodidactic models for 3d medical image analysis,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2019, pp. 384–393.
  126. J. Zhou et al., “Image bert pre-training with online tokenizer,” in International Conference on Learning Representations, 2021.
  127. K. He et al., “Masked autoencoders are scalable vision learners,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 16 000–16 009.
  128. Z. Xie, Z. Zhang, Y. Cao, Y. Lin, J. Bao, Z. Yao, Q. Dai, and H. Hu, “Simmim: A simple framework for masked image modeling,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 9653–9663.
  129. X. Wang et al., “Transformer-based unsupervised contrastive learning for histopathological image classification,” Med. Image Anal., vol. 81, p. 102559, 2022.
  130. H.-Y. Zhou et al., “Comparing to learn: Surpassing imagenet pretraining on radiographs by comparing image representations,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2020, pp. 398–407.
  131. H. Sowrirajan et al., “Moco pretraining improves representation and transferability of chest x-ray models,” in Proc. Int. Conf. Medical Imaging Deep Learn.   PMLR, 2021, pp. 728–744.
  132. O. Ciga et al., “Self supervised contrastive learning for digital histopathology,” Machine Learning with Applications, vol. 7, p. 100198, 2022.
  133. D. M. Nguyen et al., “Lvm-med: Learning large-scale self-supervised vision models for medical imaging via second-order graph matching,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  134. L. Wu, et al., “Voco: A simple-yet-effective volume contrastive learning framework for 3d medical image analysis,” in IEEE Conf. Comput. Vis. Pattern Recog., 2024.
  135. T. Chen et al., “A simple framework for contrastive learning of visual representations,” in Proc. Int. Conf. Mach. Learn.   PMLR, 2020, pp. 1597–1607.
  136. G. Wang et al., “Mis-fm: 3d medical image segmentation using foundation models pretrained on a large-scale unannotated dataset,” arXiv preprint arXiv:2306.16925, 2023.
  137. F. C. Ghesu et al., “Contrastive self-supervised learning from 100 million medical images with optional supervision,” Journal of Medical Imaging, vol. 9, no. 6, pp. 064 503–064 503, 2022.
  138. E. Vorontsov et al., “Virchow: A million-slide digital pathology foundation model,” arXiv preprint arXiv:2309.07778, 2023.
  139. R. J. Chen et al., “Towards a general-purpose foundation model for computational pathology,” Nature Medicine, 2024.
  140. J. Dippel et al., “Rudolfv: A foundation model by pathologists for pathologists,” arXiv preprint arXiv:2401.04079, 2024.
  141. M. Oquab et al., “Dinov2: Learning robust visual features without supervision,” Transactions on Machine Learning Research, 2023.
  142. Y. Wu et al., “Brow: Better features for whole slide image based on self-distillation,” arXiv preprint arXiv:2309.08259, 2023.
  143. F. Haghighi et al., “Transferable visual words: Exploiting the semantics of anatomical patterns for self-supervised learning,” IEEE transactions on medical imaging, vol. 40, no. 10, pp. 2857–2868, 2021.
  144. G. Campanella et al., “Computational pathology at health system scale–self-supervised foundation models from billions of images,” in AAAI 2024 Spring Symposium on Clinical Foundation Models, 2024.
  145. Y. Tang et al., “Self-supervised pre-training of swin transformers for 3d medical image analysis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 20 730–20 740.
  146. C. Chen et al., “Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation,” arXiv preprint arXiv:2309.08842, 2023.
  147. S. Pandey et al., “Comprehensive multimodal segmentation in medical imaging: Combining yolov8 with sam and hq-sam models,” in Proc. IEEE Int. Conf. Comput. Vis., 2023, pp. 2592–2598.
  148. S. Gong et al., “3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable medical image segmentation,” arXiv preprint arXiv:2306.13465, 2023.
  149. W. Yue et al., “Part to whole: Collaborative prompting for surgical instrument segmentation,” arXiv preprint arXiv:2312.14481, 2023.
  150. M. Hu et al., “Skinsam: Empowering skin cancer segmentation with segment anything model,” arXiv preprint arXiv:2304.13973, 2023.
  151. Y. Li et al., “Polyp-sam: Transfer sam for polyp segmentation,” arXiv preprint arXiv:2305.00293, 2023.
  152. C. Wang et al., “Sam-octa: A fine-tuning strategy for applying foundation model to octa image segmentation tasks,” in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2024, pp. 1771–1775.
  153. K. Zhang and D. Liu, “Customized segment anything model for medical image segmentation,” arXiv preprint arXiv:2304.13785, 2023.
  154. S. Chai et al., “Ladder fine-tuning approach for sam integrating complementary network,” arXiv preprint arXiv:2306.12737, 2023.
  155. W. Feng et al., “Cheap lunch for medical image segmentation by fine-tuning sam on few exemplars,” arXiv preprint arXiv:2308.14133, 2023.
  156. Y. Zhang et al., “Semisam: Exploring sam for enhancing semi-supervised medical image segmentation with extremely limited annotations,” arXiv preprint arXiv:2312.06316, 2023.
  157. X. Yan et al., “After-sam: Adapting sam with axial fusion transformer for medical imaging segmentation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 7975–7984.
  158. X. Xiong et al., “Mammo-sam: Adapting foundation segment anything model for automatic breast mass segmentation in whole mammograms,” in International Workshop on Machine Learning in Medical Imaging.   Springer, 2023, pp. 176–185.
  159. H. Li et al., “Promise: Prompt-driven 3d medical image segmentation using pretrained image foundation models,” arXiv preprint arXiv:2310.19721, 2023.
  160. J. Wu et al., “Medical sam adapter: Adapting segment anything model for medical image segmentation,” arXiv preprint arXiv:2304.12620, 2023.
  161. J. Cheng et al., “Sam-med2d,” arXiv preprint arXiv:2308.16184, 2023.
  162. J. N. Paranjape et al., “Adaptivesam: Towards efficient tuning of sam for surgical scene segmentation,” in Medical Imaging with Deep Learning, 2024.
  163. S. Kim et al., “Medivista-sam: Zero-shot medical video analysis with spatio-temporal sam adaptation,” arXiv preprint arXiv:2309.13539, 2023.
  164. X. Lin et al., “Samus: Adapting segment anything model for clinically-friendly and generalizable ultrasound image segmentation,” arXiv preprint arXiv:2309.06824, 2023.
  165. H. Gu et al., “Segmentanybone: A universal model that segments any bone at any location on mri,” arXiv preprint arXiv:2401.12974, 2024.
  166. Z. Feng et al., “Swinsam: Fine-grained polyp segmentation in colonoscopy images via segment anything model integrated with a swin transformer decoder,” Available at SSRN 4673046.
  167. Y. Zhang et al., “Input augmentation with sam: Boosting medical image segmentation with segmentation foundation model,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 129–139.
  168. T. Shaharabany et al., “Autosam: Adapting sam to medical images by overloading the prompt encoder,” arXiv preprint arXiv:2306.06370, 2023.
  169. Y. Gao et al., “Desam: Decoupling segment anything model for generalizable medical image segmentation,” arXiv preprint arXiv:2306.00499, 2023.
  170. U. Israel et al., “A foundation model for cell segmentation,” bioRxiv, pp. 2023–11, 2023.
  171. G. Deng et al., “Sam-u: Multi-box prompts triggered uncertainty estimation for reliable sam in medical image,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2023, pp. 368–377.
  172. J. Zhang et al., “Sam-path: A segment anything model for semantic segmentation in digital pathology,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2023, pp. 161–170.
  173. C. Cui and R. Deng, “All-in-sam: from weak annotation to pixel-wise nuclei segmentation with prompt-based finetuning,” in Asia Conference on Computers and Communications, ACCC, 2023.
  174. W. Yue et al., “Surgicalsam: Efficient class promptable surgical instrument segmentation,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2024.
  175. R. Biswas, “Polyp-sam++: Can a text guided sam perform better for polyp segmentation?” arXiv preprint arXiv:2308.06623, 2023.
  176. Y. Zhang et al., “Segment anything model with uncertainty rectification for auto-prompting medical image segmentation,” arXiv preprint arXiv:2311.10529, 2023.
  177. W. Lei et al., “Medlsam: Localize and segment anything model for 3d medical images,” arXiv preprint arXiv:2306.14752, 2023.
  178. Y. Li et al., “nnsam: Plug-and-play segment anything model improves nnunet performance,” arXiv preprint arXiv:2309.16967, 2023.
  179. Y. Xu et al., “Eviprompt: A training-free evidential prompt generation method for segment anything model in medical images,” arXiv preprint arXiv:2311.06400, 2023.
  180. D. Anand et al., “One-shot localization and segmentation of medical images with foundation models,” in R0-FoMo: Robustness of Few-shot and Zero-shot Learning in Large Foundation Models, 2023.
  181. Y. Liu et al., “Samm (segment any medical model): A 3d slicer integration to sam,” arXiv preprint arXiv:2304.05622, 2023.
  182. R. Sathish et al., “Task-driven prompt evolution for foundation models,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 256–264.
  183. M. Fischer et al., “Prompt tuning for parameter-efficient medical image segmentation,” Medical Image Analysis, vol. 91, p. 103024, 2024.
  184. T. Chen et al., “Sam fails to segment anything?–sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more,” arXiv preprint arXiv:2304.09148, 2023.
  185. A. Madani et al., “Large language models generate functional protein sequences across diverse families,” Nat. Biotechnol., pp. 1–8, 2023.
  186. E. Nijkamp et al., “Progen2: Exploring the boundaries of protein language models,” Cell Systems, vol. 14, pp. 968–978.e3, 11 2023, doi: 10.1016/j.cels.2023.10.002.
  187. F. Yang et al., “scbert as a large-scale pretrained deep language model for cell type annotation of single-cell rna-seq data,” Nat. Mach. Intell, vol. 4, pp. 852–866, 2022. [Online]. Available: https://doi.org/10.1038/s42256-022-00534-z
  188. C. V. Theodoris et al., “Transfer learning enables predictions in network biology,” Nature, vol. 618, pp. 616–624, 2023. [Online]. Available: https://doi.org/10.1038/s41586-023-06139-9
  189. Z. Zhou et al., “Dnabert-2: Efficient foundation model and benchmark for multi-species genomes,” in Proc. Int. Conf. Learn. Represent., 2023.
  190. V. Fishman et al., “Gena-lm: A family of open-source foundational models for long dna sequences,” bioRxiv, p. 2023.06.12.544594, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/06/13/2023.06.12.544594.abstract
  191. Y. Zhang et al., “Multiple sequence-alignment-based rna language model and its application to structural inference,” bioRxiv, p. 2023.03.15.532863, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/03/16/2023.03.15.532863.abstract
  192. K. Chen et al., “Self-supervised learning on millions of pre-mrna sequences improves sequence-based rna splicing prediction,” bioRxiv, p. 2023.01.31.526427, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/02/03/2023.01.31.526427.abstract
  193. Y. Yang et al., “Deciphering 3’ utr mediated gene regulation using interpretable deep representation learning,” bioRxiv, p. 2023.09.08.556883, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/09/12/2023.09.08.556883.abstract
  194. Z. Lin et al., “Evolutionary-scale prediction of atomic-level protein structure with a language model,” Science, vol. 379, pp. 1123–1130, 3 2023, doi: 10.1126/science.ade2574.
  195. A. Elnaggar et al., “Prottrans: Toward understanding the language of life through self-supervised learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, pp. 7112–7127, 2022.
  196. R. M. Rao et al., “Msa transformer,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139.   PMLR, 18–24 Jul 2021, pp. 8844–8856.
  197. A. Rives et al., “Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences,” Proc. Natl. Acad. Sci., vol. 118, p. e2016239118, 4 2021, doi: 10.1073/pnas.2016239118. [Online]. Available: https://doi.org/10.1073/pnas.2016239118
  198. E. Nguyen et al., “Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  199. M. Hao et al., “Large scale foundation model on single-cell transcriptomics,” bioRxiv, p. 2023.05.29.542705, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/06/15/2023.05.29.542705.abstract
  200. Y. Rosen et al., “Universal cell embeddings: A foundation model for cell biology,” bioRxiv, p. 2023.11.28.568918, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/11/29/2023.11.28.568918.abstract
  201. D. Zhang et al., “Dnagpt: A generalized pretrained tool for multiple dna sequence analysis tasks,” arXiv preprint arXiv:2307.05628, 2023.
  202. H. Cui et al., “scgpt: towards building a foundation model for single-cell multi-omics using generative ai,” bioRxiv, pp. 2023–04, 2023.
  203. M. Akiyama and Y. Sakakibara, “Informative rna base embedding for rna structural alignment and clustering by deep representation learning,” NAR Genomics and Bioinformatics, vol. 4, p. lqac012, 3 2022. [Online]. Available: https://doi.org/10.1093/nargab/lqac012
  204. R. Chowdhury et al., “Single-sequence protein structure prediction using a language model and deep learning,” Nature Biotechnol., vol. 40, no. 11, pp. 1617–1623, 2022.
  205. Y. Chu et al., “A 5’ utr language model for decoding untranslated regions of mrna and function predictions,” bioRxiv, p. 2023.10.11.561938, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/10/14/2023.10.11.561938.abstract
  206. S. Zhao et al., “Large-scale cell representation learning via divide-and-conquer contrastive learning,” arXiv preprint arXiv:2306.04371, 2023.
  207. S. Mo et al., “Multi-modal self-supervised pre-training for regulatory genome across cell types,” arXiv preprint arXiv:2110.05231, 2021.
  208. S. Li et al., “Codonbert: Large language models for mrna design and optimization,” in NeurIPS 2023 Generative AI and Biology (GenBio) Workshop, 2023.
  209. B. Chen et al., “xtrimopglm: Unified 100b-scale pre-trained transformer for deciphering the language of protein,” bioRxiv, p. 2023.07.05.547496, 1 2024. [Online]. Available: http://biorxiv.org/content/early/2024/01/11/2023.07.05.547496.abstract
  210. Z. Du et al., “GLM: General language model pretraining with autoregressive blank infilling,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.   Dublin, Ireland: Association for Computational Linguistics, may 2022, pp. 320–335.
  211. Y. T. Chen and J. Zou, “Genept: A simple but hard-to-beat foundation model for genes and cells built from chatgpt,” bioRxiv, p. 2023.10.16.562533, 1 2023. [Online]. Available: http://biorxiv.org/content/early/2023/10/19/2023.10.16.562533.abstract
  212. T. Liu et al., “scelmo: Embeddings from language models are good learners for single-cell data analysis,” bioRxiv, 2024. [Online]. Available: https://www.biorxiv.org/content/early/2024/03/03/2023.12.07.569910
  213. B. E. Slatko et al., “Overview of next-generation sequencing technologies,” Current Protocols in Molecular Biology, vol. 122, no. 1, p. e59, 2018. [Online]. Available: https://currentprotocols.onlinelibrary.wiley.com/doi/abs/10.1002/cpmb.59
  214. T. Wu et al., “A brief overview of chatgpt: The history, status quo and potential future development,” IEEE/CAA J. Autom. Sin., vol. 10, no. 5, pp. 1122–1136, 2023.
  215. Y. Khare et al., “Mmbert: Multimodal bert pretraining for improved medical vqa,” in Proc. IEEE Int. Symp. Biomed. Imaging.   IEEE, 2021, pp. 1033–1036.
  216. H.-Y. Zhou et al., “Advancing radiograph representation learning with masked record modeling,” The Eleventh International Conference on Learning Representations, 2023.
  217. Y. Zhang et al., “Contrastive learning of medical visual representations from paired images and text,” in Machine Learning for Healthcare Conference.   PMLR, 2022, pp. 2–25.
  218. P. Müller et al., “Joint learning of localized representations from medical images and reports,” in Proc. Eur. Conf. Comput. Vis.   Springer, 2022, pp. 685–701.
  219. J. Lei et al., “Unibrain: Universal brain mri diagnosis with hierarchical knowledge-enhanced pre-training,” arXiv preprint arXiv:2309.06828, 2023.
  220. C. Liu et al., “M-flag: Medical vision-language pre-training with frozen language models and latent space geometry optimization,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 637–647.
  221. F. Wang et al., “Multi-granularity cross-modal alignment for generalized medical visual representation learning,” Proc. Adv. Neural Inf. Process. Syst., vol. 35, pp. 33 536–33 549, 2022.
  222. C. Wu et al., “Medklip: Medical knowledge enhanced language-image pre-training,” medRxiv, pp. 2023–01, 2023.
  223. C. Liu et al., “Etp: Learning transferable ecg representations via ecg-text pre-training,” in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2024, pp. 8230–8234.
  224. S.-C. Huang et al., “Gloria: A multimodal global-local representation learning framework for label-efficient medical image recognition,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 3942–3951.
  225. C. Liu et al., “Imitate: Clinical prior guided hierarchical vision-language pre-training,” arXiv preprint arXiv:2310.07355, 2023.
  226. Z. Wang et al., “Medclip: Contrastive learning from unpaired medical images and text,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 3876–3887.
  227. Z. Wan et al., “Med-unic: Unifying cross-lingual medical vision-language pre-training by diminishing bias,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  228. K. You et al., “Cxr-clip: Toward large scale chest x-ray language-image pre-training,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 101–111.
  229. S. Zhang et al., “Large-scale domain-specific pretraining for biomedical vision-language processing,” arXiv preprint arXiv:2303.00915, 2023.
  230. Y. Wang and G. Wang, “Umcl: Unified medical image-text-label contrastive learning with continuous prompt,” in 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).   IEEE, 2023, pp. 2285–2289.
  231. X. Zhang et al., “Knowledge-enhanced visual-language pre-training on chest radiology images,” Nat. Commun., vol. 14, no. 1, p. 4542, 2023.
  232. S. Liu et al., “Multi-modal molecule structure–text model for text-based retrieval and editing,” Nature Machine Intelligence, vol. 5, no. 12, pp. 1447–1457, 2023.
  233. Y. Lei et al., “Clip-lung: Textual knowledge-guided lung nodule malignancy prediction,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2023.   Cham: Springer Nature Switzerland, 2023, pp. 403–412.
  234. C. Seibold et al., “Breaking with fixed set pathology recognition through report-guided contrastive training,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2022, pp. 690–700.
  235. M. Y. Lu et al., “Visual language pretrained multiple instance zero-shot transfer for histopathology images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 19 764–19 775.
  236. B. Yan and M. Pei, “Clinical-bert: Vision-language pre-training for radiograph diagnosis and reports generation,” in Proc. AAAI Conf. Artif. Intell., vol. 36, no. 3, 2022, pp. 2982–2990.
  237. Z. Chen et al., “Multi-modal masked autoencoders for medical vision-and-language pre-training,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2022, pp. 679–689.
  238. J. H. Moon et al., “Multi-modal understanding and generation for medical images and text via vision-language pre-training,” IEEE J. Biomed. Health Inform., vol. 26, no. 12, pp. 6070–6080, 2022.
  239. W. Lin and other, “Pmc-clip: Contrastive language-image pre-training using biomedical documents,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2023.   Cham: Springer Nature Switzerland, 2023, pp. 525–536.
  240. Z. Chen et al., “Align, reason and learn: Enhancing medical vision-and-language pre-training with knowledge,” in Proceedings of the 30th ACM International Conference on Multimedia, 2022, pp. 5152–5161.
  241. W. Huang et al., “Enhancing representation in radiography-reports foundation model: A granular alignment algorithm using masked contrastive learning,” arXiv preprint arXiv:2309.05904, 2023.
  242. P. Li et al., “Masked vision and language pre-training with unimodal and multimodal contrastive losses for medical visual question answering,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 374–383.
  243. C. Liu et al., “T3d: Towards 3d medical image understanding through vision-language pre-training,” arXiv preprint arXiv:2312.01529, 2023.
  244. T. Jin and Others, “Gene-induced multimodal pre-training for image-omic classification,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2023, pp. 508–517.
  245. B. Boecking et al., “Making the most of text semantics to improve biomedical vision–language processing,” in Proc. Eur. Conf. Comput. Vis.   Springer, 2022, pp. 1–21.
  246. P. Cheng et al., “Prior: Prototype representation joint learning from medical images and reports,” in Proc. IEEE Int. Conf. Comput. Vis., 2023, pp. 21 361–21 371.
  247. M. Y. Lu et al., “A visual-language foundation model for computational pathology,” Nature Medicine, 2024.
  248. S. Liu et al., “A text-guided protein design framework,” arXiv preprint arXiv:2302.04611, 2023.
  249. S. Eslami et al., “Pubmedclip: How much does clip benefit visual question answering in the medical domain?” in Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 1181–1193.
  250. M. Moor et al., “Med-flamingo: a multimodal medical few-shot learner,” in Machine Learning for Health.   PMLR, 2023, pp. 353–367.
  251. C. Li et al., “Llava-med: Training a large language-and-vision assistant for biomedicine in one day,” Advances in Neural Information Processing Systems, 2024.
  252. E. Tiu et al., “Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning,” Nat. Biomed. Eng., vol. 6, no. 12, pp. 1399–1406, 2022.
  253. W. Ikezogwo et al., “Quilt-1m: One million image-text pairs for histopathology,” Advances in Neural Information Processing Systems, 2024.
  254. Z. Huang et al., “A visual–language foundation model for pathology image analysis using medical twitter,” Nat. Med., vol. 29, no. 9, pp. 2307–2316, 2023.
  255. S. Baliah et al., “Exploring the transfer learning capabilities of clip in domain generalization for diabetic retinopathy,” in International Workshop on Machine Learning in Medical Imaging.   Springer, 2023, pp. 444–453.
  256. P. Chambon et al., “Roentgen: vision-language foundation model for chest x-ray generation,” arXiv preprint arXiv:2211.12737, 2022.
  257. T. Van Sonsbeek et al., “Open-ended medical visual question answering through prefix tuning of language models,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2023, pp. 726–736.
  258. P. Chambon et al., “Adapting pretrained vision-language foundational models to medical imaging domains,” in NeurIPS 2022 Foundation Models for Decision Making Workshop, 2022.
  259. J. Liu et al., “Qilin-med-vl: Towards chinese large vision-language model for general healthcare,” arXiv preprint arXiv:2310.17956, 2023.
  260. Y. Sun et al., “Pathasst: Redefining pathology through generative foundation ai assistant for pathology,” Proc. AAAI Conf. Artif. Intell., 2024.
  261. M. Y. Lu et al., “A foundational multimodal vision language ai assistant for human pathology,” arXiv preprint arXiv:2312.07814, 2023.
  262. Y. Lu et al., “Effectively fine-tune to improve large multimodal models for radiology report generation,” in Deep Generative Models for Health Workshop NeurIPS 2023, 2023.
  263. Z. Yu et al., “Multi-modal adapter for medical vision-and-language learning,” in International Workshop on Machine Learning in Medical Imaging.   Springer, 2023, pp. 393–402.
  264. T. T. Pham et al., “I-ai: A controllable & interpretable ai system for decoding radiologists’ intense focus for accurate cxr diagnoses,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 7850–7859.
  265. Y. Zhang et al., “Text-guided foundation model adaptation for pathological image classification,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2023, pp. 272–282.
  266. O. Thawkar et al., “Xraygpt: Chest radiographs summarization using medical vision-language models,” arXiv preprint arXiv:2306.07971, 2023.
  267. C. Pellegrini et al., “Xplainer: From x-ray observations to explainable zero-shot diagnosis,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2023, pp. 420–429.
  268. Z. Qin et al., “Medical image understanding with pretrained vision language models: A comprehensive study,” in The Eleventh International Conference on Learning Representations, 2022.
  269. M. Guo and Others, “Multiple prompt fusion for zero-shot lesion detection using vision-language models,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2023, pp. 283–292.
  270. Z. Wang et al., “Biobridge: Bridging biomedical foundation models via knowledge graph,” arXiv preprint arXiv:2310.03320, 2023.
  271. Y. Li et al., “A comparison of pre-trained vision-and-language models for multimodal representation learning across medical images and reports,” in 2020 IEEE international conference on bioinformatics and biomedicine (BIBM).   IEEE, 2020, pp. 1999–2004.
  272. J. Yu et al., “Coca: Contrastive captioners are image-text foundation models,” arXiv preprint arXiv:2205.01917, 2022.
  273. H. Liu et al., “Visual instruction tuning,” Advances in neural information processing systems, vol. 36, 2024.
  274. J.-B. Alayrac et al., “Flamingo: a visual language model for few-shot learning,” Proc. Adv. Neural Inf. Process. Syst., vol. 35, pp. 23 716–23 736, 2022.
  275. D. Driess et al., “Palm-e: An embodied multimodal language model,” in International Conference on Machine Learning.   PMLR, 2023, pp. 8469–8488.
  276. C. Liu et al., “Utilizing synthetic data for medical vision-language pre-training: Bypassing the need for real images,” arXiv preprint arXiv:2310.07027, 2023.
  277. B. Kumar et al., “Towards reliable zero shot classification in self-supervised models with conformal prediction,” arXiv preprint arXiv:2210.15805, 2022.
  278. Z. Zhao et al., “A large-scale dataset of patient summaries for retrieval-based clinical decision support systems.” Scientific data, vol. 10 1, p. 909, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:266360591
  279. A. E. Johnson et al., “Mimic-iii, a freely accessible critical care database,” Sci. Data, vol. 3, no. 1, pp. 1–9, 2016.
  280. A. Johnson et al., “Mimic-iv, a freely accessible electronic health record dataset,” Scientific data, vol. 10, no. 1, p. 1, 2023.
  281. T. J. Pollard et al., “The eicu collaborative research database, a freely available multi-center database for critical care research,” Scientific data, vol. 5, no. 1, pp. 1–13, 2018.
  282. W. Chen et al., “A benchmark for automatic medical consultation system: frameworks, tasks and datasets,” Bioinformatics, vol. 39, no. 1, p. btac817, 2023.
  283. J. Li et al., “Huatuo-26m, a large-scale chinese medical qa dataset,” arXiv preprint arXiv:2305.01526, 2023.
  284. M. Zhu et al., “Question answering with long multiple-span answers,” in Findings of the Association for Computational Linguistics: EMNLP 2020, T. Cohn, Y. He, and Y. Liu, Eds.   Online: Association for Computational Linguistics, Nov. 2020, pp. 3840–3849. [Online]. Available: https://aclanthology.org/2020.findings-emnlp.342
  285. A. Ben Abacha and Others, “A question-entailment approach to question answering,” BMC bioinformatics, vol. 20, pp. 1–23, 2019.
  286. W. Liu et al., “Meddg: An entity-centric medical consultation dataset for entity-aware medical dialogue generation,” in Natural Language Processing and Chinese Computing.   Cham: Springer International Publishing, 2022, pp. 447–459.
  287. J. Liu et al., “Benchmarking large language models on cmexam-a comprehensive chinese medical exam dataset,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  288. S. Zhang et al., “Multi-scale attentive interaction networks for chinese medical question answer selection,” IEEE Access, vol. 6, pp. 74 061–74 071, 2018.
  289. S. Suster and W. Daelemans, “Clicr: a dataset of clinical case reports for machine reading comprehension,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 1551–1563.
  290. J. He et al., “Applying deep matching networks to chinese medical question answering: a study and a dataset,” BMC medical informatics and decision making, vol. 19, pp. 91–100, 2019.
  291. G. Zeng et al., “Meddialog: Large-scale medical dialogue datasets,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 9241–9250.
  292. M. Zhu et al., “A hierarchical attention retrieval model for healthcare question answering,” in The World Wide Web Conference, ser. WWW ’19.   New York, NY, USA: Association for Computing Machinery, 2019, p. 2472–2482. [Online]. Available: https://doi.org/10.1145/3308558.3313699
  293. Y. Hu et al., “Omnimedvqa: A new large-scale comprehensive evaluation benchmark for medical lvlm,” arXiv preprint arXiv:2402.09181, 2024.
  294. N. Zhang et al., “Cblue: A chinese biomedical language understanding evaluation benchmark,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 7888–7915.
  295. D. Wang et al., “A real-world dataset and benchmark for foundation model adaptation in medical image classification,” Scientific Data, vol. 10, no. 1, p. 574, 2023.
  296. M. Antonelli et al., “The medical segmentation decathlon,” Nat. Commun., vol. 13, no. 1, p. 4128, 2022.
  297. J. Ma et al., “Unleashing the strengths of unlabeled data in pan-cancer abdominal organ quantification: the flare22 challenge,” arXiv preprint arXiv:2308.05862, 2023.
  298. J. Wasserthal et al., “Totalsegmentator: Robust segmentation of 104 anatomic structures in ct images,” Radiology: Artificial Intelligence, vol. 5, no. 5, p. e230024, 2023.
  299. J. Ma et al., “Abdomenct-1k: Is abdominal organ segmentation a solved problem?” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 10, pp. 6695–6714, 2022.
  300. Y. Deng et al., “Ctspine1k: A large-scale dataset for spinal vertebrae segmentation in computed tomography,” arXiv preprint arXiv:2105.14711, 2021.
  301. P. Liu et al., “Deep learning to segment pelvic bones: large-scale ct datasets and baseline models,” International Journal of Computer Assisted Radiology and Surgery, vol. 16, no. 5, p. 749, 2021.
  302. U. Baid et al., “The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification,” arXiv preprint arXiv:2107.02314, 2021.
  303. B. H. Menze et al., “The multimodal brain tumor image segmentation benchmark (brats),” IEEE transactions on medical imaging, vol. 34, no. 10, pp. 1993–2024, 2014.
  304. S. Bakas et al., “Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features,” Sci. Data, vol. 4, no. 1, p. 170117, 2017.
  305. D. LaBella et al., “The asnr-miccai brain tumor segmentation (brats) challenge 2023: Intracranial meningioma,” arXiv preprint arXiv:2305.07642, 2023.
  306. R. C. Petersen et al., “Alzheimer’s disease neuroimaging initiative (adni): clinical characterization,” Neurology, vol. 74, no. 3, pp. 201–209, 2010.
  307. K. Marek et al., “The parkinson progression marker initiative (ppmi),” Progress in neurobiology, vol. 95, no. 4, pp. 629–635, 2011.
  308. S. Gatidis et al., “A whole-body fdg-pet/ct dataset with manually annotated tumor lesions,” Sci. Data, vol. 9, no. 1, p. 601, 2022.
  309. ——, “The autopet challenge: Towards fully automated lesion segmentation in oncologic pet/ct imaging,” 2023.
  310. N. F. Greenwald et al., “Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning,” Nature biotechnology, vol. 40, no. 4, pp. 555–565, 2022.
  311. K. Chang et al., “The cancer genome atlas pan-cancer analysis project,” Nature Genetics, vol. 45, pp. 1113–1120, 2013. [Online]. Available: https://doi.org/10.1038/ng.2764
  312. Y. J. Kim et al., “Paip 2019: Liver cancer segmentation challenge,” Med. Image Anal., vol. 67, p. 101854, 2021.
  313. A. A. Borkowski et al., “Lung and colon cancer histopathological image dataset (lc25000),” arXiv preprint arXiv:1912.12142, 2019.
  314. J. N. Kather et al., “Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study,” PLoS Med., vol. 16, no. 1, p. e1002730, 2019.
  315. X. Wang, et al., “Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3462–3471.
  316. P. Rajpurkar et al., “Mura: Large dataset for abnormality detection in musculoskeletal radiographs,” in Medical Imaging with Deep Learning, 2022.
  317. V. Rotemberg et al., “A patient-centric dataset of images and metadata for identifying melanomas using clinical context,” Sci. Data, vol. 8, no. 1, p. 34, 2021.
  318. C. De Vente et al., “Airogs: artificial intelligence for robust glaucoma screening challenge,” IEEE transactions on medical imaging, 2023.
  319. M. Subramanian et al., “Classification of retinal oct images using deep learning,” in 2022 International Conference on Computer Communication and Informatics (ICCCI), 2022, pp. 1–7.
  320. A. Montoya et al., “Ultrasound nerve segmentation,” 2016. [Online]. Available: https://kaggle.com/competitions/ultrasound-nerve-segmentation
  321. X. P. Burgos-Artizzu et al., “Evaluation of deep convolutional neural networks for automatic classification of common maternal fetal ultrasound planes,” Sci. Rep., vol. 10, no. 1, p. 10200, 2020.
  322. D. Ouyang et al., “Video-based ai for beat-to-beat assessment of cardiac function,” Nature, vol. 580, no. 7802, pp. 252–256, 2020.
  323. G. Polat et al., “Improving the computer-aided estimation of ulcerative colitis severity according to mayo endoscopic score by using regression-based deep learning,” Nes. Nutr. Ws., p. izac226, 2022.
  324. M. Misawa et al., “Development of a computer-aided detection system for colonoscopy and a publicly accessible large colonoscopy video database (with video),” Gastrointestinal endoscopy, vol. 93, no. 4, pp. 960–967, 2021.
  325. P. H. Smedsrud et al., “Kvasir-capsule, a video capsule endoscopy dataset,” Sci. Data, vol. 8, no. 1, p. 142, 2021.
  326. K. B. Ozyoruk et al., “Endoslam dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos,” Med. Image Anal., vol. 71, p. 102058, 2021.
  327. H. Borgli et al., “Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy,” Sci. Data, vol. 7, no. 1, pp. 1–14, 2020.
  328. C. I. Nwoye and N. Padoy, “Data splits and metrics for method benchmarking on surgical action triplet datasets,” arXiv preprint arXiv:2204.05235, 2022.
  329. Y. Ma et al., “Ldpolypvideo benchmark: a large-scale colonoscopy video dataset of diverse polyps,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention.   Springer, 2021, pp. 387–396.
  330. K. Yan et al., “Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning,” Journal of medical imaging, vol. 5, no. 3, pp. 036 501–036 501, 2018.
  331. S. G. Armato III et al., “The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans,” Medical physics, vol. 38, no. 2, pp. 915–931, 2011.
  332. S.-L. Liew et al., “A large, curated, open-source stroke neuroimaging dataset to improve lesion segmentation algorithms,” Sci. Data, vol. 9, no. 1, p. 320, 2022.
  333. A. Saha et al., “Artificial intelligence and radiologists at prostate cancer detection in mri—the pi-cai challenge,” in Medical Imaging with Deep Learning, short paper track, 2023.
  334. N. Bien et al., “Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of mrnet,” PLoS medicine, vol. 15, no. 11, p. e1002699, 2018.
  335. G. Duffy et al., “High-throughput precision phenotyping of left ventricular hypertrophy with cardiovascular deep learning,” JAMA cardiology, vol. 7, no. 4, pp. 386–395, 2022.
  336. P. Ghahremani et al., “Deep learning-inferred multiplex immunofluorescence for immunohistochemical image quantification,” Nature machine intelligence, vol. 4, no. 4, pp. 401–412, 2022.
  337. N. L. S. T. R. Team, “The national lung screening trial: overview and study design,” Radiology, vol. 258, no. 1, pp. 243–253, 2011.
  338. K. Ding et al., “A large-scale synthetic pathological dataset for deep learning-enabled segmentation of breast cancer,” Sci. Data, vol. 10, no. 1, p. 231, 2023.
  339. C. S.-C. Biology et al., “Cz cell×gene discover: A single-cell data platform for scalable exploration, analysis and modeling of aggregated data,” bioRxiv, pp. 2023–10, 2023.
  340. D. A. Benson et al., “GenBank,” Nucleic Acids Res., vol. 41, no. D1, pp. D36–D42, 11 2012. [Online]. Available: https://doi.org/10.1093/nar/gks1195
  341. L. Tarhan et al., “Single cell portal: an interactive home for single-cell genomics data,” bioRxiv, 2023.
  342. A. Frankish et al., “GENCODE reference annotation for the human and mouse genomes,” Nucleic Acids Res., vol. 47, no. D1, pp. D766–D773, 10 2018. [Online]. Available: https://doi.org/10.1093/nar/gky955
  343. A. Regev et al., “Science forum: The human cell atlas,” eLife, vol. 6, p. e27041, dec 2017. [Online]. Available: https://doi.org/10.7554/eLife.27041
  344. B. J. Raney et al., “The UCSC Genome Browser database: 2024 update,” Nucleic Acids Res., vol. 52, no. D1, pp. D1082–D1088, 11 2023. [Online]. Available: https://doi.org/10.1093/nar/gkad987
  345. N. J. Edwards et al., “The cptac data portal: A resource for cancer proteomics research,” Journal of Proteome Research, vol. 14, no. 6, pp. 2707–2713, 2015.
  346. F. J. Martin et al., “Ensembl 2023,” Nucleic Acids Res., vol. 51, no. D1, pp. D933–D941, 2023.
  347. The RNAcentral Consortium, “RNAcentral: a hub of information for non-coding RNA sequences,” Nucleic Acids Res., vol. 47, no. D1, pp. D221–D229, 11 2018. [Online]. Available: https://doi.org/10.1093/nar/gky1034
  348. D. R. Armstrong et al., “Pdbe: improved findability of macromolecular structure data in the pdb,” Nucleic acids research, vol. 48, p. D335—D343, 1 2020. [Online]. Available: https://europepmc.org/articles/PMC7145656
  349. T. U. Consortium, “Uniprot: the universal protein knowledgebase in 2023,” Nucleic Acids Research, vol. 51, pp. D523–D531, 1 2023. [Online]. Available: https://doi.org/10.1093/nar/gkac1052
  350. I. NeuroLINCS (University of California, “imn (exp 2) - als, sma and control (unaffected) imn cell lines differentiated from ips cell lines using a long differentiation protocol - rna-seq,” 2017. [Online]. Available: http://lincsportal.ccs.miami.edu/datasets/#/view/LDS-1398
  351. W. Yang et al., “Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells,” Nucleic Acids Research, vol. 41, no. D1, pp. D955–D961, 11 2012.
  352. M. Ghandi et al., “Next-generation characterization of the cancer cell line encyclopedia,” Nature, vol. 569, pp. 503–508, 2019. [Online]. Available: https://doi.org/10.1038/s41586-019-1186-3
  353. C. Bycroft et al., “The uk biobank resource with deep phenotyping and genomic data,” Nature, vol. 562, no. 7726, pp. 203–209, 2018.
  354. Z. Zhao et al., “Chinese glioma genome atlas (cgga): A comprehensive resource with functional genomic data from chinese glioma patients,” Genomics, Proteomics & Bioinformatics, vol. 19, pp. 1–12, 2021.
  355. A. E. Johnson et al., “Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports,” Sci. Data, vol. 6, no. 1, p. 317, 2019.
  356. A. Bustos et al., “Padchest: A large chest x-ray image dataset with multi-label annotated reports,” Med. Image Anal., vol. 66, p. 101797, 2020.
  357. J. Irvin et al., “Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison,” in Proc. AAAI Conf. Artif. Intell., vol. 33, no. 01, 2019, pp. 590–597.
  358. A. García Seco de Herrera et al., “Overview of the imageclef 2018 caption prediction tasks,” in Working Notes of CLEF 2018-Conference and Labs of the Evaluation Forum (CLEF 2018), Avignon, France, September 10-14, 2018., vol. 2125.   CEUR Workshop Proceedings, 2018.
  359. X. He, Y. Zhang, L. Mou, E. Xing, and P. Xie, “Pathvqa: 30000+ questions for medical visual question answering,” arXiv preprint arXiv:2003.10286, 2020.
  360. M. Tsuneki and F. Kanavati, “Inference of captions from histopathological patches,” in Proc. Int. Conf. Medical Imaging Deep Learn.   PMLR, 2022, pp. 1235–1250.
  361. P. Wagner et al., “Ptb-xl, a large publicly available electrocardiography dataset,” Sci. Data, vol. 7, no. 1, p. 154, 2020.
  362. O. Pelka et al., “Radiology objects in context (roco): a multimodal image dataset,” in Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis: 7th Joint International Workshop, CVII-STENT 2018 and Third International Workshop, LABELS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings 3.   Springer, 2018, pp. 180–189.
  363. S. Subramanian et al., “Medicat: A dataset of medical images, captions, and textual references,” in Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020.   Association for Computational Linguistics (ACL), 2020, pp. 2112–2120.
  364. X. Zhang et al., “Pmc-vqa: Visual instruction tuning for medical visual question answering,” arXiv preprint arXiv:2305.10415, 2023.
  365. A. Saha et al., “A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 dce-mri features,” British journal of cancer, vol. 119, no. 4, pp. 508–516, 2018.
  366. W. Li et al., “I-SPY 2 Breast Dynamic Contrast Enhanced MRI Trial (ISPY2).” [Online]. Available: https://doi.org/10.7937/TCIA.D8Z0-9T85
  367. J. Gamper and N. Rajpoot, “Multiple instance captioning: Learning representations from histopathology textbooks and articles,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., June 2021, pp. 16 549–16 559.
  368. K. Clark et al., “The cancer imaging archive (tcia): maintaining and operating a public information repository,” Journal of digital imaging, vol. 26, pp. 1045–1057, 2013.
  369. D. Ueda et al., “Diagnostic performance of chatgpt from patient history and imaging findings on the diagnosis please quizzes,” Radiology, vol. 308, no. 1, p. e231040, 2023.
  370. S.-H. Wu et al., “Collaborative enhancement of consistency and accuracy in us diagnosis of thyroid nodules using large language models,” Radiology, vol. 310, no. 3, p. e232255, 2024.
  371. S. R. Ali et al., “Using chatgpt to write patient clinic letters,” The Lancet Digital Health, vol. 5, no. 4, pp. e179–e181, 2023.
  372. A. Abd-Alrazaq et al., “Large language models in medical education: Opportunities, challenges, and future directions,” JMIR Medical Education, vol. 9, no. 1, p. e48291, 2023.
  373. M. Karabacak et al., “The advent of generative language models in medical education,” JMIR Medical Education, vol. 9, p. e48163, 2023.
  374. T. H. Kung et al., “Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models,” PLoS Digital Health, vol. 2, no. 2, p. e0000198, 2023.
  375. A. B. Coşkun et al., “Integration of chatgpt and e-health literacy: Opportunities, challenges, and a look towards the future,” Journal of Health Reports and Technology, vol. 10, no. 1, 2024.
  376. P. Lee et al., “Benefits, limits, and risks of gpt-4 as an ai chatbot for medicine,” New Engl. J. Med., vol. 388, no. 13, pp. 1233–1239, 2023.
  377. Y. Chen et al., “Soulchat: Improving llms’ empathy, listening, and comfort abilities through fine-tuning with multi-turn empathy conversations,” in Findings of the Association for Computational Linguistics: EMNLP 2023, 2023, pp. 1170–1183.
  378. Y. Luo et al., “Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine,” arXiv preprint arXiv:2308.09442, 2023.
  379. L. Huawei Technologies Co., “A general introduction to artificial intelligence,” in Artificial Intelligence Technology.   Springer, 2022, pp. 1–41.
  380. D. B. Larson et al., “Ethics of using and sharing clinical imaging data for artificial intelligence: a proposed framework,” Radiology, vol. 295, no. 3, pp. 675–682, 2020.
  381. S. Salerno et al., “Overdiagnosis and overimaging: an ethical issue for radiological protection,” La radiologia medica, vol. 124, pp. 714–720, 2019.
  382. D. Kaur et al., “Trustworthy artificial intelligence: a review,” ACM Computing Surveys (CSUR), vol. 55, no. 2, pp. 1–38, 2022.
  383. M. Haendel et al., “How many rare diseases are there?” Nat. Rev. Drug Discov., vol. 19, no. 2, pp. 77–78, 2020.
  384. H. Guan and M. Liu, “Domain adaptation for medical image analysis: a survey,” IEEE Trans. Biomed. Eng., vol. 69, no. 3, pp. 1173–1185, 2021.
  385. Z. Liu and K. He, “A decade’s battle on dataset bias: Are we there yet?” arXiv preprint arXiv:2403.08632, 2024.
  386. A. Cassidy et al., “Lung cancer risk prediction: a tool for early detection,” Int. J. Cancer, vol. 120, no. 1, pp. 1–6, 2007.
  387. J. Gama et al., “A survey on concept drift adaptation,” ACM computing surveys (CSUR), vol. 46, no. 4, pp. 1–37, 2014.
  388. S. Wang et al., “Annotation-efficient deep learning for automatic medical image segmentation,” Nat. Commun., vol. 12, no. 1, p. 5915, 2021.
  389. N. Tajbakhsh et al., “Guest editorial annotation-efficient deep learning: the holy grail of medical imaging,” IEEE Trans. Med. Imaging, vol. 40, no. 10, pp. 2526–2533, 2021.
  390. L. Sun et al., “Trustllm: Trustworthiness in large language models,” arXiv preprint arXiv:2401.05561, 2024.
  391. K. Sokol and P. Flach, “One explanation does not fit all: The promise of interactive explanations for machine learning transparency,” KI-Künstliche Intelligenz, vol. 34, no. 2, pp. 235–250, 2020.
  392. R. Bommasani et al., “The foundation model transparency index,” arXiv preprint arXiv:2310.12941, 2023.
  393. R. J. Chen et al., “Algorithmic fairness in artificial intelligence for medicine and healthcare,” Nat. Biomed. Eng., vol. 7, no. 6, pp. 719–742, 2023.
  394. F. Motoki et al., “More human than human: Measuring chatgpt political bias,” Available at SSRN 4372349, 2023.
  395. V. Felkner et al., “Winoqueer: A community-in-the-loop benchmark for anti-lgbtq+ bias in large language models,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 9126–9140.
  396. S. Gehman et al., “Realtoxicityprompts: Evaluating neural toxic degeneration in language models,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 3356–3369.
  397. A. Wei et al., “Jailbroken: How does llm safety training fail?” Advances in Neural Information Processing Systems, vol. 36, 2024.
  398. K. Bærøe et al., “How to achieve trustworthy artificial intelligence for health,” Bull. World Health Organ., vol. 98, no. 4, p. 257, 2020.
  399. P.-Y. Chen and C. Xiao, “Trustworthy ai in the era of foundation models,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023.
  400. M. Dwyer-White et al., “High reliability in healthcare,” in Patient Safety: A Case-based Innovative Playbook for Safer Care.   Springer, 2023, pp. 3–13.
  401. V. Rawte, A. Sheth, and A. Das, “A survey of hallucination in large foundation models,” arXiv preprint arXiv:2309.05922, 2023.
  402. C. Li and J. Flanigan, “Task contamination: Language models may not be few-shot anymore,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 16, 2024, pp. 18 471–18 480.
  403. Y. Yao et al., “Editing large language models: Problems, methods, and opportunities,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 10 222–10 240.
  404. J. Hoelscher-Obermaier et al., “Detecting edit failures in large language models: An improved specificity benchmark,” in Findings of the Association for Computational Linguistics: ACL 2023, 2023, pp. 11 548–11 559.
  405. M. Raghu et al., “On the expressive power of deep neural networks,” in Proc. Int. Conf. Mach. Learn.   PMLR, 2017, pp. 2847–2854.
  406. A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Represent., 2020.
  407. Z. Liu et al., “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proc. IEEE Int. Conf. Comput. Vis., 2021, pp. 10 012–10 022.
  408. S. Zhao et al., “Elements of chronic disease management service system: an empirical study from large hospitals in china,” Sci. Rep., vol. 12, no. 1, p. 5693, 2022.
  409. C. Chen et al., “Deep learning on computational-resource-limited platforms: a survey,” Mob. Inf. Syst., vol. 2020, pp. 1–19, 2020.
  410. L. Deng et al., “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proc. IEEE, vol. 108, no. 4, pp. 485–532, 2020.
  411. N. Ding et al., “Parameter-efficient fine-tuning of large-scale pre-trained language models,” Nat. Mach. Intell, vol. 5, no. 3, pp. 220–235, 2023.
  412. E. Griffith, “The desperate hunt for the ai boom’s most indispensable prize.” International New York Times, pp. NA–NA, 2023.
  413. U. Gupta et al., “Chasing carbon: The elusive environmental footprint of computing,” in 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).   IEEE, 2021, pp. 854–867.
  414. P. Henderson et al., “Towards the systematic reporting of the energy and carbon footprints of machine learning,” J. Mach. Learn. Res., vol. 21, no. 1, pp. 10 039–10 081, 2020.
  415. A. Park et al., “Deep learning–assisted diagnosis of cerebral aneurysms using the headxnet model,” JAMA network open, vol. 2, no. 6, pp. e195 600–e195 600, 2019.
  416. D. F. Steiner et al., “Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer,” Am. J. Surg. Pathol., vol. 42, no. 12, p. 1636, 2018.
  417. H.-E. Kim et al., “Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study,” The Lancet Digital Health, vol. 2, no. 3, pp. e138–e148, 2020.
  418. P. Tschandl et al., “Human–computer collaboration for skin cancer recognition,” Nat. Med., vol. 26, no. 8, pp. 1229–1234, 2020.
  419. Y. Han et al., “Dynamic neural networks: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 7436–7456, 2021.
  420. A. Vaswani et al., “Attention is all you need,” Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017.
  421. N. Shazeer et al., “Outrageously large neural networks: The sparsely-gated mixture-of-experts layer,” in Proc. Int. Conf. Learn. Represent., 2016.
  422. C. You et al., “Implicit anatomical rendering for medical image segmentation with stochastic experts,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2023, pp. 561–571.
  423. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
  424. H. Yi, Z. Qin, Q. Lao, W. Xu, Z. Jiang, D. Wang, S. Zhang, and K. Li, “Towards general purpose medical ai: Continual learning medical foundation model,” arXiv preprint arXiv:2303.06580, 2023.
  425. T. Kojima et al., “Large language models are zero-shot reasoners,” Adv. Neur. In., vol. 35, pp. 22 199–22 213, 2022.
  426. Q.-F. Wang et al., “Learngene: From open-world to your learning task,” in Proc. AAAI Conf. Artif. Intell., vol. 36, no. 8, 2022, pp. 8557–8565.
  427. Y. Tan et al., “Federated learning from pre-trained models: A contrastive learning approach,” Adv. Neur. In., vol. 35, pp. 19 332–19 344, 2022.
  428. W. Zhuang et al., “When foundation model meets federated learning: Motivations, challenges, and future directions,” arXiv preprint arXiv:2306.15546, 2023.
  429. J. Zhu et al., “Uni-perceiver-moe: Learning sparse generalist models with conditional moes,” Adv. Neur. In., vol. 35, pp. 2664–2678, 2022.
  430. C. Geng et al., “Recent advances in open set recognition: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3614–3631, 2020.
  431. Y. Li et al., “Scaling language-image pre-training via masking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 23 390–23 400.
  432. M. Ma et al., “Are multimodal transformers robust to missing modality?” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 18 177–18 186.
  433. Y. Yuan, “On the power of foundation models,” in Proc. Int. Conf. Mach. Learn.   PMLR, 2023, pp. 40 519–40 530.
  434. J. Jiménez-Luna et al., “Drug discovery with explainable artificial intelligence,” Nat. Mach. Intell, vol. 2, no. 10, pp. 573–584, 2020.
  435. A. Qayyum et al., “Secure and robust machine learning for healthcare: A survey,” IEEE Rev. Biomed. Eng., vol. 14, pp. 156–180, 2020.
  436. C. Schlarmann and M. Hein, “On the adversarial robustness of multi-modal foundation models,” in Proc. IEEE Int. Conf. Comput. Vis., 2023, pp. 3677–3685.
  437. I. Habli et al., “Artificial intelligence in health care: accountability and safety,” Bull. World Health Organ., vol. 98, no. 4, p. 251, 2020.
  438. R. Vinuesa et al., “The role of artificial intelligence in achieving the sustainable development goals,” Nat. Commun., vol. 11, no. 1, pp. 1–10, 2020.
  439. L. H. Kaack et al., “Aligning artificial intelligence with climate change mitigation,” Nat. Clim. Change, vol. 12, no. 6, pp. 518–527, 2022.
  440. G. Menghani, “Efficient deep learning: A survey on making deep learning models smaller, faster, and better,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–37, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yuting He (18 papers)
  2. Fuxiang Huang (8 papers)
  3. Xinrui Jiang (4 papers)
  4. Yuxiang Nie (15 papers)
  5. Minghao Wang (18 papers)
  6. Jiguang Wang (6 papers)
  7. Hao Chen (1005 papers)
Citations (16)
X Twitter Logo Streamline Icon: https://streamlinehq.com