Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare (2405.06784v1)
Abstract: This survey explores the transformative impact of foundation models (FMs) in artificial intelligence, focusing on their integration with federated learning (FL) for advancing biomedical research. Foundation models such as ChatGPT, LLaMa, and CLIP, which are trained on vast datasets through methods including unsupervised pretraining, self-supervised learning, instructed fine-tuning, and reinforcement learning from human feedback, represent significant advancements in machine learning. These models, with their ability to generate coherent text and realistic images, are crucial for biomedical applications that require processing diverse data forms such as clinical reports, diagnostic images, and multimodal patient interactions. The incorporation of FL with these sophisticated models presents a promising strategy to harness their analytical power while safeguarding the privacy of sensitive medical data. This approach not only enhances the capabilities of FMs in medical diagnostics and personalized treatment but also addresses critical concerns about data privacy and security in healthcare. This survey reviews the current applications of FMs in federated settings, underscores the challenges, and identifies future research directions including scaling FMs, managing data diversity, and enhancing communication efficiency within FL frameworks. The objective is to encourage further research into the combined potential of FMs and FL, laying the groundwork for groundbreaking healthcare innovations.
- Chatgpt makes medicine easy to swallow: an exploratory case study on simplified radiology reports. European radiology, pages 1–9, 2023.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Deep learning. nature, 521(7553):436–444, 2015.
- Why does unsupervised pre-training help deep learning? In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 201–208. JMLR Workshop and Conference Proceedings, 2010.
- Unsupervised pre-training of image features on non-curated data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2959–2968, 2019.
- The lottery tickets hypothesis for supervised and self-supervised pre-training in computer vision models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16306–16316, 2021.
- Dense contrastive learning for self-supervised visual pre-training. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3024–3033, 2021.
- Visual instruction tuning. Advances in neural information processing systems, 36, 2024.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
- Machine learning in biomedical engineering. Biomedical Engineering Letters, 8:1–3, 2018.
- Machine learning in healthcare. Current genomics, 22(4):291, 2021.
- Trustworthy artificial intelligence: a review. ACM computing surveys (CSUR), 55(2):1–38, 2022.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Fedlga: Toward system-heterogeneity of federated learning via local gradient approximation. IEEE Transactions on Cybernetics, 2023.
- A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, 35(4):3347–3366, 2021.
- Federated learning with differential privacy: Algorithms and performance analysis. IEEE transactions on information forensics and security, 15:3454–3469, 2020.
- The impact of gdpr on global technology development, 2019.
- Beyond the hipaa privacy rule: enhancing privacy, improving health through research. 2009.
- Fedclip: Fast generalization and personalization for clip in federated learning. In ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models, 2023.
- MedCLIP: Contrastive learning from unpaired medical images and text. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3876–3887, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics.
- Fedmed: A federated learning framework for language modeling. Sensors, 20(14):4048, 2020.
- Medgpt: Medical concept prediction from clinical narratives. arXiv preprint arXiv:2107.03134, 2021.
- Med-bert: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ digital medicine, 4(1):86, 2021.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- KR1442 Chowdhary and KR Chowdhary. Natural language processing. Fundamentals of artificial intelligence, pages 603–649, 2020.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
- Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140):1–67, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Layer normalization. stat, 1050:21, 2016.
- Novel positional encodings to enable tree-based transformers. Advances in neural information processing systems, 32, 2019.
- Yahui Chen. Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo, 2015.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
- Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014.
- Deep contextualized word representations. In Marilyn Walker, Heng Ji, and Amanda Stent, editors, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- A neural probabilistic language model. Advances in neural information processing systems, 13, 2000.
- Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
- An overview of bard: an early experiment with generative ai. AI. Google Static Documents, 2, 2023.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Malwina Anna Wójcik. Foundation models in healthcare: Opportunities, biases and regulatory prospects in europe. In International Conference on Electronic Government and the Information Systems Perspective, pages 32–46. Springer, 2022.
- Artificial intelligence in healthcare. Nature biomedical engineering, 2(10):719–731, 2018.
- On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610–623, 2021.
- Realizing ai in healthcare: challenges appearing in the wild. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–5, 2021.
- Connecting algorithmic research and usage contexts: a perspective of contextualized evaluation for explainable ai. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 10, pages 147–159, 2022.
- Clinician-facing ai in the wild: Taking stock of the sociotechnical challenges and opportunities for hci. ACM Transactions on Computer-Human Interaction, 30(2):1–39, 2023.
- Privacy-preserving federated brain tumour segmentation. In Machine Learning in Medical Imaging: 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 10, pages 133–141. Springer, 2019.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3):50–60, 2020.
- Scaffold: Stochastic controlled averaging for federated learning. In International conference on machine learning, pages 5132–5143. PMLR, 2020.
- Generalized federated learning via sharpness aware minimization. In International Conference on Machine Learning, pages 18250–18280. PMLR, 2022.
- Lomar: A local defense against poisoning attack on federated learning. IEEE Transactions on Dependable and Secure Computing, 2021.
- Fedcp: Separating feature information for personalized federated learning via conditional policy. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023.
- Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390, 2020.
- Fedml: A research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518, 2020.
- Fate: An industrial grade platform for collaborative learning with data protection. The Journal of Machine Learning Research, 22(1):10320–10325, 2021.
- Federatedscope: A flexible federated learning platform for heterogeneity. Proceedings of the VLDB Endowment, 16(5):1059–1072, 2023.
- Towards federated learning at scale: System design. In A. Talwalkar, V. Smith, and M. Zaharia, editors, Proceedings of Machine Learning and Systems, volume 1, pages 374–388, 2019.
- Papaya: Practical, private, and scalable federated learning. Proceedings of Machine Learning and Systems, 4:814–832, 2022.
- Kali Hays. Elon Musk’s plan to charge for Twitter API access is unraveling — businessinsider.com. https://www.businessinsider.com/elon-musk-plan-to-charge-for-twitter-api-access-unraveling-2023-5. [Accessed 23-02-2024].
- Brian Fung. Reddit sparks outrage after a popular app developer said it wants him to pay $20 million a year for data access | CNN Business — cnn.com. https://www.cnn.com/2023/06/01/tech/reddit-outrage-data-access-charge/index.html. [Accessed 23-02-2024].
- Condé Nast. Stack Overflow Will Charge AI Giants for Training Data — wired.com. https://www.wired.com/story/stack-overflow-will-charge-ai-giants-for-training-data/. [Accessed 23-02-2024].
- Embracing change: Continual learning in deep neural networks. Trends in cognitive sciences, 24(12):1028–1040, 2020.
- Adaer: An adaptive experience replay approach for continual lifelong learning. Neurocomputing, 572:127204, 2024.
- Federated continual learning with weighted inter-client transfer. In International Conference on Machine Learning, pages 12073–12086. PMLR, 2021.
- Snapfusion: Text-to-image diffusion model on mobile devices within two seconds. Advances in Neural Information Processing Systems, 36, 2024.
- Fedaffect: Few-shot federated learning for facial expression recognition. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4168–4175, 2021.
- A survey on heterogeneous federated learning. arXiv preprint arXiv:2210.04505, 2022.
- Towards fair and privacy-preserving federated deep models. IEEE Transactions on Parallel and Distributed Systems, 31(11):2524–2541, 2020.
- Practical attribute reconstruction attack against federated learning. IEEE Transactions on Big Data, 2022.
- Federated foundation models: Privacy-preserving and collaborative learning for large models. arXiv preprint arXiv:2305.11414, 2023.
- On the importance and applicability of pre-training for federated learning. In The Eleventh International Conference on Learning Representations, 2022.
- Federated learning from pre-trained models: A contrastive learning approach. Advances in Neural Information Processing Systems, 35:19332–19344, 2022.
- Gpt-fl: Generative pre-trained model-assisted federated learning. arXiv preprint arXiv:2306.02210, 2023.
- Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581, 2019.
- Promptfl: Let federated participants cooperatively learn prompts instead of models-federated learning in age of foundation model. IEEE Transactions on Mobile Computing, 2023.
- Reduce communication costs and preserve privacy: Prompt tuning method in federated learning. arXiv preprint arXiv:2208.12268, 2022.
- Federated text-driven prompt generation for vision-language models. In The Twelfth International Conference on Learning Representations, 2024.
- Kate Crawford. The atlas of AI: Power, politics, and the planetary costs of artificial intelligence. Yale University Press, 2021.
- Designing human-centered ai for mental health: Developing clinically relevant applications for online cbt treatment. ACM Transactions on Computer-Human Interaction, 30(2):1–50, 2023.
- Ethical machine learning in healthcare. Annual review of biomedical data science, 4:123–144, 2021.
- Controlling healthcare costs by removing waste: what american doctors can do now. BMJ quality & safety, 20(6):534–537, 2011.
- Artificial intelligence in clinical health care applications. Interactive journal of medical research, 8(2):e12100, 2019.
- National health expenditure projections, 2019–28: Expected rebound in prices drives rising spending growth: National health expenditure projections for the period 2019–2028. Health Affairs, 39(4):704–714, 2020.
- Considering the possibilities and pitfalls of generative pre-trained transformer 3 (gpt-3) in healthcare delivery. NPJ Digital Medicine, 4(1):93, 2021.
- Data acquisition, curation, and use for a continuously learning health system. Jama, 316(16):1669–1670, 2016.
- Deep neural networks for multimodal imaging and biomedical applications. IGI Global, 2020.
- Daniela Ionescu. Deep learning algorithms and big health care data in clinical natural language processing. Linguistic and Philosophical Investigations, (19):86–92, 2020.
- Avi Ma’ayan. Complex systems biology. Journal of the Royal Society Interface, 14(134):20170391, 2017.
- Deep multimodal learning: A survey on recent advances and trends. IEEE signal processing magazine, 34(6):96–108, 2017.
- An introduction to multisensor data fusion. Proceedings of the IEEE, 85(1):6–23, 1997.
- Federico Castanedo et al. A review of data fusion techniques. The scientific world journal, 2013, 2013.
- A review on machine learning principles for multi-view biological data integration. Briefings in bioinformatics, 19(2):325–340, 2018.
- Multimodal deep learning for biomedical data fusion: a review. Briefings in Bioinformatics, 23(2):bbab569, 2022.
- Prediction of alzheimer’s disease based on deep neural network by integrating gene expression and dna methylation dataset. Expert Systems with Applications, 140:112873, 2020.
- Capsule network based modeling of multi-omics data for discovery of breast cancer-related genes. IEEE/ACM transactions on computational biology and bioinformatics, 17(5):1605–1612, 2019.
- Integrative survival analysis of breast cancer with gene expression and dna methylation data. Bioinformatics, 37(17):2601–2608, 2021.
- Deep learning–based multi-omics integration robustly predicts survival in liver cancer. Clinical Cancer Research, 24(6):1248–1259, 2018.
- Performance comparison of deep learning autoencoders for cancer subtype detection using multi-omics data. Cancers, 13(9):2013, 2021.
- An integrative deep learning framework for classifying molecular subtypes of breast cancer. Computational and structural biotechnology journal, 18:2185–2199, 2020.
- Metacancer: A deep learning-based pan-cancer metastasis prediction model developed using multi-omics data. Computational and Structural Biotechnology Journal, 19:4404–4411, 2021.
- Predicting alzheimer’s disease progression using multi-modal deep learning approach. Scientific reports, 9(1):1952, 2019.
- Hierarchical feature representation and multimodal fusion with deep learning for ad/mci diagnosis. NeuroImage, 101:569–582, 2014.
- Accurately differentiating between patients with covid-19, patients with other viral infections, and healthy individuals: multimodal late fusion learning approach. Journal of Medical Internet Research, 23(1):e25535, 2021.
- Deep fusion learning facilitates anatomical therapeutic chemical recognition in drug repurposing and discovery. Briefings in Bioinformatics, 22(6):bbab289, 2021.
- Multimodal deep learning enhances diagnostic precision in left ventricular hypertrophy. European Heart Journal-Digital Health, 3(3):380–389, 2022.
- A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data. IEEE/ACM transactions on computational biology and bioinformatics, 16(3):841–850, 2018.
- Multimodal fusion with deep neural networks for leveraging ct imaging and electronic health record: a case-study in pulmonary embolism detection. Scientific reports, 10(1):22147, 2020.
- Neuroanatomical segmentation in mri: technological objectives. International Journal of Pattern Recognition and Artificial Intelligence, 11(08):1161–1187, 1997.
- Survey on liver ct image segmentation methods. Artificial Intelligence Review, 37:83–95, 2012.
- Utilizing segmented mri data in image-guided surgery. International Journal of Pattern Recognition and Artificial Intelligence, 11(08):1367–1397, 1997.
- Bo Song and Ahmet Sacan. Automated wound identification system based on image segmentation and artificial neural networks. In 2012 IEEE International Conference on bioinformatics and biomedicine, pages 1–4. IEEE, 2012.
- Grossberg Hu et al. Survey of recent volumetric medical image segmentation techniques. In Biomedical Engineering. IntechOpen, 2009.
- Metrics for evaluating 3d medical image segmentation: analysis, selection, and tool. BMC medical imaging, 15(1):1–28, 2015.
- Survey statistics of automated segmentations applied to optical imaging of mammalian cells. BMC bioinformatics, 16:1–28, 2015.
- Breast ultrasound image segmentation: a survey. International journal of computer assisted radiology and surgery, 12:493–507, 2017.
- Machine learning techniques for biomedical image segmentation: an overview of technical aspects and introduction to state-of-art applications. Medical physics, 47(5):e148–e167, 2020.
- Biomedical image segmentation: a survey. SN Computer Science, 2:1–22, 2021.
- How long does biomedical research take? studying the time taken between biomedical and health research and its translation into products, policy, and practice. Health research policy and systems, 13(1):1–18, 2015.
- Estimated research and development investment needed to bring a new medicine to market, 2009-2018. Jama, 323(9):844–853, 2020.
- Applications of machine learning and artificial intelligence for covid-19 (sars-cov-2) pandemic: A review. Chaos, Solitons & Fractals, 139:110059, 2020.
- drugan: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Molecular pharmaceutics, 14(9):3098–3104, 2017.
- Artificial intelligence for clinical trial design. Trends in pharmacological sciences, 40(8):577–591, 2019.
- A statistical framework for genomic data fusion. Bioinformatics, 20(16):2626–2635, 2004.
- Gene prioritization through genomic data fusion. Nature biotechnology, 24(5):537–544, 2006.
- Integrative, multimodal analysis of glioblastoma using tcga molecular data, pathology images, and clinical outcomes. IEEE Transactions on Biomedical Engineering, 58(12):3469–3474, 2011.
- Classification and staging of chronic liver disease from multimodal data. IEEE transactions on biomedical engineering, 60(5):1336–1344, 2012.
- Babel enables cross-modality translation between multiomic profiles at single-cell resolution. Proceedings of the National Academy of Sciences, 118(15):e2023070118, 2021.
- Frozen pretrained transformers as universal computation engines. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7):7628–7636, Jun. 2022.
- Generating soap notes from doctor-patient conversations using modular summarization techniques. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4958–4972, 2021.
- Healthcare in the pocket: mapping the space of mobile-phone health interventions. Journal of biomedical informatics, 45(1):184–198, 2012.
- A hierarchical attention retrieval model for healthcare question answering. In The World Wide Web Conference, pages 2472–2482, 2019.
- Artificial intelligence tool for optimizing eligibility screening for clinical trials in a large community cancer center. JCO clinical cancer informatics, 4:50–59, 2020.
- Sundar Pichai. An important next step on our AI journey — blog.google. https://blog.google/intl/en-africa/products/explore-get-answers/an-important-next-step-on-our-ai-journey/, 2023.
- Synthetic data in machine learning for medicine and healthcare. Nature Biomedical Engineering, 5(6):493–497, 2021.
- Highly accurate protein structure prediction for the human proteome. Nature, 596(7873):590–596, 2021.
- Accelerated identification of disease-causing variants with ultra-rapid nanopore genome sequencing. Nature Biotechnology, 40(7):1035–1041, 2022.
- Multi-disciplinary fairness considerations in machine learning for clinical trials. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 906–924, 2022.
- When foundation model meets federated learning: Motivations, challenges, and future directions. arXiv preprint arXiv:2306.15546, 2023.
- Where to begin? on the impact of pre-training and initialization in federated learning. arXiv preprint arXiv:2210.08090, 2022.
- Communication-efficient federated learning via knowledge distillation. Nature communications, 13(1):2032, 2022.
- Accelerating federated learning with data and model parallelism in edge computing. IEEE/ACM Transactions on Networking, 2023.
- Pipefl: Hardware/software co-design of an fpga accelerator for federated learning. IEEE Access, 10:98649–98661, 2022.
- Decentralized training of foundation models in heterogeneous environments. Advances in Neural Information Processing Systems, 35:25464–25477, 2022.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR, 2019.
- LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, 2021.
- Fedprompt: Communication-efficient and privacy-preserving prompt tuning in federated learning. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Quantization networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7308–7316, 2019.
- What is the state of neural network pruning? Proceedings of machine learning and systems, 2:129–146, 2020.
- Resfed: Communication efficient federated learning with deep compressed residuals. IEEE Internet of Things Journal, 2023.
- Privacy-preserving federated learning and its application to natural language processing. Knowledge-Based Systems, 264:109693, 2023.
- Federated learning meets natural language processing: A survey. arXiv preprint arXiv:2107.12603, 2021.
- Efficient federated learning with pre-trained large language model using several adapter mechanisms. MDPI, 11(21):4479, 2024.
- Openfedllm: Training large language models on decentralized private data via federated learning. arXiv preprint arXiv:2402.06954, 2024.
- Pretrained models for multilingual federated learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1413–1421, 2022.
- Pfedprompt: Learning personalized prompt for vision-language models in federated learning. In Proceedings of the ACM Web Conference 2023, pages 1364–1374, 2023.
- Fedmm: Federated multi-modal learning with modality heterogeneity in computational pathology. arXiv preprint arXiv:2402.15858, 2024.
- Feddat: An approach for foundation model finetuning in multi-modal heterogeneous federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 11285–11293, 2024.
- Clip-guided federated learning on heterogeneity and long-tailed data. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 14955–14963, 2024.
- Federated adaptive prompt tuning for multi-domain collaborative learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 15117–15125, 2024.
- General commerce intelligence: Glocally federated nlp-based engine for privacy-preserving and sustainable personalized services of multi-merchants. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 22752–22760, 2024.
- Heterogeneous ensemble knowledge transfer for training large models in federated learning. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI) Main Track, 2022.
- No one left behind: Inclusive federated learning over heterogeneous devices. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3398–3406, 2022.
- Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10):1872–1897, 2020.
- Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432, 2021.
- A medical multimodal large language model for future pandemics. NPJ Digital Medicine, 6(1):226, 2023.
- A survey on deep learning in medical image analysis. Medical image analysis, 42:60–88, 2017.
- Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155, 2020.
- Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837, 2022.
- Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), 3(1):1–23, 2021.
- Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2463–2473, 2019.
- Large-scale knowledge synthesis and complex information retrieval from biomedical documents. In 2022 IEEE International Conference on Big Data (Big Data), pages 2364–2369. IEEE, 2022.
- Cross-property deep transfer learning framework for enhanced predictive analytics on small materials data. Nature communications, 12(1):6595, 2021.
- Accuracy of chatgpt generated diagnosis from patient’s medical history and imaging findings in neuroradiology cases. Neuroradiology, 66(1):73–79, 2024.
- A survey and analysis of electronic healthcare record standards. Acm Computing Surveys (Csur), 37(4):277–315, 2005.
- Capturing the patient’s perspective: a review of advances in natural language processing of health-related text. Yearbook of medical informatics, 26(01):214–227, 2017.
- Secnlp: A survey of embeddings in clinical natural language processing. Journal of biomedical informatics, 101:103323, 2020.
- Deep learning for electronic health records: A comparative review of multiple deep neural architectures. Journal of biomedical informatics, 101:103337, 2020.
- Representation learning for electronic health records. arXiv preprint arXiv:1909.09248, 2019.
- Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
- Data resource profile: clinical practice research datalink (cprd). International journal of epidemiology, 44(3):827–836, 2015.
- Cometa: A corpus for medical entity linking in the social media. arXiv preprint arXiv:2010.03295, 2020.
- Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter. Frontiers in Artificial Intelligence, 6:1023281, 2023.
- Mimic-cxr-jpg, a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042, 2019.
- Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome. Bioinformatics, 37(15):2112–2120, 2021.
- The philadelphia neurodevelopmental cohort: A publicly available resource for the study of normal and abnormal brain development in youth. Neuroimage, 124:1115–1119, 2016.
- Identification of autism spectrum disorder using deep learning and the abide dataset. NeuroImage: Clinical, 17:16–23, 2018.
- Image processing and quality control for the first 10,000 brain imaging datasets from uk biobank. Neuroimage, 166:400–424, 2018.
- The human protein atlas—integrated omics for single cell mapping of the human proteome. Protein Science, 32(2):e4562, 2023.
- Polyester: simulating rna-seq datasets with differential transcript expression. Bioinformatics, 31(17):2778–2784, 2015.
- Overview of the imageclefmed 2020 concept prediction task: Medical image understanding. CLEF2020 Working Notes, 2696, 2020.
- On position embeddings in bert. In International Conference on Learning Representations, 2020.
- Encoding word order in complex embeddings. In International Conference on Learning Representations, 2019.
- Improving language understanding by generative pre-training. 2018.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 2020.
- Transfer learning in biomedical natural language processing: an evaluation of bert and elmo on ten benchmarking datasets. arXiv preprint arXiv:1906.05474, 2019.
- Scibert: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3615–3620, 2019.
- Ammu: a survey of transformer-based biomedical pretrained language models. Journal of biomedical informatics, 126:103982, 2022.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
- Foundation models in healthcare: Opportunities, risks & strategies forward. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, CHI EA ’23, New York, NY, USA, 2023. Association for Computing Machinery.
- Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342, 2019.
- Scifive: a text-to-text transformer model for biomedical literature. arXiv preprint arXiv:2106.03598, 2021.
- Llms in biomedicine: A study on clinical named entity recognition. arXiv preprint arXiv:2404.07376, 2024.
- Clinicalgpt: large language models finetuned with diverse medical data and comprehensive evaluation. arXiv preprint arXiv:2306.09968, 2023.
- Large language models encode clinical knowledge. Nature, 620(7972):172–180, 2023.
- Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge. arXiv preprint arXiv:2303.14070, 2023.
- Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks. Journal of the American Medical Informatics Association, page ocae037, 2024.
- Erik Tjong Kim Sang and Fien De Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147, 2003.
- Qwen technical report. arXiv preprint arXiv:2309.16609, 2023.
- Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008.
- Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
- Beit: Bert pre-training of image transformers. In International Conference on Learning Representations, 2021.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
- Simmim: A simple framework for masked image modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9653–9663, 2022.
- Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360, 2020.
- Continual domain-tuning for pretrained language models. arXiv preprint arXiv:2004.02288, 2020.
- Multi-stage pre-training for low-resource domain adaptation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5461–5468, 2020.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
- Roentgen: vision-language foundation model for chest x-ray generation. arXiv preprint arXiv:2211.12737, 2022.
- Adapting pretrained vision-language foundational models to medical imaging domains. In NeurIPS 2022 Foundation Models for Decision Making Workshop, 2022.
- Multi-task learning of hierarchical vision-language representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10492–10501, 2019.
- Pyramidclip: Hierarchical feature alignment for vision-language model pretraining. Advances in neural information processing systems, 35:35959–35970, 2022.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22500–22510, 2023.
- Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 590–597, 2019.
- Carl Doersch. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908, 2016.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
- Medicat: A dataset of medical images, captions, and textual references. arXiv preprint arXiv:2010.06000, 2020.
- Radiology objects in context (roco): a multimodal image dataset. In Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis: 7th Joint International Workshop, CVII-STENT 2018 and Third International Workshop, LABELS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings 3, pages 180–189. Springer, 2018.
- Reliable covid-19 detection using chest x-ray images. In 2021 IEEE International Conference on Image Processing (ICIP), pages 185–189. IEEE, 2021.
- A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE transactions on medical imaging, 36(7):1550–1560, 2017.
- Clipsyntel: clip and llm synergy for multimodal question summarization in healthcare. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 22031–22039, 2024.
- Making the most of text semantics to improve biomedical vision–language processing. In European conference on computer vision, pages 1–21. Springer, 2022.
- Padchest: A large chest x-ray image dataset with multi-label annotated reports. Medical image analysis, 66:101797, 2020.
- Convnext v2: Co-designing and scaling convnets with masked autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16133–16142, 2023.
- Large-scale domain-specific pretraining for biomedical vision-language processing. arXiv preprint arXiv:2303.00915, 2(3):6, 2023.
- Biomedclip: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs. arXiv preprint arXiv:2303.00915, 2023.
- Contrastive learning of medical visual representations from paired images and text. In Machine Learning for Healthcare Conference, pages 2–25. PMLR, 2022.
- Gloria: A multimodal global-local representation learning framework for label-efficient medical image recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3942–3951, 2021.
- Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning. Nature Biomedical Engineering, 6(12):1399–1406, 2022.
- Joint learning of localized representations from medical images and reports. In European Conference on Computer Vision, pages 685–701. Springer, 2022.
- A comparison of pre-trained vision-and-language models for multimodal representation learning across medical images and reports. In 2020 IEEE international conference on bioinformatics and biomedicine (BIBM), pages 1999–2004. IEEE, 2020.
- Multi-modal understanding and generation for medical images and text via vision-language pre-training. IEEE Journal of Biomedical and Health Informatics, 26(12):6070–6080, 2022.
- Align, reason and learn: Enhancing medical vision-and-language pre-training with knowledge. In Proceedings of the 30th ACM International Conference on Multimedia, pages 5152–5161, 2022.
- Lvit: language meets vision transformer in medical image segmentation. IEEE transactions on medical imaging, 2023.
- Med-unic: Unifying cross-lingual medical vision-language pre-training by diminishing bias. Advances in Neural Information Processing Systems, 36, 2024.
- Vision–language foundation model for echocardiogram interpretation. Nature Medicine, pages 1–8, 2024.
- Llava-med: Training a large language-and-vision assistant for biomedicine in one day. Advances in Neural Information Processing Systems, 36, 2024.
- Accountability Act. Health insurance portability and accountability act of 1996. Public law, 104:191, 1996.
- Artificial intelligence, bias and clinical safety. BMJ Quality & Safety, 28(3):231–237, 2019.
- Privacy preserving distributed machine learning with federated learning. Computer Communications, 171:112–125, 2021.
- Do no harm: a roadmap for responsible machine learning for health care. Nature medicine, 25(9):1337–1340, 2019.
- Treating health disparities with artificial intelligence. Nature medicine, 26(1):16–17, 2020.
- Geographic distribution of us cohorts used to train deep learning algorithms. Jama, 324(12):1212–1213, 2020.
- Training confounder-free deep learning models for medical applications. Nature communications, 11(1):6010, 2020.
- Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
- Identification of disease treatment mechanisms through the multiscale interactome. Nature communications, 12(1):1796, 2021.
- Redmed: Extending drug lexicons for social media applications. Journal of biomedical informatics, 99:103307, 2019.
- A neural topic-attention model for medical term abbreviation disambiguation. arXiv preprint arXiv:1910.14076, 2019.
- Contrastive learning of global and local features for medical image segmentation with limited annotations. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 12546–12558. Curran Associates, Inc., 2020.
- High accuracy protein structure prediction using deep learning. Fourteenth critical assessment of techniques for protein structure prediction (abstract book), 22(24):2, 2020.
- Xingyu Li (104 papers)
- Lu Peng (12 papers)
- Yuping Wang (56 papers)
- Weihua Zhang (15 papers)