Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large AI Models in Health Informatics: Applications, Challenges, and the Future (2303.11568v2)

Published 21 Mar 2023 in cs.AI and cs.CY

Abstract: Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions. Once pretrained, large AI models demonstrate impressive performance in various downstream tasks. A prime example is ChatGPT, whose capability has compelled people's imagination about the far-reaching influence that large AI models can have and their potential to transform different domains of our lives. In health informatics, the advent of large AI models has brought new paradigms for the design of methodologies. The scale of multi-modal data in the biomedical and health domain has been ever-expanding especially since the community embraced the era of deep learning, which provides the ground to develop, validate, and advance large AI models for breakthroughs in health-related areas. This article presents a comprehensive review of large AI models, from background to their applications. We identify seven key sectors in which large AI models are applicable and might have substantial influence, including 1) bioinformatics; 2) medical diagnosis; 3) medical imaging; 4) medical informatics; 5) medical education; 6) public health; and 7) medical robotics. We examine their challenges, followed by a critical discussion about potential future directions and pitfalls of large AI models in transforming the field of health informatics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (190)
  1. OpenAI, “Chatgpt: Optimizing language models for dialogue,” 2022. [Online]. Available: https://openai.com/blog/chatgpt/
  2. A. Kirillov et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
  3. A. Vaswani et al., “Attention is all you need,” NeurIPS, vol. 30, 2017.
  4. O. (2023), “Gpt-4 technical report,” arXiv:2303.08774, 2023.
  5. V. Gulshan, L. Peng, M. Coram et al., “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” jama, vol. 316, no. 22, pp. 2402–2410, 2016.
  6. R. Bommasani et al., “On the opportunities and risks of foundation models,” arXiv:2108.07258, 2021.
  7. A. Radford et al., “Improving language understanding by generative pre-training,” 2018.
  8. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv:1810.04805, 2018.
  9. H. Touvron, T. Lavril, G. Izacard et al., “Llama: Open and efficient foundation language models,” arXiv:2302.13971, 2023.
  10. H. Touvron, L. Martin, K. Stone et al., “Llama 2: Open foundation and fine-tuned chat models,” arXiv preprint arXiv:2307.09288, 2023.
  11. H. W. Chung, L. Hou, S. Longpre et al., “Scaling instruction-finetuned language models,” arXiv:2210.11416, 2022.
  12. A. Chowdhery, S. Narang, J. Devlin et al., “Palm: Scaling language modeling with pathways,” arXiv:2204.02311, 2022.
  13. J. Hoffmann, S. Borgeaud, A. Mensch et al., “Training compute-optimal large language models,” arXiv:2203.15556, 2022.
  14. S. Smith et al., “Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model,” arXiv:2201.11990, 2022.
  15. T. L. Scao, A. Fan, C. Akiki et al., “Bloom: A 176b-parameter open-access multilingual language model,” arXiv:2211.05100, 2022.
  16. R. Thoppilan, D. De Freitas, J. Hall et al., “Lamda: Language models for dialog applications,” arXiv:2201.08239, 2022.
  17. S. Zhang, S. Roller, N. Goyal et al., “Opt: Open pre-trained transformer language models,” arXiv:2205.01068, 2022.
  18. L. Ouyang, J. Wu, X. Jiang et al., “Training language models to follow instructions with human feedback,” arXiv:2203.02155, 2022.
  19. J. W. Rae, S. Borgeaud, T. Cai et al., “Scaling language models: Methods, analysis & insights from training gopher,” arXiv:2112.11446, 2021.
  20. V. Sanh, A. Webson, C. Raffel et al., “Multitask prompted training enables zero-shot task generalization,” arXiv:2110.08207, 2021.
  21. J. Wei, X. Wang, D. Schuurmans et al., “Chain of thought prompting elicits reasoning in large language models,” arXiv:2201.11903, 2022.
  22. S. Borgeaud, A. Mensch, J. Hoffmann et al., “Improving language models by retrieving from trillions of tokens,” in ICML.   PMLR, 2022, pp. 2206–2240.
  23. P. F. Christiano, J. Leike, T. Brown et al., “Deep reinforcement learning from human preferences,” NeurIPS, vol. 30, 2017.
  24. J. Schulman, F. Wolski, P. Dhariwal et al., “Proximal policy optimization algorithms,” arXiv:1707.06347, 2017.
  25. A. Glaese, N. McAleese, M. Trebacz et al., “Improving alignment of dialogue agents via targeted human judgements,” arXiv:2209.14375, 2022.
  26. A. Susano Pinto, A. Kolesnikov, Y. Shi et al., “Tuning computer vision models with task rewards,” arXiv e-prints, pp. arXiv–2302, 2023.
  27. J. Yosinski et al., “How transferable are features in deep neural networks?” NeurIPS, vol. 27, 2014.
  28. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in CVPR.   IEEE, 2009, pp. 248–255.
  29. T. Ridnik et al., “Imagenet-21k pretraining for the masses,” in NeurIPS.
  30. C. Sun et al., “Revisiting unreasonable effectiveness of data in deep learning era,” in ICCV, 2017, pp. 843–852.
  31. X. Zhai et al., “Scaling vision transformers,” in CVPR, 2022, pp. 12 104–12 113.
  32. D. Mahajan et al., “Exploring the limits of weakly supervised pretraining,” in ECCV, 2018, pp. 181–196.
  33. M. Singh et al., “Revisiting weakly supervised pre-training of visual perception models,” in CVPR, 2022, pp. 804–814.
  34. M. Chen, A. Radford, R. Child et al., “Generative Pretraining From Pixels,” in ICML.   PMLR, Nov. 2020.
  35. M. Assran, Q. Duval, I. Misra et al., “Self-supervised learning from images with a joint-embedding predictive architecture,” in CVPR, 2023, pp. 15 619–15 629.
  36. H. Chen, Y. Wang, T. Guo et al., “Pre-trained image processing transformer,” in CVPR, 2021, pp. 12 299–12 310.
  37. K. He, X. Chen, S. Xie et al., “Masked Autoencoders Are Scalable Vision Learners,” in CVPR, 2022.
  38. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in ICML, 2020, pp. 1597–1607.
  39. A. Dosovitskiy, L. Beyer, A. Kolesnikov et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in ICLR, 2021.
  40. K. Han, A. Xiao, E. Wu et al., “Transformer in Transformer,” in NeurIPS, vol. 34, 2021, pp. 15 908–15 919.
  41. Z. Liu et al., “Swin transformer: Hierarchical vision transformer using shifted windows,” in ICCV, 2021, pp. 10 012–10 022.
  42. Z. Liu, H. Hu, Y. Lin et al., “Swin Transformer V2: Scaling Up Capacity and Resolution,” in ICCV, 2022.
  43. M. Dehghani, J. Djolonga, B. Mustafa et al., “Scaling vision transformers to 22 billion parameters,” ArXiv, vol. abs/2302.05442, 2023.
  44. Z. Liu et al., “A convnet for the 2020s,” in CVPR, 2022, pp. 11 976–11 986.
  45. W. Wang, J. Dai, Z. Chen et al., “Internimage: Exploring large-scale vision foundation models with deformable convolutions,” in CVPR, 2023.
  46. Z. Dai, H. Liu, Q. V. Le, and M. Tan, “Coatnet: Marrying convolution and attention for all data sizes,” NeurIPS, vol. 34, pp. 3965–3977, 2021.
  47. S. d’Ascoli et al., “Convit: Improving vision transformers with soft convolutional inductive biases,” in ICML.   PMLR, 2021, pp. 2286–2296.
  48. A. Kolesnikov et al., “Big transfer (bit): General visual representation learning,” in ECCV, 2020, pp. 491–507.
  49. M. Tan et al., “Efficientnet: Rethinking model scaling for convolutional neural networks,” in ICML, 2019, pp. 6105–6114.
  50. ——, “Efficientnetv2: Smaller models and faster training,” in ICML, 2021, pp. 10 096–10 106.
  51. P. Goyal et al., “Self-supervised pretraining of visual features in the wild,” arXiv:2103.01988, 2021.
  52. Y. Huang et al., “Gpipe: Efficient training of giant neural networks using pipeline parallelism,” NeurIPS, vol. 32, 2019.
  53. A. Radford et al., “Learning transferable visual models from natural language supervision,” in ICML.   PMLR, 2021, pp. 8748–8763.
  54. C. Jia, Y. Yang, Y. Xia et al., “Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision,” in ICML, Jul. 2021.
  55. L. Yuan, D. Chen, Y.-L. Chen et al., “Florence: A New Foundation Model for Computer Vision,” Nov. 2021, arXiv:2111.11432 [cs].
  56. C. Schuhmann, R. Beaumont, R. Vencu et al., “LAION-5B: An open large-scale dataset for training next generation image-text models,” in NeurIPS, Oct. 2022.
  57. A. Singh, R. Hu, V. Goswami et al., “FLAVA: A Foundational Language and Vision Alignment Model,” in CVPR, 2022, pp. 15 638–15 650.
  58. Z. Wang, J. Yu, A. W. Yu et al., “SimVLM: Simple Visual Language Model Pretraining with Weak Supervision,” in ICLR, Jan. 2022.
  59. X. Chen et al., “Pali: A jointly-scaled multilingual language-image model,” arXiv:2209.06794, 2022.
  60. J. Li et al., “BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models,” Jan. 2023, arXiv:2301.12597 [cs].
  61. S. Huang, L. Dong, W. Wang et al., “Language Is Not All You Need: Aligning Perception with Language Models,” Mar. 2023, arXiv:2302.14045 [cs].
  62. J. Li, D. Li, C. Xiong, and S. Hoi, “BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation,” in ICML.   PMLR, Jun. 2022, pp. 12 888–12 900, iSSN: 2640-3498.
  63. H. Pham, Z. Dai, G. Ghiasi et al., “Combined scaling for open-vocabulary image classification,” arXiv e-prints, pp. arXiv–2111, 2021.
  64. A. Ramesh, M. Pavlov, G. Goh et al., “Zero-Shot Text-to-Image Generation,” in ICML, Jul. 2021.
  65. J. Yu, Y. Xu, J. Y. Koh et al., “Scaling autoregressive models for content-rich text-to-image generation,” Transactions on Machine Learning Research, 2022.
  66. A. Ramesh, P. Dhariwal, A. Nichol et al., “Hierarchical text-conditional image generation with clip latents,” arXiv:2204.06125, 2022.
  67. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” 2021.
  68. A. Q. Nichol et al., “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” in ICML, 2022, pp. 16 784–16 804.
  69. C. Saharia et al., “Photorealistic text-to-image diffusion models with deep language understanding,” arXiv:2205.11487, 2022.
  70. J. Sohl-Dickstein et al., “Deep unsupervised learning using nonequilibrium thermodynamics,” in ICML.   PMLR, 2015, pp. 2256–2265.
  71. C. Wu, S. Yin, W. Qi et al., “Visual chatgpt: Talking, drawing and editing with visual foundation models,” arXiv:2303.04671, 2023.
  72. R. Girdhar, A. El-Nouby, Z. Liu et al., “Imagebind: One embedding space to bind them all,” in CVPR, 2023, pp. 15 180–15 190.
  73. Y. Zhang, K. Gong, K. Zhang et al., “Meta-transformer: A unified framework for multimodal learning,” arXiv preprint arXiv:2307.10802, 2023.
  74. X.-C. Bai et al., “How cryo-em is revolutionizing structural biology,” Trends in biochemical sciences, vol. 40, no. 1, pp. 49–57, 2015.
  75. K. Wüthrich, “The way to nmr structures of proteins,” Nature structural biology, vol. 8, no. 11, pp. 923–925, 2001.
  76. J. M. Grimes et al., “Where is crystallography going?” Acta Crystallographica Section D: Structural Biology, vol. 74, no. 2, pp. 152–166, 2018.
  77. J. Jumper, R. Evans, A. Pritzel et al., “Highly accurate protein structure prediction with alphafold,” Nature, vol. 596, no. 7873, pp. 583–589, 2021.
  78. R. Evans et al., “Protein complex prediction with alphafold-multimer,” BioRxiv, pp. 2021–10, 2021.
  79. A. Madani, B. McCann, N. Naik et al., “Progen: Language modeling for protein generation,” arXiv:2004.03497, 2020.
  80. A. Elnaggar et al., “Prottrans: Towards cracking the language of lifes code through self-supervised deep learning and high performance computing,” IEEE TPAMI, vol. PP, pp. 1–1, 07 2021.
  81. M. Steinegger and J. Söding, “Clustering huge protein sequence sets in linear time,” Nature communications, vol. 9, no. 1, p. 2542, 2018.
  82. B. E. Suzek et al., “Uniref clusters: a comprehensive and scalable alternative for improving sequence similarity searches,” Bioinformatics, vol. 31, no. 6, pp. 926–932, 2015.
  83. Z. Lin et al., “Evolutionary-scale prediction of atomic-level protein structure with a language model,” Science, vol. 379, no. 6637, pp. 1123–1130, 2023.
  84. R. Wu, F. Ding, R. Wang et al., “High-resolution de novo structure prediction from primary sequence,” BioRxiv, pp. 2022–07, 2022.
  85. M. Baek et al., “Accurate prediction of protein structures and interactions using a three-track neural network,” Science, vol. 373, no. 6557, pp. 871–876, 2021.
  86. B. Chen, X. Cheng, L. ao Gengyang et al., “xtrimopglm: Unified 100b-scale pre-trained transformer for deciphering the language of protein,” bioRxiv, 2023.
  87. A. Elnaggar et al., “Ankh: Optimized protein language model unlocks general-purpose modelling,” bioRxiv, pp. 2023–01, 2023.
  88. J. Chen, Z. Hu, S. Sun et al., “Interpretable rna foundation model from unannotated data for highly accurate rna structure and function predictions,” bioRxiv, pp. 2022–08, 2022.
  89. “Rnacentral 2021: secondary structure integration, improved sequence search and new member databases,” Nucleic acids research, vol. 49, no. D1, pp. D212–D220, 2021.
  90. L. Fu, Y. Cao, J. Wu, Q. Peng, Q. Nie, and X. Xie, “Ufold: fast and accurate rna secondary structure prediction with deep learning,” Nucleic acids research, vol. 50, no. 3, pp. e14–e14, 2022.
  91. T. Shen, Z. Hu, Z. Peng et al., “E2efold-3d: End-to-end deep learning method for accurate de novo rna 3d structure prediction,” arXiv:2207.01586, 2022.
  92. G. R. Buel and K. J. Walters, “Can alphafold2 predict the impact of missense mutations on structure?” Nature Structural & Molecular Biology, vol. 29, no. 1, pp. 1–2, 2022.
  93. E. Tiu, E. Talius, P. Patel et al., “Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning,” Nature Biomedical Engineering, pp. 1–8, 2022.
  94. S. Wang et al., “Chatcad: Interactive computer-aided diagnosis on medical image using large language models,” arXiv:2302.07257, 2023.
  95. Z. Zhao, S. Wang, J. Gu et al., “Chatcad+: Towards a universal and reliable interactive cad using llms,” arXiv preprint arXiv:2305.15964, 2023.
  96. Y. Li, Z. Li, K. Zhang et al., “Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge,” Cureus, vol. 15, no. 6, 2023.
  97. O. Thawkar, A. Shaker, S. S. Mullappilly et al., “Xraygpt: Chest radiographs summarization using medical vision-language models,” arXiv preprint arXiv:2306.07971, 2023.
  98. H. Wang, C. Liu, N. Xi et al., “Huatuo: Tuning llama model with chinese medical knowledge,” arXiv preprint arXiv:2304.06975, 2023.
  99. A. Vaid, J. Jiang, A. Sawant et al., “A foundational vision transformer improves diagnostic performance for electrocardiograms,” NPJ Digital Medicine, vol. 6, no. 1, p. 108, 2023.
  100. Y. Li, S. Rao, J. R. A. Solares et al., “Behrt: transformer for electronic health records,” Scientific reports, vol. 10, no. 1, pp. 1–12, 2020.
  101. L. Rasmy, Y. Xiang, Z. Xie et al., “Med-bert: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction,” NPJ digital medicine, vol. 4, no. 1, p. 86, 2021.
  102. P. Shi, J. Qiu, S. M. D. Abaxi et al., “Generalist vision foundation models for medical imaging: A case study of segment anything model on zero-shot medical segmentation,” Diagnostics, vol. 13, no. 11, p. 1947, 2023.
  103. J. Wu, R. Fu, H. Fang et al., “Medical sam adapter: Adapting segment anything model for medical image segmentation,” arXiv preprint arXiv:2304.12620, 2023.
  104. Z. Wang, Z. Wu, D. Agarwal, and J. Sun, “Medclip: Contrastive learning from unpaired medical images and text,” arXiv:2210.10163, 2022.
  105. Z. Huang et al., “A visual–language foundation model for pathology image analysis using medical twitter,” Nature Medicine, pp. 1–10, 2023.
  106. S. Chen, K. Ma, and Y. Zheng, “Med3d: Transfer learning for 3d medical image analysis,” arXiv:1904.00625, 2019.
  107. P. Chambon et al., “Adapting pretrained vision-language foundational models to medical imaging domains,” arXiv:2210.04133, 2022.
  108. Z. Huang, H. Wang, Z. Deng et al., “Stu-net: Scalable and transferable medical image segmentation models empowered by large-scale supervised pre-training,” arXiv preprint arXiv:2304.06716, 2023.
  109. T. Brown, B. Mann, N. Ryder et al., “Language models are few-shot learners,” NeurIPS, vol. 33, pp. 1877–1901, 2020.
  110. “Pubmed abstract,” 2023. [Online]. Available: https://pubmed.ncbi.nlm.nih.gov/download/
  111. “Pubmed central,” 2023. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/
  112. J. Lee et al., “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
  113. E. Alsentzer et al., “Publicly available clinical bert embeddings,” arXiv, 2019.
  114. H.-C. Shin, Y. Zhang, E. Bakhturina et al., “Biomegatron: Larger biomedical domain language model,” arXiv:2010.06060, 2020.
  115. S. Gururangan, A. Marasović, S. Swayamdipta et al., “Don’t stop pretraining: Adapt language models to domains and tasks,” arXiv:2004.10964, 2020.
  116. L. Rasmy, Y. Xiang, Z. Xie, C. Tao, and D. Zhi, “Med-bert: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction,” NPJ digital medicine, vol. 4, no. 1, p. 86, 2021.
  117. K. raj Kanakarajan, B. Kundumani, and M. Sankarasubbu, “Bioelectra: pretrained biomedical text encoder using discriminators,” in Proceedings of the 20th Workshop on Biomedical Language Processing, 2021, pp. 143–154.
  118. Y. Gu, R. Tinn, H. Cheng et al., “Domain-specific language model pretraining for biomedical natural language processing,” HEALTH, vol. 3, no. 1, pp. 1–23, 2021.
  119. M. Yasunaga, J. Leskovec, and P. Liang, “Linkbert: Pretraining language models with document links,” arXiv:2203.15827, 2022.
  120. R. Luo et al., “Biogpt: generative pre-trained transformer for biomedical text generation and mining,” Briefings in Bioinformatics, vol. 23, no. 6, 2022.
  121. K. Singhal, S. Azizi, T. Tu et al., “Large language models encode clinical knowledge,” arXiv:2212.13138, 2022.
  122. X. Yang, A. Chen, N. PourNejatian et al., “A large language model for electronic health records,” npj Digital Medicine, vol. 5, no. 1, p. 194, 2022.
  123. E. J. Hu, Y. Shen, P. Wallis et al., “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
  124. T. Han, L. C. Adams, J.-M. Papaioannou et al., “Medalpaca–an open-source collection of medical conversational ai models and training data,” arXiv preprint arXiv:2304.08247, 2023.
  125. J. Wei et al., “Emergent abilities of large language models,” arXiv, 2022.
  126. M. Agrawal, S. Hegselmann, H. Lang, Y. Kim, and D. Sontag, “Large language models are few-shot clinical information extractors,” arXiv:2205.12689, 2022.
  127. K. Singhal, T. Tu, J. Gottweis et al., “Towards expert-level medical question answering with large language models,” arXiv preprint arXiv:2305.09617, 2023.
  128. V. Liévin, C. E. Hother, and O. Winther, “Can large language models reason about medical questions?” arXiv:2207.08143, 2022.
  129. S. B. Patel and K. Lam, “Chatgpt: the future of discharge summaries?” The Lancet Digital Health, 2023.
  130. “Fairway health - process prior authorization faster,” 2023. [Online]. Available: https://www.ycombinator.com/launches/IIu-fairway-health-process-prior-authorization-faster
  131. T. H. Kung, M. Cheatham, A. Medenilla et al., “Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models,” PLOS Digital Health, vol. 2, no. 2, p. e0000198, 2023.
  132. E. Shue, L. Liu, B. Li, Z. Feng, X. Li, and G. Hu, “Empowering beginners in bioinformatics with chatgpt,” bioRxiv, pp. 2023–03, 2023.
  133. H. Dai, Z. Liu, W. Liao et al., “Chataug: Leveraging chatgpt for text data augmentation,” arXiv:2302.13007, 2023.
  134. E. Mitchell et al., “Detectgpt: Zero-shot machine-generated text detection using probability curvature,” arXiv:2301.11305, 2023.
  135. X. Lin, C. Xu, Z. Xiong et al., “Pangu drug model: learn a molecule like a human,” bioRxiv, pp. 2022–03, 2022.
  136. D. M. Korngiebel and S. D. Mooney, “Considering the possibilities and pitfalls of generative pre-trained transformer 3 (gpt-3) in healthcare delivery,” NPJ Digital Medicine, vol. 4, no. 1, p. 93, 2021.
  137. E. Chen, K. Lerman, E. Ferrara et al., “Tracking social media discourse about the covid-19 pandemic: Development of a public coronavirus twitter data set,” JMIR public health and surveillance, vol. 6, no. 2, p. e19273, 2020.
  138. J. Peng et al., “Clustering egocentric images in passive dietary monitoring with self-supervised learning,” in IEEE-EMBS BHI.   IEEE, 2022, pp. 01–04.
  139. J. Qiu et al., “Egocentric image captioning for privacy-preserved passive dietary intake monitoring,” IEEE Transactions on Cybernetics, 2023.
  140. B. M. Popkin, C. Corvalan, and L. M. Grummer-Strawn, “Dynamics of the double burden of malnutrition and the changing nutrition reality,” The Lancet, 2019.
  141. T. Nguyen, J. Brandstetter, A. Kapoor et al., “Climax: A foundation model for weather and climate,” arXiv:2301.10343, 2023.
  142. K. Bi, L. Xie, H. Zhang et al., “Accurate medium-range global weather forecasting with 3d neural networks,” Nature, pp. 1–6, 2023.
  143. Z. Wang, C. Liu, S. Zhang et al., “Foundation model for endoscopy video analysis via large-scale self-supervised pre-train,” arXiv preprint arXiv:2306.16741, 2023.
  144. G. D’Onofrio, L. Fiorini, A. Sorrentino et al., “Emotion recognizing by a robotic solution initiative (emotive project),” Sensors, vol. 22, no. 8, p. 2861, 2022.
  145. J. Qiu, L. Chen, X. Gu et al., “Egocentric human trajectory forecasting with a wearable camera and multi-modal fusion,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 8799–8806, 2022.
  146. P. Asgharian, A. M. Panchea, and F. Ferland, “A review on the use of mobile service robots in elderly care,” Robotics, vol. 11, no. 6, p. 127, 2022.
  147. L. Seenivasan, M. Islam, G. Kannan et al., “Surgicalgpt: End-to-end language-vision gpt for visual question answering in surgery,” arXiv preprint arXiv:2304.09974, 2023.
  148. S. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, “Chatgpt for robotics: Design principles and model abilities,” Microsoft, Tech. Rep. MSR-TR-2023-8, February 2023. [Online]. Available: https://www.microsoft.com/en-us/research/publication/chatgpt-for-robotics-design-principles-and-model-abilities/
  149. S. Reed et al., “A generalist agent,” TMLR, 2022.
  150. M. Shridhar et al., “Cliport: What and where pathways for robotic manipulation,” in Conference on Robot Learning.   PMLR, 2022, pp. 894–906.
  151. ——, “Perceiver-actor: A multi-task transformer for robotic manipulation,” arXiv:2209.05451, 2022.
  152. M. Ahn, A. Brohan, N. Brown et al., “Do as i can and not as i say: Grounding language in robotic affordances,” in arXiv:2204.01691, 2022.
  153. D. Driess, F. Xia, M. S. M. Sajjadi et al., “Palm-e: An embodied multimodal language model,” in arXiv:2303.03378, 2023.
  154. Y. Jiang, A. Gupta, Z. Zhang et al., “Vima: General robot manipulation with multimodal prompts,” arXiv:2210.03094, 2022.
  155. A. Brohan, N. Brown, J. Carbajal et al., “Rt-1: Robotics transformer for real-world control at scale,” arXiv:2212.06817, 2022.
  156. N. Ding, Y. Qin, G. Yang et al., “Parameter-efficient fine-tuning of large-scale pre-trained language models,” Nature Machine Intelligence, vol. 5, no. 3, pp. 220–235, 2023.
  157. S. Gilbert, H. Harvey, T. Melvin et al., “Large language model ai chatbots require approval as medical devices,” Nature Medicine, pp. 1–3, 2023.
  158. X. Shen, Z. Chen, M. Backes, and Y. Zhang, “In chatgpt we trust? measuring and characterizing the reliability of chatgpt,” arXiv preprint arXiv:2304.08979, 2023.
  159. P. Lee, S. Bubeck, and J. Petro, “Benefits, limits, and risks of gpt-4 as an ai chatbot for medicine,” New England Journal of Medicine, vol. 388, no. 13, pp. 1233–1239, 2023.
  160. H. Nori, N. King, S. M. McKinney, D. Carignan, and E. Horvitz, “Capabilities of gpt-4 on medical challenge problems,” arXiv preprint arXiv:2303.13375, 2023.
  161. J. Wang, X. Hu, W. Hou et al., “On the robustness of chatgpt: An adversarial and out-of-distribution perspective,” arXiv:2302.12095, 2023.
  162. L. Li and M. W. Spratling, “Data augmentation alone can improve adversarial training,” in ICLR, 2023.
  163. N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. B. Brown, D. Song, U. Erlingsson et al., “Extracting training data from large language models.” in USENIX Security Symposium, vol. 6, 2021.
  164. N. Carlini, D. Ippolito, M. Jagielski, K. Lee, F. Tramer, and C. Zhang, “Quantifying memorization across neural language models,” arXiv preprint arXiv:2202.07646, 2022.
  165. H. Li, D. Guo, W. Fan, M. Xu, and Y. Song, “Multi-step jailbreaking privacy attacks on chatgpt,” arXiv preprint arXiv:2304.05197, 2023.
  166. R. Shokri, M. Stronati, C. Song, and V. Shmatikov, “Membership inference attacks against machine learning models,” in 2017 IEEE symposium on security and privacy (SP).   IEEE, 2017, pp. 3–18.
  167. J. Duan, F. Kong, S. Wang, X. Shi, and K. Xu, “Are diffusion models vulnerable to membership inference attacks?” arXiv preprint arXiv:2302.01316, 2023.
  168. “How your data is used to improve model performance,” 2023. [Online]. Available: https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance
  169. “March 20 chatgpt outage: Here’s what happened,” 2023. [Online]. Available: https://openai.com/blog/march-20-chatgpt-outage
  170. K. Greshake, S. Abdelnabi, S. Mishra et al., “More than you’ve asked for: A comprehensive analysis of novel prompt injection threats to application-integrated large language models,” arXiv preprint arXiv:2302.12173, 2023.
  171. W. J. Hall, M. V. Chapman, K. M. Lee et al., “Implicit racial/ethnic bias among health care professionals and its influence on health care outcomes: a systematic review,” American journal of public health, vol. 105, no. 12, pp. e60–e76, 2015.
  172. Z. Obermeyer, B. Powers, C. Vogeli, and S. Mullainathan, “Dissecting racial bias in an algorithm used to manage the health of populations,” Science, vol. 366, no. 6464, pp. 447–453, 2019.
  173. D. Cirillo, S. Catuara-Solarz, C. Morey et al., “Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare,” NPJ digital medicine, vol. 3, no. 1, p. 81, 2020.
  174. D. S. Char, N. H. Shah, and D. Magnus, “Implementing machine learning in health care—addressing ethical challenges,” The New England journal of medicine, vol. 378, no. 11, p. 981, 2018.
  175. J. Rutinowski, S. Franke, J. Endendyk, I. Dormuth, and M. Pauly, “The self-perception and political biases of chatgpt,” arXiv:2304.07333, 2023.
  176. T. Y. Zhuo, Y. Huang, C. Chen et al., “Exploring ai ethics of chatgpt: A diagnostic analysis,” arXiv:2301.12867, 2023.
  177. D. Hendrycks, C. Burns, S. Basart, A. Critch, J. Li, D. Song, and J. Steinhardt, “Aligning ai with shared human values,” arXiv preprint arXiv:2008.02275, 2020.
  178. S. Gehman, S. Gururangan, M. Sap, Y. Choi, and N. A. Smith, “Realtoxicityprompts: Evaluating neural toxic degeneration in language models,” arXiv preprint arXiv:2009.11462, 2020.
  179. G. Daras and A. G. Dimakis, “Discovering the hidden vocabulary of dalle-2,” arXiv preprint arXiv:2206.00169, 2022.
  180. Y. Wang, P. Shi, and H. Zhang, “Investigating the existence of” secret language”in language models,” arXiv preprint arXiv:2307.12507, 2023.
  181. T. Kojima, S. S. Gu, M. Reid et al., “Large language models are zero-shot reasoners,” Advances in neural information processing systems, vol. 35, pp. 22 199–22 213, 2022.
  182. N. Elhage, N. Nanda, C. Olsson et al., “A mathematical framework for transformer circuits,” Transformer Circuits Thread, vol. 1, 2021.
  183. D. Patterson et al., “Carbon emissions and large neural network training,” arXiv preprint arXiv:2104.10350, 2021.
  184. D. Patterson, J. Gonzalez, U. Hölzle et al., “The carbon footprint of machine learning training will plateau, then shrink,” Computer, vol. 55, no. 7, 2022.
  185. E. F. Villaronga, P. Kieseberg, and T. Li, “Humans forget, machines remember: Artificial intelligence and the right to be forgotten,” Computer Law & Security Review, vol. 34, no. 2, pp. 304–313, 2018.
  186. Y. LeCun, “A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27,” 2022.
  187. P. Liu, W. Yuan, J. Fu et al., “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 1–35, 2023.
  188. D. Leslie, “Understanding artificial intelligence ethics and safety,” arXiv preprint arXiv:1906.05684, 2019.
  189. S. A. Siddiqui, N. Rajkumar, T. Maharaj et al., “Metadata archaeology: Unearthing data subsets by leveraging training dynamics,” in ICLR, 2022.
  190. X. Huang et al., “A survey of safety and trustworthiness of large language models through the lens of verification and validation,” arXiv:2305.11391, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (14)
  1. Jianing Qiu (24 papers)
  2. Lin Li (329 papers)
  3. Jiankai Sun (53 papers)
  4. Jiachuan Peng (4 papers)
  5. Peilun Shi (9 papers)
  6. Ruiyang Zhang (11 papers)
  7. Yinzhao Dong (2 papers)
  8. Kyle Lam (6 papers)
  9. Frank P. -W. Lo (10 papers)
  10. Bo Xiao (62 papers)
  11. Wu Yuan (25 papers)
  12. Ningli Wang (10 papers)
  13. Dong Xu (167 papers)
  14. Benny Lo (21 papers)
Citations (96)