Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Autocompletion of Chief Complaints in the Electronic Health Records using Large Language Models (2401.06088v1)

Published 11 Jan 2024 in cs.CL, cs.AI, and cs.LG

Abstract: The Chief Complaint (CC) is a crucial component of a patient's medical record as it describes the main reason or concern for seeking medical care. It provides critical information for healthcare providers to make informed decisions about patient care. However, documenting CCs can be time-consuming for healthcare providers, especially in busy emergency departments. To address this issue, an autocompletion tool that suggests accurate and well-formatted phrases or sentences for clinical notes can be a valuable resource for triage nurses. In this study, we utilized text generation techniques to develop machine learning models using CC data. In our proposed work, we train a Long Short-Term Memory (LSTM) model and fine-tune three different variants of Biomedical Generative Pretrained Transformers (BioGPT), namely microsoft/biogpt, microsoft/BioGPT-Large, and microsoft/BioGPT-Large-PubMedQA. Additionally, we tune a prompt by incorporating exemplar CC sentences, utilizing the OpenAI API of GPT-4. We evaluate the models' performance based on the perplexity score, modified BERTScore, and cosine similarity score. The results show that BioGPT-Large exhibits superior performance compared to the other models. It consistently achieves a remarkably low perplexity score of 1.65 when generating CC, whereas the baseline LSTM model achieves the best perplexity score of 170. Further, we evaluate and assess the proposed models' performance and the outcome of GPT-4.0. Our study demonstrates that utilizing LLMs such as BioGPT, leads to the development of an effective autocompletion tool for generating CC documentation in healthcare settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. D. Chang, “Generating contextual text embeddings for emergency department chief complaints using bert,” 2019.
  2. M. M. Wagner, W. R. Hogan, W. W. Chapman, and P. H. Gesteland, “Chief complaints and icd codes,” Handbook of biosurveillance, p. 333, 2006.
  3. S. Krishan and D. Gurpreet, “Misleading complaint,” https://psnet.ahrq.gov/web-mm/misleading-complaint, (Accessed on 09/09/2023).
  4. S. Karagounis, I. N. Sarkar, and E. S. Chen, “Coding free-text chief complaints from a health information exchange: A preliminary study,” in AMIA Annual Symposium Proceedings, vol. 2020.   American Medical Informatics Association, 2020, p. 638.
  5. G. P. Spithourakis, S. E. Petersen, and S. Riedel, “Clinical text prediction with numerically grounded conditional language models,” arXiv preprint arXiv:1610.06370, 2016.
  6. S. A. Hasan and O. Farri, “Clinical natural language processing with deep learning,” Data Science for Healthcare: Methodologies and Applications, pp. 147–171, 2019.
  7. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018.
  8. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  9. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  10. J. D. Osborne, T. O’Leary, A. Mudano, J. Booth, G. Rosas, G. Peramsetty, A. Knighton, J. Foster, K. Saag, and M. I. Danila, “Gout emergency department chief complaint corpora,” 2020.
  11. R. Luo, L. Sun, Y. Xia, T. Qin, S. Zhang, H. Poon, and T.-Y. Liu, “Biogpt: generative pre-trained transformer for biomedical text generation and mining,” Briefings in Bioinformatics, vol. 23, no. 6, 2022.
  12. Y. Tiwari, S. Goel, and A. Singh, “Arrival time pattern and waiting time distribution of patients in the emergency outpatient department of a tertiary level health care institution of north india,” Journal of emergencies, trauma, and shock, vol. 7, no. 3, p. 160, 2014.
  13. S. Shah, A. Patel, D. P. Rumoro, S. Hohmann, and F. Fullam, “Managing patient expectations at emergency department triage,” Patient Experience Journal, vol. 2, no. 2, pp. 31–44, 2015.
  14. E. B. Kulstad, R. Sikka, R. T. Sweis, K. M. Kelley, and K. H. Rzechula, “Ed overcrowding is associated with an increased frequency of medication errors,” The American journal of emergency medicine, vol. 28, no. 3, pp. 304–309, 2010.
  15. D. A. Travers and S. W. Haas, “Using nurses’ natural language entries to build a concept-oriented terminology for patients’ chief complaints in the emergency department,” Journal of biomedical informatics, vol. 36, no. 4-5, pp. 260–270, 2003.
  16. T. C. Sauter, G. Capaldo, M. Hoffmann, T. Birrenbach, S. C. Hautz, J. E. Kämmer, A. K. Exadaktylos, and W. E. Hautz, “Non-specific complaints at emergency department presentation result in unclear diagnoses and lengthened hospitalization: a prospective observational study,” Scandinavian journal of trauma, resuscitation and emergency medicine, vol. 26, pp. 1–7, 2018.
  17. S. Nunez, A. Hexdall, and A. Aguirre-Jaime, “Unscheduled returns to the emergency department: an outcome of medical errors?” BMJ Quality & Safety, vol. 15, no. 2, pp. 102–108, 2006.
  18. M. S. Tootooni, K. S. Pasupathy, H. A. Heaton, C. M. Clements, and M. Y. Sir, “Ccmapper: An adaptive nlp-based free-text chief complaint mapping algorithm,” Computers in Biology and Medicine, vol. 113, p. 103398, 2019.
  19. D. Chang, W. S. Hong, and R. A. Taylor, “Generating contextual embeddings for emergency department chief complaints,” JAMIA open, vol. 3, no. 2, pp. 160–166, 2020.
  20. J.-H. Hsu, T.-C. Weng, C.-H. Wu, and T.-S. Ho, “Natural language processing methods for detection of influenza-like illness from chief complaints,” in 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).   IEEE, 2020, pp. 1626–1630.
  21. B. Jing, P. Xie, and E. Xing, “On the automatic generation of medical imaging reports,” arXiv preprint arXiv:1711.08195, 2017.
  22. H.-Y. Wu, J. Zhang, J. Ive, T. Li, V. Gupta, B. Chen, and Y. Guo, “Medical scientific table-to-text generation with human-in-the-loop under the data sparsity constraint,” arXiv preprint arXiv:2205.12368, 2022.
  23. Y. Pan, Q. Chen, W. Peng, X. Wang, B. Hu, X. Liu, J. Chen, and W. Zhou, “Medwriter: Knowledge-aware medical text generation,” in Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 2363–2368.
  24. J. Guan, R. Li, S. Yu, and X. Zhang, “Generation of synthetic electronic medical record text,” in 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).   IEEE, 2018, pp. 374–380.
  25. O. Melamud and C. Shivade, “Towards automatic generation of shareable synthetic clinical notes using neural language models,” arXiv preprint arXiv:1905.07002, 2019.
  26. A. Amin-Nejad, J. Ive, and S. Velupillai, “Exploring transformer text generation for medical dataset augmentation,” in Proceedings of the Twelfth Language Resources and Evaluation Conference, 2020, pp. 4699–4708.
  27. R. Tang, X. Han, X. Jiang, and X. Hu, “Does synthetic data generation of llms help clinical text mining?” arXiv preprint arXiv:2303.04360, 2023.
  28. S. H. Lee, “Natural language generation for electronic health records,” NPJ digital medicine, vol. 1, no. 1, p. 63, 2018.
  29. P. J. Liu, “Learning to write notes in electronic health records,” arXiv preprint arXiv:1808.02622, 2018.
  30. K. Krishna, S. Khosla, J. P. Bigham, and Z. C. Lipton, “Generating soap notes from doctor-patient conversations using modular summarization techniques,” arXiv preprint arXiv:2005.01795, 2020.
  31. J. Ive, N. Viani, J. Kam, L. Yin, S. Verma, S. Puntis, R. N. Cardinal, A. Roberts, R. Stewart, and S. Velupillai, “Generation and evaluation of artificial mental health records for natural language processing,” NPJ digital medicine, vol. 3, no. 1, p. 69, 2020.
  32. J. Sirrianni, E. Sezgin, D. Claman, and S. L. Linwood, “Medical text prediction and suggestion using generative pretrained transformer models with dental medical notes,” Methods of Information in Medicine, vol. 61, no. 05/06, pp. 195–200, 2022.
  33. A. Yazdani, R. Safdari, A. Golkar, and S. R. Niakan Kalhori, “Words prediction based on n-gram model for free-text entry in electronic health records,” Health information science and systems, vol. 7, pp. 1–7, 2019.
  34. H. Van, D. Kauchak, and G. Leroy, “Automets: the autocomplete for medical text simplification,” arXiv preprint arXiv:2010.10573, 2020.
  35. B. Wang, Q. Xie, J. Pei, Z. Chen, P. Tiwari, Z. Li, and J. Fu, “Pre-trained language models in biomedical domain: A systematic survey,” ACM Computing Surveys, 2021.
  36. R. Jozefowicz, O. Vinyals, M. Schuster, N. Shazeer, and Y. Wu, “Exploring the limits of language modeling,” arXiv preprint arXiv:1602.02410, 2016.
  37. I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” Advances in neural information processing systems, vol. 27, 2014.
  38. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  39. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  40. Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” arXiv preprint arXiv:1909.11942, 2019.
  41. E. Alsentzer, J. R. Murphy, W. Boag, W.-H. Weng, D. Jin, T. Naumann, and M. McDermott, “Publicly available clinical bert embeddings,” arXiv preprint arXiv:1904.03323, 2019.
  42. G. Melis, C. Dyer, and P. Blunsom, “On the state of the art of evaluation in neural language models,” arXiv preprint arXiv:1707.05589, 2017.
  43. “Openai api,” https://platform.openai.com/docs, (Accessed on 09/08/2023).
  44. P. Stenetorp, S. Pyysalo, G. Topić, T. Ohta, S. Ananiadou, and J. Tsujii, “Brat: a web-based tool for nlp-assisted text annotation,” in Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, 2012, pp. 102–107.
  45. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  46. H. Park, S. Cho, and J. Park, “Word rnn as a baseline for sentence completion,” in 2018 IEEE 5th International Congress on Information Science and Technology (CiSt).   IEEE, 2018, pp. 183–187.
  47. L. Yao and Y. Guan, “An improved lstm structure for natural language processing,” in 2018 IEEE International Conference of Safety Produce Informatization (IICSPI).   IEEE, 2018, pp. 565–569.
  48. T.-H. Wen, M. Gasic, N. Mrksic, P.-H. Su, D. Vandyke, and S. Young, “Semantically conditioned lstm-based natural language generation for spoken dialogue systems,” arXiv preprint arXiv:1508.01745, 2015.
  49. “Raj hpc— marquette’s high performance computing cluster,” https://www.marquette.edu/high-performance-computing/architecture.php, (Accessed on 09/20/2023).
  50. Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni, “Generalizing from a few examples: A survey on few-shot learning,” ACM computing surveys (csur), vol. 53, no. 3, pp. 1–34, 2020.
  51. A. Celikyilmaz, E. Clark, and J. Gao, “Evaluation of text generation: A survey,” arXiv preprint arXiv:2006.14799, 2020.
  52. “Perplexity measure,” https://huggingface.co/docs/transformers/perplexity, (Accessed on 09/09/2023).
  53. T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” arXiv preprint arXiv:1904.09675, 2019.
  54. F. Rahutomo, T. Kitasuka, and M. Aritsugi, “Semantic cosine similarity,” in The 7th international student conference on advanced science and technology ICAST, vol. 4, no. 1, 2012, p. 1.
  55. M. Farouk, “Measuring sentences similarity: a survey,” arXiv preprint arXiv:1910.03940, 2019.
  56. H. K. Dam, T. Tran, and T. Pham, “A deep language model for software code,” arXiv preprint arXiv:1608.02715, 2016.
  57. D. Kauchak, “Improving text simplification language modeling using unsimplified text data,” in Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: Long papers), pp. 1537–1546.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets