Large Language Models for Biomedical Knowledge Graph Construction: Information extraction from EMR notes (2301.12473v2)
Abstract: The automatic construction of knowledge graphs (KGs) is an important research area in medicine, with far-reaching applications spanning drug discovery and clinical trial design. These applications hinge on the accurate identification of interactions among medical and biological entities. In this study, we propose an end-to-end machine learning solution based on LLMs that utilize electronic medical record notes to construct KGs. The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease. Given the critical need for high-quality performance in medical applications, we embark on a comprehensive assessment of 12 LLMs of various architectures, evaluating their performance and safety attributes. To gauge the quantitative efficacy of our approach by assessing both precision and recall, we manually annotate a dataset provided by the Macula and Retina Institute. We also assess the qualitative performance of LLMs, such as the ability to generate structured outputs or the tendency to hallucinate. The results illustrate that in contrast to encoder-only and encoder-decoder, decoder-only LLMs require further investigation. Additionally, we provide guided prompt design to utilize such LLMs. The application of the proposed methodology is demonstrated on age-related macular degeneration.
- Large language models are few-shot clinical information extractors. In Proc. of EMNLP, pp. 1998–2022, Abu Dhabi, United Arab Emirates, 2022. Association for Computational Linguistics. URL https://aclanthology.org/2022.emnlp-main.130.
- Alan R Aronson. Effective mapping of biomedical text to the umls metathesaurus: the metamap program. In Proceedings of the AMIA Symposium, pp.  17. American Medical Informatics Association, 2001.
- Olivier Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research, 32(suppl_1):D267–D270, 2004.
- The pathophysiology of geographic atrophy secondary to age-related macular degeneration and the complement pathway as a therapeutic target. Retina (Philadelphia, Pa.), 37(5):819, 2017.
- Language models are few-shot learners. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
- Scaling instruction-finetuned language models. ArXiv preprint, abs/2210.11416, 2022. URL https://arxiv.org/abs/2210.11416.
- DETERRENT: knowledge guided graph attention network for detecting healthcare misinformation. In Rajesh Gupta, Yan Liu, Jiliang Tang, and B. Aditya Prakash (eds.), KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, pp. 492–502. ACM, 2020. URL https://dl.acm.org/doi/10.1145/3394486.3403092.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. of NAACL-HLT, pp. 4171–4186, Minneapolis, Minnesota, 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
- Neuro-symbolic xai for computational drug repurposing. In KEOD, pp. 220–225, 2021.
- Zero-shot open-book question answering. ArXiv preprint, abs/2111.11520, 2021. URL https://arxiv.org/abs/2111.11520.
- Risk factors for progression of age-related macular degeneration. Ophthalmic and Physiological Optics, 40(2):140–170, 2020.
- Geographic atrophy: clinical features and potential therapeutic approaches. Ophthalmology, 121(5):1079–1091, 2014.
- REBEL: Relation extraction by end-to-end language generation. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (eds.), Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2370–2381, Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.findings-emnlp.204. URL https://aclanthology.org/2021.findings-emnlp.204.
- Opt-iml: Scaling language model instruction meta learning through the lens of generalization. ArXiv preprint, abs/2212.12017, 2022. URL https://arxiv.org/abs/2212.12017.
- Thinking about GPT-3 in-context learning for biomedical IE? think again. In Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 4497–4512, Abu Dhabi, United Arab Emirates, 2022. Association for Computational Linguistics. URL https://aclanthology.org/2022.findings-emnlp.329.
- PubMedQA: A dataset for biomedical research question answering. In Proc. of EMNLP, pp. 2567–2577, Hong Kong, China, 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1259. URL https://aclanthology.org/D19-1259.
- Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9, 2016.
- Semmeddb: a pubmed-scale repository of biomedical semantic predications. Bioinformatics, 28(23):3158–3160, 2012.
- Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 2020.
- Roberta: A robustly optimized bert pretraining approach. ArXiv preprint, abs/1907.11692, 2019. URL https://arxiv.org/abs/1907.11692.
- The unified medical language system specialist lexicon and lexical tools: development and applications. Journal of the American Medical Informatics Association, 27(10):1600–1605, 2020.
- Biogpt: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6):bbac409, 2022.
- A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. Journal of the American Medical Informatics Association, 21(2):299–307, 2014.
- Building causal graphs from medical literature and electronic medical records. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp. 1102–1109. AAAI Press, 2019. doi: 10.1609/aaai.v33i01.33011102. URL https://doi.org/10.1609/aaai.v33i01.33011102.
- OpenAI. Gpt-4 technical report, 2023.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- SQuAD: 100,000+ questions for machine comprehension of text. In Proc. of EMNLP, pp. 2383–2392, Austin, Texas, 2016. Association for Computational Linguistics. doi: 10.18653/v1/D16-1264. URL https://aclanthology.org/D16-1264.
- The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. Journal of biomedical informatics, 36(6):462–477, 2003.
- A knowledge graph to interpret clinical proteomics data. Nature biotechnology, 40(5):692–702, 2022.
- Bloom: A 176b-parameter open-access multilingual language model. ArXiv preprint, abs/2211.05100, 2022. URL https://arxiv.org/abs/2211.05100.
- Impact of drusen and drusenoid retinal pigment epithelium elevation size and structure on the integrity of the retinal pigment epithelium layer. British Journal of Ophthalmology, 103(2):227–232, 2019.
- Assessing bias: the importance of considering confounding. Evidence-based spine-care journal, 3(01):9–12, 2012.
- Anders Søgaard. Explainable natural language processing. Morgan & Claypool Publishers, 2021.
- Can language models be biomedical knowledge bases? In Proc. of EMNLP, pp. 4723–4734, Online and Punta Cana, Dominican Republic, 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.emnlp-main.388. URL https://aclanthology.org/2021.emnlp-main.388.
- Developing a general-purpose clinical language inference model from a large corpus of clinical notes. ArXiv preprint, abs/2210.06566, 2022. URL https://arxiv.org/abs/2210.06566.
- Stanford alpaca: An instruction-following llama model, 2023.
- Ul2: Unifying language learning paradigms. In The Eleventh International Conference on Learning Representations, 2022.
- Llama 2: Open foundation and fine-tuned chat models. ArXiv preprint, abs/2307.09288, 2023. URL https://arxiv.org/abs/2307.09288.
- Enhancing knowledge graph construction using large language models, 2023.
- Finetuned language models are zero-shot learners. In Proc. of ICLR. OpenReview.net, 2022. URL https://openreview.net/forum?id=gEZrGCozdqR.
- Wizardlm: Empowering large language models to follow complex instructions. ArXiv preprint, abs/2304.12244, 2023. URL https://arxiv.org/abs/2304.12244.
- In-context instruction learning. ArXiv preprint, abs/2302.14691, 2023. URL https://arxiv.org/abs/2302.14691.
- Yitayal Yitayew. Flan-ul2: A new open source flan 20b with ul2. https://www.yitay.net/blog/flan-ul2-20b, 2023.
- Opt: Open pre-trained transformer language models. ArXiv preprint, abs/2205.01068, 2022. URL https://arxiv.org/abs/2205.01068.
- Judging llm-as-a-judge with mt-bench and chatbot arena. ArXiv preprint, abs/2306.05685, 2023. URL https://arxiv.org/abs/2306.05685.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.