Extraction of Medication and Temporal Relation from Clinical Text using Neural Language Models (2310.02229v2)
Abstract: Clinical texts, represented in electronic medical records (EMRs), contain rich medical information and are essential for disease prediction, personalised information recommendation, clinical decision support, and medication pattern mining and measurement. Relation extractions between medication mentions and temporal information can further help clinicians better understand the patients' treatment history. To evaluate the performances of deep learning (DL) and LLMs in medication extraction and temporal relations classification, we carry out an empirical investigation of \textbf{MedTem} project using several advanced learning structures including BiLSTM-CRF and CNN-BiLSTM for a clinical domain named entity recognition (NER), and BERT-CNN for temporal relation extraction (RE), in addition to the exploration of different word embedding techniques. Furthermore, we also designed a set of post-processing roles to generate structured output on medications and the temporal relation. Our experiments show that CNN-BiLSTM slightly wins the BiLSTM-CRF model on the i2b2-2009 clinical NER task yielding 75.67, 77.83, and 78.17 for precision, recall, and F1 scores using Macro Average. BERT-CNN model also produced reasonable evaluation scores 64.48, 67.17, and 65.03 for P/R/F1 using Macro Avg on the temporal relation extraction test set from i2b2-2012 challenges. Code and Tools from MedTem will be hosted at \url{https://github.com/HECTA-UoM/MedTem}
- K.-Y. Su, i. Tsujii, J.-H. Lee, and O. Yee Kwong, “Lecture Notes in Artificial Intelligence 3248 Subseries of Lecture Notes in Computer Science,” Tech. Rep., 2004.
- L. Han, I. Sorokina, S. Gladkoff, G. Nenadic et al., “Investigating massive multilingual pre-trained machine translation models for clinical domain via transfer learning,” in Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023, pp. 31–40.
- Y. Zhou, Y. Yan, R. Han, J. H. Caufield, K.-W. Chang, Y. Sun, P. Ping, and W. Wang, “Clinical temporal relation extraction with probabilistic soft logic regularization and global inference,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 16, 2021, pp. 14 647–14 655.
- M. Fredriksen, A. A. Dahl, E. W. Martinsen, O. Klungsøyr, J. Haavik, and D. E. Peleikis, “Effectiveness of one-year pharmacological treatment of adult attention-deficit/hyperactivity disorder (ADHD): An open-label prospective study of time in treatment, dose, side-effects and comorbidity,” European Neuropsychopharmacology, vol. 24, no. 12, pp. 1873–1884, 12 2014.
- H. Alrdahi, L. Han, H. Šuvalov, and G. Nenadic, “Medmine: Examining pre-trained language models on medication mining,” arXiv preprint arXiv:2308.03629, 2023.
- V. Kocaman and D. Talby, “Spark nlp: natural language understanding at scale,” Software Impacts, vol. 8, p. 100058, 2021.
- B. K. Cunningham H, Maynard D, “GATE: an Architecture for Development of Robust HLT Applicas,” no. 2, 2002.
- G. K. Savova, J. J. Masanz, P. V. Ogren, J. Zheng, S. Sohn, K. C. Kipper-Schuler, and C. G. Chute, “Mayo clinical text analysis and knowledge extraction system (ctakes): architecture, component evaluation and applications,” Journal of the American Medical Informatics Association, vol. 17, no. 5, pp. 507–513, 2010.
- D.-H. Pham and A.-C. Le, “Exploiting multiple word embeddings and one-hot character vectors for aspect-based sentiment analysis,” International Journal of Approximate Reasoning, vol. 103, pp. 1–10, 2018.
- Y. Shao, S. Taylor, N. Marshall, C. Morioka, and Q. Zeng-Treitler, “Clinical text classification with word embedding features vs. bag-of-words features,” in 2018 IEEE International Conference on Big Data (Big Data). IEEE, 2018, pp. 2874–2878.
- J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.
- J. A. Bullinaria and J. P. Levy, “Extracting semantic representations from word co-occurrence statistics: A computational study,” Behavior research methods, vol. 39, no. 3, pp. 510–526, 2007.
- E. Arisoy, A. Sethy, B. Ramabhadran, and S. Chen, “Bidirectional recurrent neural network language models for automatic speech recognition,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015, pp. 5421–5425.
- Y. Tang, Y. Huang, Z. Wu, H. Meng, M. Xu, and L. Cai, “Question detection from acoustic features using recurrent neural network with gated recurrent unit,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016, pp. 6125–6129.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- S. Zhang, D. Zheng, X. Hu, and M. Yang, “Bidirectional long short-term memory networks for relation classification,” in Proceedings of the 29th Pacific Asia conference on language, information and computation, 2015, pp. 73–78.
- P. Zhou, W. Shi, J. Tian, Z. Qi, B. Li, H. Hao, and B. Xu, “Attention-based bidirectional long short-term memory networks for relation classification,” in Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers), 2016, pp. 207–212.
- J. Lafferty, A. McCallum, and F. C. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” 2001.
- A. L. F. Han, D. F. Wong, and L. S. Chao, “Chinese named entity recognition with conditional random fields in the light of chinese characteristics,” in Language Processing and Intelligent Information Systems, M. A. Kłopotek, J. Koronacki, M. Marciniak, A. Mykowiecka, and S. T. Wierzchoń, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 57–68. [Online]. Available: https://doi.org/10.1007/978-3-642-38634-3_8
- A. L.-F. Han, X. Zeng, D. F. Wong, and L. S. Chao, “Chinese named entity recognition with graph-based semi-supervised learning model,” in Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing, 2015, pp. 15–20.
- A. Maldonado, L. Han, E. Moreau, A. Alsulaimani, K. D. Chowdhury, C. Vogel, and Q. Liu, “Detection of verbal multi-word expressions via conditional random fields with syntactic dependency features and semantic re-ranking,” in Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017). Valencia, Spain: Association for Computational Linguistics, Apr. 2017, pp. 114–120. [Online]. Available: https://aclanthology.org/W17-1715
- E. Moreau, A. Alsulaimani, A. Maldonado, L. Han, C. Vogel, and K. Dutta Chowdhury, “Semantic reranking of CRF label sequences for verbal multiword expression identification,” in Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop. Language Science Press, 2018, pp. 177 – 207. [Online]. Available: https://hal.archives-ouvertes.fr/hal-01930987
- Y. Wu, L. Han, V. Antonini, and G. Nenadic, “On cross-domain pre-trained language models for clinical text mining: How do they perform on data-constrained fine-tuning?” in arXiv:2210.12770 [cs.CL], 2022. [Online]. Available: https://doi.org/10.48550/arXiv.2210.12770
- S. Albawi, T. A. Mohammed, and S. Al-Zawi, “Understanding of a convolutional neural network,” in 2017 international conference on engineering and technology (ICET). Ieee, 2017, pp. 1–6.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Conference on Neural Information Processing System, 2017, pp. 6000–6010.
- L. A. Ramshaw and M. P. Marcus, “Text chunking using transformation-based learning,” in Natural language processing using very large corpora. Springer, 1999, pp. 157–176.
- W. Sun, A. Rumshisky, O. Uzuner, P. Szolovits, and J. Pustejovsky, “The 2012 i2b2 temporal relations challenge annotation guidelines,” Manuscript, Available at https://www. i2b2. org/NLP/TemporalRelations/Call. php, 2012.
- Y.-K. Lin, H. Chen, and R. A. Brown, “MedTime: A temporal information extraction system for clinical narratives,” Journal of biomedical informatics, vol. 46, no. 6, pp. S20–S28, 2013.
- H. Gurulingappa, A. M. Rajput, A. Roberts, J. Fluck, M. Hofmann-Apitius, and L. Toldo, “Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports,” Journal of biomedical informatics, vol. 45, no. 5, pp. 885–892, 2012.
- J. P. Chiu and E. Nichols, “Named entity recognition with bidirectional lstm-cnns,” Transactions of the association for computational linguistics, vol. 4, pp. 357–370, 2016.
- M. Hofer, “Deep learning for named entity recognition #2: Implementing the state-of-the-art bidirectional lstm + cnn model for conll 2003,” https://towardsdatascience.com/deep-learning-for-named-entity-\\recognition-2-implementing-the-state-\\of-the-art-bidirectional-lstm-4603491087f1/, 2018.
- S. Chotirat and P. Meesad, “Part-of-speech tagging enhancement to natural language processing for thai wh-question classification with deep learning,” Heliyon, vol. 7, no. 10, p. e08216, 2021.
- M. Labeau, K. Löser, and A. Allauzen, “Non-lexical neural architecture for fine-grained pos tagging,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 232–237.
- G. Michalopoulos, Y. Wang, H. Kaka, H. Chen, and A. Wong, “Umlsbert: Clinical domain knowledge augmentation of contextual embeddings using the unified medical language system metathesaurus,” arXiv preprint arXiv:2010.10391, 2020.
- B. Ji, R. Liu, S. Li, J. Yu, Q. Wu, Y. Tan, and J. Wu, “A hybrid approach for named entity recognition in chinese electronic medical record,” BMC medical informatics and decision making, vol. 19, no. 2, pp. 149–158, 2019.
- R. M. Reeves, F. R. Ong, M. E. Matheny, J. C. Denny, D. Aronsky, G. T. Gobbel, D. Montella, T. Speroff, and S. H. Brown, “Detecting temporal expressions in medical narratives,” International journal of medical informatics (Shannon, Ireland), vol. 82, no. 2, pp. 118–127, 2012.
- J. Strötgen and M. Gertz, “Multilingual and cross-domain temporal tagging,” Language Resources and Evaluation, vol. 47, no. 2, pp. 269–298, 6 2013.
- T. Hao, X. Pan, Z. Gu, Y. Qu, and H. Weng, “A pattern learning-based method for temporal expression extraction and normalization from multi-lingual heterogeneous clinical texts,” BMC medical informatics and decision making, vol. 18, no. Suppl 1, pp. 22–22, 2018.
- K. Roberts, B. Rink, and S. M. Harabagiu, “A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text,” Journal of the American Medical Informatics Association : JAMIA, vol. 20, no. 5, pp. 867–875, 2013.
- Y.-C. Chang, H.-J. Dai, J. C.-Y. Wu, J.-M. Chen, R. T.-H. Tsai, and W.-L. Hsu, “TEMPTING system: A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries,” Journal of biomedical informatics, vol. 46, no. 6, pp. S54–S62, 2013.
- Y. Xu, Y. Wang, T. Liu, J. Tsujii, and E. I.-C. Chang, “An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge,” Journal of the American Medical Informatics Association : JAMIA, vol. 20, no. 5, pp. 849–858, 2013.
- Y. Wu, M. Jiang, J. Lei, and H. Xu, “Named entity recognition in chinese clinical text using deep neural network,” Studies in health technology and informatics, vol. 216, p. 624, 2015.
- Y. Wu, M. Jiang, J. Xu, D. Zhi, and H. Xu, “Clinical named entity recognition using deep learning models,” in AMIA Annual Symposium Proceedings, vol. 2017. American Medical Informatics Association, 2017, p. 1812.
- Y.-M. Kim and T.-H. Lee, “Korean clinical entity recognition from diagnosis text using bert,” BMC Medical Informatics and Decision Making, vol. 20, no. 7, pp. 1–9, 2020.
- J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
- S. Jiang, S. Zhao, K. Hou, Y. Liu, L. Zhang et al., “A bert-bilstm-crf model for chinese electronic medical records named entity recognition,” in 2019 12th International Conference on Intelligent Computation Technology and Automation (ICICTA). IEEE, 2019, pp. 166–169.
- X. Ma and E. Hovy, “End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 1064–1074. [Online]. Available: https://aclanthology.org/P16-1101
- G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural architectures for named entity recognition,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, California: Association for Computational Linguistics, Jun. 2016, pp. 260–270. [Online]. Available: https://aclanthology.org/N16-1030
- L. Rasmy, Y. Xiang, Z. Xie, C. Tao, and D. Zhi, “Med-bert: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction,” NPJ digital medicine, vol. 4, no. 1, pp. 1–13, 2021.
- S. Belkadi, N. Micheletti, L. Han, W. Del-Pinto, and G. Nenadic, “Conditional transformer generates faithful medical instructions,” openreview.net, 2023.
- Y. Cui, L. Han, and G. Nenadic, “Medtem2. 0: Prompt-based temporal classification of treatment events from discharge summaries,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 2023, pp. 160–183.
- Hangyu Tu (1 paper)
- Lifeng Han (37 papers)
- Goran Nenadic (49 papers)