Comparison of pipeline, sequence-to-sequence, and GPT models for end-to-end relation extraction: experiments with the rare disease use-case (2311.13729v2)
Abstract: End-to-end relation extraction (E2ERE) is an important and realistic application of NLP in biomedicine. In this paper, we aim to compare three prevailing paradigms for E2ERE using a complex dataset focused on rare diseases involving discontinuous and nested entities. We use the RareDis information extraction dataset to evaluate three competing approaches (for E2ERE): NER $\rightarrow$ RE pipelines, joint sequence to sequence models, and generative pre-trained transformer (GPT) models. We use comparable state-of-the-art models and best practices for each of these approaches and conduct error analyses to assess their failure modes. Our findings reveal that pipeline models are still the best, while sequence-to-sequence models are not far behind; GPT models with eight times as many parameters are worse than even sequence-to-sequence models and lose to pipeline models by over 10 F1 points. Partial matches and discontinuous entities caused many NER errors contributing to lower overall E2E performances. We also verify these findings on a second E2ERE dataset for chemical-protein interactions. Although generative LM-based methods are more suitable for zero-shot settings, when training data is available, our results show that it is better to work with more conventional models trained and tailored for E2ERE. More innovative methods are needed to marry the best of the both worlds from smaller encoder-decoder pipeline models and the larger GPT models to improve E2ERE. As of now, we see that well designed pipeline models offer substantial performance gains at a lower cost and carbon footprint for E2ERE. Our contribution is also the first to conduct E2ERE for the RareDis dataset.
- Dietze H, Schroeder M. GoWeb: a semantic search engine for the life science web. BMC bioinformatics. 2009;10:1-13.
- Henry S, McInnes BT. Literature based discovery: models, methods, and trends. Journal of biomedical informatics. 2017;74:20-32.
- Broad-coverage biomedical relation extraction with SemRep. BMC bioinformatics. 2020;21:1-28.
- Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers; 2014. p. 2335-44.
- Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers); 2016. p. 207-12.
- Extracting drug-drug interactions with word and character-level recurrent neural networks. In: 2017 IEEE International Conference on Healthcare Informatics (ICHI). IEEE; 2017. p. 5-12.
- Miwa M, Bansal M. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). vol. 1; 2016. p. 1105-16.
- End-to-End Neural Relation Extraction with Global Optimization. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; 2017. p. 1730-40.
- End-to-end relation extraction using neural networks and Markov logic networks. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. vol. 1; 2017. p. 818-27.
- Tran T, Kavuluru R. An end-to-end deep learning architecture for extracting protein–protein interactions affected by genetic mutations. Database. 2018:1-13.
- Cross-sentence n-ary relation extraction with graph lstms. Transactions of the Association for Computational Linguistics. 2017;5:101-15.
- DocRED: A Large-Scale Document-Level Relation Extraction Dataset. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019. p. 764-77.
- A Span-Based Model for Joint Overlapped and Discontinuous Named Entity Recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers); 2021. p. 4814-28.
- The RareDis corpus: a corpus annotated with rare diseases, their signs and symptoms. Journal of Biomedical Informatics. 2022;125:103961.
- Deep neural models for extracting entities and relationships in the new RDD corpus relating disabilities and rare diseases. Computer methods and programs in biomedicine. 2018;164:121-9.
- Eberts M, Ulges A. Span-Based Joint Entity and Relation Extraction with Transformer Pre-Training. In: ECAI 2020. IOS Press; 2020. p. 2006-13.
- Tran T, Kavuluru R. Neural metric learning for fast end-to-end relation extraction. arXiv preprint arXiv:190507458. 2019.
- Zhong Z, Chen D. A Frustratingly Easy Approach for Entity and Relation Extraction. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2021. p. 50-61.
- Nayak T, Ng HT. Effective modeling of encoder-decoder architecture for joint entity and relation extraction. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34; 2020. p. 8528-35.
- Extracting relational facts by an end-to-end neural model with copy mechanism. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); 2018. p. 506-14.
- A sequence-to-sequence approach for document-level relation extraction. In: Proceedings of the 21st Workshop on Biomedical Language Processing. Dublin, Ireland: Association for Computational Linguistics; 2022. p. 10-25. Available from: https://aclanthology.org/2022.bionlp-1.2.
- Available from: https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf.
- Language models are few-shot learners. Advances in neural information processing systems. 2020;33:1877-901.
- BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics. 2022;23(6).
- Available from: https://crfm.stanford.edu/2022/12/15/biomedlm.html.
- National Organization for Rare Disorders (NORD). Rare Disease Database Frequently Asked Questions; 2019. Accessed: Month Day, Year. https://rarediseases.org/wp-content/uploads/2019/01/RDD-FAQ-2019.pdf.
- Global view on rare diseases: a mini review. Current medicinal chemistry. 2017;24(29):3153-8.
- Global Genes. Facts;. Accessed May 1, 2023. https://globalgenes.org/learn/rare-disease-facts/.
- The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:210100027. 2020.
- Jurafsky D, Martin JH. Speech and Language Processing (3rd Edition); 2023. https://web.stanford.edu/~jurafsky/slp3/.
- Ai X, Kavuluru R. End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies. arXiv preprint arXiv:230401344. 2023.
- Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv preprint arXiv:230312712. 2023.
- Shashank Gupta (57 papers)
- Xuguang Ai (7 papers)
- Ramakanth Kavuluru (23 papers)