Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction (2401.13598v1)
Abstract: Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. However, collecting and annotating data for newly emerging relations is time-consuming and labor-intensive. Recent advanced LLMs, such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities, inspiring us to explore an alternative approach for obtaining auto-labeled documents with new relations. In this paper, we propose a Zero-shot Document-level Relation Triplet Extraction (ZeroDocRTE) framework, which generates labeled data by retrieval and denoising knowledge from LLMs, called GenRDK. Specifically, we propose a chain-of-retrieval prompt to guide ChatGPT to generate labeled long-text data step by step. To improve the quality of synthetic data, we propose a denoising strategy based on the consistency of cross-document knowledge. Leveraging our denoised synthetic data, we proceed to fine-tune the LLaMA2-13B-Chat for extracting document-level relation triplets. We perform experiments for both zero-shot document-level relation and triplet extraction on two public datasets. The experimental results illustrate that our GenRDK framework outperforms strong baselines.
- Chih-Yao Chen and Cheng-Te Li. 2021. ZS-BERT: Towards Zero-Shot Relation Extraction with Attribute Representation Learning. In Proceedings of NAACL. 3470–3479.
- RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction. In Findings of the Association for Computational Linguistics: ACL 2022. 45–57.
- Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs. In Proceedings of EMNLP. 4927–4938. https://aclanthology.org/D19-1498
- Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416 (2022).
- Chain-of-Verification Reduces Hallucination in Large Language Models. arXiv preprint arXiv:2309.11495 (2023).
- Is GPT-3 a Good Data Annotator?. In Proceedings of ACL. Toronto, Canada, 11173–11195. https://aclanthology.org/2023.acl-long.626
- Markus Eberts and Adrian Ulges. 2021. An End-to-end Model for Entity-level Relation Extraction using Multi-instance Learning. In Proceedings of EACL. 3650–3660. https://aclanthology.org/2021.eacl-main.319
- A sequence-to-sequence approach for document-level relation extraction. In Proceedings of the 21st Workshop on Biomedical Language Processing. 10–25. https://aclanthology.org/2022.bionlp-1.2
- Unsupervised Relation Extraction from Language Models using Constrained Cloze Completion. In Findings of EMNLP. 1263–1276.
- Towards a Unified View of Parameter-Efficient Transfer Learning. In International Conference on Learning Representations. https://arxiv.org/pdf/2110.04366.pdf
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021). https://arxiv.org/abs/2106.09685
- Three Sentences Are All You Need: Local Path Enhanced Document Relation Extraction. In Proceedings of ACL. 998–1004. https://aclanthology.org/2021.acl-short.126
- Zero-Shot Relation Extraction via Reading Comprehension. In Proceedings of CoNLL. 333–342.
- MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction. In Findings of ACL. 1359–1370. https://aclanthology.org/2021.findings-acl.117
- TransO: a knowledge-driven representation learning method with ontology information constraints. World Wide Web (2022), 1–23.
- A Robustly Optimized BERT Pre-training Approach with Post-training. In China National Conference on Chinese Computational Linguistics. Springer, 471–484.
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of ICLR. https://openreview.net/forum?id=Bkg6RiCqY7
- Reasoning with Latent Structure Refinement for Document-Level Relation Extraction. In Proceedings of ACL. 1546–1557. https://aclanthology.org/2020.acl-main.141
- Abiola Obamuyide and Andreas Vlachos. 2018. Zero-shot Relation Classification as Textual Entailment. Proceedings of EMNLP (2018), 72.
- Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network. In Proceedings of ACL. 4309–4316. https://aclanthology.org/P19-1423
- Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction. In Proceedings of EMNLP. 1199–1212.
- Modeling relational data with graph convolutional networks. In European Semantic Web Conference. Springer, 593–607.
- Uncertainty Guided Label Denoising for Document-level Distant Relation Extraction. In Proceedings of ACL. Association for Computational Linguistics, Toronto, Canada, 15960–15973. https://doi.org/10.18653/v1/2023.acl-long.889
- Document-level relation extraction with two-stage dynamic graph attention networks. Knowledge-Based Systems 267 (2023), 110428. https://www.sciencedirect.com/science/article/pii/S0950705123001788
- A hierarchical framework for relation extraction with reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7072–7079.
- Revisiting DocRED – Addressing the False Negative Problem in Relation Extraction. In Proceedings of EMNLP. https://arxiv.org/abs/2205.12696
- Global-to-Local Neural Networks for Document-Level Relation Extraction. In Proceedings of EMNLP. 3711–3721. https://aclanthology.org/2020.emnlp-main.303
- Kehai Chen Wang Xu and Tiejun Zhao. 2021. Discriminative Reasoning for Document-level Relation Extraction. In Findings of ACL. 1653–1663. https://aclanthology.org/2021.findings-acl.144
- Explicit semantic ranking for academic search via knowledge graph embedding. In Proceedings of the 26th international conference on world wide web. 1271–1279.
- Wee Sun Lee Yang Zhou. 2022. None Class Ranking Loss for Document-Level Relation Extraction. In Proceedings of IJCAI. 4538–4544. https://www.ijcai.org/proceedings/2022/0630
- DocRED: A Large-Scale Document-Level Relation Extraction Dataset. In Proceedings of ACL. 764–777. https://aclanthology.org/P19-1074
- SIRE: Separate Intra- and Inter-sentential Reasoning for Document-level Relation Extraction. In Findings of EMNLP. 524–534. https://aclanthology.org/2021.findings-acl.47
- Double Graph Based Reasoning for Document-level Relation Extraction. In Proceedings of EMNLP. 1630–1640. https://aclanthology.org/2020.emnlp-main.127
- Extracting relational facts by an end-to-end neural model with copy mechanism. In Proceedings of ACL. 506–514.
- A Novel Table-to-Graph Generation Approach for Document-Level Joint Entity and Relation Extraction. In Proceedings of ACL. Association for Computational Linguistics, Toronto, Canada, 10853–10865. https://aclanthology.org/2023.acl-long.607
- RE-Matching: A Fine-Grained Semantic Matching Method for Zero-Shot Relation Extraction. In Proceedings of ACL. 6680–6691. https://aclanthology.org/2023.acl-long.369
- Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. In Proceedings of ACL.
- Document-level relation extraction with adaptive thresholding and localized context pooling. In Proceedings of AAAI. 14612–14620. https://ojs.aaai.org/index.php/AAAI/article/view/17717
- Qi Sun (114 papers)
- Kun Huang (85 papers)
- Xiaocui Yang (23 papers)
- Rong Tong (4 papers)
- Kun Zhang (353 papers)
- Soujanya Poria (138 papers)