Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models (2310.11085v4)

Published 17 Oct 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Document-level relation extraction aims at inferring structured human knowledge from textual documents. State-of-the-art methods for this task use pre-trained LLMs (LMs) via fine-tuning, yet fine-tuning is computationally expensive and cannot adapt to new relation types or new LMs. As a remedy, we leverage the generalization capabilities of pre-trained LMs and present a novel framework for document-level in-context few-shot relation extraction. Our framework has three strengths: it eliminates the need (1) for named entity recognition and (2) for human annotations of documents, and (3) it can be updated to new LMs without re-training. We evaluate our framework using DocRED, the largest publicly available dataset for document-level relation extraction, and demonstrate that our framework achieves state-of-the-art performance. We further show that our framework actually performs much better than the original labels from the development set of DocRED. Finally, we conduct an extensive benchmark demonstrating the effectiveness of our framework, achieving state-of-the-art results across six relation extraction datasets and outperforming more than 30 baseline methods. Unlike our framework, the baseline methods have large computational overhead (e.g., from fine-tuning). To the best of our knowledge, we are the first to reformulate the document-level relation extraction task as a tailored in-context few-shot learning paradigm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (102)
  1. Global normalization of convolutional neural networks for joint entity and relation classification. EMNLP, 2017.
  2. A review on language models as knowledge bases. arXiv preprint arXiv:2204.06031, 2022.
  3. DBpedia: A nucleus for a web of open data. In ISWC, 2007.
  4. SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676, 2019.
  5. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, 2008.
  6. Language models are few-shot learners. NeurIPS, 2020.
  7. REBEL: Relation extraction by end-to-end language generation. In EMNLP, 2021.
  8. Knowledgeable or educated guess? Revisiting language models as knowledge bases. In ACL-IJCNLP, 2021.
  9. Toward an architecture for never-ending language learning. In AAAI, 2010.
  10. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416, 2022.
  11. Knowledge base question answering by case-based reasoning over subgraphs. In ICML, 2022.
  12. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 2019.
  13. Span-based joint entity and relation extraction with transformer pre-training. In EACI, 2020.
  14. An end-to-end model for entity-level relation extraction using multi-instance learning. In EACL, 2021.
  15. Measuring and improving consistency in pretrained language models. TACL, 2021.
  16. Measuring causal effects of data statistics on language model’s factual predictions. arXiv preprint arXiv:2207.14251, 2022.
  17. A sequence-to-sequence approach for document-level relation extraction. In BioNLP Workshop, 2022.
  18. Ralph Grishman. Twenty-five years of information extraction. Natural Language Engineering, 2019.
  19. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. Journal of Biomedical Informatics, 2012.
  20. More data, more relations, more context and more openness: A review and outlook for relation extraction. In AACL, 2020.
  21. BertNet: Harvesting knowledge graphs from pretrained language models. arXiv preprint arXiv:2206.14268, 2022.
  22. Surface form competition: Why the highest probability answer isn’t always right. In EMNLP, 2021.
  23. Selective annotation makes language models better few-shot learners. In ICLR, 2023.
  24. Think rationally about what you see: Continuous rationale extraction for relation extraction. In SIGIR, 2023.
  25. A benchmark for fact checking algorithms built on knowledge bases. In CIKM, 2019.
  26. A systematic exploration of the feature space for relation extraction. In NAACL, 2007.
  27. MetaPAD: Meta pattern discovery from massive text corpora. In KDD, 2017.
  28. Large language models struggle to learn long-tail knowledge. arXiv preprint arXiv:2211.08411, 2022.
  29. Going out on a limb: Joint extraction of entity mentions and relations without dependency trees. In ACL, 2017.
  30. The power of scale for parameter-efficient prompt tuning. In EMNLP, 2021.
  31. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, 2020.
  32. Prefix-tuning: Optimizing continuous prompts for generation. In ACL-IJCNLP, 2021.
  33. Holistic evaluation of language models. arXiv preprint arXiv:2211.09110, 2022.
  34. KagNet: Knowledge-aware graph networks for commonsense reasoning. In EMNLP-IJCNLP, 2019.
  35. Learning entity and relation embeddings for knowledge graph completion. In AAAI, 2015.
  36. What makes good in-context examples for GPT-3333? In DeeLIO, 2022a.
  37. Generated knowledge prompting for commonsense reasoning. In ACL, 2021a.
  38. GPT understands, too. arXiv preprint arXiv:2103.10385, 2021b.
  39. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. In ACL, 2022b.
  40. RoBERTa: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
  41. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. ACL, 2022a.
  42. Unified structure generation for universal information extraction. In ACL, 2022b.
  43. Knowledge base question answering via encoding of complex query graphs. In EMNLP, 2018.
  44. Noisy channel language model prompting for few-shot text classification. In ACL, 2022a.
  45. Rethinking the role of demonstrations: What makes in-context learning work? In EMNLP, 2022b.
  46. End-to-end relation extraction using LSTMs on sequences and tree structures. In ACL, 2016.
  47. PATTY: A taxonomy of relational patterns with semantic types. In EMNLP, 2012.
  48. P-adapters: Robustly extracting factual information from language models with diverse prompts. ICLR, 2022.
  49. Relation extraction from Wikipedia using subtree mining. In AAAI, 2007.
  50. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  51. Structured prediction as translation between augmented natural languages. In ICLR, 2021.
  52. Relation extraction: A survey. arXiv preprint arXiv:1712.05191, 2017.
  53. True few-shot learning with language models. NeurIPS, 2021.
  54. Language models as knowledge bases? In EMNLP-IJCNLP, 2019.
  55. E-BERT: Efficient-yet-effective entity embeddings for BERT. In EMNLP, 2020.
  56. Language models are unsupervised multitask learners. OpenAI blog, 2019.
  57. Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 2020.
  58. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In EMNLP, 2019.
  59. Modeling relations and their mentions without labeled text. In ECML PKDD, 2010.
  60. A linear programming formulation for global inference in natural language tasks. In CoNLL at HLT-NAACL, 2004.
  61. Learning to retrieve prompts for in-context learning. In NAACL, 2022.
  62. Semi-Markov conditional random fields for information extraction. NeurIPS, 2004.
  63. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. In EMNLP, 2020.
  64. YAGO: a core of semantic knowledge. In WWW, 2007.
  65. Document-level relation extraction with adaptive focal loss and knowledge distillation. In Findings of ACL, 2022.
  66. FACE-KEG: Fact checking explained using knowledge graphs. In WSDM, 2021.
  67. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10):78–85, 2014.
  68. Revisiting relation extraction in the era of large language models. In ACL, 2023.
  69. Gpt-re: In-context learning for relation extraction using large language models. In EMNLP, 2023.
  70. GPT-J-6B: A 6 billion parameter autoregressive language model. https://github.com/kingoflolz/mesh-transformer-jax, 2021.
  71. Fine-tune BERT for DocRED with two-step process. arXiv preprint arXiv:1909.11898, 2019.
  72. DKN: Deep knowledge-aware network for news recommendation. In WWW, 2018.
  73. Two are better than one: Joint entity and relation extraction with table-sequence encoders. In EMNLP, 2020.
  74. Mengqiu Wang. A re-examination of dependency path kernels for relation extraction. In IJNLP, 2008.
  75. Self-consistency improves chain of thought reasoning in language models. ICLR, 2023.
  76. A new concept of knowledge based question answering (KBQA) system for multi-hop reasoning. In NAACL, 2022.
  77. Knowledge graph embedding by translating on hyperplanes. In AAAI, 2014.
  78. Lili Mou Wang Xu, Kehai Chen and Tiejun Zhao. Document-level relation extraction with sentences importance estimation and focusing. In NAACL, 2022.
  79. Emergent abilities of large language models. In TMLR, 2022a.
  80. Chain of thought prompting elicits reasoning in large language models. NeurIPS, 2022b.
  81. Larger language models do in-context learning differently. arXiv preprint arXiv:2303.03846, 2023.
  82. Machine knowledge: Creation and curation of comprehensive knowledge bases. Foundations and Trends in Databases, 10(2-4):108–490, 2021.
  83. SAIS: supervising and augmenting intermediate steps for document-level relation extraction. In NAACL, 2022.
  84. Entity structure within and throughout: Modeling mention dependencies for document-level relation extraction. In AAAI, 2021.
  85. S2ynRE: Two-stage self-training with synthetic data for low-resource relation extraction. In ACL, 2023.
  86. DocRED: A large-scale document-level relation extraction dataset. In NAACL, 2019.
  87. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach. In COLING, 2010.
  88. Relation classification via convolutional deep neural network. In COLING, 2014.
  89. Exploring self-distillation based relational reasoning training for document-level relation extraction. In AAAI, 2023a.
  90. Exploring syntactic features for relation extraction using a convolution tree kernel. In NAACL, 2006a.
  91. A composite kernel to extract relations between entities with both flat and structured features. In ACL, 2006b.
  92. Document-level relation extraction as semantic segmentation. In IJCAI, 2021.
  93. A novel table-to-graph generation approach for document-level joint entity and relation extraction. In ACL, 2023b.
  94. Automatic chain of thought prompting in large language models. ICLR, 2023c.
  95. Calibrate before use: Improving few-shot performance of language models. In ICML, 2021.
  96. Joint entity and relation extraction based on a hybrid neural network. Neurocomputing, 257:59–66, 2017.
  97. Factual probing is [mask]: Learning vs. learning to recall. In NAACL, 2021.
  98. Improving conversational recommender systems via knowledge graph based semantic fusion. In KDD, 2020a.
  99. Attention-based bidirectional long short-term memory networks for relation classification. In ACL, 2016.
  100. Interactive recommender system via knowledge graph-enhanced reinforcement learning. In SIGIR, 2020b.
  101. Document-level relation extraction with adaptive thresholding and localized context pooling. In AAAI, 2021.
  102. Parallel feature selection inspired by group testing. NeurIPS, 2014.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yilmazcan Ozyurt (6 papers)
  2. Stefan Feuerriegel (117 papers)
  3. Ce Zhang (215 papers)
Citations (1)