Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GPT-RE: In-context Learning for Relation Extraction using Large Language Models (2305.02105v3)

Published 3 May 2023 in cs.CL
GPT-RE: In-context Learning for Relation Extraction using Large Language Models

Abstract: In spite of the potential for ground-breaking achievements offered by LLMs (e.g., GPT-3), they still lag significantly behind fully-supervised baselines (e.g., fine-tuned BERT) in relation extraction (RE). This is due to the two major shortcomings of LLMs in RE: (1) low relevance regarding entity and relation in retrieved demonstrations for in-context learning; and (2) the strong inclination to wrongly classify NULL examples into other pre-defined labels. In this paper, we propose GPT-RE to bridge the gap between LLMs and fully-supervised baselines. GPT-RE successfully addresses the aforementioned issues by (1) incorporating task-specific entity representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic. We evaluate GPT-RE on four widely-used RE datasets, and observe that GPT-RE achieves improvements over not only existing GPT-3 baselines, but also fully-supervised baselines. Specifically, GPT-RE achieves SOTA performances on the Semeval and SciERC datasets, and competitive performances on the TACRED and ACE05 datasets.

Enhancing Relation Extraction with GPT-RE: In-context Learning and Task-Aware Demonstrations

Introduction to GPT-RE for Relation Extraction

The research presents GPT-RE, a novel approach to addressing the challenges faced by LLMs such as GPT-3 in the task of relation extraction (RE). Despite LLMs' remarkable capabilities in various NLP tasks through in-context learning (ICL), their performance in RE has remained suboptimal, mainly due to issues related to the relevance of entity and relation in demonstration retrieval and the lack of a mechanism for explaining input-label mappings.

Key Innovations in GPT-RE

GPT-RE introduces two significant improvements to overcome the limitations mentioned above:

  • Task-Aware Demonstration Retrieval: This method emphasizes the retrieval of demonstrations that are highly relevant to the specific entities and relations of interest, significantly enhancing the pertinence and quality of the demonstrations used for ICL.
  • Gold Label-Induced Reasoning: By integrating reasoning logic that aligns with the gold label into the demonstrations, GPT-RE provides a richer context for the model, enabling a deeper understanding and more accurate generation of relation predictions.

Empirical Results

The effectiveness of GPT-RE is demonstrated through extensive evaluations on widely-used RE datasets, including Semeval, SciERC, TACRED, and ACE05. GPT-RE achieves state-of-the-art (SOTA) performances on Semeval and SciERC, and competitive results on TACRED and ACE05, outperforming both existing GPT-3 baselines and fully-supervised approaches.

Theoretical and Practical Implications

The introduction of task-aware demonstration retrieval and gold label-induced reasoning embodies a significant step forward in the application of LLMs like GPT-3 to RE. These strategies not only address the specific shortcomings of ICL in the context of RE but also provide a generalized framework that could influence future research in LLMs' application to other NLP tasks. From a practical standpoint, the ability to enhance LLMs' performance in specialized domains like RE without extensive dataset-specific fine-tuning presents an efficient pathway for developing more versatile and robust NLP systems.

Future Directions

The success of GPT-RE opens several avenues for future research, including exploration into additional mechanisms for improving the alignment between demonstrations and target tasks, and further examination of the limitations of LLMs in tasks requiring deep domain knowledge or complex reasoning. Moreover, the approach of integrating task-specific knowledge and reasoning into ICL could be adapted and extended to other areas beyond RE, potentially unlocking new capabilities in LLMs.

Conclusion

GPT-RE showcases a novel and effective approach to leveraging the strengths of LLMs in the domain of relation extraction, through innovations in demonstration retrieval and the incorporation of reasoning logic. Its successes invite further integration of task-specific knowledge into the in-context learning paradigm, promising advancements in both the theory and practice of generative AI in NLP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Matching the blanks: Distributional similarity for relation learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2895–2905, Florence, Italy. Association for Computational Linguistics.
  2. Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of ACL-08: HLT, pages 28–36, Columbus, Ohio. Association for Computational Linguistics.
  3. SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3615–3620, Hong Kong, China. Association for Computational Linguistics.
  4. Prompting language models for linguistic structure. CoRR, abs/2211.07830.
  5. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 632–642, Lisbon, Portugal. Association for Computational Linguistics.
  6. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  7. Palm: Scaling language modeling with pathways. CoRR, abs/2204.02311.
  8. Relation extraction as two-way span-prediction. CoRR, abs/2010.04829.
  9. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  10. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  11. MICK: A meta-learning framework for few-shot relation classification with small training data. In CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020, pages 415–424. ACM.
  12. Thinking about GPT-3 in-context learning for biomedical ie? think again. CoRR, abs/2203.08410.
  13. FewRel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4803–4809, Brussels, Belgium. Association for Computational Linguistics.
  14. SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 33–38, Uppsala, Sweden. Association for Computational Linguistics.
  15. Training compute-optimal large language models.
  16. Large language models are zero-shot reasoners. CoRR, abs/2205.11916.
  17. Albert: A lite bert for self-supervised learning of language representations.
  18. Pre-training to match for unified low-shot relation extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5785–5795, Dublin, Ireland. Association for Computational Linguistics.
  19. What makes good in-context examples for GPT-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pages 100–114, Dublin, Ireland and Online. Association for Computational Linguistics.
  20. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, Dublin, Ireland. Association for Computational Linguistics.
  21. Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3219–3232, Brussels, Belgium. Association for Computational Linguistics.
  22. Coherence boosting: When your pretrained language model is not paying enough attention. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8214–8236, Dublin, Ireland. Association for Computational Linguistics.
  23. Rethinking the role of demonstrations: What makes in-context learning work?
  24. Rethinking the role of demonstrations: What makes in-context learning work? CoRR, abs/2202.12837.
  25. True few-shot learning with language models. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 11054–11070.
  26. Scaling language models: Methods, analysis & insights from training gopher.
  27. Exploring the limits of transfer learning with a unified text-to-text transformer.
  28. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
  29. Learning to retrieve prompts for in-context learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2655–2671, Seattle, United States. Association for Computational Linguistics.
  30. Constrained language models yield few-shot semantic parsers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7699–7715, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  31. Lamda: Language models for dialog applications.
  32. Rescue implicit and long-tail cases: Nearest neighbor relation extraction. CoRR, abs/2210.11800.
  33. DeepStruct: Pretraining of language models for structure prediction. In Findings of the Association for Computational Linguistics: ACL 2022, pages 803–823, Dublin, Ireland. Association for Computational Linguistics.
  34. Self-consistency improves chain of thought reasoning in language models. CoRR, abs/2203.11171.
  35. Chain of thought prompting elicits reasoning in large language models. CoRR, abs/2201.11903.
  36. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122, New Orleans, Louisiana. Association for Computational Linguistics.
  37. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  38. Position-aware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35–45, Copenhagen, Denmark. Association for Computational Linguistics.
  39. Calibrate before use: Improving few-shot performance of language models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 12697–12706. PMLR.
  40. Zexuan Zhong and Danqi Chen. 2021. A frustratingly easy approach for entity and relation extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 50–61, Online. Association for Computational Linguistics.
  41. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1218–1227, Huhhot, China. Chinese Information Processing Society of China.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zhen Wan (42 papers)
  2. Fei Cheng (46 papers)
  3. Zhuoyuan Mao (19 papers)
  4. Qianying Liu (30 papers)
  5. Haiyue Song (18 papers)
  6. Jiwei Li (137 papers)
  7. Sadao Kurohashi (55 papers)
Citations (72)