Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study (2210.10678v3)

Published 19 Oct 2022 in cs.CL, cs.AI, cs.IR, and cs.LG

Abstract: This paper presents an empirical study to build relation extraction systems in low-resource settings. Based upon recent pre-trained LLMs, we comprehensively investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; (iii) data augmentation technologies and self-training to generate more labeled in-domain data. We create a benchmark with 8 relation extraction (RE) datasets covering different languages, domains and contexts and perform extensive comparisons over the proposed schemes with combinations. Our experiments illustrate: (i) Though prompt-based tuning is beneficial in low-resource RE, there is still much potential for improvement, especially in extracting relations from cross-sentence contexts with multiple relational triples; (ii) Balancing methods are not always helpful for RE with long-tailed distribution; (iii) Data augmentation complements existing baselines and can bring much performance gain, while self-training may not consistently achieve advancement to low-resource RE. Code and datasets are in https://github.com/zjunlp/LREBench.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xin Xu (187 papers)
  2. Xiang Chen (343 papers)
  3. Ningyu Zhang (148 papers)
  4. Xin Xie (81 papers)
  5. Xi Chen (1035 papers)
  6. Huajun Chen (198 papers)
Citations (10)