Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bootstrapping Relation Extractors using Syntactic Search by Examples (2102.05007v1)

Published 9 Feb 2021 in cs.CL

Abstract: The advent of neural-networks in NLP brought with it substantial improvements in supervised relation extraction. However, obtaining a sufficient quantity of training data remains a key challenge. In this work we propose a process for bootstrapping training datasets which can be performed quickly by non-NLP-experts. We take advantage of search engines over syntactic-graphs (Such as Shlain et al. (2020)) which expose a friendly by-example syntax. We use these to obtain positive examples by searching for sentences that are syntactically similar to user input examples. We apply this technique to relations from TACRED and DocRED and show that the resulting models are competitive with models trained on manually annotated data and on data obtained from distant supervision. The models also outperform models trained using NLG data augmentation techniques. Extending the search-based approach with the NLG method further improves the results.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Matan Eyal (15 papers)
  2. Asaf Amrami (4 papers)
  3. Hillel Taub-Tabib (7 papers)
  4. Yoav Goldberg (142 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.