Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distant Supervision for Relation Extraction with Linear Attenuation Simulation and Non-IID Relevance Embedding (1812.09516v1)

Published 22 Dec 2018 in cs.CL

Abstract: Distant supervision for relation extraction is an efficient method to reduce labor costs and has been widely used to seek novel relational facts in large corpora, which can be identified as a multi-instance multi-label problem. However, existing distant supervision methods suffer from selecting important words in the sentence and extracting valid sentences in the bag. Towards this end, we propose a novel approach to address these problems in this paper. Firstly, we propose a linear attenuation simulation to reflect the importance of words in the sentence with respect to the distances between entities and words. Secondly, we propose a non-independent and identically distributed (non-IID) relevance embedding to capture the relevance of sentences in the bag. Our method can not only capture complex information of words about hidden relations, but also express the mutual information of instances in the bag. Extensive experiments on a benchmark dataset have well-validated the effectiveness of the proposed method.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Changsen Yuan (2 papers)
  2. Heyan Huang (107 papers)
  3. Chong Feng (11 papers)
  4. Xiao Liu (402 papers)
  5. Xiaochi Wei (12 papers)
Citations (33)