Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automatic Context Pattern Generation for Entity Set Expansion (2207.08087v4)

Published 17 Jul 2022 in cs.CL and cs.IR

Abstract: Entity Set Expansion (ESE) is a valuable task that aims to find entities of the target semantic class described by given seed entities. Various NLP and Information Retrieval (IR) downstream applications have benefited from ESE due to its ability to discover knowledge. Although existing corpus-based ESE methods have achieved great progress, they still rely on corpora with high-quality entity information annotated, because most of them need to obtain the context patterns through the position of the entity in a sentence. Therefore, the quality of the given corpora and their entity annotation has become the bottleneck that limits the performance of such methods. To overcome this dilemma and make the ESE models free from the dependence on entity annotation, our work aims to explore a new ESE paradigm, namely corpus-independent ESE. Specifically, we devise a context pattern generation module that utilizes autoregressive LLMs (e.g., GPT-2) to automatically generate high-quality context patterns for entities. In addition, we propose the GAPA, a novel ESE framework that leverages the aforementioned GenerAted PAtterns to expand target entities. Extensive experiments and detailed analyses on three widely used datasets demonstrate the effectiveness of our method. All the codes of our experiments are available at https://github.com/geekjuruo/GAPA.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Yinghui Li (65 papers)
  2. Shulin Huang (12 papers)
  3. Xinwei Zhang (37 papers)
  4. Qingyu Zhou (28 papers)
  5. Yangning Li (49 papers)
  6. Ruiyang Liu (15 papers)
  7. Yunbo Cao (43 papers)
  8. Hai-Tao Zheng (94 papers)
  9. Ying Shen (76 papers)
Citations (19)
Github Logo Streamline Icon: https://streamlinehq.com