Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FUSE: Multi-Faceted Set Expansion by Coherent Clustering of Skip-grams (1910.04345v3)

Published 10 Oct 2019 in cs.CL and cs.LG

Abstract: Set expansion aims to expand a small set of seed entities into a complete set of relevant entities. Most existing approaches assume the input seed set is unambiguous and completely ignore the multi-faceted semantics of seed entities. As a result, given the seed set {"Canon", "Sony", "Nikon"}, previous models return one mixed set of entities that are either Camera Brands or Japanese Companies. In this paper, we study the task of multi-faceted set expansion, which aims to capture all semantic facets in the seed set and return multiple sets of entities, one for each semantic facet. We propose an unsupervised framework, FUSE, which consists of three major components: (1) facet discovery module: identifies all semantic facets of each seed entity by extracting and clustering its skip-grams, and (2) facet fusion module: discovers shared semantic facets of the entire seed set by an optimization formulation, and (3) entity expansion module: expands each semantic facet by utilizing a masked LLM with pre-trained BERT models. Extensive experiments demonstrate that FUSE can accurately identify multiple semantic facets of the seed set and generate quality entities for each facet.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Wanzheng Zhu (6 papers)
  2. Hongyu Gong (44 papers)
  3. Jiaming Shen (56 papers)
  4. Chao Zhang (907 papers)
  5. Jingbo Shang (141 papers)
  6. Suma Bhat (28 papers)
  7. Jiawei Han (263 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.