2000 character limit reached
Word Sense Induction with Neural biLM and Symmetric Patterns (1808.08518v2)
Published 26 Aug 2018 in cs.CL
Abstract: An established method for Word Sense Induction (WSI) uses a LLM to predict probable substitutes for target words, and induces senses by clustering these resulting substitute vectors. We replace the ngram-based LLM (LM) with a recurrent one. Beyond being more accurate, the use of the recurrent LM allows us to effectively query it in a creative way, using what we call dynamic symmetric patterns. The combination of the RNN-LM and the dynamic symmetric patterns results in strong substitute vectors for WSI, allowing to surpass the current state-of-the-art on the SemEval 2013 WSI shared task by a large margin.
- Asaf Amrami (4 papers)
- Yoav Goldberg (142 papers)