Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SenseFitting: Sense Level Semantic Specialization of Word Embeddings for Word Sense Disambiguation (1907.13237v1)

Published 30 Jul 2019 in cs.CL and cs.LG

Abstract: We introduce a neural network-based system of Word Sense Disambiguation (WSD) for German that is based on SenseFitting, a novel method for optimizing WSD. We outperform knowledge-based WSD methods by up to 25% F1-score and produce a new state-of-the-art on the German sense-annotated dataset WebCAGe. Our method uses three feature vectors consisting of a) sense, b) gloss, and c) relational vectors to represent target senses and to compare them with the vector centroids of sample contexts. Utilizing widely available word embeddings and lexical resources, we are able to compensate for the lower resource availability of German. SenseFitting builds upon the recently introduced semantic specialization procedure Attract-Repel, and leverages sense level semantic constraints from lexical-semantic networks (e.g. GermaNet) or online social dictionaries (e.g. Wiktionary) to produce high-quality sense embeddings from pre-trained word embeddings. We evaluate our sense embeddings with a new SimLex-999 based similarity dataset, called SimSense, that we developed for this work. We achieve results that outperform current lemma-based specialization methods for German, making them comparable to results achieved for English.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Manuel Stoeckel (4 papers)
  2. Sajawel Ahmed (2 papers)
  3. Alexander Mehler (19 papers)

Summary

We haven't generated a summary for this paper yet.