Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context (2004.15016v3)

Published 30 Apr 2020 in cs.CL

Abstract: We present WiC-TSV, a new multi-domain evaluation benchmark for Word Sense Disambiguation. More specifically, we introduce a framework for Target Sense Verification of Words in Context which grounds its uniqueness in the formulation as a binary classification task thus being independent of external sense inventories, and the coverage of various domains. This makes the dataset highly flexible for the evaluation of a diverse set of models and systems in and across domains. WiC-TSV provides three different evaluation settings, depending on the input signals provided to the model. We set baseline performance on the dataset using state-of-the-art LLMs. Experimental results show that even though these models can perform decently on the task, there remains a gap between machine and human performance, especially in out-of-domain settings. WiC-TSV data is available at https://competitions.codalab.org/competitions/23683

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Anna Breit (3 papers)
  2. Artem Revenko (4 papers)
  3. Kiamehr Rezaee (6 papers)
  4. Mohammad Taher Pilehvar (43 papers)
  5. Jose Camacho-Collados (58 papers)
Citations (23)

Summary

We haven't generated a summary for this paper yet.