Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models (2304.13803v1)

Published 26 Apr 2023 in cs.CL

Abstract: Pretrained LLMs (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks such as translation and multilingual word sense disambiguation (WSD). However, they often struggle at disambiguating word sense in a zero-shot setting. To better understand this contrast, we present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT), an extension of word-level translation that prompts the model to translate a given word in context. We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance. Building on C-WLT, we introduce a zero-shot approach for WSD, tested on 18 languages from the XL-WSD dataset. Our method outperforms fully supervised baselines on recall for many evaluation languages without additional training or finetuning. This study presents a first step towards understanding how to best leverage the cross-lingual knowledge inside PLMs for robust zero-shot reasoning in any language.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Haoqiang Kang (7 papers)
Terra Blevins (20 papers)
Luke Zettlemoyer (225 papers)

Citations (1)

View on Semantic Scholar

Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models (2304.13803v1)

Related Papers