Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples
Abstract: The evaluation of cross-lingual semantic search capabilities of models is often limited to existing datasets from tasks such as information retrieval and semantic textual similarity. To allow for domain-specific evaluation, we introduce Cross Lingual Semantic Discrimination (CLSD), a novel cross-lingual semantic search task that does not require a large evaluation corpus, only parallel sentences of the language pair of interest within the target domain. This task focuses on the ability of a model to cross-lingually rank the true parallel sentence higher than challenging distractors generated by a LLM. We create a case study of our introduced CLSD task for the language pair German-French in the news domain. Within this case study, we find that models that are also fine-tuned for retrieval tasks benefit from pivoting through English, while bitext mining models perform best directly cross-lingually. A fine-grained similarity analysis enabled by our distractor generation strategy indicate that different embedding models are sensitive to different types of perturbations.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.