Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics (2305.11806v1)

Published 19 May 2023 in cs.CL

Abstract: Neural metrics for machine translation evaluation, such as COMET, exhibit significant improvements in their correlation with human judgments, as compared to traditional metrics based on lexical overlap, such as BLEU. Yet, neural metrics are, to a great extent, "black boxes" returning a single sentence-level score without transparency about the decision-making process. In this work, we develop and compare several neural explainability methods and demonstrate their effectiveness for interpreting state-of-the-art fine-tuned neural metrics. Our study reveals that these metrics leverage token-level information that can be directly attributed to translation errors, as assessed through comparison of token-level neural saliency maps with Multidimensional Quality Metrics (MQM) annotations and with synthetically-generated critical translation errors. To ease future research, we release our code at: https://github.com/Unbabel/COMET/tree/explainable-metrics.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ricardo Rei (34 papers)
  2. Nuno M. Guerreiro (27 papers)
  3. Marcos Treviso (17 papers)
  4. Luisa Coheur (33 papers)
  5. Alon Lavie (12 papers)
  6. André F. T. Martins (113 papers)
Citations (15)
Github Logo Streamline Icon: https://streamlinehq.com