2000 character limit reached
On the Evaluation of Machine Translation for Terminology Consistency (2106.11891v2)
Published 22 Jun 2021 in cs.CL
Abstract: As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies. In many scenarios and particularly in cases of domain adaptation, one expects the MT output to adhere to the constraints provided by a terminology. In this work, we propose metrics to measure the consistency of MT output with regards to a domain terminology. We perform studies on the COVID-19 domain over 5 languages, also performing terminology-targeted human evaluation. We open-source the code for computing all proposed metrics: https://github.com/mahfuzibnalam/terminology_evaluation
- Md Mahfuz ibn Alam (9 papers)
- Antonios Anastasopoulos (111 papers)
- Laurent Besacier (76 papers)
- James Cross (22 papers)
- Matthias Gallé (31 papers)
- Philipp Koehn (60 papers)
- Vassilina Nikoulina (28 papers)