2000 character limit reached
RoBLEURT Submission for the WMT2021 Metrics Task (2204.13352v1)
Published 28 Apr 2022 in cs.CL
Abstract: In this paper, we present our submission to Shared Metrics Task: RoBLEURT (Robustly Optimizing the training of BLEURT). After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy. Experimental results show that our model reaching state-of-the-art correlations with the WMT2020 human annotations upon 8 out of 10 to-English language pairs.
- Yu Wan (18 papers)
- Dayiheng Liu (75 papers)
- Baosong Yang (57 papers)
- Tianchi Bi (4 papers)
- Haibo Zhang (25 papers)
- Boxing Chen (67 papers)
- Weihua Luo (63 papers)
- Derek F. Wong (69 papers)
- Lidia S. Chao (41 papers)