Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SIT at MixMT 2022: Fluent Translation Built on Giant Pre-trained Models (2210.11670v2)

Published 21 Oct 2022 in cs.CL

Abstract: This paper describes the Stevens Institute of Technology's submission for the WMT 2022 Shared Task: Code-mixed Machine Translation (MixMT). The task consisted of two subtasks, subtask $1$ Hindi/English to Hinglish and subtask $2$ Hinglish to English translation. Our findings lie in the improvements made through the use of large pre-trained multilingual NMT models and in-domain datasets, as well as back-translation and ensemble techniques. The translation output is automatically evaluated against the reference translations using ROUGE-L and WER. Our system achieves the $1{st}$ position on subtask $2$ according to ROUGE-L, WER, and human evaluation, $1{st}$ position on subtask $1$ according to WER and human evaluation, and $3{rd}$ position on subtask $1$ with respect to ROUGE-L metric.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Abdul Rafae Khan (8 papers)
  2. Hrishikesh Kanade (1 paper)
  3. Girish Amar Budhrani (1 paper)
  4. Preet Jhanglani (1 paper)
  5. Jia Xu (87 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.