Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in Fine-tuning LLMs for Simultaneous Translation (2405.10443v4)

Published 16 May 2024 in cs.CL and cs.LG

Abstract: LLMs have achieved state-of-the-art performance in various language processing tasks, motivating their adoption in simultaneous translation. Current fine-tuning methods to adapt LLMs for simultaneous translation focus on prompting optimization strategies using either data augmentation or prompt structure modifications. However, these methods suffer from several issues, such as unnecessarily expanded training sets, computational inefficiency from dumping the key and value cache, increased prompt sizes, or restriction to a single decision policy. To eliminate these issues, in this work, we propose SimulMask, a new paradigm for fine-tuning LLMs for simultaneous translation. It utilizes a novel attention mask approach that models simultaneous translation during fine-tuning by masking attention for a desired decision policy. Applying the proposed SimulMask on a Falcon LLM for the IWSLT 2017 dataset, we have observed a significant translation quality improvement compared to state-of-the-art prompting optimization strategies on five language pairs while reducing the computational cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Simul-llm: A framework for exploring high-quality simultaneous translation with large language models. arXiv preprint arXiv:2312.04691.
  2. The falcon series of open language models. arXiv preprint arXiv:2311.16867.
  3. Overview of the IWSLT 2017 evaluation campaign. In Proceedings of the 14th International Conference on Spoken Language Translation, pages 2–14, Tokyo, Japan. International Workshop on Spoken Language Translation.
  4. Kyunghyun Cho and Masha Esipova. 2016. Can neural machine translation do simultaneous translation? arXiv preprint arXiv:1606.02012.
  5. Learning to translate in real-time with neural machine translation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 1053–1062, Valencia, Spain. Association for Computational Linguistics.
  6. Sillm: Large language models for simultaneous machine translation. arXiv preprint arXiv:2402.13036.
  7. Towards effective disambiguation for machine translation with large language models. In Proceedings of the Eighth Conference on Machine Translation, pages 482–495, Singapore. Association for Computational Linguistics.
  8. Mistral 7b. arXiv preprint arXiv:2310.06825.
  9. Transllama: Llm-based simultaneous translation system. arXiv preprint arXiv:2402.04636.
  10. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  11. STACL: Simultaneous translation with implicit anticipation and controllable latency using prefix-to-prefix framework. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3025–3036, Florence, Italy. Association for Computational Linguistics.
  12. SIMULEVAL: An evaluation toolkit for simultaneous translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 144–150, Online. Association for Computational Linguistics.
  13. Simulmt to simulst: Adapting simultaneous text translation to end-to-end simultaneous speech translation. arXiv preprint arXiv:2011.02048.
  14. Prolonged turns in interpreting: Effects on quality, physiological and psychological stress (pilot study). Interpreting, 3(1):47–64.
  15. Adaptive machine translation with large language models. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 227–237, Tampere, Finland. European Association for Machine Translation.
  16. Does simultaneous speech translation need simultaneous models? In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 141–153, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  17. Over-generation cannot be rewarded: Length-adaptive average lagging for simultaneous speech translation. In Proceedings of the Third Workshop on Automatic Simultaneous Translation, pages 12–17, Online. Association for Computational Linguistics.
  18. The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116.
  19. Matt Post. 2018. A call for clarity in reporting bleu scores. arXiv preprint arXiv:1804.08771.
  20. Train short, test long: Attention with linear biases enables input length extrapolation. arXiv preprint arXiv:2108.12409.
  21. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  22. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  23. Prompting PaLM for translation: Assessing strategies and performance. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15406–15427, Toronto, Canada. Association for Computational Linguistics.
  24. Conversational simulmt: Efficient simultaneous translation with large language models. arXiv preprint arXiv:2402.10552.
  25. Simultaneous machine translation with large language models. arXiv preprint arXiv:2309.06706.
  26. A paradigm shift in machine translation: Boosting translation performance of large language models. arXiv preprint arXiv:2309.11674.
  27. Machine translation with large language models: Prompting, few-shot learning, and fine-tuning with QLoRA. In Proceedings of the Eighth Conference on Machine Translation, pages 468–481, Singapore. Association for Computational Linguistics.
  28. Simpler and faster learning of adaptive policies for simultaneous translation. arXiv preprint arXiv:1909.01559.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Matthew Raffel (7 papers)
  2. Victor Agostinelli (8 papers)
  3. Lizhong Chen (24 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets