Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models (2312.04691v4)

Published 7 Dec 2023 in cs.CL and cs.AI

Abstract: LLMs with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks. Neural machine translation (NMT) is one such task that LLMs have been applied to with great success. However, little research has focused on applying LLMs to the more difficult subset of NMT called simultaneous translation (SimulMT), where translation begins before the entire source context is available to the model. In this paper, we address key challenges facing LLMs fine-tuned for SimulMT, validate classical SimulMT concepts and practices in the context of LLMs, explore adapting LLMs that are fine-tuned for NMT to the task of SimulMT, and introduce Simul-LLM, the first open-source fine-tuning and evaluation pipeline development framework for LLMs focused on SimulMT.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Falcon-40B: an open large language model with state-of-the-art performance.
  2. Monotonic infinite lookback attention for simultaneous machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1313–1323, Florence, Italy. Association for Computational Linguistics.
  3. Emergent and predictable memorization in large language models.
  4. Language models are few-shot learners.
  5. Improving translation faithfulness of large language models via augmenting instructions.
  6. Bert: Pre-training of deep bidirectional transformers for language understanding. In 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
  7. Don’t until the final verb wait: Reinforcement learning for simultaneous machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1342–1352, Doha, Qatar. Association for Computational Linguistics.
  8. Learning to translate in real-time with neural machine translation.
  9. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  10. Language modeling with deep transformers. In Interspeech 2019. ISCA.
  11. Mistral 7b.
  12. Scaling laws for neural language models.
  13. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526.
  14. Stacl: Simultaneous translation with implicit anticipation and controllable latency using prefix-to-prefix framework. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3025–3036, Florence, Italy. Association for Computational Linguistics (ACL).
  15. Simuleval: An evaluation toolkit for simultaneous translation.
  16. Peft: State-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft.
  17. fairseq: A fast, extensible toolkit for sequence modeling.
  18. Over-generation cannot be rewarded: Length-adaptive average lagging for simultaneous speech translation. In Proceedings of the Third Workshop on Automatic Simultaneous Translation. Association for Computational Linguistics.
  19. Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 186–191, Belgium, Brussels. Association for Computational Linguistics.
  20. Llama: Open and efficient foundation language models.
  21. Ahmet Üstün and Asa Cooper Stickland. 2022. When does parameter-efficient transfer learning work for machine translation? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7919–7933, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  22. Attention is all you need.
  23. Prompting PaLM for translation: Assessing strategies and performance. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15406–15427, Toronto, Canada. Association for Computational Linguistics.
  24. A paradigm shift in machine translation: Boosting translation performance of large language models.
  25. Prompting large language model for machine translation: A case study.
  26. Simultaneous translation policies: From fixed to adaptive.
  27. Speculative beam search for simultaneous translation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1395–1402, Hong Kong, China. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Victor Agostinelli (8 papers)
  2. Max Wild (1 paper)
  3. Matthew Raffel (7 papers)
  4. Lizhong Chen (24 papers)
  5. Kazi Ahmed Asif Fuad (2 papers)
Citations (3)