Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transformers for Low-Resource Languages: Is Féidir Linn! (2403.01985v1)

Published 4 Mar 2024 in cs.CL and cs.AI

Abstract: The Transformer model is the state-of-the-art in Machine Translation. However, in general, neural translation models often under perform on language pairs with insufficient training data. As a consequence, relatively few experiments have been carried out using this architecture on low-resource language pairs. In this study, hyperparameter optimization of Transformer models in translating the low-resource English-Irish language pair is evaluated. We demonstrate that choosing appropriate parameters leads to considerable performance improvements. Most importantly, the correct choice of subword model is shown to be the biggest driver of translation performance. SentencePiece models using both unigram and BPE approaches were appraised. Variations on model architectures included modifying the number of layers, testing various regularisation techniques and evaluating the optimal number of heads for attention. A generic 55k DGT corpus and an in-domain 88k public admin corpus were used for evaluation. A Transformer optimized model demonstrated a BLEU score improvement of 7.8 points when compared with a baseline RNN model. Improvements were observed across a range of metrics, including TER, indicating a substantially reduced post editing effort for Transformer optimized models with 16k BPE subword models. Bench-marked against Google Translate, our translation engines demonstrated significant improvements. The question of whether or not Transformers can be used effectively in a low-resource setting of English-Irish translation has been addressed. Is f\'eidir linn - yes we can.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. The digital divide and social inclusion among refugee migrants. Information Technology & People.
  2. Optimizing transformer for low-resource neural machine translation. arXiv preprint arXiv:2011.02266.
  3. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  4. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 610–623.
  5. Random search for hyper-parameter optimization. Journal of machine learning research, 13(2).
  6. Bisong, E. (2019). Google colaboratory. In Building Machine Learning and Deep Learning Models on Google Cloud Platform, pages 59–64. Springer.
  7. Findings of the 2017 conference on machine translation (WMT17). In Proceedings of the Second Conference on Machine Translation, pages 169–214, Copenhagen, Denmark. Association for Computational Linguistics.
  8. Findings of the 2018 conference on machine translation (WMT18). In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 272–303, Belgium, Brussels. Association for Computational Linguistics.
  9. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.
  10. A call for prudent choice of subword merge operations in neural machine translation. arXiv preprint arXiv:1905.10453.
  11. Smt versus nmt: Preliminary comparisons for irish.
  12. Gage, P. (1994). A new algorithm for data compression. C Users Journal, 12(2):23–38.
  13. Deep reconstruction-classification networks for unsupervised domain adaptation. In European Conference on Computer Vision, pages 597–613. Springer.
  14. Finding the optimal vocabulary size for neural machine translation. arXiv preprint arXiv:2004.02334.
  15. Opennmt: Open-source toolkit for neural machine translation. arXiv preprint arXiv:1701.02810.
  16. Kudo, T. (2018). Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint arXiv:1804.10959.
  17. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226.
  18. Quantifying the carbon emissions of machine learning. arXiv preprint arXiv:1910.09700.
  19. Pivot machine translation using chinese as pivot language. In China Workshop on Machine Translation, pages 74–85. Springer.
  20. Responses to language barriers in consultations with refugees and asylum seekers: a telephone survey of irish general practitioners. BMC Family Practice, 9(1):1–6.
  21. Montgomery, D. C. (2017). Design and analysis of experiments. John wiley & sons.
  22. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318.
  23. Popović, M. (2015). chrf: character n-gram f-score for automatic mt evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 392–395.
  24. Informing the use of hyperparameter optimization through metalearning. In 2017 IEEE International Conference on Data Mining (ICDM), pages 1051–1056. IEEE.
  25. SEAI (2020). Sustainable Energy in Ireland.
  26. Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909.
  27. A study of translation edit rate with targeted human annotation. In Proceedings of association for machine translation in the Americas, volume 200. Citeseer.
  28. Results of the wmt15 metrics shared task. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 256–273.
  29. Dgt-tm: A freely available translation memory in 22 languages. arXiv preprint arXiv:1309.5226.
  30. On optimal transformer depth for low-resource language translation. arXiv preprint arXiv:2004.04418.
  31. Attention is all you need. arXiv preprint arXiv:1706.03762.
  32. Way, A. (2020). MT Developments in the EU: Keynote AMTA 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Séamus Lankford (17 papers)
  2. Haithem Afli (13 papers)
  3. Andy Way (46 papers)
Citations (14)