Papers
Topics
Authors
Recent
Search
2000 character limit reached

Beyond MLE: Investigating SEARNN for Low-Resourced Neural Machine Translation

Published 20 May 2024 in cs.CL and cs.LG | (2405.11819v1)

Abstract: Structured prediction tasks, like machine translation, involve learning functions that map structured inputs to structured outputs. Recurrent Neural Networks (RNNs) have historically been a popular choice for such tasks, including in NLP applications. However, training RNNs using Maximum Likelihood Estimation (MLE) has its limitations, including exposure bias and a mismatch between training and testing metrics. SEARNN, based on the learning to search (L2S) framework, has been proposed as an alternative to MLE for RNN training. This project explored the potential of SEARNN to improve machine translation for low-resourced African languages -- a challenging task characterized by limited training data availability and the morphological complexity of the languages. Through experiments conducted on translation for English to Igbo, French to \ewe, and French to \ghomala directions, this project evaluated the efficacy of SEARNN over MLE in addressing the unique challenges posed by these languages. With an average BLEU score improvement of $5.4$\% over the MLE objective, we proved that SEARNN is indeed a viable algorithm to effectively train RNNs on machine translation for low-resourced languages.

Authors (1)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. A few thousand translations go a long way! leveraging pre-trained models for african news translation. NAACL, 2022. URL https://arxiv.org/abs/2205.02022v2.
  2. An actor-critic algorithm for sequence prediction. International Conference on Learning Representations, 2016. URL https://arxiv.org/abs/1607.07086v3.
  3. Learning to search better than your teacher. International Conference on Machine Learning, 2015. URL https://arxiv.org/abs/1502.02206v2.
  4. Search-based structured prediction. Mach. Learn., 75(3):297–325, 2009. doi: 10.1007/S10994-009-5106-X. URL https://doi.org/10.1007/s10994-009-5106-x.
  5. Mmtafrica: Multilingual machine translation for african languages. Conference on Machine Translation, 2022. doi: 10.48550/arXiv.2204.04306. URL https://arxiv.org/abs/2204.04306v1.
  6. Igbo-english machine translation: An evaluation benchmark. ArXiv, abs/2004.00648, 2020.
  7. Searnn: Training rnns with global-local losses. International Conference on Learning Representations, 2017. URL https://arxiv.org/abs/1706.04499v3.
  8. A two-stage rnn-based deep reinforcement learning approach for solving the parallel machine scheduling problem with due dates and family setups. Journal of Intelligent Manufacturing, 35:1107–1140, 2023. doi: 10.1007/s10845-023-02094-4. URL https://link.springer.com/article/10.1007/s10845-023-02094-4/fulltext.html.
  9. Participatory research for low-resourced machine translation: A case study in African languages. In Cohn, T., He, Y., and Liu, Y. (eds.), Findings of the Association for Computational Linguistics: EMNLP 2020, pp.  2144–2160, Online, nov 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.findings-emnlp.195. URL https://aclanthology.org/2020.findings-emnlp.195.
  10. Noss, P. A. Gabriel m. nissim, le bamileke-ghomálá: parler de bandjoun, cameroun, langues et civilisations à tradition orale. paris: Cnrs, 1981, 313 pp. Africa, 54(4):106–106, 1984. doi: 10.2307/1160416.
  11. Deep learning for computer vision tasks: A review. arXiv preprint arXiv: 1804.03928, 2018. URL https://arxiv.org/abs/1804.03928v1.
  12. Sequence to sequence learning with neural networks. Neural Information Processing Systems, 2014. URL https://arxiv.org/abs/1409.3215v3.
  13. Attention is all you need. Advances in neural information processing systems, 30, 2017. URL https://arxiv.org/abs/1706.03762v7.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.