Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo (2504.17252v1)

Published 24 Apr 2025 in cs.CL and cs.LG

Abstract: In this study, we develop Neural Machine Translation (NMT) and Transformer-based transfer learning models for English-to-Igbo translation - a low-resource African language spoken by over 40 million people across Nigeria and West Africa. Our models are trained on a curated and benchmarked dataset compiled from Bible corpora, local news, Wikipedia articles, and Common Crawl, all verified by native language experts. We leverage Recurrent Neural Network (RNN) architectures, including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), enhanced with attention mechanisms to improve translation accuracy. To further enhance performance, we apply transfer learning using MarianNMT pre-trained models within the SimpleTransformers framework. Our RNN-based system achieves competitive results, closely matching existing English-Igbo benchmarks. With transfer learning, we observe a performance gain of +4.83 BLEU points, reaching an estimated translation accuracy of 70%. These findings highlight the effectiveness of combining RNNs with transfer learning to address the performance gap in low-resource language translation tasks.

PDF Abstract

Insights into Low-Resource Neural Machine Translation for English-to-Igbo

The paper "Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo" by Ocheme Anthony Ekle and Biswarup Das explores developing and evaluating Neural Machine Translation (NMT) systems specifically applied to the English-to-Igbo language pair. Igbo, spoken by over 40 million people in Nigeria and West Africa, represents a category of languages considered low-resource due to limited digital textual data availability. This paper presents novel insights by combining recurrent neural network architectures with transfer learning techniques to address translation challenges in these languages.

The paper primarily reviews the design of RNN-based models enhanced with attention mechanisms to achieve competitive translation, especially leveraging LSTM and GRU frameworks. It compares the effectiveness of these RNN types and highlights the role of attention mechanisms in improving translation accuracy. Notably, the LSTM model, combined with dot-product attention, emerged as the more proficient architecture, offering smoother and more accurate translation results across complex sentence structures.

Numerical Results

The experiment reports that the LSTM model, when paired with dot-product attention and optimized hyperparameters, achieved a BLEU score of 0.3817. This surpasses the performance of standalone RNN architectures and closely approaches established benchmarks like Tatoeba and JW300, which report BLEU scores between 0.38 and 0.395. Furthermore, when extended to the English-French dataset, the model achieves a BLEU score of 0.590, demonstrating significant adaptability across language pairs.

Transfer Learning Integration

A notable advancement is presented through the integration of MarianNMT-based transfer learning via the SimpleTransformers framework. This model achieves a BLEU score of 0.43 on the English-Igbo test set, indicating a substantial improvement over current benchmarks. The authors report a semantic accuracy rate exceeding 70% across evaluated sentences, which highlights the efficacy of transfer learning in enhancing low-resource NMT systems. This approach highlights MarianNMT's role as an effective tool for lowering the performance gap in low-resource scenarios through pre-trained linguistic representations.

Practical and Theoretical Implications

Practically, the paper provides a strong baseline for future English-Igbo NMT systems and suggests pathways to scale NMT for multiple low-resource languages using advanced attention models and multilingual embeddings. Theoretically, it emphasizes the robustness and applicability of LSTM architecture in sequence modeling enhanced with transfer learning, contributing value to broader developments in neural translation models.

Future Developments

The research opens avenues for future work aiming at extending vocabulary size, employing diverse attention mechanisms, exploring alternative decoding strategies, and integrating syntactic information via Graph Neural Networks (GNNs). These directions promise to bolster translation accuracy by leveraging linguistic structure encoding and computational efficiency.

In conclusion, the paper presents valuable advancements in addressing low-resource language translation challenges, offering detailed architectural insights, strong numerical results, and outlining promising future research trajectories in neural machine translation and natural language processing domains.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Ocheme Anthony Ekle (2 papers)
Biswarup Das (16 papers)

Related Papers

Find Related Papers

Tweets

YouTube

Show All Videos