Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation (2404.04212v1)
Abstract: Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained LLMs for diverse tasks, offering a balance between adaptability and computational efficiency. They are important in Low-Resource Language (LRL) Neural Machine Translation (NMT) to enhance translation accuracy with minimal resources. However, their practical effectiveness varies significantly across different languages. We conducted comprehensive empirical experiments with varying LRL domains and sizes to evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score. We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests and the Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.
- Simple, scalable adaptation for neural machine translation. arXiv preprint arXiv:1909.08478.
- Regularization techniques for fine-tuning in neural machine translation. arXiv preprint arXiv:1707.09920.
- No language left behind: Scaling human-centered machine translation. arXiv preprint arXiv:2207.04672.
- Indicmt eval: A dataset to meta-evaluate machine translation metrics for indian languages. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14210–14228.
- Beyond english-centric multilingual machine translation. CoRR, abs/2010.11125.
- Data augmentation and terminology integration for domain-specific sinhala-english-tamil statistical machine translation. arXiv preprint arXiv:2011.02821.
- Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 30–45, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- The flores-101 evaluation benchmark for low-resource and multilingual machine translation. Transactions of the Association for Computational Linguistics, 10:522–538.
- Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
- The state and fate of linguistic diversity and inclusion in the nlp world. arXiv preprint arXiv:2004.09095.
- Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems, 34:1022–1035.
- m4adaptersuperscript𝑚4𝑎𝑑𝑎𝑝𝑡𝑒𝑟m^{4}adapteritalic_m start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT italic_a italic_d italic_a italic_p italic_t italic_e italic_r: Multilingual multi-domain adaptation for machine translation with a meta-adapter. arXiv preprint arXiv:2210.11912.
- Pre-trained multilingual sequence-to-sequence models: A hope for low-resource language translation? arXiv preprint arXiv:2203.08850.
- Yohan Lee. 2021. Improving end-to-end task-oriented dialog system with a simple auxiliary task. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1296–1303, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
- Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems, 35:1950–1965.
- Unipelt: A unified framework for parameter-efficient language model tuning. arXiv preprint arXiv:2110.07577.
- Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier.
- Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys, 56(2):1–40.
- Adapterhub: A framework for adapting transformers. arXiv preprint arXiv:2007.07779.
- Mad-x: An adapter-based framework for multi-task cross-lingual transfer. arXiv preprint arXiv:2005.00052.
- MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer. arXiv preprint.
- Matt Post. 2018. A call for clarity in reporting bleu scores. arXiv preprint arXiv:1804.08771.
- Adapters: A unified library for parameter-efficient and modular transfer learning.
- Samanantar: The largest publicly available parallel corpora collection for 11 indic languages.
- ChatGPT MT: Competitive for high- (but not low-) resource languages. In Proceedings of the Eighth Conference on Machine Translation, pages 392–418, Singapore. Association for Computational Linguistics.
- Modular and parameter-efficient fine-tuning for NLP models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, pages 23–29, Abu Dubai, UAE. Association for Computational Linguistics.
- Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709.
- David Stap and Ali Araabi. 2023. ChatGPT is not a good indigenous translator. In Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), pages 163–167, Toronto, Canada. Association for Computational Linguistics.
- Multilingual translation with extensible multilingual pretraining and finetuning.
- Multilingual unsupervised neural machine translation with denoising adapters. arXiv preprint arXiv:2110.10472.
- Ahmet Üstün and Asa Cooper Stickland. 2022. When does parameter-efficient transfer learning work for machine translation? arXiv preprint arXiv:2205.11277.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Don’t trust ChatGPT when your question is not in English: A study of multilingual abilities and types of LLMs. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 7915–7927, Singapore. Association for Computational Linguistics.