Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation (2401.07456v2)
Abstract: Federated learning (FL) is a promising distributed machine learning paradigm that enables multiple clients to collaboratively train a global model. In this paper, we focus on a practical federated multilingual learning setup where clients with their own language-specific data aim to collaboratively construct a high-quality neural machine translation (NMT) model. However, communication constraints in practical network systems present challenges for exchanging large-scale NMT engines between FL parties. We propose a meta-learning-based adaptive parameter selection methodology, MetaSend, that improves the communication efficiency of model transmissions from clients during FL-based multilingual NMT training. Our approach learns a dynamic threshold for filtering parameters prior to transmission without compromising the NMT model quality, based on the tensor deviations of clients between different FL rounds. Through experiments on two NMT datasets with different language distributions, we demonstrate that MetaSend obtains substantial improvements over baselines in translation quality in the presence of a limited communication budget.
- Flower: A friendly federated learning research framework. ArXiv, abs/2007.14390.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Beyond english-centric multilingual machine translation. J. Mach. Learn. Res., 22:107:1–107:48.
- Compressing large-scale transformer-based models: A case study on bert. Transactions of the Association for Computational Linguistics, 9:1061–1080.
- Fedner: Medical named entity recognition with federated learning. arXiv: Computation and Language.
- Manish Gupta and Puneet Agrawal. 2020. Compression of deep learning models for text: A survey. ACM Trans. Knowl. Discov. Data, 16:61:1–61:55.
- Learning to rank visual stories from human ranking data. In Annual Meeting of the Association for Computational Linguistics.
- Plot and rework: Modeling storylines for visual storytelling. In Findings.
- Federated learning: Strategies for improving communication efficiency. ArXiv, abs/1610.05492.
- Revealing the dark secrets of bert. In Conference on Empirical Methods in Natural Language Processing.
- Deduplicating training data makes language models better. In Annual Meeting of the Association for Computational Linguistics.
- A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, pages 1–1.
- FedNLP: Benchmarking federated learning methods for natural language processing tasks. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 157–175, Seattle, United States. Association for Computational Linguistics.
- Federated learning meets natural language processing: A survey. ArXiv, abs/2107.12603.
- Communication-efficient learning of deep networks from decentralized data. In AISTATS.
- Luke Melas-Kyriazi and Franklyn Wang. 2022. Intrinsic gradient compression for scalable and efficient federated learning. Proceedings of the First Workshop on Federated Learning for Natural Language Processing (FL4NLP 2022).
- Paul Michel and Graham Neubig. 2018. MTNT: A testbed for machine translation of noisy text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 543–553, Brussels, Belgium. Association for Computational Linguistics.
- Bleu: a method for automatic evaluation of machine translation. In ACL.
- Training mixed-domain translation models via federated learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2576–2586, Seattle, United States. Association for Computational Linguistics.
- Matt Post. 2018. A call for clarity in reporting bleu scores. ArXiv, abs/1804.08771.
- Improving federated learning for aspect-based sentiment analysis via topic memories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3942–3954, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Language models are unsupervised multitask learners.
- Fixed encoder self-attention patterns in transformer-based machine translation. ArXiv, abs/2002.10260.
- COMET: A neural framework for MT evaluation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2685–2702, Online. Association for Computational Linguistics.
- Scaling language model size in cross-device federated learning. In Proceedings of the First Workshop on Federated Learning for Natural Language Processing (FL4NLP 2022), pages 6–20, Dublin, Ireland. Association for Computational Linguistics.
- BLEURT: Learning robust metrics for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7881–7892. Association for Computational Linguistics.
- Meta-weight-net: Learning an explicit mapping for sample weighting. In Neural Information Processing Systems.
- Learning an unreferenced metric for online dialogue evaluation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2430–2441. Association for Computational Linguistics.
- Distantly supervised relation extraction in federated settings. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 569–583, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Synthesizer: Rethinking self-attention in transformer models. In International Conference on Machine Learning.
- Efficient methods for natural language processing: A survey. ArXiv, abs/2209.00099.
- Pretrained models for multilingual federated learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1413–1421, Seattle, United States. Association for Computational Linguistics.
- Huggingface’s transformers: State-of-the-art natural language processing. ArXiv, abs/1910.03771.
- Opt: Open pre-trained transformer language models. ArXiv, abs/2205.01068.
- BERTScore: Evaluating text generation with BERT. In International Conference on Learning Representations.
- The United Nations parallel corpus v1.0. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 3530–3534, Portorož, Slovenia. European Language Resources Association (ELRA).