How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models? (2309.08565v3)
Abstract: Customizing machine translation models to comply with desired attributes (e.g., formality or grammatical gender) is a well-studied topic. However, most current approaches rely on (semi-)supervised data with attribute annotations. This data scarcity bottlenecks democratizing such customization possibilities to a wider range of languages, particularly lower-resource ones. This gap is out of sync with recent progress in pretrained massively multilingual translation models. In response, we transfer the attribute controlling capabilities to languages without attribute-annotated data with an NLLB-200 model as a foundation. Inspired by techniques from controllable generation, we employ a gradient-based inference-time controller to steer the pretrained model. The controller transfers well to zero-shot conditions, as it operates on pretrained multilingual representations and is attribute -- rather than language-specific. With a comprehensive comparison to finetuning-based control, we demonstrate that, despite finetuning's clear dominance in supervised settings, the gap to inference-time control closes when moving to zero-shot conditions, especially with new and distant target languages. The latter also shows stronger domain robustness. We further show that our inference-time control complements finetuning. A human evaluation on a real low-resource language, Bengali, confirms our findings. Our code is https://github.com/dannigt/attribute-controller-transfer
- FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 1–61, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Speech translation with style: AppTek’s submissions to the IWSLT subtitling and formality tracks in 2023. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 251–260, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Ankur Bapna and Orhan Firat. 2019. Simple, scalable adaptation for neural machine translation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1538–1548, Hong Kong, China. Association for Computational Linguistics.
- Gender in danger? evaluating speech translation technology on the MuST-SHE corpus. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6923–6933, Online. Association for Computational Linguistics.
- An empirical comparison of domain adaptation methods for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 385–391, Vancouver, Canada. Association for Computational Linguistics.
- Multilingual domain adaptation for NMT: Decoupling language and domain information with adapters. In Proceedings of the Sixth Conference on Machine Translation, pages 578–598, Online. Association for Computational Linguistics.
- Plug and play language models: A simple approach to controlled text generation. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net.
- Prafulla Dhariwal and Alexander Quinn Nichol. 2021. Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 8780–8794.
- Beyond english-centric multilingual machine translation. J. Mach. Learn. Res., 22:107:1–107:48.
- Controlling Japanese honorifics in English-to-Japanese neural machine translation. In Proceedings of the 6th Workshop on Asian Translation, pages 45–53, Hong Kong, China. Association for Computational Linguistics.
- Markus Freitag and Yaser Al-Onaizan. 2016. Fast domain adaptation for neural machine translation. CoRR, abs/1612.06897.
- The unreasonable effectiveness of few-shot learning for machine translation. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 10867–10878. PMLR.
- Towards continual learning for multilingual machine translation via vocabulary substitution. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1184–1192, Online. Association for Computational Linguistics.
- Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
- CTRL: A conditional transformer language model for controllable generation. CoRR, abs/1909.05858.
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Domain control for neural machine translation. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria, September 2 - 8, 2017, pages 372–378. INCOMA Ltd.
- GeDi: Generative discriminator guided sequence generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4929–4952, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 66–71, Brussels, Belgium. Association for Computational Linguistics.
- Controlling the output length of neural machine translation. In Proceedings of the 16th International Conference on Spoken Language Translation, Hong Kong. Association for Computational Linguistics.
- BeamR: Beam reweighing with attribute discriminators for controllable text generation. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 422–437, Online only. Association for Computational Linguistics.
- Improving formality-sensitive machine translation using data-centric approaches and prompt engineering. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 420–432, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Diffusion-lm improves controllable text generation. In NeurIPS.
- DExperts: Decoding-time controlled text generation with experts and anti-experts. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6691–6706, Online. Association for Computational Linguistics.
- Danni Liu and Jan Niehues. 2022. Learning an artificial language for knowledge-sharing in multilingual translation. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 188–202, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742.
- Deltalm: Encoder-decoder pre-training for language generation and translation by augmenting pretrained multilingual encoders. CoRR, abs/2106.13736.
- Controlling the reading level of machine translation output. In Proceedings of Machine Translation Summit XVII: Research Track, pages 193–203, Dublin, Ireland. European Association for Machine Translation.
- Paul Michel and Graham Neubig. 2018. Extreme adaptation for personalized neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 312–318, Melbourne, Australia. Association for Computational Linguistics.
- CoCoA-MT: A dataset and benchmark for contrastive controlled MT with application to formality. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 616–632, Seattle, United States. Association for Computational Linguistics.
- Jan Niehues. 2020. Machine translation with unsupervised length-constraints. In Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 21–35, Virtual. Association for Machine Translation in the Americas.
- Multi-task neural models for translating between styles within and across languages. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1008–1021, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- No language left behind: Scaling human-centered machine translation.
- fairseq: A fast, extensible toolkit for sequence modeling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pages 48–53, Minneapolis, Minnesota. Association for Computational Linguistics.
- Monolingual adapters for zero-shot neural machine translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4465–4470, Online. Association for Computational Linguistics.
- How multilingual is multilingual BERT? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4996–5001, Florence, Italy. Association for Computational Linguistics.
- Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 186–191, Brussels, Belgium. Association for Computational Linguistics.
- COMET: A neural framework for MT evaluation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2685–2702, Online. Association for Computational Linguistics.
- Controlling translation formality using pre-trained multilingual language models. In Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022), pages 327–340, Dublin, Ireland (in-person and online). Association for Computational Linguistics.
- ChatGPT MT: Competitive for high- (but not low-) resource languages. In Proceedings of the Eighth Conference on Machine Translation, pages 392–418, Singapore. Association for Computational Linguistics.
- Ashutosh Saboo and Timo Baumann. 2019. Integration of dubbing constraints into machine translation. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers), pages 94–101, Florence, Italy. Association for Computational Linguistics.
- RAMP: Retrieval and attribute-marking enhanced prompting for attribute-controlled translation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1476–1490, Toronto, Canada. Association for Computational Linguistics.
- Neural machine translation doesn’t translate gender coreference right unless you make it. In Proceedings of the Second Workshop on Gender Bias in Natural Language Processing, pages 35–43, Barcelona, Spain (Online). Association for Computational Linguistics.
- Controlling machine translation for multiple attributes with additive interventions. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6676–6696, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Controlling politeness in neural machine translation via side constraints. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 35–40, San Diego, California. Association for Computational Linguistics.
- Sho Takase and Naoaki Okazaki. 2019. Positional encoding to control output sequence length. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3999–4004, Minneapolis, Minnesota. Association for Computational Linguistics.
- Overcoming catastrophic forgetting during domain adaptation of neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2062–2068, Minneapolis, Minnesota. Association for Computational Linguistics.
- Low-resource formality controlled NMT using pre-trained LM. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 321–329, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Getting gender right in neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3003–3008, Brussels, Belgium. Association for Computational Linguistics.
- Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008.
- MTCue: Learning zero-shot control of extra-textual attributes by leveraging unstructured context in neural machine translation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8210–8226, Toronto, Canada. Association for Computational Linguistics.
- Can domains be transferred across languages in multi-domain multilingual neural machine translation? In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 381–396, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Controlling styles in neural machine translation with activation prompt. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2606–2620, Toronto, Canada. Association for Computational Linguistics.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Improving neural machine translation formality control with domain adaptation and reranking-based transductive learning. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 180–186, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 483–498, Online. Association for Computational Linguistics.
- Kevin Yang and Dan Klein. 2021. FUDGE: Controlled text generation with future discriminators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3511–3535, Online. Association for Computational Linguistics.
- Improving massively multilingual neural machine translation and zero-shot translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1628–1639, Online. Association for Computational Linguistics.
- Multilingual machine translation with large language models: Empirical results and analysis. CoRR, abs/2304.04675.
- Danni Liu (23 papers)
- Jan Niehues (76 papers)