LangBridge: Multilingual Reasoning Without Multilingual Supervision (2401.10695v2)
Abstract: We introduce LangBridge, a zero-shot approach to adapt LLMs for multilingual reasoning tasks without multilingual supervision. LangBridge operates by bridging two models, each specialized in different aspects: (1) one specialized in understanding multiple languages (e.g., mT5 encoder) and (2) one specialized in reasoning (e.g., MetaMath). LangBridge connects the two models by introducing minimal trainable parameters between them. Despite utilizing only English data for training, LangBridge considerably enhances the performance of LLMs on low-resource languages across mathematical reasoning, code completion, logical reasoning, and commonsense reasoning. Our analysis suggests that the efficacy of LangBridge stems from the language-agnostic characteristics of multilingual representations. We publicly release our code and models.
- Quality estimation via backtranslation at the WMT 2022 quality estimation task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 593–596, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736.
- Llemma: An open language model for mathematics. arXiv preprint arXiv:2310.06786.
- Llm augmented llms: Expanding capabilities through composition.
- Introducing our multimodal models.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
- Breaking language barriers in multilingual mathematical reasoning: Insights and observations.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
- Unimax: Fairer and more effective language sampling for large-scale multilingual pretraining. In The Eleventh International Conference on Learning Representations.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
- Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
- No language left behind: Scaling human-centered machine translation. arXiv preprint arXiv:2207.04672.
- Language-agnostic BERT sentence embedding. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 878–891, Dublin, Ireland. Association for Computational Linguistics.
- The Flores-101 evaluation benchmark for low-resource and multilingual machine translation. Transactions of the Association for Computational Linguistics, 10:522–538.
- Measuring massive multitask language understanding. Proceedings of the International Conference on Learning Representations (ICLR).
- Measuring mathematical problem solving with the math dataset. NeurIPS.
- Mistral 7b.
- Is chatgpt a good translator? yes with gpt-4 as the engine.
- Large language models struggle to learn long-tail knowledge.
- Turning english-centric llms into polyglots: How much multilinguality is needed?
- Okapi: Instruction-tuned large language models in multiple languages with reinforcement learning from human feedback.
- The bigscience roots corpus: A 1.6 tb composite multilingual dataset. Advances in Neural Information Processing Systems, 35:31809–31826.
- Mind the gap: Assessing temporal generalization in neural language models.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models.
- Starcoder: may the source be with you! arXiv preprint arXiv:2305.06161.
- Tianjian Li and Kenton Murray. 2023. Why does zero-shot cross-lingual generation fail? an explanation and a solution. In Findings of the Association for Computational Linguistics: ACL 2023, pages 12461–12476, Toronto, Canada. Association for Computational Linguistics.
- Openorca: An open dataset of gpt augmented flan reasoning traces. https://https://huggingface.co/Open-Orca/OpenOrca.
- On the language neutrality of pre-trained multilingual representations. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1663–1674, Online. Association for Computational Linguistics.
- Few-shot learning with multilingual language models.
- Improved baselines with visual instruction tuning.
- Visual instruction tuning.
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization.
- Mini-model adaptation: Efficiently extending pretrained models to new languages via aligned shallow training. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5474–5490, Toronto, Canada. Association for Computational Linguistics.
- Linearly mapping from image to text space.
- Orca 2: Teaching small language models how to reason.
- Orca: Progressive learning from complex explanation traces of gpt-4.
- Second language acquisition of neural language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 13557–13572, Toronto, Canada. Association for Computational Linguistics.
- OpenAI. 2023. Gpt-4 technical report.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
- Are NLP models really able to solve simple math word problems? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2080–2094, Online. Association for Computational Linguistics.
- How multilingual is multilingual BERT? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4996–5001, Florence, Italy. Association for Computational Linguistics.
- XCOPA: A multilingual dataset for causal commonsense reasoning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2362–2376, Online. Association for Computational Linguistics.
- Maja Popović. 2015. chrF: character n-gram F-score for automatic MT evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 392–395, Lisbon, Portugal. Association for Computational Linguistics.
- Cross-lingual prompting: Improving zero-shot chain-of-thought reasoning across languages. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2695–2709.
- Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
- Nils Reimers and Iryna Gurevych. 2020. Making monolingual sentence embeddings multilingual using knowledge distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4512–4525, Online. Association for Computational Linguistics.
- Choice of plausible alternatives: An evaluation of commonsense causal reasoning. In 2011 AAAI Spring Symposium Series.
- Code llama: Open foundation models for code.
- Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
- Self-attention with relative position representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 464–468, New Orleans, Louisiana. Association for Computational Linguistics.
- Language models are multilingual chain-of-thought reasoners. In The Eleventh International Conference on Learning Representations.
- Language models are multilingual chain-of-thought reasoners. arXiv preprint arXiv:2210.03057.
- SlimPajama: A 627B token cleaned and deduplicated version of RedPajama.
- Challenging BIG-bench tasks and whether chain-of-thought can solve them. In Findings of the Association for Computational Linguistics: ACL 2023, pages 13003–13051, Toronto, Canada. Association for Computational Linguistics.
- Llama: Open and efficient foundation language models.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Overcoming catastrophic forgetting in zero-shot cross-lingual generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9279–9300, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 483–498, Online. Association for Computational Linguistics.
- Metamath: Bootstrap your own mathematical questions for large language models.
- Scaling relationship on learning mathematical reasoning with large language models.
- Extrapolating large language models to non-english by aligning languages.
- Rethinking round-trip translation for machine translation evaluation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 319–337, Toronto, Canada. Association for Computational Linguistics.