Improving Cross-Lingual Transfer through Subtree-Aware Word Reordering (2310.13583v1)
Abstract: Despite the impressive growth of the abilities of multilingual LLMs, such as XLM-R and mT5, it has been shown that they still face difficulties when tackling typologically-distant languages, particularly in the low-resource setting. One obstacle for effective cross-lingual transfer is variability in word-order patterns. It can be potentially mitigated via source- or target-side word reordering, and numerous approaches to reordering have been proposed. However, they rely on language-specific rules, work on the level of POS tags, or only target the main clause, leaving subordinate clauses intact. To address these limitations, we present a new powerful reordering method, defined in terms of Universal Dependencies, that is able to learn fine-grained word-order patterns conditioned on the syntactic context from a small amount of annotated data and can be applied at all levels of the syntactic tree. We conduct experiments on a diverse set of tasks and show that our method consistently outperforms strong baselines over different language pairs and model architectures. This performance advantage holds true in both zero-shot and few-shot scenarios.
- Multilingual projection for parsing truly low-resource languages. Transactions of the Association for Computational Linguistics, 4:301–312.
- On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2440–2452, Minneapolis, Minnesota. Association for Computational Linguistics.
- Mega: Multilingual evaluation of generative ai. ArXiv, abs/2303.12528.
- Many languages, one parser. Transactions of the Association for Computational Linguistics, 4:431–444.
- On the relation between syntactic divergence and zero-shot performance. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4803–4817, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Buffet: Benchmarking large language models for few-shot cross-lingual transfer. ArXiv, abs/2305.14857.
- Zero-resource dependency parsing: Boosting delexicalized cross-lingual transfer with linguistic knowledge. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 119–130, Osaka, Japan. The COLING 2016 Organizing Committee.
- Pi-Chuan Chang and Kristina Toutanova. 2007. A discriminative syntactic word order model for machine translation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 9–16, Prague, Czech Republic. Association for Computational Linguistics.
- Neural machine translation with reordering embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1787–1799, Florence, Italy. Association for Computational Linguistics.
- Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 531–540, Ann Arbor, Michigan. Association for Computational Linguistics.
- Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An efficient smt solver. In Tools and Algorithms for the Construction and Analysis of Systems, pages 337–340, Berlin, Heidelberg. Springer Berlin Heidelberg.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Timothy Dozat and Christopher D. Manning. 2016. Deep biaffine attention for neural dependency parsing. ArXiv, abs/1611.01734.
- Matthew S Dryer. 1992. The Greenbergian word order correlations. Language, 68(1):81–138.
- Improving text-to-SQL evaluation methodology. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 351–360, Melbourne, Australia. Association for Computational Linguistics.
- AllenNLP: A deep semantic natural language processing platform. In Proceedings of Workshop for NLP Open Source Software (NLP-OSS), pages 1–6, Melbourne, Australia. Association for Computational Linguistics.
- Dmitriy Genzel. 2010. Automatically learning source-side reordering rules for large scale machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 376–384, Beijing, China. Coling 2010 Organizing Committee.
- Semantic parsing for task oriented dialog using hierarchical representations. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2787–2792, Brussels, Belgium. Association for Computational Linguistics.
- John A Hawkins. 1992. Syntactic weight versus information structure in word order variation. Informationsstruktur und grammatik, pages 196–219.
- Jonathan Herzig and Jonathan Berant. 2019. Don’t paraphrase, detect! rapid and effective data collection for semantic parsing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3810–3820, Hong Kong, China. Association for Computational Linguistics.
- Jonathan Herzig and Jonathan Berant. 2021. Span-based semantic parsing for compositional generalization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 908–921, Online. Association for Computational Linguistics.
- Otedama: Fast rule-based pre-ordering for machine translation. The Prague Bulletin of Mathematical Linguistics, 106:159 – 168.
- Word reordering for zero-shot cross-lingual structured prediction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4109–4120, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- The state and fate of linguistic diversity and inclusion in the NLP world. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6282–6293, Online. Association for Computational Linguistics.
- Mihir Kale and Abhinav Rastogi. 2020. Text-to-text pre-training for data-to-text tasks. In Proceedings of the 13th International Conference on Natural Language Generation, pages 97–102, Dublin, Ireland. Association for Computational Linguistics.
- Measuring compositional generalization: A comprehensive method on realistic data. ArXiv, abs/1912.09713.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
- MTOP: A comprehensive multilingual task-oriented semantic parsing benchmark. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2950–2962, Online. Association for Computational Linguistics.
- Seq2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3203–3214, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cross-lingual dependency parsing by POS-guided word reordering. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2938–2948, Online. Association for Computational Linguistics.
- Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742.
- The Johns Hopkins University Bible corpus: 1600+ tongues for typological exploration. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2884–2892, Marseille, France. European Language Resources Association.
- Target language-aware constrained inference for cross-lingual dependency parsing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1117–1128, Hong Kong, China. Association for Computational Linguistics.
- Addressing word-order divergence in multilingual neural machine translation for extremely low resource languages. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3868–3873, Minneapolis, Minnesota. Association for Computational Linguistics.
- A data bootstrapping recipe for low-resource multilingual relation classification. In Proceedings of the 25th Conference on Computational Natural Language Learning, pages 575–587, Online. Association for Computational Linguistics.
- Trankit: A light-weight transformer-based toolkit for multilingual natural language processing. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 80–90, Online. Association for Computational Linguistics.
- Fine-grained analysis of cross-linguistic syntactic divergences. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1159–1176, Online. Association for Computational Linguistics.
- Dmitry Nikolaev and Sebastian Pado. 2022. Word-order typology in multilingual BERT: A case study in subordinate-clause detection. In Proceedings of the 4th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 11–21, Seattle, Washington. Association for Computational Linguistics.
- Mohammad Sadegh Rasooli and Michael Collins. 2017. Cross-lingual syntactic transfer with limited resources. Transactions of the Association for Computational Linguistics, 5:279–293.
- Mohammad Sadegh Rasooli and Michael Collins. 2019. Low-resource syntactic transfer with unsupervised source reordering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3845–3856, Minneapolis, Minnesota. Association for Computational Linguistics.
- Don’t parse, generate! a sequence to sequence architecture for task-oriented semantic parsing. Proceedings of The Web Conference 2020.
- Xtreme-up: A user-centric scarce-data benchmark for under-represented languages. ArXiv, abs/2305.11938.
- On language spaces, scales and cross-lingual transfer of UD parsers. In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 266–281, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Cross-lingual alignment of contextual word embeddings, with applications to zero-shot dependency parsing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1599–1613, Minneapolis, Minnesota. Association for Computational Linguistics.
- Ralf Steinberger. 1994. Treating ‘free word order’ in machine translation. In COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics, Kyoto, Japan.
- Dingquan Wang and Jason Eisner. 2016. The galactic dependencies treebanks: Getting more data by synthesizing new languages. Transactions of the Association for Computational Linguistics, 4:491–505.
- Dingquan Wang and Jason Eisner. 2018. Synthetic data made to order: The case of parsing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1325–1337, Brussels, Belgium. Association for Computational Linguistics.
- Cross-lingual BERT transformation for zero-shot dependency parsing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5721–5727, Hong Kong, China. Association for Computational Linguistics.
- Menglin Xia and Emilio Monti. 2021. Multilingual neural semantic parsing for low-resourced languages. In Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics, pages 185–194, Online. Association for Computational Linguistics.
- Haoran Xu and Philipp Koehn. 2021. Zero-shot cross-lingual dependency parsing through contextual embedding transformation. In Proceedings of the Second Workshop on Domain Adaptation for NLP, pages 204–213, Kyiv, Ukraine. Association for Computational Linguistics.
- mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 483–498, Online. Association for Computational Linguistics.
- LUKE: Deep contextualized entity representations with entity-aware self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6442–6454, Online. Association for Computational Linguistics.
- Incorporating word reordering knowledge into attention-based neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1524–1534, Vancouver, Canada. Association for Computational Linguistics.
- Position-aware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35–45, Copenhagen, Denmark. Association for Computational Linguistics.
- Ofir Arviv (11 papers)
- Dmitry Nikolaev (33 papers)
- Taelin Karidi (12 papers)
- Omri Abend (75 papers)