Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models (2402.18397v1)

Published 28 Feb 2024 in cs.CL

Abstract: Despite the predominance of English in their training data, English-centric LLMs like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks, raising questions about the depth and nature of their cross-lingual capabilities. This paper introduces the decomposed prompting approach to probe the linguistic structure understanding of these LLMs in sequence labeling tasks. Diverging from the single text-to-text prompt, our method generates for each token of the input sentence an individual prompt which asks for its linguistic label. We assess our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages, utilizing both English-centric and multilingual LLMs. Our findings show that decomposed prompting surpasses the iterative prompting baseline in efficacy and efficiency under zero- and few-shot settings. Further analysis reveals the influence of evaluation methods and the use of instructions in prompts. Our multilingual investigation shows that English-centric LLMs perform better on average than multilingual models. Our study offers insights into the multilingual transferability of English-centric LLMs, contributing to the understanding of their multilingual linguistic knowledge.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. MEGA: Multilingual evaluation of generative AI. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4232–4267, Singapore. Association for Computational Linguistics.
  2. Buffet: Benchmarking large language models for few-shot cross-lingual transfer. arXiv preprint arXiv:2305.14857.
  3. Interpretability and analysis in neural NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, pages 1–5, Online. Association for Computational Linguistics.
  4. Prompting language models for linguistic structure. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6649–6663, Toronto, Canada. Association for Computational Linguistics.
  5. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  6. Monolingual or multilingual instruction tuning: Which makes a better alpaca. arXiv preprint arXiv:2309.08958.
  7. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509.
  8. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
  9. Template-based named entity recognition using BART. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1835–1845, Online. Association for Computational Linguistics.
  10. Multilingual jailbreak challenges in large language models. arXiv preprint arXiv:2310.06474.
  11. Jennifer Hu and Roger Levy. 2023. Prompting is not a substitute for probability measurements in large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5040–5060, Singapore. Association for Computational Linguistics.
  12. Xtreme: A massively multilingual multi-task benchmark for evaluating cross-lingual generalisation. In International Conference on Machine Learning, pages 4411–4421. PMLR.
  13. Mistral 7b. arXiv preprint arXiv:2310.06825.
  14. Dan Jurafsky. 2000. Speech & language processing. Pearson Education India.
  15. Turning english-centric llms into polyglots: How much multilinguality is needed? arXiv preprint arXiv:2312.12683.
  16. ChatGPT beyond English: Towards a comprehensive evaluation of large language models in multilingual learning. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 13171–13189, Singapore. Association for Computational Linguistics.
  17. Lampp: Language models as probabilistic priors for perception and action. arXiv e-prints, pages arXiv–2302.
  18. URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 8–14, Valencia, Spain. Association for Computational Linguistics.
  19. Is prompt-based finetuning always better than vanilla finetuning? insights from cross-lingual language understanding.
  20. Topro: Token-level prompt decomposition for cross-lingual sequence labeling tasks. arXiv preprint arXiv:2401.16589.
  21. Template-free prompt tuning for few-shot NER. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5721–5732, Seattle, United States. Association for Computational Linguistics.
  22. Learning language representations for typology prediction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2529–2535, Copenhagen, Denmark. Association for Computational Linguistics.
  23. Crosslingual generalization through multitask finetuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15991–16111, Toronto, Canada. Association for Computational Linguistics.
  24. Seallms–large language models for southeast asia. arXiv preprint arXiv:2312.00738.
  25. Cross-lingual retrieval augmented prompt for low-resource languages. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8320–8340, Toronto, Canada. Association for Computational Linguistics.
  26. Universal Dependencies v2: An evergrowing multilingual treebank collection. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4034–4043, Marseille, France. European Language Resources Association.
  27. Workshop on large language models’ interpretability and trustworthiness (llmit). In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, pages 5290–5293.
  28. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715–1725, Berlin, Germany. Association for Computational Linguistics.
  29. Multilingual instruction tuning with just a pinch of multilinguality. arXiv preprint arXiv:2401.01854.
  30. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  31. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  32. Neural machine translation with byte-level subwords.
  33. Label words are anchors: An information flow perspective for understanding in-context learning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9840–9855, Singapore. Association for Computational Linguistics.
  34. All languages matter: On the multilingual safety of large language models. arXiv preprint arXiv:2310.00905.
  35. Super-NaturalInstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5085–5109, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  36. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  37. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
  38. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 483–498, Online. Association for Computational Linguistics.
  39. Language versatilists vs. specialists: An empirical revisiting on multilingual transfer ability. arXiv preprint arXiv:2306.06688.
  40. Instruction tuning for large language models: A survey. arXiv preprint arXiv:2308.10792.
  41. A survey of large language models. arXiv preprint arXiv:2303.18223.
  42. Calibrate before use: Improving few-shot performance of language models. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12697–12706. PMLR.
  43. Through the lens of core competency: Survey on evaluation of large language models. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum), pages 88–109, Harbin, China. Chinese Information Processing Society of China.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Ercong Nie (25 papers)
  2. Shuzhou Yuan (12 papers)
  3. Bolei Ma (18 papers)
  4. Helmut Schmid (20 papers)
  5. Michael Färber (65 papers)
  6. Frauke Kreuter (25 papers)
  7. Hinrich Schütze (250 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com