Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Polyglot or Not? Measuring Multilingual Encyclopedic Knowledge in Foundation Models (2305.13675v2)

Published 23 May 2023 in cs.CL

Abstract: In this work, we assess the ability of foundation models to recall encyclopedic knowledge across a wide range of linguistic contexts. To support this, we: 1) produce a 20-language dataset that contains 303k factual associations paired with counterfactuals, 2) evaluate 5 models in a multilingual test, and 3) benchmark a diverse set of 24 models in an English-only test. Meta's LLaMA achieves the highest scores in both multilingual and English-only evaluations. Yet, an analysis of LLaMA's errors reveals significant limitations in its ability to recall facts in languages other than English, plus difficulties related to the location and gender of fact subjects. Overall, our findings suggest that today's foundation models are far from polyglots.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3):337–351.
  2. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, pages 610–623, New York, NY, USA. Association for Computing Machinery.
  3. On the opportunities and risks of foundation models. (arXiv:2108.07258).
  4. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, page 1877–1901.
  5. Scaling instruction-finetuned language models. (arXiv:2210.11416).
  6. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, page 8440–8451, Online. Association for Computational Linguistics.
  7. Gpt3.int8(): 8-bit matrix multiplication for transformers at scale. In Advances in Neural Information Processing Systems, volume 35, pages 30318–30332.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), page 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  9. Documenting large webtext corpora: A case study on the colossal clean crawled corpus. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, page 12861305, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  10. Calibrating factual knowledge in pretrained language models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5937–5947, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  11. Measuring causal effects of data statistics on language model’s ‘factual’ predictions. (arXiv:2207.14251).
  12. T-rex: A large scale alignment of natural language with knowledge base triples. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
  13. Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 30–45, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  14. Measuring massive multitask language understanding. In International Conference on Learning Representations, Online and Austria.
  15. X-factr: Multilingual factual knowledge retrieval from pretrained language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5943–5959, Online. Association for Computational Linguistics.
  16. Scaling laws for neural language models. (arXiv:2001.08361).
  17. Multilingual lama: Investigating knowledge in multilingual pretrained language models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3250–3258, Online. Association for Computational Linguistics.
  18. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  19. Locating and editing factual associations in gpt. In Advances in Neural Information Processing Systems, volume 35.
  20. Mass-editing memory in a transformer. (arXiv:2210.07229).
  21. OpenAI. 2023. Gpt-4 technical report. (arXiv:2303.08774).
  22. The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only. (arXiv:2306.01116).
  23. Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2463–2473, Hong Kong, China. Association for Computational Linguistics.
  24. Language models are unsupervised multitask learners.
  25. Scaling language models: Methods, analysis & insights from training gopher. (arXiv:2112.11446).
  26. Pangu-ΣΣ\Sigmaroman_Σ: Towards trillion parameter language model with sparse heterogeneous computing. (arXiv:2303.10845).
  27. How much knowledge can you pack into the parameters of a language model? In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5418–5426, Online. Association for Computational Linguistics.
  28. Lamda: Language models for dialog applications. (arXiv:2201.08239).
  29. Llama: Open and efficient foundation language models. (arXiv:2302.13971).
  30. Ccnet: Extracting high quality monolingual datasets from web crawl data. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4003–4012, Marseille, France. European Language Resources Association.
  31. Opt: Open pre-trained transformer language models. (arXiv:2205.01068).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Tim Schott (1 paper)
  2. Daniel Furman (1 paper)
  3. Shreshta Bhat (1 paper)
Citations (4)