Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adapters for Enhanced Modeling of Multilingual Knowledge and Text (2210.13617v2)

Published 24 Oct 2022 in cs.CL and cs.AI

Abstract: LLMs appear to learn facts from the large text corpora they are trained on. Such facts are encoded implicitly within their many parameters, making it difficult to verify or manipulate what knowledge has been learned. LLMs have recently been extended to multilingual LLMs (MLLMs), enabling knowledge to be learned across hundreds of languages. Meanwhile, knowledge graphs contain facts in an explicit triple format, which require careful and costly curation and are only available in a few high-resource languages, restricting their research and application. To address these issues, we propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages, including low-resource ones. Specifically, we introduce a lightweight adapter set to enhance MLLMs with cross-lingual entity alignment and facts from MLKGs for many languages. Experiments on common benchmarks show that such enhancement benefits both MLLMs and MLKGs, achieving: (1) comparable or improved performance for knowledge graph completion and entity alignment relative to baselines, especially for low-resource languages (for which knowledge graphs are unavailable); and (2) improved MLLM performance on language understanding tasks that require multilingual factual knowledge; all while maintaining performance on other general language tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yifan Hou (27 papers)
  2. Wenxiang Jiao (44 papers)
  3. Meizhen Liu (1 paper)
  4. Carl Allen (16 papers)
  5. Zhaopeng Tu (135 papers)
  6. Mrinmaya Sachan (124 papers)
Citations (9)