Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KILM: Knowledge Injection into Encoder-Decoder Language Models (2302.09170v1)

Published 17 Feb 2023 in cs.CL and cs.AI

Abstract: Large pre-trained LLMs (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into LLMs (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training. This is done without architectural modifications to the PLMs or adding additional parameters. Experimental results over a suite of knowledge-intensive tasks spanning numerous datasets show that KILM enables models to retain more knowledge and hallucinate less, while preserving their original performance on general NLU and NLG tasks. KILM also demonstrates improved zero-shot performances on tasks such as entity disambiguation, outperforming state-of-the-art models having 30x more parameters.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yan Xu (258 papers)
  2. Mahdi Namazifar (19 papers)
  3. Devamanyu Hazarika (33 papers)
  4. Aishwarya Padmakumar (17 papers)
  5. Yang Liu (2253 papers)
  6. Dilek Hakkani-Tür (164 papers)
Citations (23)