Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pre-training Language Models with Deterministic Factual Knowledge (2210.11165v1)

Published 20 Oct 2022 in cs.CL

Abstract: Previous works show that Pre-trained LLMs (PLMs) can capture factual knowledge. However, some analyses reveal that PLMs fail to perform it robustly, e.g., being sensitive to the changes of prompts when extracting factual knowledge. To mitigate this issue, we propose to let PLMs learn the deterministic relationship between the remaining context and the masked content. The deterministic relationship ensures that the masked factual content can be deterministically inferable based on the existing clues in the context. That would provide more stable patterns for PLMs to capture factual knowledge than randomly masking. Two pre-training tasks are further introduced to motivate PLMs to rely on the deterministic relationship when filling masks. Specifically, we use an external Knowledge Base (KB) to identify deterministic relationships and continuously pre-train PLMs with the proposed methods. The factual knowledge probing experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing. Further experiments on question-answering datasets show that trying to learn a deterministic relationship with the proposed methods can also help other knowledge-intensive tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Shaobo Li (24 papers)
  2. Xiaoguang Li (71 papers)
  3. Lifeng Shang (90 papers)
  4. Chengjie Sun (9 papers)
  5. Bingquan Liu (9 papers)
  6. Zhenzhou Ji (6 papers)
  7. Xin Jiang (242 papers)
  8. Qun Liu (230 papers)
Citations (8)