Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning (2004.14224v1)

Published 29 Apr 2020 in cs.CL and cs.LG

Abstract: In this work, we aim at equipping pre-trained LLMs with structured knowledge. We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked LLMs, our first contribution is an entity masking scheme that exploits relational knowledge underlying the text. This is fulfilled by using a linked knowledge graph to select informative entities and then masking their mentions. In addition we use knowledge graphs to obtain distractors for the masked entities, and propose a novel distractor-suppressed ranking objective which is optimized jointly with masked LLM. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training, to inject LLMs with structured knowledge via learning from raw text. It is more efficient than retrieval-based methods that perform entity linking and integration during finetuning and inference, and generalizes more effectively than the methods that directly learn from concatenated graph triples. Experiments show that our proposed model achieves improved performance on five benchmark datasets, including question answering and knowledge base completion tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Tao Shen (87 papers)
  2. Yi Mao (78 papers)
  3. Pengcheng He (60 papers)
  4. Guodong Long (115 papers)
  5. Adam Trischler (50 papers)
  6. Weizhu Chen (128 papers)
Citations (54)