Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base (2204.07994v2)

Published 17 Apr 2022 in cs.CL and cs.AI

Abstract: Pre-trained LLMs (PLMs) like BERT have made significant progress in various downstream NLP tasks. However, by asking models to do cloze-style tests, recent work finds that PLMs are short in acquiring knowledge from unstructured text. To understand the internal behaviour of PLMs in retrieving knowledge, we first define knowledge-baring (K-B) tokens and knowledge-free (K-F) tokens for unstructured text and ask professional annotators to label some samples manually. Then, we find that PLMs are more likely to give wrong predictions on K-B tokens and attend less attention to those tokens inside the self-attention module. Based on these observations, we develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner. Experiments on knowledge-intensive tasks show the effectiveness of the proposed methods. To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (23)

Authors (6)

Cunxiang Wang (30 papers)
Fuli Luo (23 papers)
Yanyang Li (22 papers)
Runxin Xu (30 papers)
Fei Huang (408 papers)
Yue Zhang (618 papers)

Citations (2)

View on Semantic Scholar

Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base (2204.07994v2)

Related Papers