Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries (2008.09036v2)
Abstract: Pretrained LLMs have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose single-token name is found in common LM vocabularies. Furthermore, the main benefit of this paradigm, namely querying the KB using a variety of natural language paraphrases, is underexplored so far. Here, we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to represent millions of entities and present a detailed case study on paraphrased querying of world knowledge in LMs, thereby providing a proof-of-concept that LLMs can indeed serve as knowledge bases.
- Benjamin Heinzerling (26 papers)
- Kentaro Inui (119 papers)