Papers
Topics
Authors
Recent
Search
2000 character limit reached

Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Published 16 May 2023 in cs.CL and cs.AI | (2305.09144v2)

Abstract: Memory is one of the most essential cognitive functions serving as a repository of world knowledge and episodes of activities. In recent years, large-scale pre-trained LLMs have shown remarkable memorizing ability. On the contrary, vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. To investigate such a retentive-forgetful contradiction and understand the memory mechanism of LLMs, we conduct thorough experiments by controlling the target knowledge types, the learning strategies and the learning schedules. We find that: 1) Vanilla LLMs are forgetful; 2) Pre-training leads to retentive LLMs; 3) Knowledge relevance and diversification significantly influence the memory formation. These conclusions are useful for understanding the abilities of pre-trained LLMs and shed light on designing and evaluating new learning and inference algorithms of LLMs.

Citations (12)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.