Objective for trading off parametric and non-parametric memory in language models

Develop a principled objective or decision criterion to determine the tradeoff between parametric memory (knowledge encoded in model weights) and non-parametric memory (e.g., long-context conditioning and retrieval augmentation) in language models, including when and how to allocate reliance on each memory type.

Background

The paper emphasizes the complementary roles of parametric memory, which compresses and integrates knowledge but is lossy, and non-parametric memory, which is lossless and attributable. It demonstrates scenarios where parametric memory enables complex reasoning beyond what current long-context or retrieval-augmented LLMs can achieve.

The authors explicitly identify the lack of a principled framework for determining how much to rely on parametric versus non-parametric memory for a given task as an open problem.

References

How to decide the tradeoff between parametric and non-parametric memory (or, how to define the objective for such a tradeoff) is another interesting open problem for future work.

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization (2405.15071 - Wang et al., 23 May 2024) in Section 6: Related Work