Dice Question Streamline Icon: https://streamlinehq.com

Objective for trading off parametric and non-parametric memory in language models

Develop a principled objective or decision criterion to determine the tradeoff between parametric memory (knowledge encoded in model weights) and non-parametric memory (e.g., long-context conditioning and retrieval augmentation) in language models, including when and how to allocate reliance on each memory type.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper emphasizes the complementary roles of parametric memory, which compresses and integrates knowledge but is lossy, and non-parametric memory, which is lossless and attributable. It demonstrates scenarios where parametric memory enables complex reasoning beyond what current long-context or retrieval-augmented LLMs can achieve.

The authors explicitly identify the lack of a principled framework for determining how much to rely on parametric versus non-parametric memory for a given task as an open problem.

References

How to decide the tradeoff between parametric and non-parametric memory (or, how to define the objective for such a tradeoff) is another interesting open problem for future work.

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization (2405.15071 - Wang et al., 23 May 2024) in Section 6: Related Work