Conjecture: Hidden States as a Major Knowledge Storage Module
Establish whether pre-filling and using the hidden states of sequence models as an integrated knowledge storage module provides a generalizable mechanism for encoding and utilizing knowledge during inference.
References
To summarize the viewpoints discussed in this blog post, we propose two conjectures regarding potential improvements to the LLM knowledge paradigm: (1) In-context learning demonstrates certain advantages over traditional LLM knowledge modeling paradigm, and could potentially be scaled up to the pre-training corpus level to enable models to acquire stronger and more robust knowledge capabilities; (2) The hidden states of sequence models may offer a highly generalizable mechanism for knowledge encoding and utilizing, and could potentially serve as a major knowledge storage module.