Conjecture: Hidden States as a Major Knowledge Storage Module

Establish whether pre-filling and using the hidden states of sequence models as an integrated knowledge storage module provides a generalizable mechanism for encoding and utilizing knowledge during inference.

Background

The authors hypothesize that sequence model hidden states can serve as a more efficient, scalable substrate for knowledge storage and access, potentially enabling models to leverage in-context-like mechanisms without prohibitive context lengths. This conjecture motivates research into architectures with expressive hidden states and learning-to-learn dynamics.

References

To summarize the viewpoints discussed in this blog post, we propose two conjectures regarding potential improvements to the LLM knowledge paradigm: (1) In-context learning demonstrates certain advantages over traditional LLM knowledge modeling paradigm, and could potentially be scaled up to the pre-training corpus level to enable models to acquire stronger and more robust knowledge capabilities; (2) The hidden states of sequence models may offer a highly generalizable mechanism for knowledge encoding and utilizing, and could potentially serve as a major knowledge storage module.

— Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms (2504.06823 - Ye et al., 9 Apr 2025) in Conclusion

Conjecture: Hidden States as a Major Knowledge Storage Module

Sponsor

Background

References

Related Problems