Model size threshold where compressed representations surpass raw text
Determine whether there exists a threshold in large language model parameter count beyond which the compressed continuous document representations (memory-token embeddings) used for retrieval-augmented generation outperform using raw text context in supporting model understanding and answer generation.
References
An open question for future work is whether there exists a model size threshold beyond which compressed representations surpass raw text in supporting understanding and generation.
— CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
(2511.18659 - He et al., 24 Nov 2025) in Limitations — Model Size paragraph