Dice Question Streamline Icon: https://streamlinehq.com

Provenance and Knowledge Updating in Large Language Models

Determine effective methods for large language models to provide verifiable provenance for generated outputs and to update their internal world knowledge efficiently and reliably, enabling transparent decision-making and overcoming limitations of static parametric memory.

Information Square Streamline Icon: https://streamlinehq.com

Background

The survey highlights fundamental limitations of LLMs, including hallucinations and outdated knowledge due to reliance on static training data. It stresses that beyond accuracy, systems must attribute their outputs to precise sources and be able to update internal knowledge to stay current.

This problem motivates Retrieval-Augmented Generation and multimodal extensions, but the paper emphasizes that provenance and updating remain unresolved at the core of LLMs.

References

Moreover, providing provenance for their decisions and updating their world knowledge remain critical open problems .

Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation (2502.08826 - Abootorabi et al., 12 Feb 2025) in Section 1, Introduction (Background)