Assessing a Model’s Knowledge Remains Open
Develop reliable methodologies to assess what factual knowledge a large language model possesses and to evaluate the quality of these methods despite the absence of ground truth about the model’s internal knowledge.
References
Assessing a model's knowledge remains an open problem, particularly since evaluating the quality of such methods is challenging due to the lack of ground truth about what the model truly knows.
— Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
(2405.05904 - Gekhman et al., 9 May 2024) in Section 6 (Knowledge Categories Analysis)