A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs
The paper introduces a novel dataset designed to enhance the transparency and reliability of explanations generated by LLMs. This work addresses the critical challenge of interpreting the reasoning processes of LLMs, which has become a focal point for applications in domains that require model decision transparency, such as healthcare and law.
The dataset comprises 24,204 instances, each associated with explanations generated using knowledge graphs (KGs) and graph attention networks (GATs). These explanations illuminate the reasoning behaviors of LLMs such as Llama-3 and RoBERTa, including both "why-choose" and "why-not-choose" components. The inclusion of debugger-scores allows for a multidimensional quality assessment, which is notable because it provides a structured evaluation framework extending beyond traditional explanation methods.
Technical Approach
The authors integrate KGs and GATs to construct explanations grounded in the reasoning processes. This framework links the LLM's decision-making to entities and relations within the KGs, enabling an interpretable representation of the model's behavior. The use of GATs is significant for capturing salient features that influence predictions and providing a human-understandable narrative.
The paper distinguishes its contributions by comparing this dataset with existing explanation datasets like CoS-E and ECQA, positioning it as the first of its kind to offer structured and grounded explanations for LLM reasoning. The dataset supports the transparency and interpretability of LLM outputs, addressing limitations in current explanation strategies that often suffer from hallucinations and lack real-process reflection.
Results and Evaluation
The integration of the proposed dataset and framework shows potential in reducing hallucinations and enhancing explanation groundedness. Evaluation through both human and automated means yielded high scores, particularly with human experts who rated the explanation quality at 0.93/1.00. The framework's applicability across models like GPT-3.5, GPT-4, and various Llama configurations also extends its utility, demonstrating significant improvements in accuracy.
Implications and Future Directions
This dataset and accompanying framework mark a significant step toward improving the trustworthiness of LLM outputs. By rooting explanations in factual grounding, the approach reduces model hallucinations, a critical issue in sensitive applications. The framework's applicability across diverse LLM architectures suggests potential for broader adoption in AI systems, enhancing transparency and accountability.
Future research could focus on extending the dataset with more diverse reasoning scenarios and exploring integration with evolving LLM architectures. Additionally, advancements in KG construction and dynamic adaptation of GAT parameters may further enhance the precision and reliability of model explanations. Such developments would enable more robust AI deployment in critical areas, aligning technical capabilities with ethical standards in AI explainability.