Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Unified Framework with Novel Metrics for Evaluating the Effectiveness of XAI Techniques in LLMs (2503.05050v2)

Published 6 Mar 2025 in cs.CL, cs.AI, and cs.LG

Abstract: The increasing complexity of LLMs presents significant challenges to their transparency and interpretability, necessitating the use of eXplainable AI (XAI) techniques to enhance trustworthiness and usability. This study introduces a comprehensive evaluation framework with four novel metrics for assessing the effectiveness of five XAI techniques across five LLMs and two downstream tasks. We apply this framework to evaluate several XAI techniques LIME, SHAP, Integrated Gradients, Layer-wise Relevance Propagation (LRP), and Attention Mechanism Visualization (AMV) using the IMDB Movie Reviews and Tweet Sentiment Extraction datasets. The evaluation focuses on four key metrics: Human-reasoning Agreement (HA), Robustness, Consistency, and Contrastivity. Our results show that LIME consistently achieves high scores across multiple LLMs and evaluation metrics, while AMV demonstrates superior Robustness and near-perfect Consistency. LRP excels in Contrastivity, particularly with more complex models. Our findings provide valuable insights into the strengths and limitations of different XAI methods, offering guidance for developing and selecting appropriate XAI techniques for LLMs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Melkamu Abay Mersha (5 papers)
  2. Mesay Gemeda Yigezu (8 papers)
  3. Hassan Shakil (5 papers)
  4. Sanghyun Byun (5 papers)
  5. Jugal Kalita (64 papers)
  6. Ali K. AlShami (5 papers)

Summary

We haven't generated a summary for this paper yet.