2000 character limit reached
BELL: Benchmarking the Explainability of Large Language Models (2504.18572v1)
Published 22 Apr 2025 in cs.AI and cs.CL
Abstract: LLMs have demonstrated remarkable capabilities in natural language processing, yet their decision-making processes often lack transparency. This opaqueness raises significant concerns regarding trust, bias, and model performance. To address these issues, understanding and evaluating the interpretability of LLMs is crucial. This paper introduces a standardised benchmarking technique, Benchmarking the Explainability of LLMs, designed to evaluate the explainability of LLMs.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.