Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 82 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 19 tok/s Pro

GPT-5 High 17 tok/s Pro

GPT-4o 107 tok/s Pro

Kimi K2 174 tok/s Pro

GPT OSS 120B 468 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Explainability in Human-Agent Systems (1904.08123v1)

Published 17 Apr 2019 in cs.AI

Abstract: This paper presents a taxonomy of explainability in Human-Agent Systems. We consider fundamental questions about the Why, Who, What, When and How of explainability. First, we define explainability, and its relationship to the related terms of interpretability, transparency, explicitness, and faithfulness. These definitions allow us to answer why explainability is needed in the system, whom it is geared to and what explanations can be generated to meet this need. We then consider when the user should be presented with this information. Last, we consider how objective and subjective measures can be used to evaluate the entire system. This last question is the most encompassing as it will need to evaluate all other issues regarding explainability.

Citations (192)

View on Semantic Scholar

Collections

Summary

An Analysis of Explainability in Human-Agent Systems

The paper "Explainability in Human-Agent Systems" addresses the multifaceted nature of explainability within systems where humans interact with artificial agents. In this work, the authors establish a taxonomy of key questions surrounding explainability: Why, Who, What, When, and How. Each question is meticulously analyzed to lay out a comprehensive framework for understanding and developing explainable systems, specifically those employing machine learning technologies.

Initially, the paper sets forth precise definitions vital for the discourse on explainability, including terms like interpretability, transparency, explicitness, and faithfulness. These definitions form the basis for a nuanced discussion on explainability, differing from other related concepts. Specifically, explainability is framed as the clarity with which the human user can comprehend the logic underpinning an agent's decision-making.

The authors propose that understanding why a system requires explainability is pivotal. They identify three levels of need: not helpful, beneficial, and critical. Explainability, marked as critical, is instrumental in systems that heavily rely on transparent decision-making to build user trust or comply with legal standards. This categorization underscores that the type of explainability needed is deeply intertwined with the user’s interaction context and system objectives.

The work then explores the intended recipients of explanations — the 'Who' — suggesting three potential audiences: regular users, expert users, and external entities. The paper argues for tailoring explanations to each category, highlighting that different user types may require different kinds of explanations, both in complexity and presentation.

In exploring the 'What', the authors analyze the methods available for generating interpretability. They stress that explainable models could be derived from directly transparent machine learning algorithms or through post-hoc analysis tools facilitating comprehension. This section is rich with details on how such interpretations differ in explicitness and faithfulness, and stresses a trade-off often observed between a model’s accuracy and the degree of explainability it offers. The six strategies offered provide a spectrum of techniques, each suited for different interpretability requirements.

The 'When' aspect dissects the timeline of explanation delivery: before, during, or after a decision is made. Here, timing dovetails with the need for interpretability, varying based on the operational demands and constraints of different human-agent systems.

Evaluation — the 'How' — is a formidable challenge addressed through proposals for measuring the effectiveness of explanations. The authors suggest a framework involving the performance of the algorithm itself, the interpretability of the generated model, and ultimately, the user's understanding. This segment ambitiously seeks to quantify these dimensions, though admits existing limitations, particularly in standardizing measures of explicitness and faithfulness.

A significant contribution of this paper is its evaluative framework, which introduces a utility function balancing these diverse parameters in human-agent systems. The proposed model emphasizes the users' performance and their acceptance of machine outputs mediated by interpretability, fostering a systemic evaluation rather than a singular focus on algorithmic outputs.

While this paper robustly concentrates on systematizing explainability, it catalyzes discussions on several unresolved issues. Aspects such as defining universal measures for interpretability and others needing standardized datasets for benchmarking remain open, urging further exploration.

In conclusion, this paper enriches the landscape of interpretability research with a detailed and systematic framework, crucially integrating various aspects and considerations into the design and evaluation of Human-Agent Systems. It calls for grounded developments in evaluation techniques and suggests a forward-looking path whereby systems are equipped to offer context-appropriate explainability, ultimately engendering trust and enhancing user experience in interactions with AI.