Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive Explanations for Model Interpretability (2103.01378v3)

Published 2 Mar 2021 in cs.CL, cs.AI, and cs.LG

Abstract: Contrastive explanations clarify why an event occurred in contrast to another. They are more inherently intuitive to humans to both produce and comprehend. We propose a methodology to produce contrastive explanations for classification models by modifying the representation to disregard non-contrastive information, and modifying model behavior to only be based on contrastive reasoning. Our method is based on projecting model representation to a latent space that captures only the features that are useful (to the model) to differentiate two potential decisions. We demonstrate the value of contrastive explanations by analyzing two different scenarios, using both high-level abstract concept attribution and low-level input token/span attribution, on two widely used text classification tasks. Specifically, we produce explanations for answering: for which label, and against which alternative label, is some aspect of the input useful? And which aspects of the input are useful for and against particular decisions? Overall, our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Alon Jacovi (26 papers)
  2. Swabha Swayamdipta (49 papers)
  3. Shauli Ravfogel (38 papers)
  4. Yanai Elazar (44 papers)
  5. Yejin Choi (287 papers)
  6. Yoav Goldberg (142 papers)
Citations (86)

Summary

We haven't generated a summary for this paper yet.