Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CRAFT: Concept Recursive Activation FacTorization for Explainability (2211.10154v2)

Published 17 Nov 2022 in cs.CV and cs.AI

Abstract: Attribution methods, which employ heatmaps to identify the most influential regions of an image that impact model decisions, have gained widespread popularity as a type of explainability method. However, recent research has exposed the limited practical value of these methods, attributed in part to their narrow focus on the most prominent regions of an image -- revealing "where" the model looks, but failing to elucidate "what" the model sees in those areas. In this work, we try to fill in this gap with CRAFT -- a novel approach to identify both "what" and "where" by generating concept-based explanations. We introduce 3 new ingredients to the automatic concept extraction literature: (i) a recursive strategy to detect and decompose concepts across layers, (ii) a novel method for a more faithful estimation of concept importance using Sobol indices, and (iii) the use of implicit differentiation to unlock Concept Attribution Maps. We conduct both human and computer vision experiments to demonstrate the benefits of the proposed approach. We show that the proposed concept importance estimation technique is more faithful to the model than previous methods. When evaluating the usefulness of the method for human experimenters on a human-centered utility benchmark, we find that our approach significantly improves on two of the three test scenarios. Our code is freely available at github.com/deel-ai/Craft.

Citations (77)

Summary

  • The paper introduces a recursive method to decompose neural network activations into interpretable concepts, effectively mitigating neural collapse.
  • The paper leverages Sobol indices to quantitatively assess concept importance, offering a robust alternative to noisy directional derivative methods.
  • The paper employs implicit differentiation to produce detailed pixel-level attribution maps, bridging global concept explanations with local visual evidence.

The CRAFT (Concept Recursive Activation FacTorization) method, detailed in (2211.10154), introduces a novel framework for generating concept-based explanations for deep neural networks. It addresses the limitations of traditional attribution methods that primarily focus on identifying "where" the model attends in an image without elucidating "what" the model perceives. CRAFT aims to bridge this gap by identifying both "what" and "where" through the extraction and attribution of human-interpretable concepts within the neural network's activation space.

Key Components of CRAFT

CRAFT comprises three primary components that contribute to its enhanced explainability:

Recursive Concept Detection

This component employs a recursive strategy to detect and decompose concepts across different layers of the neural network. The process begins by extracting concepts from the top layers. If a concept lacks clear interpretability, the method recursively decomposes it into sub-concepts utilizing activations from earlier layers. This recursive decomposition mitigates the issue of "neural collapse," where concepts tend to be amalgamated in deeper layers, thereby reducing their interpretability.

Sobol Indices for Concept Importance

CRAFT leverages Sobol indices, derived from sensitivity analysis, to estimate the importance of individual concepts in relation to a model's prediction. Unlike previous methods such as TCAV, which rely on potentially noisy directional derivatives, Sobol indices offer a quantitative measure of each concept's contribution, including its interactions with other concepts, to the model's output variance. This approach provides a more faithful assessment of concept importance and reduces confirmation bias.

Implicit Differentiation for Concept Attribution Maps

CRAFT generates concept attribution maps by backpropagating concept scores into the pixel space. It employs the implicit function theorem to enable differentiation through the Non-negative Matrix Factorization (NMF) block utilized for concept discovery. This allows for the localization of pixels associated with a particular concept within a given input image and enables the creation of concept-wise attribution maps using both white-box and black-box attribution methods.

Improved Explainability

CRAFT enhances explainability through the following mechanisms:

  • Granularity of Explanations: By recursively identifying and decomposing concepts, CRAFT offers explanations at appropriate levels of granularity, making them more accessible to human understanding.
  • Concept Importance: Sobol indices ensure that the identified concepts are pertinent to the model's decision-making process, mitigating confirmation bias.
  • Comprehensive Understanding: Concept attribution maps bridge the divide between global concept explanations and local pixel-level explanations, providing a holistic understanding of "what" the model observed and "where" it observed it.

Experimental Evaluation and Results

The authors conducted a series of human and computer vision experiments to validate the efficacy of CRAFT.

Human Experiments (Utility Evaluation)

In the utility evaluation, a human-centered utility benchmark was employed to assess the practical usefulness of CRAFT in real-world scenarios. Participants were trained to predict a model's decisions on unseen images using explanations generated by CRAFT, ACE, and various attribution methods. The benchmark encompassed scenarios such as identifying bias in an AI system (Husky vs Wolf), characterizing visual strategies (Paleobotanical dataset), and understanding complex failure cases (ImageNet "Red fox" vs "Kit fox").

The utility metric quantified the accuracy of users in predicting the model's decision on novel images, normalized by the baseline accuracy of users trained without explanations. Higher utility scores indicated more useful explanations. CRAFT achieved higher utility scores than attribution methods and ACE in the Husky vs. Wolf and Leaves scenarios, demonstrating its benefits for human understanding.

Human Experiments (Validation of Recursivity)

Psychophysics experiments were conducted to validate the recursivity ingredient and the meaningfulness of the extracted high-level concepts.

In an intruder detection experiment, users were tasked with identifying an "intruder" image crop from a different concept among a series of image crops. The experiment compared the results of intruder detection using a concept and using one of its sub-concepts. In a binary choice experiment, users were presented with an image crop belonging to both a subcluster and a parent cluster and asked which of the two clusters seemed to accommodate the image best.

The results indicated that both concepts and sub-concepts are coherent and that recursivity can improve the understanding of the generated concepts. Participants more frequently chose the sub-concept cluster, suggesting that recursivity helps form more coherent clusters.

Computer Vision Experiments

Fidelity analysis was conducted using deletion and insertion curves to evaluate the faithfulness of the identified concepts and the concept importance estimator. The metrics measured the change in logit score when adding/removing concepts considered important by Sobol indices vs TCAV scores. Sobol indices led to better estimates of important concepts compared to TCAV.

A sanity check was performed on the method by running the concept extraction pipeline on a ResNet-50v2 model with randomized weights. The concepts extracted were drastically different from those extracted from the trained model, indicating that CRAFT passes the sanity check.