Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI (2205.12469v1)

Published 25 May 2022 in cs.CL

Abstract: Evaluating an explanation's faithfulness is desired for many reasons such as trust, interpretability and diagnosing the sources of model's errors. In this work, which focuses on the NLI task, we introduce the methodology of Faithfulness-through-Counterfactuals, which first generates a counterfactual hypothesis based on the logical predicates expressed in the explanation, and then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic (i.e. if the new formula is \textit{logically satisfiable}). In contrast to existing approaches, this does not require any explanations for training a separate verification model. We first validate the efficacy of automatic counterfactual hypothesis generation, leveraging on the few-shot priming paradigm. Next, we show that our proposed metric distinguishes between human-model agreement and disagreement on new counterfactual input. In addition, we conduct a sensitivity analysis to validate that our metric is sensitive to unfaithful explanations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Suzanna Sia (7 papers)
  2. Anton Belyy (6 papers)
  3. Amjad Almahairi (19 papers)
  4. Madian Khabsa (38 papers)
  5. Luke Zettlemoyer (225 papers)
  6. Lambert Mathias (19 papers)
Citations (12)
Youtube Logo Streamline Icon: https://streamlinehq.com