Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A General Search-based Framework for Generating Textual Counterfactual Explanations (2211.00369v2)

Published 1 Nov 2022 in cs.LG and cs.CL

Abstract: One of the prominent methods for explaining the decision of a machine-learning classifier is by a counterfactual example. Most current algorithms for generating such examples in the textual domain are based on generative LLMs. Generative models, however, are trained to minimize a specific loss function in order to fulfill certain requirements for the generated texts. Any change in the requirements may necessitate costly retraining, thus potentially limiting their applicability. In this paper, we present a general search-based framework for generating counterfactual explanations in the textual domain. Our framework is model-agnostic, domain-agnostic, anytime, and does not require retraining in order to adapt to changes in the user requirements. We model the task as a search problem in a space where the initial state is the classified text, and the goal state is a text in a given target class. Our framework includes domain-independent modification operators, but can also exploit domain-specific knowledge through specialized operators. The search algorithm attempts to find a text from the target class with minimal user-specified distance from the original classified object.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Contributions to the study of SMS spam filtering: new collection and results. In Proceedings of the 2011 ACM Symposium on Document Engineering, Mountain View, 259–262. ACM.
  2. Weight of Evidence as a Basis for Human-Oriented Explanations. CoRR, abs/1910.13503.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33: 1877–1901.
  4. Let the CAT out of the bag: Contrastive Attributed explanations for Text. CoRR, abs/2109.07983.
  5. CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation. arXiv preprint arXiv:2210.04873.
  6. Accountability of AI Under the Law: The Role of Explanation. CoRR, abs/1711.01134.
  7. Text Counterfactuals via Latent Optimization and Shapley-Guided Search. In Proceedings of the 2021 EMNLP, 5578–5593.
  8. Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach. CoRR, abs/2001.07417.
  9. Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals. In International Conference on Management of Data,2021, 577–590.
  10. Hilton, D. J. 1990. Conversational processes and causal explanation. Psychological Bulletin, 107(1): 65.
  11. Contrastive Explanations for Model Interpretability. In Proceedings of the 2021 EMNLP, 1597–1611.
  12. Model-agnostic counterfactual explanations for consequential decisions. In Oroceedings of AISTATSS 2021, 895–905.
  13. Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars. In Appice, A.; Tsoumakas, G.; Manolopoulos, Y.; and Matwin, S., eds., Discovery Science - 23rd International Conference, DS 2020, Thessaloniki, Greece, October 19-21, 2020, Proceedings, volume 12323 of Lecture Notes in Computer Science, 357–373. Springer.
  14. Levenshtein, V. I.; et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10, 707–710. Soviet Union.
  15. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 142–150. Portland, Oregon, USA: Association for Computational Linguistics.
  16. Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text. In Thirty-Fifth AAAI 2021, 13516–13524.
  17. Explaining Data-Driven Document Classifications. MIS Q., 38(1): 73–99.
  18. Miller, G. A. 1995. WordNet: a lexical database for English. Communications of the ACM, 38(11): 39–41.
  19. Miller, T. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence, 267: 1–38.
  20. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 607–617.
  21. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 (EMNLP-IJCNLP), 188–197.
  22. Nielsen, J. 1994. Usability engineering. San Francisco, Calif.
  23. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.
  24. Glove: Global Vectors for Word Representation. In Proceedings of EMNLP 2014, 1532–1543.
  25. Pohl, I. 1970. Heuristic search viewed as path finding in a graph. Artificial intelligence, 1(3-4): 193–204.
  26. Language models are unsupervised multitask learners. OpenAI blog, 1(8): 9.
  27. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140): 1–67.
  28. Counterfactual Explanation Algorithms for Behavioral and Textual Data. CoRR, abs/1912.01819.
  29. Generating Realistic Natural Language Counterfactuals. In Findings of the Association for Computational Linguistics: EMNLP 2021, 3611–3625.
  30. Explaining NLP Models via Minimal Contrastive Editing (MiCE). In Findings of ACL/IJCNLP 2021, 3840–3852.
  31. Tailor: Generating and Perturbing Text with Semantic Controls. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics.
  32. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR, abs/1910.01108.
  33. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 EMNLP, 1631–1642.
  34. Thagard, P. 1989. Explanatory coherence. Behavioral and brain sciences, 12(3): 435–502.
  35. Attention is all you need. Advances in neural information processing systems, 30.
  36. Counterfactual Explanations for Machine Learning: A Review. CoRR, abs/2010.10596.
  37. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., 31: 841.
  38. Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models. In Proceedings of ACL/IJCNLP 2021, 6707–6723.
  39. Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation. SIGKDD Explor., 23(1): 59–68.
  40. Generating Plausible Counterfactual Explanations for Deep Transformers in Financial Text Classification. In Proceedings of ICL 2020, 6150–6160.
  41. Textual Membership Queries. In Proceedings of IJCAI 2020, 2662–2668.
  42. Simple fast algorithms for the editing distance between trees and related problems. SIAM journal on computing, 18(6): 1245–1262.
  43. Character-level Convolutional Networks for Text Classification. In NeurIPS 2015, 649–657.
  44. Zilberstein, S. 1996. Using Anytime Algorithms in Intelligent Systems. AI Magazine, 17(3): 73.

Summary

We haven't generated a summary for this paper yet.