Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Navigating the Structured What-If Spaces: Counterfactual Generation via Structured Diffusion (2312.13616v1)

Published 21 Dec 2023 in cs.LG and cs.AI

Abstract: Generating counterfactual explanations is one of the most effective approaches for uncovering the inner workings of black-box neural network models and building user trust. While remarkable strides have been made in generative modeling using diffusion models in domains like vision, their utility in generating counterfactual explanations in structured modalities remains unexplored. In this paper, we introduce Structured Counterfactual Diffuser or SCD, the first plug-and-play framework leveraging diffusion for generating counterfactual explanations in structured data. SCD learns the underlying data distribution via a diffusion model which is then guided at test time to generate counterfactuals for any arbitrary black-box model, input, and desired prediction. Our experiments show that our counterfactuals not only exhibit high plausibility compared to the existing state-of-the-art but also show significantly better proximity and diversity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” arXiv: Machine Learning, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:11319376
  2. OpenAI, “Gpt-4 technical report,” ArXiv, vol. abs/2303.08774, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257532815
  3. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” ArXiv, vol. abs/2204.06125, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:248097655
  4. M. T. Ribeiro, S. Singh, and C. Guestrin, “Model-agnostic interpretability of machine learning,” arXiv preprint arXiv:1606.05386, 2016.
  5. S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in neural information processing systems, vol. 30, 2017.
  6. S. Wachter, B. Mittelstadt, and C. Russell, “Counterfactual explanations without opening the black box: Automated decisions and the gdpr,” Harv. JL & Tech., vol. 31, p. 841, 2017.
  7. R. K. Mothilal, A. Sharma, and C. Tan, “Explaining machine learning classifiers through diverse counterfactual explanations,” in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 607–617.
  8. A.-H. Karimi, G. Barthe, B. Balle, and I. Valera, “Model-agnostic counterfactual explanations for consequential decisions,” ArXiv, vol. abs/1905.11190, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:166227893
  9. W. Yang, J. Li, C. Xiong, and S. C. Hoi, “Mace: An efficient model-agnostic framework for counterfactual explanation,” arXiv preprint arXiv:2205.15540, 2022.
  10. A. Ross, A. Marasović, and M. E. Peters, “Explaining nlp models via minimal contrastive editing (mice),” arXiv preprint arXiv:2012.13985, 2020.
  11. N. Madaan, I. Padhi, N. Panwar, and D. Saha, “Generate your counterfactuals: Towards controlled counterfactual generation for text,” in Proceedings of the AAAI Conference on Artificial Intelligence, no. 15, 2021, pp. 13 516–13 524.
  12. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
  13. M. Augustin, V. Boreiko, F. Croce, and M. Hein, “Diffusion visual counterfactual explanations,” Advances in Neural Information Processing Systems, vol. 35, pp. 364–377, 2022.
  14. G. Jeanneret, L. Simon, and F. Jurie, “Diffusion models for counterfactual explanations,” in Asian Conference on Computer Vision, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:247779169
  15. P. Sanchez and S. A. Tsaftaris, “Diffusion causal models for counterfactual estimation,” in CLEaR, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:247011291
  16. “Diffusion-based visual counterfactual explanations – towards systematic quantitative evaluation,” 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:260866076
  17. X. Li, J. Thickstun, I. Gulrajani, P. S. Liang, and T. B. Hashimoto, “Diffusion-lm improves controllable text generation,” Advances in Neural Information Processing Systems, vol. 35, pp. 4328–4343, 2022.
  18. K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013.
  19. X. Han, B. C. Wallace, and Y. Tsvetkov, “Explaining black box predictions and unveiling data artifacts through influence functions,” arXiv preprint arXiv:2005.06676, 2020.
  20. L. A. Hendricks, R. Hu, T. Darrell, and Z. Akata, “Grounding visual explanations,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 264–279.
  21. Y. Goyal, Z. Wu, J. Ernst, D. Batra, D. Parikh, and S. Lee, “Counterfactual visual explanations,” in International Conference on Machine Learning.   PMLR, 2019, pp. 2376–2384.
  22. A. Van Looveren and J. Klaise, “Interpretable counterfactual explanations guided by prototypes,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases.   Springer, 2021, pp. 650–665.
  23. J. Li, W. Monroe, and D. Jurafsky, “Understanding neural networks through representation erasure,” arXiv preprint arXiv:1612.08220, 2016.
  24. S. Feng, E. Wallace, A. Grissom II, M. Iyyer, P. Rodriguez, and J. Boyd-Graber, “Pathologies of neural models make interpretations difficult,” arXiv preprint arXiv:1804.07781, 2018.
  25. M. T. Ribeiro, T. Wu, C. Guestrin, and S. Singh, “Beyond accuracy: Behavioral testing of NLP models with CheckList,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, 2020, pp. 4902–4912. [Online]. Available: https://www.aclweb.org/anthology/2020.acl-main.442
  26. J. Ebrahimi, A. Rao, D. Lowd, and D. Dou, “Hotflip: White-box adversarial examples for text classification,” arXiv preprint arXiv:1712.06751, 2017.
  27. M. T. Ribeiro, S. Singh, and C. Guestrin, “Semantically equivalent adversarial rules for debugging NLP models,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 856–865. [Online]. Available: https://www.aclweb.org/anthology/P18-1079
  28. M. Iyyer, J. Wieting, K. Gimpel, and L. Zettlemoyer, “Adversarial example generation with syntactically controlled paraphrase networks,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).   New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 1875–1885. [Online]. Available: https://aclanthology.org/N18-1170
  29. R. Jia and P. Liang, “Adversarial examples for evaluating reading comprehension systems,” arXiv preprint arXiv:1707.07328, 2017.
  30. M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, S. A. McIlraith and K. Q. Weinberger, Eds.   AAAI Press, 2018, pp. 1527–1535. [Online]. Available: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16982
  31. I. Higgins, L. Matthey, A. Pal, C. P. Burgess, X. Glorot, M. M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae: Learning basic visual concepts with a constrained variational framework,” in International Conference on Learning Representations, 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:46798026
  32. A.-H. Karimi, G. Barthe, B. Balle, and I. Valera, “Model-agnostic counterfactual explanations for consequential decisions,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2020, pp. 895–905.
  33. R. Guidotti, A. Monreale, F. Giannotti, D. Pedreschi, S. Ruggieri, and F. Turini, “Factual and counterfactual explanations for black box decision making,” IEEE Intelligent Systems, vol. 34, pp. 14–23, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:210931542
  34. R. Poyiadzi, K. Sokol, R. Santos-Rodriguez, T. De Bie, and P. Flach, “Feasible and actionable counterfactual explanations,” 2020.
  35. A. Dhurandhar, T. Pedapati, A. Balakrishnan, P.-Y. Chen, K. Shanmugam, and R. Puri, “Model agnostic contrastive explanations for structured data,” ArXiv, vol. abs/1906.00117, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:173990728
  36. A. Jacovi, S. Swayamdipta, S. Ravfogel, Y. Elazar, Y. Choi, and Y. Goldberg, “Contrastive explanations for model interpretability,” in Conference on Empirical Methods in Natural Language Processing, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:232092617
  37. T. Wu, M. T. Ribeiro, J. Heer, and D. S. Weld, “Polyjuice: Automated, general-purpose counterfactual generation,” arXiv preprint arXiv:2101.00288, 2021.
  38. N. Madaan, D. Saha, and S. Bedathur, “Counterfactual sentence generation with plug-and-play perturbation,” in 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML).   IEEE, 2023, pp. 306–315.
  39. V. Boreiko, M. Augustin, F. Croce, P. Berens, and M. Hein, “Sparse visual counterfactual explanations in image space,” ArXiv, vol. abs/2205.07972, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:248834482
  40. P. Howard, G. Singer, V. Lal, Y. Choi, and S. Swayamdipta, “Neurocounterfactuals: Beyond minimal-edit counterfactuals for richer data augmentation,” ArXiv, vol. abs/2210.12365, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:253098636
  41. Z. Xu, H. Lamba, Q. Ai, J. Tetreault, and A. Jaimes, “Counterfactual editing for search result explanation,” ArXiv, vol. abs/2301.10389, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:256231426
  42. A. Frank, “Uci machine learning repository,” http://archive. ics. uci. edu/ml, 2010.
  43. H. Zhu, “Predicting earning potential using the adult dataset,” Retrieved December, vol. 5, p. 2016, 2016.
  44. M. S., R. P., and C. P., “Bank Marketing,” UCI Machine Learning Repository, 2012, DOI: https://doi.org/10.24432/C5K306.
  45. R. K. Pace and R. Barry, “Sparse spatial autoregressions,” Statistics & Probability Letters, vol. 33, no. 3, pp. 291–297, 1997.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Nishtha Madaan (12 papers)
  2. Srikanta Bedathur (41 papers)

Summary

We haven't generated a summary for this paper yet.