Papers
Topics
Authors
Recent
Search
2000 character limit reached

CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal Problem

Published 13 Dec 2023 in cs.AI | (2312.08157v2)

Abstract: The minimal feature removal problem in the post-hoc explanation area aims to identify the minimal feature set (MFS). Prior studies using the greedy algorithm to calculate the minimal feature set lack the exploration of feature interactions under a monotonic assumption which cannot be satisfied in general scenarios. In order to address the above limitations, we propose a Cooperative Integrated Dynamic Refining method (CIDR) to efficiently discover minimal feature sets. Specifically, we design Cooperative Integrated Gradients (CIG) to detect interactions between features. By incorporating CIG and characteristics of the minimal feature set, we transform the minimal feature removal problem into a knapsack problem. Additionally, we devise an auxiliary Minimal Feature Refinement algorithm to determine the minimal feature set from numerous candidate sets. To the best of our knowledge, our work is the first to address the minimal feature removal problem in the field of natural language processing. Extensive experiments demonstrate that CIDR is capable of tracing representative minimal feature sets with improved interpretability across various models and datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Provably efficient, succinct, and precise explanations. In Advances in Neural Information Processing Systems.
  2. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, volume 33, 1877–1901. Curran Associates, Inc.
  3. Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4236–4251. Online: Association for Computational Linguistics.
  4. Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5578–5593. Online: Association for Computational Linguistics.
  5. Explainable GNN-Based Models over Knowledge Graphs. In International Conference on Learning Representations.
  6. Knowledge Neurons in Pretrained Transformers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 8493–8502. Dublin, Ireland: Association for Computational Linguistics.
  7. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 4171–4186. Minneapolis, Minnesota: Association for Computational Linguistics.
  8. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 4443–4458. Online: Association for Computational Linguistics.
  9. Enguehard, J. 2023. Sequential Integrated Gradients: a simple but effective method for explaining language models. In Findings of the Association for Computational Linguistics: ACL 2023, 7555–7565. Toronto, Canada: Association for Computational Linguistics.
  10. Interpretation of Neural Networks Is Fragile. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01): 3681–3688.
  11. Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules? In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 3275–3284. Brussels, Belgium: Association for Computational Linguistics.
  12. Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14): 12963–12971.
  13. Harsanyi, J. C. 1982. A Simplified Bargaining Model for the n-Person Cooperative Game, 44–70. Dordrecht: Springer Netherlands. ISBN 978-94-017-2527-9.
  14. Cardinality-Minimal Explanations for Monotonic Neural Networks. arXiv:2205.09901.
  15. Enhanced integrated gradients: improving interpretability of deep learning models using splicing codes as a case study. Genome biology, 21(1): 1–22.
  16. Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models. In International Conference on Learning Representations.
  17. Learning how to explain neural networks: PatternNet and PatternAttribution. In International Conference on Learning Representations.
  18. Visualizing and Understanding Neural Models in NLP. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 681–691. San Diego, California: Association for Computational Linguistics.
  19. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692.
  20. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, 4768–4777. Red Hook, NY, USA: Curran Associates Inc. ISBN 9781510860964.
  21. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 142–150. Portland, Oregon, USA: Association for Computational Linguistics.
  22. Investigating Saturation Effects in Integrated Gradients. ArXiv, abs/2010.12697.
  23. Causal Interpretability for Machine Learning - Problems, Methods and Evaluation. SIGKDD Explor. Newsl., 22(1): 18–33.
  24. “That Is a Suspicious Reaction!”: Interpreting Logits Variation to Detect NLP Adversarial Attacks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 7806–7816. Dublin, Ireland: Association for Computational Linguistics.
  25. NLP Whack-A-Mole: Challenges in Cross-Domain Temporal Expression Extraction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 3682–3692. Minneapolis, Minnesota: Association for Computational Linguistics.
  26. OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774.
  27. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), 115–124. Ann Arbor, Michigan: Association for Computational Linguistics.
  28. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, 1135–1144. New York, NY, USA: Association for Computing Machinery. ISBN 9781450342322.
  29. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv, abs/1910.01108.
  30. Discretized Integrated Gradients for Explaining Language Models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 10285–10299. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
  31. Improving Interpretability via Explicit Word Interaction Graph Layer. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11): 13528–13537.
  32. Shapley, L. S.; et al. 1953. A value for n-person games.
  33. Learning Important Features through Propagating Activation Differences. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, 3145–3153. JMLR.org.
  34. Not Just a Black Box: Learning Important Features Through Propagating Activation Differences. arXiv:1605.01713.
  35. Integrated Directional Gradients: Feature Interaction Attribution for Neural NLP Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 865–878. Online: Association for Computational Linguistics.
  36. Hierarchical interpretations for neural network predictions. In International Conference on Learning Representations.
  37. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1631–1642. Seattle, Washington, USA: Association for Computational Linguistics.
  38. Axiomatic attribution for deep networks. In International conference on machine learning, 3319–3328. PMLR.
  39. How does This Interaction Affect Me? Interpretable Attribution for Feature Interactions. In Advances in Neural Information Processing Systems, volume 33, 6147–6159. Curran Associates, Inc.
  40. Attention is All You Need. NIPS’17, 6000–6010. Red Hook, NY, USA: Curran Associates Inc. ISBN 9781510860964.
  41. Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1126–1140. Online: Association for Computational Linguistics.
  42. Shapley Explanation Networks. In International Conference on Learning Representations.
  43. Attention is not not Explanation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 11–20. Hong Kong, China: Association for Computational Linguistics.
  44. On the (in)Fidelity and Sensitivity of Explanations. Red Hook, NY, USA: Curran Associates Inc.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.