Papers
Topics
Authors
Recent
Search
2000 character limit reached

X-SHIELD: Regularization for eXplainable Artificial Intelligence

Published 3 Apr 2024 in cs.AI | (2404.02611v3)

Abstract: As artificial intelligence systems become integral across domains, the demand for explainability grows, the called eXplainable artificial intelligence (XAI). Existing efforts primarily focus on generating and evaluating explanations for black-box models while a critical gap in directly enhancing models remains through these evaluations. It is important to consider the potential of this explanation process to improve model quality with a feedback on training as well. XAI may be used to improve model performance while boosting its explainability. Under this view, this paper introduces Transformation - Selective Hidden Input Evaluation for Learning Dynamics (T-SHIELD), a regularization family designed to improve model quality by hiding features of input, forcing the model to generalize without those features. Within this family, we propose the XAI - SHIELD(X-SHIELD), a regularization for explainable artificial intelligence, which uses explanations to select specific features to hide. In contrast to conventional approaches, X-SHIELD regularization seamlessly integrates into the objective function enhancing model explainability while also improving performance. Experimental validation on benchmark datasets underscores X-SHIELD's effectiveness in improving performance and overall explainability. The improvement is validated through experiments comparing models with and without the X-SHIELD regularization, with further analysis exploring the rationale behind its design choices. This establishes X-SHIELD regularization as a promising pathway for developing reliable artificial intelligence regularization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58:82–115, 2020.
  2. Towards explainable model extraction attacks. International Journal of Intelligent Systems, 37(11):9936–9956, 2022.
  3. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law & Technology, 31:841–888, 2017.
  4. "Why Should I Trust You?": Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 1135–1144, New York, NY, USA, 2016. Association for Computing Machinery.
  5. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems, pages 4768–4777, 2017.
  6. Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. Proceedings of Machine Learning Research, 2017.
  7. Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328, 2017.
  8. To trust or not to trust an explanation: using LEAF to evaluate local linear XAI methods. PeerJ Computer Science, 7:1–26, 2021.
  9. REVEL framework to measure local linear explanations for black-box models: Deep learning image classification case study. International Journal of Intelligent Systems, pages 1–34, 2023.
  10. A survey of regularization strategies for deep models. Artificial Intelligence Review, 53:3947–3986, 2020.
  11. Explainable artificial intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Information Fusion, page 102301, 2024.
  12. Explainable artificial intelligence: an analytical review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(5):e1424, 2021.
  13. Benchmarking and survey of explanation methods for black box models. Data Mining and Knowledge Discovery, 37(5):1719–1778, 2023.
  14. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
  15. Artificial intelligence explainability: the technical and ethical dimensions. Philosophical Transactions of the Royal Society A, 379:20200363–20200363, 2021.
  16. Sanity checks for saliency metrics. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 6021–6029, 2020.
  17. Deterministic local interpretable model-agnostic explanations for stable explainability. Machine Learning and Knowledge Extraction, 3(3):525–541, 2021.
  18. Considerations for evaluation and generalization in interpretable machine learning. Explainable and interpretable models in computer vision and machine learning, 1:3–17, 2018.
  19. Avi Rosenfeld. Better metrics for evaluating explainable artificial intelligence. In Proceedings of the 20th international conference on autonomous agents and multiagent systems, pages 45–50, 2021.
  20. Evaluation metrics for XAI: A review, taxonomy, and practical applications. In 2023 IEEE 27th International Conference on Intelligent Engineering Systems (INES), pages 000111–000124, 2023.
  21. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  22. Assessing fidelity in xai post-hoc techniques: A comparative study with ground truth explanations datasets. arXiv preprint arXiv:2311.01961, 2023.
  23. Guideline-based additive explanation for computer-aided diagnosis of lung nodules. In Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support, pages 39–47, Cham, 2019. Springer International Publishing.
  24. Effect of superpixel aggregation on explanations in lime–a case study with biological data. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 147–158, 2019.
  25. CIFAR-10 (canadian institute for advanced research). URL http://www. cs. toronto. edu/kriz/cifar. html, 5(4):1, 2010.
  26. Alex Krizhevsky. Learning multiple layers of features from tiny images. URL http://www. cs. toronto. edu/kriz/cifar. html, pages 32–33, 2009. URL https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
  27. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  28. EMNIST: Extending mnist to handwritten letters. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 2921–2926, 2017.
  29. Automated flower classification over a large number of classes. In 2008 Sixth Indian conference on computer vision, graphics & image processing, pages 722–729, 2008.
  30. The oxford-IIIT pet dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012.
  31. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114, 2019.
  32. Efficientnetv2: Smaller models and faster training. In International conference on machine learning, pages 10096–10106, 2021.
  33. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  34. rnpbst: An r package covering non-parametric and bayesian statistical tests. In Hybrid Artificial Intelligent Systems: 12th International Conference, Proceedings of Machine Learning Research 2017, pages 281–292. Springer, 2017.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.