Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Uncertainty-Aware Explanations Through Probabilistic Self-Explainable Neural Networks (2403.13740v1)

Published 20 Mar 2024 in cs.LG

Abstract: The lack of transparency of Deep Neural Networks continues to be a limitation that severely undermines their reliability and usage in high-stakes applications. Promising approaches to overcome such limitations are Prototype-Based Self-Explainable Neural Networks (PSENNs), whose predictions rely on the similarity between the input at hand and a set of prototypical representations of the output classes, offering therefore a deep, yet transparent-by-design, architecture. So far, such models have been designed by considering pointwise estimates for the prototypes, which remain fixed after the learning phase of the model. In this paper, we introduce a probabilistic reformulation of PSENNs, called Prob-PSENN, which replaces point estimates for the prototypes with probability distributions over their values. This provides not only a more flexible framework for an end-to-end learning of prototypes, but can also capture the explanatory uncertainty of the model, which is a missing feature in previous approaches. In addition, since the prototypes determine both the explanation and the prediction, Prob-PSENNs allow us to detect when the model is making uninformed or uncertain predictions, and to obtain valid explanations for them. Our experiments demonstrate that Prob-PSENNs provide more meaningful and robust explanations than their non-probabilistic counterparts, thus enhancing the explainability and reliability of the models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv preprint arXiv:1603.04467, 2015. doi: 10.48550/arXiv.1603.04467.
  2. Sanity Checks for Saliency Maps. In Advances in Neural Information Processing Systems, volume 31, pages 9505–9515, 2018.
  3. Fairwashing: The Risk of Rationalization. In Proceedings of the 36th International Conference on Machine Learning (ICML), volume 97, pages 161–170, 2019.
  4. D. Alvarez-Melis and T. Jaakkola. Towards Robust Interpretability with Self-Explaining Neural Networks. In Advances in Neural Information Processing Systems, volume 31, pages 7775–7784, 2018.
  5. M. Ashoori and J. D. Weisz. In AI We Trust? Factors That Influence Trustworthiness of AI-infused Decision-Making Processes. arXiv preprint arXiv:1912.02675, 2019. doi: 10.48550/arXiv.1912.02675.
  6. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLOS ONE, 10(7):e0130140, 2015. ISSN 1932-6203. doi: 10.1371/journal.pone.0130140.
  7. Evasion Attacks against Machine Learning at Test Time. In Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, pages 387–402, 2013. ISBN 978-3-642-40994-3. doi: 10.1007/978-3-642-40994-3_25.
  8. This Looks Like That: Deep Learning for Interpretable Image Recognition. In Advances in Neural Information Processing Systems, volume 32, pages 8930–8941, 2019.
  9. Deep Learning for Classical Japanese Literature. In Neural Information Processing Systems 2018 Workshop on Machine Learning for Creativity and Design, 2018.
  10. EMNIST: Extending MNIST to handwritten letters. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), pages 2921–2926, 2017. doi: 10.1109/IJCNN.2017.7966217.
  11. Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning. In Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80, pages 1184–1193, 2018.
  12. TensorFlow Distributions. arXiv preprint arXiv:1711.10604, 2017. doi: 10.48550/arXiv.1711.10604.
  13. Y. Gal. Uncertainty in Deep Learning. PhD thesis, University of Cambridge, 2016.
  14. ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model. In Advances in Neural Information Processing Systems, volume 35, pages 17940–17952, 2022.
  15. Interpretation of Neural Networks Is Fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 3681–3688, 2019. doi: 10.1609/aaai.v33i01.33013681.
  16. E. Goan and C. Fookes. Bayesian Neural Networks: An Introduction and Survey. Case Studies in Applied Bayesian Data Science, 2259:45–87, 2020. doi: 10.1007/978-3-030-42553-1_3.
  17. On Calibration of Modern Neural Networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), volume 70, pages 1321–1330, 2017.
  18. Interpretable Image Recognition with Hierarchical Prototypes. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 7, pages 32–40, 2019. ISBN 978-1-57735-820-6.
  19. Bayesian Active Learning for Classification and Preference Learning. arXiv preprint arXiv:1112.5745, 2011. doi: 10.48550/arXiv.1112.5745.
  20. E. Hüllermeier and W. Waegeman. Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods. Machine Learning, 110(3):457–506, 2021. ISSN 1573-0565. doi: 10.1007/s10994-021-05946-3.
  21. Distributional Prototypical Methods for Reliable Explanation Space Construction. IEEE Access, 11:34821–34834, 2023. ISSN 2169-3536. doi: 10.1109/ACCESS.2023.3264794.
  22. A. Kendall and Y. Gal. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? In Advances in Neural Information Processing Systems, volume 30, pages 5574–5584, 2017.
  23. D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR), 2015. doi: 10.48550/arXiv.1412.6980.
  24. D. P. Kingma and M. Welling. Auto-Encoding Variational Bayes. In International Conference on Learning Representations (ICLR), 2014.
  25. Uncertainty Quantification Using Bayesian Neural Networks in Classification: Application to Ischemic Stroke Lesion Segmentation. In Medical Imaging with Deep Learning (MIDL), 2022.
  26. H. Lakkaraju and O. Bastani. "How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, AIES ’20, pages 79–85, 2020. ISBN 978-1-4503-7110-0. doi: 10.1145/3375627.3375833.
  27. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. ISSN 1558-2256. doi: 10.1109/5.726791.
  28. Deep Learning for Case-Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, pages 3530–3537, 2018.
  29. Z. C. Lipton. The Mythos of Model Interpretability: In Machine Learning, the Concept of Interpretability Is Both Important and Slippery. Queue, 16(3):31–57, 2018. ISSN 1542-7730. doi: 10.1145/3236386.3241340.
  30. Neural Prototype Trees for Interpretable Fine-grained Image Recognition. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14933–14943, 2021. ISBN 978-1-66544-509-2. doi: 10.1109/CVPR46437.2021.01469.
  31. R. M. Neal. Bayesian Learning for Neural Networks, volume 118. Springer Science & Business Media, 2012.
  32. M. Pastore and A. Calcagnì. Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index. Frontiers in Psychology, 10(1089), 2019. ISSN 1664-1078. doi: 10.3389/fpsyg.2019.01089.
  33. A. Petrov and M. Kwiatkowska. Robustness of Unsupervised Representation Learning without Labels. arXiv preprint arXiv:2210.04076, 2022. doi: 10.48550/arXiv.2210.04076.
  34. C. Rudin. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nature Machine Intelligence, 1(5):206–215, 2019. ISSN 2522-5839. doi: 10.1038/s42256-019-0048-x.
  35. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626, 2017. doi: 10.1109/ICCV.2017.74.
  36. C. E. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27(3):379–423, 1948. ISSN 0005-8580. doi: 10.1002/j.1538-7305.1948.tb01338.x.
  37. Learning Important Features Through Propagating Activation Differences. In Proceedings of the 34th International Conference on Machine Learning (ICML), volume 70, pages 3145–3153, 2017.
  38. Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition, pages 958–963, 2003. doi: 10.1109/ICDAR.2003.1227801.
  39. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. In Workshop of the 2014 International Conference on Learning Representations (ICLR), 2014.
  40. Striving for Simplicity: The All Convolutional Net. In Workshop of the 2015 International Conference on Learning Representations (ICLR), 2015. doi: 10.48550/arXiv.1412.6806.
  41. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), volume 70, pages 3319–3328, 2017.
  42. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014.
  43. When and How to Fool Explainable Models (and Humans) with Adversarial Examples. arXiv preprint arXiv:2107.01943, 2021.
  44. Mixture of Gaussian-distributed Prototypes with Generative Modelling for Interpretable Image Classification. arXiv preprint arXiv:2312.00092, 2023. doi: 10.48550/arXiv.2312.00092.
  45. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint arXiv:2107.01943, 2017. doi: 10.48550/arXiv.1708.07747.
  46. Adversarial Examples: Attacks and Defenses for Deep Learning. IEEE Transactions on Neural Networks and Learning Systems, 30(9):2805–2824, 2019. ISSN 2162-2388. doi: 10.1109/TNNLS.2018.2886017.
  47. A Survey on Neural Network Interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence, 5(5):726–742, 2021. ISSN 2471-285X. doi: 10.1109/TETCI.2021.3100641.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets