Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution (2405.09800v1)

Published 16 May 2024 in cs.LG, cs.HC, and math.DG

Abstract: In this paper, we dive into the reliability concerns of Integrated Gradients (IG), a prevalent feature attribution method for black-box deep learning models. We particularly address two predominant challenges associated with IG: the generation of noisy feature visualizations for vision models and the vulnerability to adversarial attributional attacks. Our approach involves an adaptation of path-based feature attribution, aligning the path of attribution more closely to the intrinsic geometry of the data manifold. Our experiments utilise deep generative models applied to several real-world image datasets. They demonstrate that IG along the geodesics conforms to the curved geometry of the Riemannian data manifold, generating more perceptually intuitive explanations and, subsequently, substantially increasing robustness to targeted attributional attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Latent Space Oddity: on the Curvature of Deep Generative Models. February 2018.
  2. Fast and Robust Shortest Paths on Manifolds Learned from Data. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, pp.  1506–1515. PMLR, April 2019. ISSN: 2640-3498.
  3. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLOS ONE, 10(7):e0130140, July 2015. ISSN 1932-6203. doi: 10.1371/journal.pone.0130140. Publisher: Public Library of Science.
  4. Fooling Partial Dependence via Data Poisoning. In Amini, M.-R., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., and Tsoumakas, G. (eds.), Machine Learning and Knowledge Discovery in Databases, volume 13715, pp.  121–136. Springer Nature Switzerland, Cham, 2023. ISBN 978-3-031-26408-5 978-3-031-26409-2. doi: 10.1007/978-3-031-26409-2˙8. Series Title: Lecture Notes in Computer Science.
  5. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, August 2013. ISSN 0162-8828. doi: 10.1109/TPAMI.2013.50.
  6. The Manifold Hypothesis for Gradient-Based Explanations. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.  3697–3702, Vancouver, BC, Canada, June 2023. IEEE. ISBN 9798350302493. doi: 10.1109/CVPRW59228.2023.00378.
  7. Why Deep Learning Works: A Manifold Disentanglement Perspective. IEEE Transactions on Neural Networks and Learning Systems, 27(10):1997–2008, October 2016. ISSN 2162-237X, 2162-2388. doi: 10.1109/TNNLS.2015.2496947.
  8. Understanding disentangling in $\beta$-VAE, April 2018. arXiv:1804.03599 [cs, stat].
  9. Concise Explanations of Neural Networks using Adversarial Training. In Proceedings of the 37th International Conference on Machine Learning, pp.  1383–1391. PMLR, November 2020. ISSN: 2640-3498.
  10. Robust Attribution Regularization. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  11. Hyperbolic VAE via Latent Gaussian Distributions, October 2023. arXiv:2209.15217 [cs].
  12. Generating In-Between Images Through Learned Latent Space Representation Using Variational Autoencoders. IEEE Access, 8:149456–149467, 2020. ISSN 2169-3536. doi: 10.1109/ACCESS.2020.3016313.
  13. Hyperspherical Variational Auto-Encoders. 2018.
  14. Explanations can be manipulated and geometry is to blame. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  15. Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature Machine Intelligence, 3(7):620–631, July 2021. ISSN 2522-5839. doi: 10.1038/s42256-021-00343-w. Number: 7 Publisher: Nature Publishing Group.
  16. Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, February 2016. ISSN 0894-0347, 1088-6834. doi: 10.1090/jams/852.
  17. Interpretable Explanations of Black Boxes by Meaningful Perturbation. In 2017 IEEE International Conference on Computer Vision (ICCV), pp.  3449–3457, Venice, October 2017. IEEE. ISBN 978-1-5386-1032-9. doi: 10.1109/ICCV.2017.371.
  18. Do Perceptually Aligned Gradients Imply Adversarial Robustness?, August 2023. arXiv:2207.11378 [cs].
  19. Interpretation of Neural Networks Is Fragile. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):3681–3688, July 2019. ISSN 2374-3468. doi: 10.1609/aaai.v33i01.33013681. Number: 01.
  20. Fooling Neural Network Interpretations via Adversarial Model Manipulation. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  21. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. November 2016.
  22. Autoencoders, Minimum Description Length and Helmholtz Free Energy. In Advances in Neural Information Processing Systems, volume 6. Morgan-Kaufmann, 1993.
  23. Deep Feature Consistent Variational Autoencoder, October 2016. arXiv:1610.00291 [cs].
  24. FAR: A General Framework for Attributional Robustness, March 2022. arXiv:2010.07393 [cs].
  25. Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study. Genome Biology, 21(1):149, June 2020. ISSN 1474-760X. doi: 10.1186/s13059-020-02055-7.
  26. Jost, J. Riemannian geometry and geometric analysis. Springer, 7 edition, 2017.
  27. Guided Integrated Gradients: an Adaptive Path Method for Removing Noise. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  5048–5056, Nashville, TN, USA, June 2021. IEEE. ISBN 978-1-66544-509-2. doi: 10.1109/CVPR46437.2021.00501.
  28. Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?, October 2019. arXiv:1910.08640 [cs, stat].
  29. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  30. Auto-Encoding Variational Bayes, December 2022. arXiv:1312.6114 [cs, stat].
  31. Fooling SHAP with Stealthily Biased Sampling. September 2022.
  32. Lee, J. M. Introduction to Smooth Manifolds. Springer, 2 edition, 2014.
  33. Lee, J. M. Introduction to Riemannian manifolds. Springer, 2 edition, 2018.
  34. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  35. A rigorous study of integrated gradients method and extensions to internal neuron attributions, 2022.
  36. Towards Deep Learning Models Resistant to Adversarial Attacks. February 2018.
  37. Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  38. Investigating Saturation Effects in Integrated Gradients, October 2020. arXiv:2010.12697 [cs].
  39. Learning Weighted Submanifolds With Variational Autoencoders and Riemannian Variational Autoencoders. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  14491–14499, Seattle, WA, USA, June 2020. IEEE. ISBN 978-1-72817-168-5. doi: 10.1109/CVPR42600.2020.01451.
  40. Automated flower classification over a large number of classes, 2008.
  41. Numerical Optimization. Springer, 1999.
  42. The oxford-iiit pet dataset, 2012.
  43. Detecting out-of-distribution samples via variational auto-encoder with reliable uncertainty estimation. Neural Networks, 145:199–208, January 2022. ISSN 0893-6080. doi: 10.1016/j.neunet.2021.10.020.
  44. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.  1135–1144, San Francisco California USA, August 2016. ACM. ISBN 978-1-4503-4232-2. doi: 10.1145/2939672.2939778.
  45. Do Input Gradients Highlight Discriminative Features?, October 2021. arXiv:2102.12781 [cs, stat].
  46. The Riemannian Geometry of Deep Generative Models. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.  428–4288, Salt Lake City, UT, USA, June 2018. IEEE. ISBN 978-1-5386-6100-0. doi: 10.1109/CVPRW.2018.00071.
  47. Learning Important Features Through Propagating Activation Differences. In Proceedings of the 34th International Conference on Machine Learning, pp.  3145–3153. PMLR, July 2017. ISSN: 2640-3498.
  48. Deep inside convolutional networks: Visualising image classification models and saliency maps. In Bengio, Y. and LeCun, Y. (eds.), 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Workshop Track Proceedings, 2014.
  49. Mixed-curvature Variational Autoencoders. September 2019.
  50. Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, AIES ’20, pp.  180–186, New York, NY, USA, February 2020. Association for Computing Machinery. ISBN 978-1-4503-7110-0. doi: 10.1145/3375627.3375830.
  51. SmoothGrad: removing noise by adding noise, June 2017. arXiv:1706.03825 [cs, stat].
  52. Striving for Simplicity: The All Convolutional Net, April 2015. arXiv:1412.6806 [cs].
  53. Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness. November 2023.
  54. Visualizing the impact of feature attribution baselines. Distill, 2020. doi: 10.23915/distill.00022. https://distill.pub/2020/attribution-baselines.
  55. Axiomatic attribution for deep networks. In International conference on machine learning, pp.  3319–3328. PMLR, 2017.
  56. Poincar\’e GloVe: Hyperbolic Word Embeddings, November 2018. arXiv:1810.06546 [cs].
  57. Robustness May Be at Odds with Accuracy. September 2018.
  58. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning - ICML ’08, pp.  1096–1103, Helsinki, Finland, 2008. ACM Press. ISBN 978-1-60558-205-4. doi: 10.1145/1390156.1390294.
  59. Exploiting the Relationship Between Kendall’s Rank Correlation and Cosine Similarity for Attribution Protection, September 2022. arXiv:2205.07279 [cs].
  60. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004. doi: 10.1109/TIP.2003.819861.
  61. Attribution in Scale and Space. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  9677–9686, Seattle, WA, USA, June 2020. IEEE. ISBN 978-1-72817-168-5. doi: 10.1109/CVPR42600.2020.00970.
  62. IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  23725–23734, Vancouver, BC, Canada, June 2023. IEEE. ISBN 9798350301298. doi: 10.1109/CVPR52729.2023.02272.
  63. On the (In)fidelity and Sensitivity for Explanations, November 2019. URL http://arxiv.org/abs/1901.09392. arXiv:1901.09392 [cs, stat].
  64. Generative Visual Manipulation on the Natural Image Manifold. In Leibe, B., Matas, J., Sebe, N., and Welling, M. (eds.), Computer Vision – ECCV 2016, Lecture Notes in Computer Science, pp.  597–613, Cham, 2016. Springer International Publishing. ISBN 978-3-319-46454-1. doi: 10.1007/978-3-319-46454-1˙36.

Summary

We haven't generated a summary for this paper yet.