Papers
Topics
Authors
Recent
Search
2000 character limit reached

Learning with Explanation Constraints

Published 25 Mar 2023 in cs.LG, cs.AI, and stat.ML | (2303.14496v3)

Abstract: As larger deep learning models are hard to interpret, there has been a recent focus on generating explanations of these black-box models. In contrast, we may have apriori explanations of how models should behave. In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models. One may naturally ask, "When would these explanations be helpful?" Our first key contribution addresses this question via a class of models that satisfies these explanation constraints in expectation over new data. We provide a characterization of the benefits of these models (in terms of the reduction of their Rademacher complexities) for a canonical class of explanations given by gradient information in the settings of both linear models and two layer neural networks. In addition, we provide an algorithmic solution for our framework, via a variational approximation that achieves better performance and satisfies these constraints more frequently, when compared to simpler augmented Lagrangian methods to incorporate these explanations. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. M.-F. Balcan and A. Blum. A pac-style model for learning from labeled and unlabeled data. In International Conference on Computational Learning Theory, pages 111–126. Springer, 2005.
  2. M.-F. Balcan and A. Blum. A discriminative model for semi-supervised learning. Journal of the ACM (JACM), 57(3):1–46, 2010.
  3. Impossibility theorems for feature attribution. arXiv preprint arXiv:2212.11870, 2022.
  4. J. R. Birge and F. Louveaux. Introduction to stochastic programming. Springer Science & Business Media, 2011.
  5. Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Transactions on Neural Networks, 20(3):542–542, 2009.
  6. Sobolev training for neural networks. Advances in neural information processing systems, 30, 2017.
  7. Lagrangian duality for constrained deep learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 118–135. Springer, 2021.
  8. Self-training converts weak learners to strong learners in mixture models. In International Conference on Artificial Intelligence and Statistics, pages 8003–8021. PMLR, 2022.
  9. Posterior regularization for structured latent variable models. The Journal of Machine Learning Research, 11:2001–2049, 2010.
  10. S. Garg and Y. Liang. Functional regularization for representation learning: A unified theoretical perspective. Advances in Neural Information Processing Systems, 33:17187–17199, 2020.
  11. Deep learning. MIT press, 2016.
  12. Counterfactual visual explanations. In International Conference on Machine Learning, pages 2376–2384. PMLR, 2019.
  13. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  14. Harnessing deep neural networks with logic rules. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2410–2420, 2016.
  15. Improving deep learning interpretability by saliency guided training. Advances in Neural Information Processing Systems, 34:26726–26739, 2021.
  16. Stochastic programming, volume 6. Springer, 1994.
  17. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International conference on machine learning, pages 2668–2677. PMLR, 2018.
  18. P. W. Koh and P. Liang. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017.
  19. Concept bottleneck models. In International Conference on Machine Learning, pages 5338–5348. PMLR, 2020.
  20. Deep learning. nature, 521(7553):436–444, 2015.
  21. M. Ledoux and M. Talagrand. Probability in Banach Spaces: isoperimetry and processes, volume 23. Springer Science & Business Media, 1991.
  22. A learning theoretic perspective on local explainability. In International Conference on Learning Representations, 2020.
  23. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
  24. T. Ma. Lecture notes from machine learning theory, 2022. URL http://web.stanford.edu/class/stats214/.
  25. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency, pages 607–617, 2020.
  26. Beyond word importance: Contextual decomposition to extract interactions from lstms. In International Conference on Learning Representations, 2018.
  27. Learning with noisy labels. Advances in neural information processing systems, 26, 2013.
  28. Label propagation with weak supervision. In The Eleventh International Conference on Learning Representations, 2022.
  29. Snorkel: Rapid training data creation with weak supervision. In Proceedings of the VLDB Endowment. International Conference on Very Large Data Bases, volume 11, page 269. NIH Public Access, 2017.
  30. Data programming: Creating large training sets, quickly. Advances in neural information processing systems, 29, 2016.
  31. ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
  32. Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. In International conference on machine learning, pages 8116–8126. PMLR, 2020.
  33. Right for the right reasons: training differentiable models by constraining their explanations. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pages 2662–2670, 2017.
  34. A theory of PAC learnability under transformation invariances. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=l1WlfNaRkKw.
  35. Smoothgrad: removing noise by adding noise. ICML Workshop on Visualization for Deep Learning, 2017, 2017.
  36. Supervising model attention with human explanations for robust natural language inference. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11349–11357, 2022.
  37. Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR, 2017.
  38. L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984.
  39. Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL & Tech., 31:841, 2017.
  40. Theoretical analysis of self-training with deep networks on unlabeled data. In International Conference on Learning Representations, 2020.
  41. Robust explanation constraints for neural networks. In The Eleventh International Conference on Learning Representations, 2022.
  42. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10687–10698, 2020.
  43. Representer point selection for explaining deep neural networks. Advances in neural information processing systems, 31, 2018.
  44. On completeness-aware concept-based explanations in deep neural networks. Advances in Neural Information Processing Systems, 33:20554–20565, 2020.
  45. Concept embedding models. In NeurIPS 2022-36th Conference on Neural Information Processing Systems, 2022.
  46. Wrench: A comprehensive benchmark for weak supervision. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
Citations (5)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.