Consistent Explanations in the Face of Model Indeterminacy via Ensembling (2306.06193v2)
Abstract: This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy, which arises due to the existence of multiple (nearly) equally well-performing models for a given dataset and task. Despite their similar performance, such models often exhibit inconsistent or even contradictory explanations for their predictions, posing challenges to end users who rely on these models to make critical decisions. Recognizing this issue, we introduce ensemble methods as an approach to enhance the consistency of the explanations provided in these scenarios. Leveraging insights from recent work on neural network loss landscapes and mode connectivity, we devise ensemble strategies to efficiently explore the underspecification set -- the set of models with performance variations resulting solely from changes in the random seed during training. Experiments on five benchmark financial datasets reveal that ensembling can yield significant improvements when it comes to explanation similarity, and demonstrate the potential of existing ensemble methods to explore the underspecification set efficiently. Our findings highlight the importance of considering model indeterminacy when interpreting explanations and showcase the effectiveness of ensembles in enhancing the reliability of explanations in machine learning.
- OpenXAI: Towards a transparent evaluation of model explanations. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. URL https://openreview.net/forum?id=MU2495w47rz.
- AI-Rights. Blueprint for an AI Bill of Rights, 2022. URL https://www.whitehouse.gov/ostp/ai-bill-of-rights/.
- Git re-basin: Merging models modulo permutation symmetries. In International Conference on Learning Representations, 2023.
- Z. Allen-Zhu and Y. Li. Towards understanding ensemble, knowledge distillation and self-distillation in deep learning. In International Conference on Learning Representations, 2023.
- Loss surface simplexes for mode connecting volumes and fast ensembling. In International Conference on Machine Learning, 2021.
- Selective ensembles for consistent predictions. In International Conference on Learning Representations, 2021.
- Implications of model indeterminacy for explanations of automated decisions. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2022. URL https://openreview.net/pdf?id=LzbrVf-l0Xq.
- Underspecification presents challenges for credibility in modern machine learning. Journal of Machine Learning Research, 23(226):1–61, 2022. URL http://jmlr.org/papers/v23/20-1335.html.
- T. G. Dietterich. Ensemble methods in machine learning. In Multiple Classifier Systems, pages 1–15, Berlin, Heidelberg, 2000. Springer Berlin Heidelberg. ISBN 978-3-540-45014-6.
- Essentially no barriers in neural network energy landscape. In International Conference on Machine Learning, 2019.
- D. Dua and C. Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml. Accessed on 2023-05-16.
- FICO. Explainable machine learning challenge, 2018. URL https://community.fico.com/s/explainable-machine-learning-challenge?tabset-158d9=d157e. Accessed on 2023-05-16.
- S. Fort and S. Jastrzebski. Large scale structure of neural network loss landscapes. In Advances in Neural Information Processing Systems, volume abs/1906.04724, 2019. URL http://arxiv.org/abs/1906.04724.
- B. Freshcorn. Give me some credit :: 2011 competition data | kaggle, 2022. URL https://www.kaggle.com/datasets/brycecf/give-me-some-credit-dataset. Accessed on 2023-05-16.
- Born again neural networks. In J. Dy and A. Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1607–1616. PMLR, 10–15 Jul 2018. URL https://proceedings.mlr.press/v80/furlanello18a.html.
- Y. Gal. Uncertainty in deep learning. University of Cambridge, 1(3), 2016.
- Loss surfaces, mode connectivity, and fast ensembling of dnns. In Advances in Neural Information Processing Systems, 2018.
- GDPR. General Data Protection Regulation (GDPR), 2018. URL https://gdpr.eu/tag/gdpr/.
- Using mode connectivity for loss landscape analysis. In ICML Workshop on Modern Trends in Nonconvex Optimization for Machine Learning,, volume abs/1806.06977, 2018. URL http://arxiv.org/abs/1806.06977.
- Which explanation should I choose? A function approximation perspective to characterizing post hoc explanations. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=rTvH1_SRyXs.
- Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop, 2015. URL http://arxiv.org/abs/1503.02531.
- Averaging weights leads to wider optima and better generalization. In Conference on Uncertainty in Artificial Intelligence, volume abs/1803.05407, 2018. URL http://arxiv.org/abs/1803.05407.
- Joblib Development Team. Joblib: running python functions as pipeline jobs, 2020. URL https://joblib.readthedocs.io/.
- The disagreement problem in explainable machine learning: A practitioner’s perspective, 2022. URL https://arxiv.org/abs/2202.01602.
- Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems, 2017.
- Cross-model consensus of explanations and beyond for image classification models: An empirical study, 2021. URL https://arxiv.org/abs/2109.00707.
- Tune: A research platform for distributed model selection and training. In ICML Workshop on AutoML, 2018.
- A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf.
- D. J. MacKay. A practical Bayesian framework for backpropagation networks. Neural computation, 4(3):448–472, 1992.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, 2019.
- "why should i trust you?": Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 1135–1144, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450342322. doi: 10.1145/2939672.2939778. URL https://doi.org/10.1145/2939672.2939778.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, abs/1312.6034, 2013.
- S. P. Singh and M. Jaggi. Model fusion via optimal transport. In Advances in Neural Information Processing Systems, volume abs/1910.05653, 2020. URL http://arxiv.org/abs/1910.05653.
- Smoothgrad: removing noise by adding noise. In ICML Workshop on Visualization for Deep Learning, Sydney, Australia, 2017. doi: 0.48550/arXiv.1706.03825. URL https://arxiv.org/abs/1706.03825.
- Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014. URL http://jmlr.org/papers/v15/srivastava14a.html.
- Optimizing mode connectivity via neuron alignment. In Advances in Neural Information Processing Systems, 2020.
- Towards robust and reliable algorithmic recourse. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 16926–16937. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/file/8ccfb1140664a5fa63177fb6e07352f0-Paper.pdf.
- G. Van Rossum and F. L. Drake Jr. Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam, 1995.
- Adversarial weight perturbation helps robust generalization. In Advances in Neural Information Processing Systems, volume abs/2004.05884, 2020. URL https://arxiv.org/abs/2004.05884.
- On the certified robustness for ensemble models and beyond, 2021. URL https://arxiv.org/abs/2107.10873.
- The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36(2, Part 1):2473–2480, 2009. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2007.12.020. URL https://www.sciencedirect.com/science/article/pii/S0957417407006719.
- Bridging mode connectivity in loss landscapes and adversarial robustness. In International Conference on Learning Representations, 2020. URL https://arxiv.org/abs/2005.00060.