Learning Locally Interpretable Rule Ensemble (2306.11481v1)
Abstract: This paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable. A rule ensemble is an interpretable model based on the linear combination of weighted rules. In practice, we often face the trade-off between the accuracy and interpretability of rule ensembles. That is, a rule ensemble needs to include a sufficiently large number of weighted rules to maintain its accuracy, which harms its interpretability for human users. To avoid this trade-off and learn an interpretable rule ensemble without degrading accuracy, we introduce a new concept of interpretability, named local interpretability, which is evaluated by the total number of rules necessary to express individual predictions made by the model, rather than to express the model itself. Then, we propose a regularizer that promotes local interpretability and develop an efficient algorithm for learning a rule ensemble with the proposed regularizer by coordinate descent with local search. Experimental results demonstrated that our method learns rule ensembles that can explain individual predictions with fewer rules than the existing methods, including RuleFit, while maintaining comparable accuracy.
- D. Alvarez-Melis and T. S. Jaakkola. On the robustness of interpretability methods. In Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning, pages 66–71, 2018.
- Learning certifiably optimal rule lists. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 35–44, 2017.
- Machine Bias — ProPublica. URL: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing, 2016. Accessed: 2023-06-20.
- Interpretable random forests via rule extraction. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, pages 937–945, 2021.
- L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
- Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1721–1730, 2015.
- Boolean decision rules via column generation. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pages 4660–4670, 2018.
- Learning sparse classifiers: Continuous and mixed integer optimization perspectives. Journal of Machine Learning Research, 22(135):1–47, 2021.
- F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. arXiv, arXiv:1702.08608, 2017.
- Rule-enhanced penalized regression by column generation using rectangular maximum agreement. In Proceedings of the 34th International Conference on Machine Learning, pages 1059–1067, 2017.
- Explainable Machine Learning Challenge. URL: https://community.fico.com/s/explainable-machine-learning-challenge, 2018. Accessed: 2023-06-20.
- A. A. Freitas. Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter, 15(1):1–10, 2014.
- Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.
- J. Friedman and B. E. Popescu. Gradient directed regularization for linear regression and classification. Technical report, Statistics Department, Stanford University, 2003.
- Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3):916–954, 2008.
- Optimal sparse decision trees. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, pages 7265–7273, 2019.
- A. Jacovi and Y. Goldberg. Towards faithfully interpretable NLP systems: How should we define and evaluate faithfulness? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4198–4205, 2020.
- Safe rulefit: Learning optimal sparse rule model by meta safe screening. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):2330–2343, 2023.
- LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 3149–3157, 2017.
- The UCI machine learning repository. URL: http://archive.ics.uci.edu/, 2023. Accessed: 2023-06-20.
- Human evaluation of models built for interpretability. In Proceedings of the 7th AAAI Conference on Human Computation and Crowdsourcing, pages 59–67, 2019.
- Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1675–1684, 2016.
- Z. C. Lipton. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3):31–57, 2018.
- Fast sparse classification for generalized linear and additive models. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, pages 9304–9333, 2022.
- A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 4765–4774, 2017.
- T. Miller. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267:1–38, 2019.
- Foundations of Machine Learning. The MIT Press, 2012.
- Safe pattern pruning: An efficient approach for predictive pattern mining. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1785–1794, 2016.
- M. Nalenz and T. Augustin. Compressed rule ensemble learning. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, pages 9998–10014, 2022.
- Regularizing black-box models for improved interpretability. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 10526–10536, 2020.
- “Why Should I Trust You?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1135–1144, 2016.
- Anchors: High-precision model-agnostic explanations. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pages 1527–1535, 2018.
- Interpretations are useful: Penalizing explanations to align neural networks with prior knowledge. In Proceedings of the 37th International Conference on Machine Learning, pages 8116–8126, 2020.
- Right for the right reasons: Training differentiable models by constraining their explanations. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pages 2662–2670, 2017.
- C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1:206–215, 2019.
- C. Rudin and Y. Shaposhnik. Globally-consistent rule-based summary-explanations for machine learning models: Application to credit-risk evaluation. Journal of Machine Learning Research, 24(16):1–44, 2023.
- Interpretable machine learning: Fundamental principles and 10 grand challenges. Statistics Surveys, 16:1–85, 2022.
- R. Tibshirani. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58:267–288, 1994.
- B. Ustun and C. Rudin. Learning optimized risk scores. Journal of Machine Learning Research, 20(150):1–75, 2019.
- F. Wang and C. Rudin. Falling Rule Lists. In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, pages 1013–1022, 2015.
- Generalized linear rule models. In Proceedings of the 36th International Conference on Machine Learning, pages 6687–6696, 2019.
- Scalable Bayesian Rule Lists. In Proceedings of the 34th International Conference on Machine Learning, pages 3921–3930, 2017.
- Locally sparse neural networks for tabular biomedical data. In Proceedings of the 39th International Conference on Machine Learning, pages 25123–25153, 2022.
- LIMIS: Locally interpretable modeling using instance-wise subsampling. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=S8eABAy8P3.
- G. Zhang and A. Gionis. Regularized impurity reduction: accurate decision trees with complexity guarantees. Data Mining and Knowledge Discovery, 37(1):434–475, 2023.