Scalable Interpretability via Polynomials (2205.14108v4)

Published 27 May 2022 in cs.LG and cs.CV

Abstract: Generalized Additive Models (GAMs) have quickly become the leading choice for inherently-interpretable machine learning. However, unlike uninterpretable methods such as DNNs, they lack expressive power and easy scalability, and are hence not a feasible alternative for real-world tasks. We present a new class of GAMs that use tensor rank decompositions of polynomials to learn powerful, {\em inherently-interpretable} models. Our approach, titled Scalable Polynomial Additive Models (SPAM) is effortlessly scalable and models {\em all} higher-order feature interactions without a combinatorial parameter explosion. SPAM outperforms all current interpretable approaches, and matches DNN/XGBoost performance on a series of real-world benchmarks with up to hundreds of thousands of features. We demonstrate by human subject evaluations that SPAMs are demonstrably more interpretable in practice, and are hence an effortless replacement for DNNs for creating interpretable and high-performance systems suitable for large-scale machine learning. Source code is available at https://github.com/facebookresearch/nbm-spam.

Authors (3)

Abhimanyu Dubey (35 papers)
Filip Radenovic (20 papers)
Dhruv Mahajan (38 papers)

Citations (25)

View on Semantic Scholar

Summary

The paper introduces SPAM, which uses low-rank tensor decompositions to model complex feature interactions while maintaining interpretability.
It provides both theoretical foundations and empirical evidence, showing SPAM converges to optimal polynomials and rivals DNN and XGBoost performance.
Human evaluations affirm that SPAM offers superior interpretability, with users more accurately predicting model decisions than with traditional post-hoc methods.

Scalable Interpretability via Polynomials

The paper "Scalable Interpretability via Polynomials" introduces Scalable Polynomial Additive Models (SPAM), a novel approach to improving the interpretability and scalability of Generalized Additive Models (GAMs) in machine learning. GAMs are known for their interpretability, as they model non-linear interactions between features in a way that is comprehensible to humans. However, traditional GAMs often struggle with scalability and expressive power when dealing with large, complex datasets compared to black-box models like deep neural networks (DNNs) or XGBoost.

Core Contributions

Introduction of Scalable Polynomial Additive Models (SPAM): SPAM leverages low-rank tensor decompositions of polynomials, allowing it to model higher-order feature interactions without the exponential increase in parameters that typically limits scalability. This approach maintains inherent interpretability while achieving performance comparable to black-box models.
Theoretical and Practical Analysis: The authors provide a theoretical framework demonstrating that SPAM models converge to optimal polynomials under certain regularity conditions as the number of samples increases. Furthermore, non-asymptotic excess risk bounds are established, aligning with classic results for linear models and ensuring the practical viability of SPAM models.
Benchmark Performance: SPAM's performance was evaluated across several machine learning tasks involving datasets with hundreds of thousands of features. It outperformed existing interpretable models and matched DNN/XGBoost performance, illustrating its scalability and efficiency across diverse applications.
Human Evaluation of Interpretability: Demonstrations with human subjects verified the practical interpretability of SPAM models, showcasing a higher mean user accuracy in predicting model decisions based on explanations compared to those provided by post-hoc interpretable methods such as LIME and SHAP.

Implications

The introduction of SPAM models represents a significant advancement in interpretable machine learning, offering a scalable solution that ameliorates the expressiveness and performance gaps found in traditional GAMs. By facilitating the efficient modeling of complex feature interactions, SPAM allows for the interpretation of large-scale datasets without sacrificing accuracy or comprehensibility.

Future Directions

The paper opens several potential pathways for further research:

Application Beyond Tabular Data: Exploring SPAM models in fields such as natural language processing or computer vision, where feature interactions are inherently complex and high-dimensional, could yield insights into their adaptability and performance outside tabular datasets.
Privacy and Explainability: Investigating SPAM's capabilities to understand spurious correlations or model failure modes might prove beneficial in high-stakes decision-making scenarios, assisting in the design of systems prioritizing robustness and transparency.

In conclusion, the Scalable Polynomial Additive Models proposed in this paper present an innovative path forward for leveraging polynomial theory in scalable, interpretable machine learning, signaling a promising avenue for reconciling the interpretability-performance tradeoff that has long challenged the field.

PDF Markdown

Related Papers

GitHub

GitHub - facebookresearch/nbm-spam: Training and evaluating NBM and SPAM for interpretable machine learning. (78 stars)

Tweets

https://twitter.com/Albertyusun/status/1786444460207026533