- The paper introduces SPAM, which uses low-rank tensor decompositions to model complex feature interactions while maintaining interpretability.
- It provides both theoretical foundations and empirical evidence, showing SPAM converges to optimal polynomials and rivals DNN and XGBoost performance.
- Human evaluations affirm that SPAM offers superior interpretability, with users more accurately predicting model decisions than with traditional post-hoc methods.
Scalable Interpretability via Polynomials
The paper "Scalable Interpretability via Polynomials" introduces Scalable Polynomial Additive Models (SPAM), a novel approach to improving the interpretability and scalability of Generalized Additive Models (GAMs) in machine learning. GAMs are known for their interpretability, as they model non-linear interactions between features in a way that is comprehensible to humans. However, traditional GAMs often struggle with scalability and expressive power when dealing with large, complex datasets compared to black-box models like deep neural networks (DNNs) or XGBoost.
Core Contributions
- Introduction of Scalable Polynomial Additive Models (SPAM): SPAM leverages low-rank tensor decompositions of polynomials, allowing it to model higher-order feature interactions without the exponential increase in parameters that typically limits scalability. This approach maintains inherent interpretability while achieving performance comparable to black-box models.
- Theoretical and Practical Analysis: The authors provide a theoretical framework demonstrating that SPAM models converge to optimal polynomials under certain regularity conditions as the number of samples increases. Furthermore, non-asymptotic excess risk bounds are established, aligning with classic results for linear models and ensuring the practical viability of SPAM models.
- Benchmark Performance: SPAM's performance was evaluated across several machine learning tasks involving datasets with hundreds of thousands of features. It outperformed existing interpretable models and matched DNN/XGBoost performance, illustrating its scalability and efficiency across diverse applications.
- Human Evaluation of Interpretability: Demonstrations with human subjects verified the practical interpretability of SPAM models, showcasing a higher mean user accuracy in predicting model decisions based on explanations compared to those provided by post-hoc interpretable methods such as LIME and SHAP.
Implications
The introduction of SPAM models represents a significant advancement in interpretable machine learning, offering a scalable solution that ameliorates the expressiveness and performance gaps found in traditional GAMs. By facilitating the efficient modeling of complex feature interactions, SPAM allows for the interpretation of large-scale datasets without sacrificing accuracy or comprehensibility.
Future Directions
The paper opens several potential pathways for further research:
- Application Beyond Tabular Data: Exploring SPAM models in fields such as natural language processing or computer vision, where feature interactions are inherently complex and high-dimensional, could yield insights into their adaptability and performance outside tabular datasets.
- Privacy and Explainability: Investigating SPAM's capabilities to understand spurious correlations or model failure modes might prove beneficial in high-stakes decision-making scenarios, assisting in the design of systems prioritizing robustness and transparency.
In conclusion, the Scalable Polynomial Additive Models proposed in this paper present an innovative path forward for leveraging polynomial theory in scalable, interpretable machine learning, signaling a promising avenue for reconciling the interpretability-performance tradeoff that has long challenged the field.