PolySHAP: Polynomial Shapley Explanations
- PolySHAP is an explainable AI method that uses polynomial regression to derive consistent and efficient Shapley value estimates while incorporating higher-order feature interactions.
- The approach models the explanation game with a multilinear polynomial that extends beyond additive effects, improving empirical accuracy and addressing computational barriers.
- Empirical benchmarks confirm that PolySHAP enhances error metrics and precision, with theoretical guarantees ensuring convergence to true Shapley values via a Möbius transformation.
PolySHAP is a method in explainable artificial intelligence (XAI) designed to estimate Shapley values more accurately and efficiently by fitting low-degree polynomials to the model-explanation game, thereby capturing feature interactions absent from traditional KernelSHAP. Shapley values, grounded in cooperative game theory, quantify individual feature contributions by averaging marginal effects over all feature subset combinations, but their direct computation requires game evaluations — an exponential barrier for models with moderately large . KernelSHAP ameliorates this cost by approximating the explanation game as a linear function via weighted least squares over a sampled subset of feature combinations, accounting only for additive effects. PolySHAP generalizes KernelSHAP by fitting higher-order (degree-) polynomial models to efficiently incorporate non-linear feature interactions, providing provably consistent Shapley value estimates with improved empirical accuracy on benchmark datasets (Fumagalli et al., 26 Jan 2026).
1. Limitations of KernelSHAP and Additive Approximation
KernelSHAP estimates Shapley values for a model-agnostic game by solving a weighted linear least squares problem. The exact formulation is:
where the weight for $0<|S|
This procedure is computationally tractable, scaling as , but is constrained to additive effects—it cannot capture interactions beyond linearity.
2. PolySHAP Polynomial Regression Formulation
PolySHAP constructs a multilinear polynomial of degree over the feature set, generalizing the additive model. The "interaction frontier" is given by:
with one coefficient per subset . The polynomial approximation is:
where . The fitting objective is:
where . This is solved via constrained least squares regression, using a design matrix and target vector . The solution is subject to the constraint , handled by standard projection techniques.
3. Shapley Value Extraction via Möbius–Shapley Transformation
After determining , the estimated Shapley values are recovered by mapping monomial coefficients to individual feature attributions via:
For , this reduces to the KernelSHAP additive solution . For , higher-order coefficients correct the Shapley estimate for non-additive interactions. This transformation, rigorously derived in [(Fumagalli et al., 26 Jan 2026), Theorem 3.1], constitutes a Möbius–Shapley conversion and ensures coherent attribution, including interaction effects.
4. Consistency Guarantees
PolySHAP is consistent in the sense that, as the number of sampled subsets , the estimated polynomial coefficients converge to the population minimizer, and the resulting Shapley values converge to the true Shapley values for the explanation game. The proof strategy employs (i) recasting both degree-1 and degree- weighted least squares problems into constrained matrix regression, (ii) use of a projection lemma (Lemma A.1) establishing equivalence between degree- and projected degree-1 solutions in full enumeration, and (iii) demonstration that reconstructed Shapley values via Möbius–Shapley transformation recover the population solution exactly, under full-rank design and unbiased sampling assumptions (Fumagalli et al., 26 Jan 2026).
5. Paired Sampling and Algebraic Equivalence to Quadratic PolySHAP
Paired (antithetic) sampling is a KernelSHAP heuristic involving sampling subsets in complementary pairs ( and ), empirically observed to improve accuracy by reducing estimator variance. PolySHAP establishes a formal equivalence: KernelSHAP with paired sampling solves the same normal equations as 2-PolySHAP projected to degree-1. Specifically:
The proof rests on symmetries in the cross-moment matrix under paired data, which ensure all quadratic columns collapse to effective additive corrections, yielding identical Shapley estimates [(Fumagalli et al., 26 Jan 2026), Theorem 4.1]. This result provides a theoretical foundation for the practical success of paired sampling and shows it implicitly incorporates all second-order interactions at no additional computational cost.
6. Empirical Performance and Practical Recommendations
Empirical benchmarks conducted in (Fumagalli et al., 26 Jan 2026) demonstrate that PolySHAP improves mean squared error (MSE), top-5 precision, and Spearman correlation over additive KernelSHAP as the polynomial degree increases (), provided the number of samples . Paired sampling enables KernelSHAP () to perform equivalently to quadratic PolySHAP (), but higher-order () models yield further improvements when computational budgets allow. Recommended regime for moderate is or $3$; for high-dimensional settings, employing a "partial frontier" (random subset of degree- monomials) is suggested. Leverage-score sampling ( row leverage) further reduces variance, guaranteeing for -accurate fits. Total computational cost is for regression and for Shapley extraction, with model evaluation typically the dominant factor except when is extremely large.
7. Summary and Scope
PolySHAP generalizes KernelSHAP by fitting interaction-informed low-degree polynomial models to the model-agnostic explanation game, enabling consistent recovery of Shapley values including non-linear interactions. The method is grounded in weighted least squares regression constrained to sum-to-game-total, with Shapley value extraction via Möbius transformation. The use of paired sampling is formally justified as algebraically equivalent to including all second-order interactions. PolySHAP provides improved statistical accuracy and theoretical consistency while remaining computationally feasible for modest degree and dimension. Empirical guidelines favor degree or $3$ and leverage-score or paired sampling strategies for practical deployment (Fumagalli et al., 26 Jan 2026).