Real-Time Explanations for Tabular Foundation Models

Published 31 Mar 2026 in cs.LG | (2603.29946v1)

Abstract: Interpretability is central for scientific machine learning, as understanding \emph{why} models make predictions enables hypothesis generation and validation. While tabular foundation models show strong performance, existing explanation methods like SHAP are computationally expensive, limiting interactive exploration. We introduce ShapPFN, a foundation model that integrates Shapley value regression directly into its architecture, producing both predictions and explanations in a single forward pass. On standard benchmarks, ShapPFN achieves competitive performance while producing high-fidelity explanations ($R^2$=0.96, cosine=0.99) over 1000\times faster than KernelSHAP (0.06s vs 610s). Our code is available at https://github.com/kunumi/ShapPFN

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces ShapPFN, a model that integrates Shapley value regression to deliver real-time, high-fidelity feature explanations alongside predictive outputs.
It employs custom decoder heads and a Shapley consistency loss to enforce additive attributions, achieving near-equivalence with KernelSHAP in explanation quality.
Empirical results on OpenML-CC18 datasets show >1000× speedup in explanation time with only a minor predictive accuracy drop (≤0.011 AUC), making interactive analysis feasible.

Real-Time Explanations for Tabular Foundation Models: An Expert Analysis

Introduction

"Real-Time Explanations for Tabular Foundation Models" (2603.29946) addresses a critical challenge in interpretable machine learning for tabular data—achieving fast, high-fidelity feature attributions integrated within highly performant foundation models (FMs). The work introduces ShapPFN, a novel model class that enables simultaneous prediction and feature attribution via explicit Shapley value regression integrated into Prior-Data Fitted Networks (PFNs). The resulting architecture closes the gap between the generalization strength of PFNs and the explainability provided by model-agnostic methods like SHAP, making interactive and scientifically-driven model interrogation feasible.

Motivation and Context

Interpretability remains a fundamental requirement in scientific ML, directly impacting hypothesis generation and causal inference. The theoretical appeal of Shapley-based attributions (i.e., additivity, fairness) has driven their widespread adoption. However, their prohibitive computational cost (enumerating feature coalitions) limits real-time or high-throughput workflows, especially when using post-hoc explainers such as KernelSHAP. Prior art, such as ViaSHAP, demonstrated that integrating prediction and attribution can dramatically accelerate explanations, but these gains had not been realized in PFN-style, data-generalizing models. ShapPFN is positioned as the first method to integrate Shapley value regression with tabular FMs, offering both predictive and explanatory outputs in a single, efficient forward pass.

ShapPFN Architecture and Training

Architectural Integration

ShapPFN builds on the nanoTabPFN architecture—a lightweight variant of TabPFN—retaining the core Transformer blocks and in-context learning capabilities. The central architectural advance is the inclusion of two custom decoder heads:

BaseDecoder: Computes a global baseline, conceptually analogous to the prediction with all features masked.
ShapDecoder: Outputs per-feature additive contributions, enforcing an explicit decomposition:

$f_{\theta}(x) = \text{base} + \sum_{f=1}^F \phi_f(x)$

The additivity enables extraction of Shapley-like values directly from the network output, rather than via expensive post-hoc sampling.

Shapley Consistency Loss

The training objective synthesizes two losses:

Cross-entropy loss for predictive accuracy.
Shapley consistency loss enforcing that the sum of the decoded feature attributions over masked coalitions approximates the expected model output (with masked features marginalized by empirical sampling). This loss is kernel-weighted to reflect the Shapley value calculation.

Masked feature input is generated via interventional sampling (i.e., features are replaced with random values drawn from the data), aligning with the causal interpretation of feature ablation ([pmlr-v108-janzing20a]).

Hyperparameter optimization confirms that strong predictive and explanation performance is robust to the number of SHAP subsets and background samples, but the SHAP loss weight must be carefully balanced to avoid predictivity–attribution trade-offs.

Experimental Results

Predictive Performance

Evaluation on OpenML-CC18 datasets establishes that ShapPFN delivers competitive ROC-AUC (0.848 average across 36 datasets), matching the classical Random Forest and the foundation model baseline NanoTabPFN. While TabPFN v2 outperforms all on average (0.872), the architectural introduction of SHAP heads and the addition of Shapley constraints incur only a minor average predictive cost (≤0.011 AUC drop relative to the base architecture). Results remain strong across both hyperparameter-optimized (HPO) and evaluation-only (Eval) splits.

Explanation Fidelity and Efficiency

ShapPFN’s attributions are evaluated against KernelSHAP, the de facto model-agnostic standard for Shapley fidelity. On all datasets tested:

Explanation Quality: High agreement with KernelSHAP—mean $R^2 = 0.963$ , cosine similarity = 0.987, Spearman ρ = 0.954—demonstrates near-equivalence in the generated attributions.
Computational Cost: ShapPFN explanations require only 0.06s per instance vs 610s for KernelSHAP (geometric mean), yielding >1000× speedup. On some datasets, speedup approaches 50,000×.

Critically, the ablated architecture (without SHAP loss) exhibits substantial degradation in explanation fidelity, underscoring the necessity of the loss for SHAP-consistent attributions.

Implications and Future Directions

Practical Implications

ShapPFN effectively transforms model feature attribution from an offline, high-latency diagnostic into an integrated, interactive tool for scientific analytics. This is especially salient for domains where researchers require real-time hypothesis testing, rapid ablation studies, or model debugging. The adoption potential is substantial in high-stakes scientific, medical, or policy contexts that demand both accuracy and transparent reasoning.

Theoretical and Architectural Significance

The design demonstrates that it is possible to enforce SHAP axioms within a strong, pre-trained FM without compromising on predictive performance or scalability, provided the attribution regression signal is appropriately regularized and integrated. This raises prospects for further research integrating additional forms of explanation constraints (e.g., causal, counterfactual) into generalizing FMs.

Future Work

Extensions to multi-target regression, probabilistic outputs, and foundational models beyond the tabular domain are areas of direct interest. Investigations into learning more general forms of explanation (e.g., higher-order interactions or submodular attributions), adaptation to federated settings, and scaling to orders of magnitude larger and higher-dimensional data are also viable. Research into applying integrated explainability at even lower latency and wider settings (streaming, embedded) is warranted.

Conclusion

ShapPFN substantiates that high-fidelity, real-time Shapley explanations for tabular foundation models are feasible by integrating Shapley value regression into the FM architecture and loss. The approach maintains baseline predictive accuracy while achieving explanation quality and computational efficiency that renders interactive scientific modeling practical. This development establishes a new paradigm for interpretable FMs in tabular domains, enabling highly explainable, data-efficient, and performant modeling suitable for scientific ML workflows.

Markdown Report Issue