Proof of Efficient Attribution (PoEA)

Updated 30 December 2025

Proof of Efficient Attribution (PoEA) is a cryptographic and algorithmic paradigm that ensures efficient, verifiable, and ε-optimal attribution in machine learning.
It employs interactive PAC protocols and sublinear verifier retrainings to robustly validate computed attributions even in resource-constrained environments.
Applications include model interpretability, data pricing, fairness, and secure distributed inference, offering scalable and trustworthy ML verification solutions.

Proof of Efficient Attribution (PoEA) is a rigorously defined, algorithmic, and cryptographic paradigm designed to ensure that feature or data attributions in machine learning—especially those derived from computationally intensive methods—are not only computed correctly but can also be checked efficiently and verifiably by resource-constrained parties. PoEA protocols formalize efficiency, completeness, and soundness guarantees in data/model attribution, offering scalable and trustworthy solutions across a broad spectrum of applications, including model interpretability, data pricing, fairness, and distributed AI infrastructure verification.

1. Formal Problem Statement and Theoretical Guarantees

PoEA protocols address the challenge of verifying that an attribution vector—such as a data-influence score, Shapley value, or linear predictor fitted to counterfactual model outputs—is $\epsilon$ -close (in mean squared error or task-relevant metric) to the optimal attribution, with failure probability at most $\delta$ , while requiring only sublinear computational effort by the verifier.

Given:

A training set $S = \{s_1, \ldots, s_N\}$ and an associated function $f: \{-1,1\}^N \to \mathbb{R}$ representing a model statistic (e.g., a logit, test error differential) that can be evaluated via retraining.
The objective is to verify a claimed linear datamodel $g_a(x) = \langle a, x \rangle$ without recomputing expensive influence calculations.

Let $\Phi(S)$ be the optimal attribution vector minimizing the $p$ -biased MSE:

$\Phi(S) = \arg\min_a \mathbb{E}_{x \sim \mathbb{B}_p}\left[(f(x) - \langle a, x \rangle)^2\right]$

For any candidate $a'$ returned by an untrusted party, PoEA ensures that with probability at least $1-\delta$ , the sub-optimality error

$\operatorname{err}(a', \Phi(S)) := \mathbb{E}_{x \sim \mathbb{B}_p}\left[(f(x) - \langle a', x \rangle)^2\right] - \mathbb{E}_{x \sim \mathbb{B}_p}\left[(f(x) - \langle \Phi(S), x \rangle)^2\right]$

is at most $\epsilon$ after only $O(\log(1/\delta)/\epsilon^2)$ verifier retrainings, independent of $N$ (Karchmer et al., 14 Aug 2025).

2. Interactive PoEA Protocols and PAC Verification

PoEA instantiates a minimally interactive, resource-efficient protocol between a computationally unbounded Prover (P) and a resource-limited Verifier (V), formalizing the verification task in the interactive PAC framework.

Protocol Structure:

Phase 1: Verifier prepares challenge sample sets $E$ (for residual estimation) and $M$ (for mean-squared error evaluation) by sampling from the Boolean hypercube, using random seeds to fix retrainings.
Phase 2: Prover computes $a^* = \Phi(S)$ (using, e.g., empirical influence or datamodeling), trains models for all $e \in E$ , and returns $(a^*, \{\theta_e\}_{e \in E})$ .
Phase 3 (Verifier):

Spot-checks a subset $C \subset E$ : retrains models to detect Prover deviation.
Runs a robust degree-2 residual estimator (e.g., Saunshi–Goldwasser 2022) to estimate the error of the best linear fit.
Computes mean squared error on $M$ using local retrainings and outputs $a^*$ if the empirical error is within $\epsilon/2$ of the residual estimate, else aborts.

Guarantees:

Completeness: Honest prover is accepted with probability $\geq 1-\delta$ if attributions are $\epsilon$ -optimal.
Soundness: Malicious prover is only accepted with probability $\leq \delta$ if sub-optimality $> \epsilon$ .
Verifier Complexity: $O(\log(1/\delta)/\epsilon^2)$ local retrainings and negligible additional computation (Karchmer et al., 14 Aug 2025).

3. Algorithmic Generalizations and Applications

PoEA’s mathematical foundation—linear verification of functions over the Boolean hypercube—broadly encompasses multiple attribution frameworks:

Any attribution method yielding a linear model (empirical influence, Shapley approximation, representer points) is directly verifiable.
The core requirement is the ability to spot-check functional evaluations and execute residual estimation over degree-2 Fourier coefficients.

Notable algorithmic instantiations include:

Least-squares Shapley attribution admits acceleration via block QR, providing unbiased Monte Carlo estimates of feature attributions for least-squares models in $O((N+M)p^2 + Kp^3)$ time, a polynomial speedup over classical Shapley evaluation (Bell et al., 2023).
Axiomatic multiterm attribution grounded in Shapley–Aumann–Shapley–Shubik theory ensures uniqueness and efficient computation for multilinear functions and permits explicit allocation in economic and network flows (Sun et al., 2011).

Table: Core Attributes of PoEA Protocols

Guarantee	PoEA Protocols (Karchmer et al., 14 Aug 2025)	LS-Shapley (Bell et al., 2023)	ASS Axiomatic (Sun et al., 2011)
Soundness	$(\epsilon, \delta)$ PAC	Unbiased Monte Carlo	Unique under five axioms
Verifier Complexity	$O(\log(1/\delta)/\epsilon^2)$ retrainings	$O(Kp^3)$ QR/block solves	$O(N n^2)$ DP for multilinear
Class of Attributable Func	Linear/Boolean hypercube	$R^2$ for LS models	Multilinear + additive functions

4. Efficient Model Explanation and Submodular Black-box Attribution

Recent advances extend PoEA principles to black-box and submodular contexts. The LiMA framework defines attribution as a submodular maximization problem, optimizing a structured function $F$ over input regions. Key structural properties—diminishing returns and monotonicity—enable bidirectional greedy algorithms with $(1-1/e-\epsilon)$ -approximation guarantees in $O(|V|^2)$ model evaluations (Chen et al., 1 Apr 2025). This sharply contrasts with brute-force enumeration and is validated empirically on high-dimensional models.

Faithfulness here is established not by explicit MSE bounds, but via submodular proxy metrics (consistency, collaboration, confidence, and diversity) shown to capture the core attributional signal, with empirical Insertion/Deletion AUC improvements of 30–60% over prior state-of-the-art methods.

5. Cryptographic PoEA: Verifiable Attribution in Large-Scale Distributed Inference

PoEA as a cryptographically-binded consensus primitive emerges in the context of secure, decentralized inference protocols. As instantiated in Optimistic TEE-Rollups (OTR), PoEA binds each model output to a hardware attestation (e.g., via NVIDIA H100 TEE DCAP quote and MRENCLAVE identity), ensuring not only attribution integrity but model authenticity on-chain (Chan et al., 23 Dec 2025).

Protocol Overview: After enclave execution, the sequencer publishes a tuple $(r, \sigma_{\text{TEE}}, \text{MRENCLAVE})$ , where $\sigma_{\text{TEE}}$ is a cryptographic attestation on the hashed input/output and a unique enclave measurement.
Verification: On-chain verification checks (1) validity of attestation under the manufacturer root and (2) that the enclave measurement matches the registered model identity.
Probabilistic Security: Occasional random zero-knowledge spot-checks ensure that adversaries cannot forge model outputs or downgrade models without overwhelming probability of detection. The expected cost overhead remains marginal ($\approx\$0.07 $per query), while finality is provisioned at sub-second scale (<a href="/papers/2512.20176" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Chan et al., 23 Dec 2025</a>).</li> </ul> <p>The security analysis formalizes the adversary’s expected profit against detection probability and slashing penalties, establishing incentive compatibility for rational agents.</p> <h2 class='paper-heading' id='axiomatic-foundations-uniqueness-and-fairness-in-efficient-attribution'>6. Axiomatic Foundations: Uniqueness and Fairness in Efficient Attribution</h2> <p>Axiomatic approaches, particularly in cost-sharing and performance decomposition contexts, underpin the uniqueness and fairness guarantees for efficient attribution:</p> <ul> <li>The five axioms—Dummy, Additivity, Conditional Nonnegativity, Affine Scale Invariance, Anonymity—jointly define the Aumann–Shapley–Shubik (ASS) method as the unique attribution on the sum of multilinear and additive characteristic functions (<a href="/papers/1102.0989" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Sun et al., 2011</a>).</li> <li>Efficient$ O(N n^2)$ dynamic programming algorithms enable direct computation of the ASS value, permitting practical deployment of PoEA even in combinatorial settings such as advertising auction spend breakdowns, portfolio analysis, and e-commerce funnel attribution.</li> </ul> <p>These results delineate the frontier where efficient, uniquely fair attribution is possible, with necessity and sufficiency grounded in function class.</p> <h2 class='paper-heading' id='practical-impact-and-faithfulness-error-bounds'>7. Practical Impact and Faithfulness Error Bounds</h2> <p>Empirical validation of PoEA instantiations in deep learning and econometrics aligns theoretical guarantees with observed efficiency:</p> <ul> <li>MFABA demonstrates >100$\times $runtime speedups over integral-based interpretability baselines, with attributions matching loss differences to within$ 1.2\% $and achieving state-of-the-art insertion/deletion AUC scores (<a href="/papers/2312.13630" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Zhu et al., 2023</a>).</li> <li>LiMA’s submodular black-box approach yields 1.6$ \times $faster attribution compared to greedy baselines, with statistically significant improvements in the identification of influential input regions (<a href="/papers/2504.00470" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Chen et al., 1 Apr 2025</a>).</li> <li>All formal PoEA procedures deliver provable faithfulness/error bounds: either direct ($ \epsilon $-MSE PAC) (<a href="/papers/2508.10866" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Karchmer et al., 14 Aug 2025</a>), indirect (submodular$ (1-1/e-\epsilon)$ approximation) (Chen et al., 1 Apr 2025), or cryptographic (attestation-validity, ZK spot-check sample probability) (Chan et al., 23 Dec 2025).

PoEA thus provides the methodological and practical infrastructure necessary for scalable, trustworthy, and fair deployment of attribution in modern machine learning and distributed inference systems.