Papers
Topics
Authors
Recent
2000 character limit reached

Wasserstein-RIF: Robust Data Attribution

Updated 11 December 2025
  • Wasserstein-RIF are robust extensions of classical influence functions that certify model sensitivity under worst-case distributional perturbations.
  • They employ optimal transport and Wasserstein metrics to compute certified intervals, providing formal coverage guarantees for leave-one-out and population influence.
  • In deep networks, Natural Wasserstein metrics yield tighter certificates and facilitate robust anomaly detection, overcoming Euclidean certification limitations.

Wasserstein-Robust Influence Functions (W-RIF) generalize classical influence functions to provide certified robustness under distributional shifts, using optimal transport metrics. W-RIF enables the quantification of how training examples influence model predictions while accounting for worst-case perturbations measured in Wasserstein distance. This framework yields formal coverage guarantees in convex models and provides new geometric tools for certified data attribution in deep neural networks, overcoming severe limitations of Euclidean-based certification by introducing a Natural Wasserstein metric derived from feature covariance geometry (Li et al., 9 Dec 2025).

1. Classical Influence Functions and Their Limitations

Given a data-generating distribution PP over Z\mathcal{Z} and a twice-differentiable loss L(θ;z)L(\theta;z), the empirical risk minimizer is

θ^=argminθRp  EzPn[L(θ;z)]\hat{\theta} = \arg\min_{\theta\in\mathbb{R}^p} \;\mathbb{E}_{z\sim P_n}[L(\theta;z)]

for empirical Pn=1ni=1nδziP_n = \frac{1}{n}\sum_{i=1}^n \delta_{z_i}. The classical influence of training point ziz_i on test loss at ztestz_{\text{test}} is

I(zi,ztest)=θL(θ^;ztest)H1θL(θ^;zi)\mathcal{I}(z_i, z_{\text{test}}) = -\nabla_\theta L(\hat{\theta}; z_{\text{test}})^\top H^{-1} \nabla_\theta L(\hat{\theta}; z_i)

where H=EPn[θ2L(θ^;z)]H = \mathbb{E}_{P_n}[\nabla^2_\theta L(\hat{\theta};z)] is assumed positive definite. This formula quantifies the first-order impact of up-weighting or removing ziz_i but lacks robustness to distributional perturbations and fails to provide certified intervals for influence under data shifts (Li et al., 9 Dec 2025).

2. Wasserstein Uncertainty Sets and Rationale

The pp-Wasserstein distance between distributions PP and QQ on Z\mathcal{Z} is

Wp(P,Q)=infπΠ(P,Q)(E(z,z)π[zzp])1/pW_p(P,Q) = \inf_{\pi\in\Pi(P,Q)} \left(\mathbb{E}_{(z,z')\sim\pi}\left[\|z-z'\|^p\right]\right)^{1/p}

The corresponding Wasserstein ball Bp(P,ϵ)={Q:Wp(Q,P)ϵ}\mathcal{B}_p(P,\epsilon) = \{Q : W_p(Q,P) \le \epsilon\} defines an adversarial uncertainty set for robust analysis. Wasserstein metrics are preferred over Euclidean parameter perturbations because they quantify distributional (mass transport) shifts, accommodate support changes such as outlier addition/removal, and allow tractable duality-based reformulations via Kantorovich–Rubinstein duality. This construction provides a natural notion of uncertainty for robust influence function analysis (Li et al., 9 Dec 2025).

3. Definition and Computation of W-RIF

For any xZx\in\mathcal{Z}, the W-RIF at radius ϵ\epsilon is

W-RIF(x;θ,P,ϵ)=supQBp(P,ϵ)EzQ[θL(θ;z)θL(θ;x)]\mathrm{W\text{-}RIF}(x; \theta, P, \epsilon) = \sup_{Q\in\mathcal{B}_p(P, \epsilon)} \left| \mathbb{E}_{z\sim Q} \left[\nabla_\theta L(\theta;z)^\top \nabla_\theta L(\theta; x)\right] \right|

Substituting empirical estimators θ=θ^\theta = \hat{\theta} and P=PnP = P_n, a first-order expansion yields

IQ=IPn+ZS(z)[dQ(z)dPn(z)]+O(QPn2)\mathcal{I}_Q = \mathcal{I}_{P_n} + \int_{\mathcal{Z}} S(z) [dQ(z) - dP_n(z)] + O(\|Q - P_n\|^2)

where the complete sensitivity kernel S(z)S(z) is the sum of SH(z)S_H(z) and Sg(z)S_g(z), both involving Hessian and gradient terms. If SS is LSL_S-Lipschitz with respect to the input norm, the dual form implies

supQ:W1(Q,Pn)ϵS(z)(dQdPn)=ϵLS\sup_{Q:W_1(Q,P_n)\le \epsilon} \int S(z)\, (dQ - dP_n) = \epsilon L_S

leading to the closed-form certified interval

I(zi,ztest)±ϵLS+O(ϵ2)\mathcal{I}(z_i, z_{\text{test}}) \pm \epsilon L_S + O(\epsilon^2)

This interval is guaranteed to contain the leave-one-out or true population-level influence as detailed below (Li et al., 9 Dec 2025).

4. Provable Certification and Coverage Guarantees

Leave-one-out influence, given by removing ziz_i from PnP_n, is a specific distributional perturbation: W1(Pn,Pn,i)diam(Z)nW_1(P_n, P_{n,-i}) \le \frac{\text{diam}(\mathcal{Z})}{n}. Setting ϵdiam(Z)/n\epsilon \ge \text{diam}(\mathcal{Z})/n, the W-RIF interval around PnP_n certifies the true leave-one-out influence. For population-level guarantees, one selects

ϵn=O(n1/2)\epsilon_n = O(n^{-1/2})

so that, with probability at least 1δ1-\delta, Wp(Pn,P)ϵnW_p(P_n, P) \le \epsilon_n and the population influence is covered: Pr[IP(x)IPn(x)±ϵnLS]1δ\Pr\left[ \mathcal{I}_{P}(x) \in \mathcal{I}_{P_n}(x) \pm \epsilon_n L_S \right] \ge 1-\delta This provides formal certification for robust attribution and data influence quantification in convex models (Li et al., 9 Dec 2025).

5. Computational Strategies

Exact estimation of LSL_S by pairwise slope calculation has computational complexity O(n2p2+p3)O(n^2p^2 + p^3). Practical approximations include:

  • Estimating zS(z)\|\nabla_z S(z)\| at a random subset of points.
  • Solving the dual optimization

maxϕ:Lip(ϕ)1  1ni=1nS(zi)ϕ(zi)\max_{\phi:\,\mathrm{Lip}(\phi)\le 1}\;\frac{1}{n}\sum_{i=1}^n S(z_i)\phi(z_i)

using projected gradient methods, which scale linearly in nn. All other computations (Hessian inversion, gradient evaluation) adhere to standard influence-function workflows (Li et al., 9 Dec 2025).

6. Extensions to Deep Networks and the Spectral Amplification Barrier

In non-convex deep networks, the parameter solution map θ^(Q)\hat{\theta}(Q) may change discontinuously under data perturbations, rendering classical W-RIF constructions invalid. TRAK (a linearized attribution method at the fixed network) is formulated as

TRAK(ztest,zi)=ϕ(ztest)Q1ϕ(zi),Q=1njϕ(zj)ϕ(zj)+λI\mathrm{TRAK}(z_{\text{test}}, z_i) = \phi(z_{\text{test}})^\top Q^{-1} \phi(z_i),\quad Q = \frac{1}{n}\sum_j \phi(z_j)\phi(z_j)^\top + \lambda I

where ϕ(z)=θfθ^(z)\phi(z) = \nabla_\theta f_{\hat{\theta}}(z). However, naive use of Euclidean W2W_2-balls in feature space results in vacuous certificates, because the relevant Lipschitz constant scales inversely with the smallest eigenvalue of QQ, and deep representations typically exhibit ill-conditioning with condition numbers of 10410^410610^6. Empirically, such Euclidean certificates cover 0%0\% of ranking pairs (Li et al., 9 Dec 2025).

7. Natural Wasserstein Metric and Robust Neural Attribution

To address spectral amplification, the Natural Wasserstein metric is defined as

dNat(z,z)=ϕ(z)ϕ(z)Q1=(ϕ(z)ϕ(z))Q1(ϕ(z)ϕ(z))d_{\mathrm{Nat}}(z, z') = \|\phi(z) - \phi(z')\|_{Q^{-1}} = \sqrt{(\phi(z) - \phi(z'))^\top Q^{-1} (\phi(z) - \phi(z'))}

In this induced geometry, the Lipschitz constant of the attribution score coincides with the data-dependent “Self-Influence” score: SI(z)=ϕ(z)Q1ϕ(z)=Q1/2ϕ(z)22\mathrm{SI}(z) = \phi(z)^\top Q^{-1} \phi(z) = \|Q^{-1/2}\phi(z)\|_2^2 The critical bound is

LNat2SI(ztest)SI(zi)maxjSI(zj)L_{\mathrm{Nat}}\le 2\,\sqrt{\mathrm{SI}(z_{\text{test}})}\sqrt{\mathrm{SI}(z_i)}\max_{j}\sqrt{\mathrm{SI}(z_j)}

Empirically, this yields certified intervals that are $10$–100×100\times tighter than Euclidean baselines. On CIFAR-10 with ResNet-18, Natural W-TRAK certificates cover 68.7%68.7\% of ranking pairs, in contrast to 0%0\% for Euclidean approaches (Li et al., 9 Dec 2025).

Furthermore, Self-Influence not only certifies attribution robustness but also provides a mathematically grounded leverage score for anomaly detection, achieving AUROC $0.970$ for label noise detection and identifying 94.1%94.1\% of corrupted labels when considering the top 20%20\% of the training data (Li et al., 9 Dec 2025).


Wasserstein-Robust Influence Functions unify classical and modern approaches to data attribution by extending influence function analysis to robust, distributional settings. For convex models, W-RIF yields O(ϵ2)O(\epsilon^2)-accurate certified intervals with provable guarantees for leave-one-out and population influence. For deep networks, robust certification is only achievable by linearizing at the feature level and measuring perturbations in the Natural Wasserstein metric, thereby circumventing the issue of spectral amplification. The theory provides a formal basis for certified data valuation, debugging, unlearning, and robust anomaly detection in high-dimensional, non-convex machine learning models (Li et al., 9 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Wasserstein-Robust Influence Functions (W-RIF).