Papers
Topics
Authors
Recent
2000 character limit reached

Rescaled Influence Functions (RIF)

Updated 25 November 2025
  • Rescaled Influence Functions (RIF) are advanced techniques that modify classical influence functions to improve effect estimation in high-dimensional settings.
  • They enhance leave-one-out approximations in machine learning and econometrics by using leverage scores and sensitivity curves for more accurate parameter adjustments.
  • Applications include data poisoning detection, machine unlearning, and robust marginal effect estimation, offering improved computational efficiency and precision.

Rescaled Influence Functions (RIF) constitute a class of techniques for data attribution and robust statistical inference that modify classical influence functions to address their empirical and theoretical shortcomings in high-dimensional settings and for complex statistical functionals. The two major contexts for RIF methodology are (1) machine learning model parameter sensitivity in the overparameterized regime and (2) econometric analysis of functional outcomes beyond means, such as quantiles and indices of inequality. RIFs enhance the fidelity of effect estimation from individual observations or groups of observations both in analytic and empirical tasks.

1. Foundations of Influence Functions and their Rescaling

Classical influence functions (IF) quantify the infinitesimal effect of perturbing a dataset by introducing an outlier or removing an observation. For estimator θ^\hat{\theta} obtained by minimizing the empirical loss L(θ)=i=1n(xi,yi;θ)L(\theta) = \sum_{i=1}^n \ell(x_i, y_i; \theta), the IF for observation ziz_i is

IFi=H1gi,\mathrm{IF}_i = -H^{-1} g_i,

where H=2L(θ^)H = \nabla^2 L(\hat{\theta}) is the empirical Hessian and gi=(xi,yi;θ^)g_i = \nabla \ell(x_i, y_i; \hat{\theta}) the gradient for ziz_i. This first-order Taylor approximation is accurate in low-dimensional, regularized settings but degrades substantially in the high-dimensional regime (number of parameters dnd \gtrsim n), where curvature becomes sample-sensitive and IFs systematically underestimate removal effects (Rubinstein et al., 7 Jun 2025). For statistical functionals v(F)v(F), such as quantiles or inequality indices, the classical IF is defined as the Gâteaux derivative at the point mass,

IF(y;v,F)=limϵ0v((1ϵ)F+ϵδy)v(F)ϵ,\mathrm{IF}(y; v, F) = \lim_{\epsilon \to 0} \frac{v((1-\epsilon)F + \epsilon \delta_y) - v(F)}{\epsilon},

with FF the distribution function of YY (Alejo et al., 2021).

2. Mathematical Formulation and Variants of RIF

2.1 Machine Learning Data Attribution

Rescaled Influence Functions enhance data attribution in high-dimensional models, notably generalized linear models (GLMs). RIF approximates the leave-one-out effect by:

θ^RIF,T=θ^+iTRIFi,RIFi:=H[n]{i}1gi,\widehat{\theta}_{\mathrm{RIF}, T} = \hat{\theta} + \sum_{i \in T} \mathrm{RIF}_i, \quad \mathrm{RIF}_i := H_{[n] \setminus \{i\}}^{-1} g_i,

with H[n]{i}=jiHjH_{[n] \setminus \{i\}} = \sum_{j \neq i} H_j and Hj=2(xj,yj;θ^)H_j = \nabla^2 \ell(x_j, y_j; \hat{\theta}). For GLMs (rank-one HiH_i), Sherman–Morrison formula yields the closed form:

RIFi=IFi1hi,\mathrm{RIF}_i = \frac{\mathrm{IF}_i}{1 - h_i},

where hi=tr(H1Hi)h_i = \mathrm{tr}(H^{-1} H_i) is the “leverage score”. For logistic regression, hi=σi(1σi)xiH1xih_i = \sigma_i (1 - \sigma_i) x_i^\top H^{-1} x_i with σi=σ(θ^,xi)\sigma_i = \sigma(\hat{\theta}, x_i) (Rubinstein et al., 7 Jun 2025). The scaling term (1hi)1(1 - h_i)^{-1} corrects underestimation: for small hih_i, RIF \approx IF; for large hih_i, the correction can be considerable.

2.2 Econometric and Statistical Applications

In statistical inference, the recentered influence function (also abbreviated RIF in the econometrics tradition but conceptually distinct from the ML RIF) is defined as

RIF(y;v,F)=v(F)+IF(y;v,F),\mathrm{RIF}(y; v, F) = v(F) + \mathrm{IF}(y; v, F),

ensuring EF[RIF(Y;v,F)]=v(F)\mathbb{E}_F [ \mathrm{RIF}(Y; v, F) ] = v(F). RIF regression models the conditional expectation E[RIF(Y;v,F)X=x]\mathbb{E}[\mathrm{RIF}(Y; v, F) | X = x] to estimate the marginal effects of covariates on v(F)v(F), applicable to quantiles, Gini, polarization, and other complex indices (Alejo et al., 2021).

3. Theoretical Properties and Accuracy Regimes

The accuracy of RIF in machine learning settings is established via comparison to the “single Newton step” (NS) leave-one-out approximation:

θ^NS,T=θ^H[n]T1iTgi.\widehat{\theta}_{\mathrm{NS}, T} = \hat{\theta} - H_{[n] \setminus T}^{-1} \sum_{i \in T} g_i.

RIF is additive across removals and, under positive semidefiniteness, limited sample dominance, and incoherence conditions, achieves tight signal-to-noise bounds in high dimensions: SNR Ω(n/(kd))\approx \Omega(n/(k\sqrt{d})) for batch removal size kk (Rubinstein et al., 7 Jun 2025). In contrast, standard IFs lose accuracy as n/dn/d decreases or regularization weakens.

For statistical functionals, sensitivity curve (SC) approximations converge in probability to the true IF under Fréchet-differentiability, scale invariance, and quadratic von Mises remainder. Thus, empirical SCs provide valid approximations for RIF regression estimation in large samples (Alejo et al., 2021).

4. Algorithmic Implementation and Computational Aspects

RIF for GLMs can be computed efficiently:

1
2
3
4
5
6
Input: {(x_i, y_i)}, loss ℓ, minimizer θ̂, Hessian inverse H⁻¹
For i=1 to n:
    g_i ← ∇ℓ(x_i, y_i; θ̂)
    h_i ← x_i^T H⁻¹ x_i × weight factor (e.g., σ_i(1−σ_i) for logistic regression)
    IF_i ← -H⁻¹ g_i
    RIF_i ← IF_i / (1 - h_i)
Complexity is O(d3)O(d^3) for Hessian inversion plus O(nd2)O(nd^2) matrix-vector products, with negligible additional cost for the scalar rescale and O(nd)O(nd) per sample for leverages (Rubinstein et al., 7 Jun 2025). For non-GLM losses, low-rank corrections via the Sherman–Morrison or Woodbury formula maintain computational efficiency.

In RIF regression for complex statistics, leave-one-out functionals are approximated using subsample-based sensitivity curve estimation and regression splines, reducing O(n2)O(n^2) cost to approximately O(n)O(n) with negligible loss in precision for large nn (Alejo et al., 2021).

5. Empirical Findings Across Applications

5.1 High-Dimensional Data Attribution

Rescaled IFs provide accurate predictions of leave-TT-out shifts in test loss, prediction probabilities, and self-loss across vision (e.g., ImageNet-derived binary tasks with ResNet or Inception embeddings), audio (ESC-50 with OpenL3), and textual datasets (IMDB, Enron) when compared to ground-truth retraining. Conventional IFs systematically underpredict effect sizes in dnd \approx n or dnd \gg n regimes, but RIFs remain on the retrain-diagonal (Rubinstein et al., 7 Jun 2025). Poisoning detection is significantly improved: IFs assign low influence to adversarially flipped instances, while RIFs identify them with high removal effect.

5.2 RIF Regression with Sensitivity Curves

Empirically, RIF regression coefficients estimated with restricted spline-based sensitivity curves closely match those from analytical RIFs for variance and Gini, both in simulation and in large-scale wage data for the Duclos–Esteban–Ray polarization index. The approach remains accurate for functionals without closed-form IF, and the computational gains are an order of magnitude for large samples (Alejo et al., 2021).

6. Methodological Extensions and Applications

Rescaled Influence Functions are leveraged for:

  • Data Poisoning Detection: RIFs flag points whose removal produces disproportionately large parameter or prediction shifts—critical for identifying adversarial contamination (Rubinstein et al., 7 Jun 2025).
  • Machine Unlearning: Efficiently estimating parameter adjustment when removing user data from models.
  • Dataset Auditing and Curation: Prioritizing instances for removal or review based on their corrected influence metric.
  • Bias Detection and Debiasing: Detecting group effects and harmful biases that might be underestimated by classical IFs.
  • Robust Marginal Effect Estimation: In econometrics, RIF regression enables marginal effect estimation for quantiles, Gini, and polarization indices (Alejo et al., 2021).

A notable distinction is that RIF in machine learning primarily addresses parameter sensitivity in continuous parameter spaces, while RIF in statistical modeling addresses distributional functionals and marginal effect estimation.

7. Limitations and Interpretive Considerations

While rescaled influence functions improve pointwise removal estimates, their validity relies on sufficient regularity (rank-deficiency, sample incoherence, positive semidefiniteness), and the direct analogy between ML RIF and econometric RIF is limited—terminological precision is critical. In both cases, the approximations are asymptotic: for very large-rank Hessians or nonlinear models without Hessian access, performance may degrade. For complex statistics without analytic IF, empirical SC-based estimation is justified asymptotically, but finite-sample behavior should be validated as in (Alejo et al., 2021).

Rescaled Influence Functions restore much of the higher-order accuracy of Newton-based methods without sacrificing additive structure or computational tractability, making them a robust methodological advance for attribution, unlearning, auditing, and effect estimation in both high-dimensional machine learning and semiparametric statistical inference (Rubinstein et al., 7 Jun 2025, Alejo et al., 2021).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Rescaled Influence Functions (RIF).