Stability-Guided Online Influence Framework

Updated 27 November 2025

The paper introduces SG-OIF, a novel framework that couples influence function approximations with stability control to estimate per-example causal influence in online deep vision systems.
It integrates iterative IHVP maintenance with modular curvature approximations to achieve high-fidelity, computationally efficient influence estimation.
SG-OIF effectively addresses challenges in noisy-label detection, out-of-distribution detection, poisoning localization, and selective unlearning.

The Stability-Guided Online Influence Framework (SG-OIF) is a methodology for estimating the per-example causal leverage of training points on test predictions in deep learning vision systems under streaming or online training regimes. It couples influence function approximations with real-time algorithmic stability control, providing state-of-the-art ranking and attribution for critical tasks such as noisy-label detection, out-of-distribution (OOD) detection, poisoning localization, and selective unlearning. SG-OIF is distinguished by its integration of stability-guided control, iterative IHVP maintenance, and modular curvature approximations to ensure high-fidelity, computationally efficient, and robust influence estimation during model development (Rao et al., 21 Nov 2025).

1. Architectural Overview and Design Objectives

SG-OIF’s core objective is to provide accurate, streaming estimation of each training point's impact on deployed model predictions. Its architecture is organized into three principal components:

Anchor Bank & Curvature Backends: SG-OIF maintains a small set $\mathcal{V}$ of anchors, each representing a test vector $v$ and associated Inverse Hessian–Vector Product (IHVP) $\phi_v \approx H^{-1} g_v$ . The framework supports four curvature approximation backends: diagonal second-moments (fast, coarse), empirical Fisher matrices (full-batch), Kronecker-factored blocks (K-FAC), and hybrid low-rank plus diagonal (Woodbury factorization), allowing explicit tradeoffs between computational speed and statistical fidelity.
Streaming IHVP Refinement: Each anchor IHVP $\phi_v$ is iteratively refined using stochastic Richardson iteration and preconditioned Neumann updates. The framework exploits a shared low-rank subspace $Q_r$ ( $r \ll d$ ) to efficiently capture dominant curvature directions and accelerate anchor vector updates.
Stability-Guided Controller: Algorithmic stability monitors solver residuals $r_v = g_v - H \phi_v$ , calculates a per-anchor confidence metric $c_v \in [0,1]$ , and adapts the thresholds dynamically. Low-confidence anchors can trigger conjugate-gradient refinements, and anchors with persistently low $c_v$ may be replaced to maintain attribution quality.

SG-OIF is designed for robustness in the presence of training drift, nonstationary data, and resource constraints, enabling online, reliable influence estimation across diverse deep learning vision workflows (Rao et al., 21 Nov 2025).

2. Mathematical Foundations: Influence Functions and Online Estimation

Classical Influence Function

Given parameter vector $\theta^*$ minimizing the empirical risk

$L(\theta) = \frac{1}{n} \sum_{i=1}^n \ell(\theta; z_i),$

the infinitesimal up-weighting of a single training point $z$ introduces a perturbed minimizer,

$\theta_\epsilon = \arg\min_\theta \left[L(\theta) + \epsilon \ell(\theta; z)\right].$

By the implicit function theorem,

$\left.\frac{d\theta_\epsilon}{d\epsilon}\right|_{\epsilon=0} = - H^{-1} \nabla_\theta \ell(\theta^*; z),$

where $H=\nabla_\theta^2 L(\theta^*)$ is the Hessian. The influence of $z$ on the test loss $\ell(\theta_\epsilon; z_\text{test})$ is

$I(z; z_\text{test}) = - g(z)^\top H^{-1} g(z_\text{test}),$

with $g(z) = \nabla_\theta \ell(\theta^*; z)$ . For anchor $v = g(z_\text{test})$ , $I_v(z) = - g(z)^\top H^{-1} v$ (Rao et al., 21 Nov 2025).

Iterative and Stochastic IHVP Computation

SG-OIF approximates $\phi_v \approx H^{-1} v$ online via:

Stochastic Richardson Iteration:

$\phi_v^{(t+1)} = \phi_v^{(t)} + \rho_t [g_v^{(t)} - H_t \phi_v^{(t)}],$

where $g_v^{(t)} = v$ , $H_t$ is the curvature surrogate, and $\rho_t$ follows a Robbins–Monro schedule.

Preconditioned Neumann Series: For $H_t = \alpha I + \Delta_t$ ,

$H_t^{-1} v \approx \sum_{k=0}^K (-\alpha^{-1} \Delta_t)^k \alpha^{-1} v,$

with $K$ chosen adaptively based on the norm of the solver residuals $\|r_v\|$ .

Stability-Guided Control and Confidence Calibration

A stability proxy $\tilde\beta_t = \gamma_1 (\eta_t \bar{Y}_t)/n + \gamma_2 (\lambda_w)/n$ and condition-number proxy $\Gamma_t$ yield the threshold $\tau_t = \kappa \tilde\beta_t \Gamma_t$ . The per-anchor residual $r_v^{(t)} = g_v^{(t)} - H_t \phi_v^{(t)}$ gates the confidence

$c_v^{(t)} = \text{Clip}\left(1 - \frac{\|r_v^{(t)}\|}{\tau_t}, 0, 1\right).$

Per-example influence is then

$\tilde{I}(z_i; v) = - c_v\, \phi_v^\top g_i$

and aggregated across anchors $\mathcal{V}$ via

$\tilde{I}(z_i) = \sum_{v \in \mathcal{V}} w_v \tilde{I}(z_i; v),\quad w_v = \frac{c_v}{\sum_{u \in \mathcal{V}} c_u}.$

3. Modular Curvature Approximations and Algorithm Workflow

SG-OIF offers a selection of curvature surrogates for practical deployment:

Backend Type	Compute Cost	Fidelity
Diagonal Second-Moment	Minimal	Coarse
Empirical Fisher	Moderate/Batch	Moderate+
K-FAC	Higher	High
Low-Rank + Diagonal	Tunable	Adaptable

The main algorithm cycles through parameter updates, anchor IHVP refinements, anchor replacement, and confidence-based gating. Briefly:

Update model and build curvature surrogate $H_t$ .
At intervals, update the low-rank basis $Q_r$ .
For each anchor $v$ $v$ :
- Compute $g_v$ ; update $\phi_v$ via Richardson.
- Compute and gate confidence $c_v$ .
For each sample $z_i$ in the current minibatch, compute influence via current $\phi_v$ and $c_v$ .
Periodically refresh anchors with low $c_v$ and trigger short CG refinement for low-confidence but high-influence anchors.

Stability-guided control ensures that unreliable influence estimates—due to numerical instability or changing curvature—are suppressed or recalibrated before being used for ranking or intervention.

4. Experimental Protocols and Evaluations

Noisy-Label Detection

Datasets: CIFAR-10 (20%, 40%, 70% synthetic noise at different sparsities), CIFAR-100 (20% asymmetric), WebVision, Clothing1M.
Baselines: INCV, Mixup, SCE-loss, MentorNet, Co-Teaching, FASTIF, TracIn, Confident Learning.
Metrics: Precision@1% (P@1%), AUC-PR, computational overhead (relative to ERM).

Out-of-Distribution Detection

Datasets: MNIST, CIFAR-10, CIFAR-100, ImageNet.
Baselines: CutPaste, DRAEM, PatchCore, Gram, EBO, GradNorm, ReAct, MLS, KLM, DICE, VIM, DeepKNN, G-ODIN, CSI, ARPL, MOS, OpenGAN, VOS, LogitNorm, UDG, PixMix.
Metrics: AUROC, AUPR, average classification accuracy.

Global confidence calibration and gating are empirically validated via streamed empirical-Bernstein intervals, with theoretical bounds indicating reduced bias and variance for the resulting influence estimates.

5. Performance Results

Key results reported for SG-OIF include:

Task/Dataset	SG-OIF Score	Notable Comparison
Noisy-Label Detection (CIFAR-10, 20% noise, 0 sparsity)	P@1%: 91.1%	+3–5% vs. INCV, Mixup, SCE-loss
CIFAR-10, 60% sparsity	P@1% ≥ 90.9%	Robust to sparsity
CIFAR-100 / WebVision / Clothing1M	P@1%: 88.5 / 67.5 / 63.8
CIFAR-100 / WebVision / Clothing1M	AUC-PR: 69.8 / 55.4 / 52.5
Out-of-Distribution (MNIST, CIFAR-10/100, ImageNet avg.)	AUROC: 89.1% / AUPR: 95.1%	Top among 21 methods
MNIST (AUPR)	99.8%	Exceeds all baselines
CIFAR-10 (AUPR)	96.7%	Exceeds all baselines
Overhead	≤ 1.00× of baseline ERM	Practically negligible

A plausible implication is that, by decoupling curvature approximation and stability control, SG-OIF can adapt efficiently to training drift and model updates, consistently maintaining high influence estimation fidelity even as data conditions change (Rao et al., 21 Nov 2025).

6. Applications and Impact

SG-OIF directly addresses key limitations of prior influence estimation frameworks in modern vision models—namely, the computational intractability of explicit Hessian inverses and the instability of static influence approximations under dynamic or nonstationary training. Its primary applications include:

Noisy-label detection: Identifying and potentially reweighting or removing mislabeled or contaminated samples during supervised learning.
Out-of-distribution (OOD) detection: Reliable assessment of sample atypicality relative to the training distribution for robust deployment.
Poisoning localization: Pinpointing training samples with disproportionate, potentially adversarial influence on test predictions.
Selective unlearning: Quantifying and mitigating the impact of selected samples, with implications for data deletion or privacy compliance.

These results establish SG-OIF as a foundational component for real-time influence analysis in contemporary deep vision pipelines, benchmarking new state-of-the-art in both accuracy and operational efficiency (Rao et al., 21 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

SG-OIF: A Stability-Guided Online Influence Framework for Reliable Vision Data (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Stability-Guided Online Influence Framework (SG-OIF).