Explaining BIF’s Advantage over EK-FAC in Small-Data Regimes

Determine the cause of the observed superior Linear Datamodelling Score performance of the local Bayesian influence function relative to EK-FAC when the retrain subset size is small, and ascertain whether the effect arises from higher-order loss-landscape sensitivity captured by BIF or from approximation errors in EK-FAC’s Kronecker-factored curvature model.

Background

The authors evaluate training data attribution methods via retraining experiments summarized by the Linear Datamodelling Score (LDS). They find that in small retrain-subset regimes, local BIF consistently outperforms EK-FAC, while EK-FAC remains competitive or superior in larger-data settings.

Although they propose possible explanations (e.g., higher-order effects in BIF versus structural approximation bias in EK-FAC), they explicitly state that the reason for BIF’s advantage remains unclear, motivating focused investigation to resolve this discrepancy.

References

It remains unclear why the BIF outperforms EK-FAC in the small-data regime.

— Bayesian Influence Functions for Hessian-Free Data Attribution (2509.26544 - Kreer et al., 30 Sep 2025) in Appendix: Retraining Experiments — LDS Results

Explaining BIF’s Advantage over EK-FAC in Small-Data Regimes

Background

References

Related Problems