2000 character limit reached
The Approximate Fisher Influence Function: Faster Estimation of Data Influence in Statistical Models (2407.08169v2)
Published 11 Jul 2024 in cs.LG and cs.AI
Abstract: Quantifying the influence of infinitesimal changes in training data on model performance is crucial for understanding and improving machine learning models. In this work, we reformulate this problem as a weighted empirical risk minimization and enhance existing influence function-based methods by using information geometry to derive a new algorithm to estimate influence. Our formulation proves versatile across various applications, and we further demonstrate in simulations how it remains informative even in non-convex cases. Furthermore, we show that our method offers significant computational advantages over current Newton step-based methods.