Influence Functions in Deep Learning Are Fragile (2006.14651v2)

Published 25 Jun 2020 in cs.LG and stat.ML

Abstract: Influence functions approximate the effect of training samples in test-time predictions and have a wide variety of applications in machine learning interpretability and uncertainty estimation. A commonly-used (first-order) influence function can be implemented efficiently as a post-hoc method requiring access only to the gradients and Hessian of the model. For linear models, influence functions are well-defined due to the convexity of the underlying loss function and are generally accurate even across difficult settings where model changes are fairly large such as estimating group influences. Influence functions, however, are not well-understood in the context of deep learning with non-convex loss functions. In this paper, we provide a comprehensive and large-scale empirical study of successes and failures of influence functions in neural network models trained on datasets such as Iris, MNIST, CIFAR-10 and ImageNet. Through our extensive experiments, we show that the network architecture, its depth and width, as well as the extent of model parameterization and regularization techniques have strong effects in the accuracy of influence functions. In particular, we find that (i) influence estimates are fairly accurate for shallow networks, while for deeper networks the estimates are often erroneous; (ii) for certain network architectures and datasets, training with weight-decay regularization is important to get high-quality influence estimates; and (iii) the accuracy of influence estimates can vary significantly depending on the examined test points. These results suggest that in general influence functions in deep learning are fragile and call for developing improved influence estimation methods to mitigate these issues in non-convex setups.

Citations (205)

View on Semantic Scholar

Summary

The paper shows that influence functions yield reliable estimates in shallow networks but become unstable in deep, non-convex settings.
The paper demonstrates that regularization techniques, like weight-decay, can improve function accuracy yet are not a complete remedy for fragility.
The paper reveals that stochastic Hessian approximations and test point sensitivity further complicate reliable influence estimation, urging the development of robust alternatives.

Influence Functions in Deep Learning are Fragile

The paper "Influence Functions in Deep Learning Are Fragile" by Basu, Pope, and Feizi addresses the efficacy of influence functions in the field of neural networks with non-convex loss functions. Influence functions, originating from robust statistics, are pivotal for estimating the effect of individual training samples on a model's predictions without the computational cost of retraining. Their applications span model interpretability, identifying mislabeled data, and constructing adversarial attacks.

The paper provides a critical evaluation of influence functions within the setup of deep learning architectures. The classical paradigm assumes convexity of the loss functions, applicable for models like logistic regression. However, this assumption does not hold in deep learning due to the inherent non-convexity. This paper conducts empirical assessments using datasets like Iris, MNIST, CIFAR-10, and ImageNet, exploring how architectural factors and training setups affect influence estimation. Key observations include:

Network Depth and Width: Influence estimates demonstrate reliability in shallow networks but degrade with depth, which is attributed to increased curvature values reflected as eigenvalues of the Hessian at optimal parameters. Additionally, over-parameterization through width expansion similarly induces variance in influence scores.
Regularization Techniques: The application of weight-decay regularization improves influence function accuracy in certain architectures, indicating regularization as a potential stabilizing factor for non-convex loss settings.
Hessian Approximation: The paper uses stochastic estimation for the inverse-Hessian vector product, essential for influence calculation in large networks—yet this introduces errors, further complicating influence estimates in deeper models.
Test Point Sensitivity: The effectiveness of influence functions is highly test point-dependent, implying non-uniform reliability across the input space.

These insights suggest that influence functions, while useful, exhibit pronounced fragility in deep learning contexts. The findings imply a necessity for developing enhanced methods that can more robustly compute influence estimations across varying experimental conditions, architectural configurations, and dataset scales.

From a broader perspective, this paper suggests future research avenues include developing alternatives to first-order influence functions to handle larger landmark parameter shifts effectively, potentially involving higher-order or optimal transport-inspired approaches. Furthermore, understanding influence in the context of 'group influence', instead of single training points, might also yield more robust interpretations due to the cumulative effects that large data samples can have on the model training dynamics.

By presenting a cautionary examination of influence functions within deep learning, this paper contributes significantly to the interpretability and reliability discourse in AI, underlining the complexities introduced by non-convex loss landscapes and suggesting that robust solutions capable of navigating these challenges are crucial as machine learning models become increasingly integral to critical decision-making systems.

PDF Markdown

Influence Functions in Deep Learning Are Fragile (2006.14651v2)

Summary

Influence Functions in Deep Learning are Fragile

Related Papers