Revisit, Extend, and Enhance Hessian-Free Influence Functions (2405.17490v2)
Abstract: Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective approximation method known as TracIn. This method substitutes the inverse of the Hessian matrix with an identity matrix. We provide deeper insights into why this simple approximation method performs well. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance TracIn through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for LLM fine-tuning, and defense against adversarial attacks.
- Dmlr: Data-centric machine learning research–past, present and future. arXiv preprint arXiv:2311.13028, 2023.
- What data benefits my classifier? enhancing model performance and interpretability through influence-based data selection. In International Conference on Learning Representations, 2024.
- Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models. arXiv preprint arXiv:2310.00902, 2023.
- Training data influence analysis and estimation: A survey. arXiv preprint arXiv:2212.04612, 2022.
- Residuals and influence in regression. New York: Chapman and Hall, 1982.
- Data shapley: Equitable valuation of data for machine learning. In International Conference on Machine Learning, 2019.
- Towards efficient data valuation based on the Shapley value. In International Conference on Artificial Intelligence and Statistics, 2019.
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning. In International Conference on Artificial Intelligence and Statistics, 2022.
- Efficient task specific data valuation for nearest neighbor algorithms. In International Conference on Very Large Data Bases Endowment, 2018.
- Understanding black-box predictions via influence functions. In International Conference on Machine Learning, 2017.
- Second-order stochastic optimization for machine learning in linear time. Journal of Machine Learning Research, 2017.
- If influence functions are the answer, then what is the question? Advances in Neural Information Processing Systems, 2022.
- Studying large language model generalization with influence functions. arXiv preprint arXiv:2308.03296, 2023.
- Influence functions in deep learning are fragile. arXiv preprint arXiv:2006.14651, 2020.
- Revisiting the fragility of influence functions. Neural Networks, 162:581–588, 2023.
- Estimating training data influence by tracing gradient descent. Advances in Neural Information Processing Systems, 2020.
- Make every example count: On the stability and utility of self-influence for learning from noisy nlp datasets. arXiv preprint arXiv:2302.13959, 2023.
- Self-influence guided data reweighting for language model pre-training. arXiv preprint arXiv:2311.00913, 2023.
- Training data influence analysis and estimation: A survey. Machine Learning, pages 1–53, 2024.
- Resolving training biases via influence-based data relabeling. In International Conference on Learning Representations, 2021.
- Less is better: Unweighted data subsampling via influence function. In AAAI Conference on Artificial Intelligence, 2020.
- Optimal subsampling with influence functions. Advances in Neural Information Processing Systems, 2018.
- Finding influential training samples for gradient boosted decision trees. In International Conference on Machine Learning, 2018.
- Less: Selecting influential data for targeted instruction tuning. arXiv preprint arXiv:2402.04333, 2024.
- Influence-balanced loss for imbalanced visual classification. In IEEE/CVF International Conference on Computer Vision, 2021.
- Selective and collaborative influence function for efficient recommendation unlearning. Expert Systems with Applications, 2023.
- Recommendation unlearning via influence function. arXiv preprint arXiv:2307.02147, 2023.
- Influence selection for active learning. In IEEE/CVF International Conference on Computer Vision, 2021.
- Achieving fairness at no utility cost via data reweighing with influence. In International Conference on Machine Learning, 2022.
- Understanding instance-level impact of fairness constraints. In International Conference on Machine Learning, 2022.
- Fairif: Boosting fairness in deep learning via influence functions with validation set sensitive attributes. In International Conference on Web Search and Data Mining, 2024.
- Detecting adversarial samples using influence functions and nearest neighbors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
- Characterizing the influence of graph elements. In International Conference on Learning Representations, 2023.
- Gif: A general graph unlearning strategy via influence function. In ACM Web Conference, 2023.
- Machine unlearning: Solutions and challenges. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024.
- Fast yet effective machine unlearning. IEEE Transactions on Neural Networks and Learning Systems, 2023.
- Out-of-distribution generalization analysis via influence function. arXiv preprint arXiv:2101.08521, 2021.
- Evaluating the impact of local differential privacy on utility loss via influence functions. arXiv preprint arXiv:2309.08678, 2023.
- Understanding programmatic weak supervision via source-aware influence function. Advances in Neural Information Processing Systems, 2022.
- On the accuracy of influence functions for measuring group effects. Advances in Neural Information Processing Systems, 2019.
- Deeper understanding of black-box predictions via generalized influence functions. arXiv preprint arXiv:2312.05586, 2023.
- Multi-stage influence function. Advances in Neural Information Processing Systems, 2020.
- Fairness through awareness. In Innovations in Theoretical Computer Science Conference, 2012.
- Adversarial robustness of linear models: regularization and dimensionality. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2019.
- Random projection in dimensionality reduction: applications to image and text data. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001.
- Learning with noisy labels revisited: A study using real-world human annotations. In International Conference on Learning Representations, 2022.
- Cmw-net: Learning a class-aware sample weighting mapping for robust deep learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Learning multiple layers of features from tiny images. 2009.
- Learning deep representation for face alignment with auxiliary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015.
- Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461, 2018.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
- Perturbation augmentation for fairer NLP. In Conference on Empirical Methods in Natural Language Processing, 2022.
- On adaptive attacks to adversarial example defenses. Advances in Neural Information Processing Systems, 2020.
- Evasion attacks against machine learning at test time. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013.
- A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014.
- Large-scale Celebfaces attributes (CelebA) dataset. Retrieved August, 2018.
- David Noever. Machine learning suites for online toxicity detection. arXiv preprint arXiv:1810.01869, 2018.
- Realistic evaluation of deep semi-supervised learning algorithms. Advances in Neural Information Processing Systems, 2018.