Training Data Attribution via Approximate Unrolled Differentiation (2405.12186v2)
Abstract: Many training data attribution (TDA) methods aim to estimate how a model's behavior would change if one or more data points were removed from the training set. Methods based on implicit differentiation, such as influence functions, can be made computationally efficient, but fail to account for underspecification, the implicit bias of the optimization algorithm, or multi-stage training pipelines. By contrast, methods based on unrolling address these issues but face scalability challenges. In this work, we connect the implicit-differentiation-based and unrolling-based approaches and combine their benefits by introducing Source, an approximate unrolling-based TDA method that is computed using an influence-function-like formula. While being computationally efficient compared to unrolling-based approaches, Source is suitable in cases where implicit-differentiation-based approaches struggle, such as in non-converged models and multi-stage training pipelines. Empirically, Source outperforms existing TDA techniques in counterfactual prediction, especially in settings where implicit-differentiation-based approaches fall short.
- Towards tracing knowledge in language models back to the training data. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2429–2446, 2022.
- Walter Edwin Arnoldi. The principle of minimized iterations in the solution of the matrix eigenvalue problem. Quarterly of applied mathematics, 9(1):17–29, 1951.
- If influence functions are the answer, then what is the question? Advances in Neural Information Processing Systems, 35:17953–17967, 2022a.
- Amortized proximal optimization. Advances in Neural Information Processing Systems, 35:8982–8997, 2022b.
- John F Banzhaf III. Weighted voting doesn’t work: A mathematical analysis. Rutgers L. Rev., 19:317, 1964.
- Influence functions in deep learning are fragile. In International Conference on Learning Representations, 2020.
- Yoshua Bengio. Practical recommendations for gradient-based training of deep architectures. In Neural Networks: Tricks of the Trade: Second Edition, pages 437–478. Springer, 2012.
- Adapting and evaluating influence-estimation methods for gradient-boosted decision trees. Journal of Machine Learning Research, 24(154):1–48, 2023.
- Case-based explanation of non-case-based learning methods. In Proceedings of the AMIA Symposium, page 212. American Medical Informatics Association, 1999.
- Hydra: Hypergradient data relevance analysis for interpreting deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7081–7089, 2021.
- BERT: Pre-training of deep bidirectional transformers for language understanding, 2018.
- DsDm: Model-aware dataset selection with datamodels, 2024.
- Revisiting the fragility of influence functions. Neural Networks, 162:581–588, 2023.
- Kronecker-factored approximate curvature for modern neural network architectures. Advances in Neural Information Processing Systems, 36, 2024.
- Influence function based data poisoning attacks to top-n𝑛nitalic_n recommender systems. In Proceedings of The Web Conference 2020, pages 3019–3025, 2020.
- What neural networks memorize and why: Discovering the long tail via influence estimation. Advances in Neural Information Processing Systems, 33:2881–2891, 2020.
- Fast approximate natural gradient descent in a kronecker factored eigenbasis. Advances in Neural Information Processing Systems, 31, 2018.
- The journey, not the destination: How data guides diffusion models, 2023.
- Domain generalization for object recognition with multi-task autoencoders. In Proceedings of the IEEE international conference on computer vision, pages 2551–2559, 2015.
- Data Shapley: Equitable valuation of data for machine learning. In International Conference on Machine Learning, pages 2242–2251. PMLR, 2019.
- An empirical investigation of catastrophic forgetting in gradient-based neural networks, 2015.
- Roger Grosse. University of Toronto CSC2541, Topics in Machine Learning: Neural Net Training Dynamics, Chapter 4: Second-Order Optimization. Lecture Notes, 2021. URL https://www.cs.toronto.edu/~rgrosse/courses/csc2541_2021/readings/L04_second_order.pdf.
- A kronecker-factored approximate fisher matrix for convolution layers. In International Conference on Machine Learning, pages 573–582. PMLR, 2016.
- Studying large language model generalization with influence functions, 2023.
- In search of lost domain generalization, 2020.
- Shampoo: Preconditioned stochastic tensor optimization. In International Conference on Machine Learning, pages 1842–1850. PMLR, 2018.
- Simfluence: Modeling the influence of individual training examples by simulating training runs, 2023.
- Training data influence analysis and estimation: A survey. Machine Learning, pages 1–53, 2024.
- Frank R Hampel. The influence curve and its role in robust estimation. Journal of the american statistical association, 69(346):383–393, 1974.
- Explaining black box predictions and unveiling data artifacts through influence functions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5553–5563, 2020.
- Evaluation of similarity-based explanations. In International Conference on Learning Representations, 2020.
- Data cleansing for models trained with sgd. Advances in Neural Information Processing Systems, 32, 2019.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- A benchmark for interpretability methods in deep neural networks. Advances in neural information processing systems, 32, 2019.
- Lora: Low-rank adaptation of large language models, 2021.
- Datamodels: Predicting predictions from training data. In International Conference on Machine Learning, 2022.
- Optimization of graph neural networks with natural gradient descent. In 2020 IEEE international conference on big data, pages 171–179. IEEE, 2020.
- Louis A Jaeckel. The infinitesimal jackknife. Bell Telephone Laboratories, 1972.
- Subpopulation data poisoning attacks. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 3104–3122, 2021.
- Towards efficient data valuation based on the shapley value. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1167–1176. PMLR, 2019.
- Scalability vs. utility: Do we have to sacrifice one for the other in data importance quantification? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8239–8247, 2021.
- Opendataval: A unified benchmark for data valuation, 2023.
- Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
- Extensions of lipschitz maps into banach spaces. Israel Journal of Mathematics, 54(2):129–138, 1986.
- Revisiting methods for finding influential examples, 2021.
- The UCI machine learning repository, 2023. URL https://archive.ics.uci.edu.
- Interpreting black box predictions using fisher kernels. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 3382–3390. PMLR, 2019.
- GEX: A flexible method for approximating influence via geometric ensemble. Advances in Neural Information Processing Systems, 36, 2024.
- Adam: A method for stochastic optimization, 2014.
- Understanding black-box predictions via influence functions. In International Conference on Machine Learning, pages 1885–1894. PMLR, 2017.
- Resolving training biases via influence-based data relabeling. In International Conference on Learning Representations, 2021.
- Attributing learned concepts in neural networks to training data, 2023.
- The Implicit Function Theorem: History, theory, and applications. Springer Science & Business Media, 2002.
- Learning multiple layers of features from tiny images. 2009.
- Beta Shapley: A unified and noise-reduced data valuation framework for machine learning. In International Conference on Artificial Intelligence and Statistics, pages 8780–8802. PMLR, 2022.
- DataInf: Efficiently estimating data influence in lora-tuned llms and diffusion models. In International Conference on Learning Representations, 2023.
- Contrastive error attribution for finetuned language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11482–11498, 2023.
- MNIST handwritten digit database. ATT Labs, 2, 2010.
- Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017.
- On the limited memory BFGS method for large scale optimization. Mathematical programming, 45(1-3):503–528, 1989.
- Influence selection for active learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9274–9283, 2021.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
- James Martens. New insights and perspectives on the natural gradient method. Journal of Machine Learning Research, 21(146):1–76, 2020.
- Optimizing neural networks with kronecker-factored approximate curvature. In International Conference on Machine Learning, pages 2408–2417. PMLR, 2015.
- Kronecker-factored curvature approximations for recurrent neural networks. In International Conference on Learning Representations, 2018.
- Pointer sentinel mixture models. In International Conference on Learning Representations, 2016.
- Trustworthy machine learning, 2023.
- A bayesian approach to analysing training data attribution in deep learning. Advances in Neural Information Processing Systems, 36, 2024.
- The memory-perturbation equation: Understanding model’s sensitivity to data. Advances in Neural Information Processing Systems, 36, 2024.
- Rank list sensitivity of recommender systems to interaction perturbations. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 1584–1594, 2022.
- TRAK: Attributing model behavior at scale. In International Conference on Machine Learning, pages 27074–27113. PMLR, 2023.
- Automatic differentiation in PyTorch. 2017.
- Estimating training data influence by tracing gradient descent. Advances in Neural Information Processing Systems, 33:19920–19930, 2020.
- Language models are unsupervised multitask learners. 2019.
- Explaining and improving model behavior with k𝑘kitalic_k nearest neighbor representations, 2020.
- A stochastic approximation method. The annals of mathematical statistics, pages 400–407, 1951.
- Okapi at TREC-3. Nist Special Publication Sp, 109:109, 1995.
- Scaling up influence functions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8179–8186, 2022.
- Theoretical and practical perspectives on what influence functions do. Advances in Neural Information Processing Systems, 36, 2024.
- Lloyd Shapley. A value for n𝑛nitalic_n-person games. 1953.
- A simple and efficient baseline for data attribution on images. arXiv preprint arXiv:2311.03386, 2023.
- Using the adap learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the annual symposium on computer application in medical care, page 261. American Medical Informatics Association, 1988.
- Charles Spearman. The proof and measurement of association between two things. The American journal of psychology, 100(3/4):441–471, 1987.
- Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2):26–31, 2012.
- Parkinsons Telemonitoring. UCI Machine Learning Repository, 2009. DOI: https://doi.org/10.24432/C5ZS3N.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding, 2019.
- Data Banzhaf: A robust data valuation framework for machine learning. In International Conference on Artificial Intelligence and Statistics, pages 6388–6421. PMLR, 2023.
- A privacy-friendly approach to data valuation. Advances in Neural Information Processing Systems, 36, 2024.
- Residuals and influence in regression. 1982.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.emnlp-demos.6.
- LESS: Selecting influential data for targeted instruction tuning, 2024.
- Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms, 2017.
- Representer point selection for explaining deep neural networks. Advances in neural information processing systems, 31, 2018.
- First is better than last for language data influence. Advances in Neural Information Processing Systems, 35:32285–32298, 2022.
- I-Cheng Yeh. Concrete Compressive Strength. UCI Machine Learning Repository, 2007. DOI: https://doi.org/10.24432/C5PK67.
- Wide residual networks, 2017.
- Counterfactual memorization in neural language models. Advances in Neural Information Processing Systems, 36:39321–39362, 2023.
- Intriguing properties of data attribution on diffusion models. In International Conference on Learning Representations, 2023.