TRAK Algorithm for Scalable Data Attribution

Updated 8 February 2026

TRAK is an algorithmic framework for data attribution that traces model predictions back to individual training examples using kernel linearization, random projection, and ALO influence estimation.
It balances computational efficiency with high ranking fidelity, enabling practical applications like data cleaning, anomaly detection, and robustness certification in deep models.
Empirical results show that TRAK approximates traditional retraining methods with significantly reduced computational cost while extending to robust certification through extensions like Natural W-TRAK.

TRAK (Tracing with the Randomly-projected After Kernel) is a scalable algorithmic framework for data attribution: tracing model predictions or outputs back to individual training examples in modern machine learning models, particularly deep neural networks. TRAK addresses the computational intractability and efficacy trade-offs of prior data attribution methods by combining a kernel-based linearization, dimension-reducing random projections, and approximate leave-one-out (ALO) risk estimation. Empirical and theoretical results demonstrate that, despite aggressive approximations, TRAK reliably preserves the ranking of influential datapoints, enabling both interpretability and downstream applications such as data cleaning, anomaly detection, and robustness certification (Tong et al., 1 Feb 2026, Park et al., 2023, Li et al., 9 Dec 2025).

1. Problem Setting and Motivation

Given a trained model $f(x;\beta)$ (or $f(z;\theta)$ in general deep networks) obtained by empirical risk minimization, the fundamental goal is to estimate the influence $I^{True}_i$ of training point $z_i = (x_i, y_i)$ on an output $f(x_{new};\hat{\beta})$ (or other prediction of interest). The gold-standard influence is computed as the change in output due to retraining after deleting $z_i$ :

$I^{True}_i = f(x_{new};\hat{\beta}) - f(x_{new};\hat{\beta}_{/i}), \qquad \hat{\beta}_{/i} = \arg\min_\beta \sum_{j \ne i} \ell(y_j, f(x_j;\beta))$

For modern overparameterized models (e.g., with $d > 10^6$ ) and large datasets ( $n \gtrsim 10^5$ ), this retraining-based approach is computationally prohibitive, motivating efficient surrogate algorithms.

Traditional influence-function approaches are limited by their reliance on first- or second-order approximations and convexity assumptions, which can fail in nonconvex deep architectures. By contrast, kernel/Shapley/sampling-based approaches maintain fidelity by brute-force model retraining or marginalization, but incur excessive computational cost. TRAK is designed to deliver consistent, high-quality attribution for differentiable, large-scale models with only a handful of trained models (Park et al., 2023, Tong et al., 1 Feb 2026).

2. The TRAK Algorithm: Formulation and Methodology

TRAK decomposes scalable influence approximation into three core steps:

(a) Kernel-Machine Linearization ("After Kernel")

The model’s output $f(x;\hat{\beta})$ is first linearized around the learned parameters $\hat{\beta}$ via first-order Taylor expansion. The per-example gradient $g_j = \nabla_\beta f(x_j;\hat{\beta})$ serves as a feature representation. The linearized model in this tangent space is:

$f(x_j;\beta) \approx g_j^\top \beta + b_j, \qquad b_j = f(x_j;\hat{\beta}) - g_j^\top \hat{\beta}$

A surrogate empirical risk minimization is then solved on the gradients, equivalent to kernel learning with kernel $K_{ij} = g_i^\top g_j$ (Tong et al., 1 Feb 2026, Park et al., 2023).

(b) Random Projection for Dimensionality Reduction

For high-dimensional $\beta$ or $\theta$ , TRAK applies a Johnson–Lindenstrauss random projection $S \in \mathbb{R}^{d \times k}$ (with $k \ll d$ ) to the gradient features:

$\psi_j = S^\top g_j, \qquad \Psi = [\psi_1, ..., \psi_n]^T$

This step enables computationally efficient matrix inversion and further suppresses overfitting. The projected features preserve dot products between tangent vectors up to distortion, crucial for robust influence estimation (Park et al., 2023, Tong et al., 1 Feb 2026).

(c) Approximate Leave-One-Out (ALO) Influence

TRAK adapts the ALO methodology for efficient LOO influence computation. For loss function $\ell$ with derivatives $\ell'_i$ , $\ell''_i$ evaluated at the linearized solution, the ALO-corrected influence of $z_i$ on $x_{new}$ is:

$\widehat{I}_i^{TRAK} = \frac{\ell'_i \cdot \psi_{new}^\top H^{-1} \psi_i}{1 - \ell''_i \cdot \psi_i^\top H^{-1} \psi_i}$

where $H = \Psi^\top \operatorname{diag}(\ell''_j) \Psi$ . This approximation arises from a second-order Sherman–Morrison–Woodbury formula and matches LOO retraining up to $O(1/n^{1.5-\epsilon})$ error under mild conditions (Tong et al., 1 Feb 2026).

Algorithmic Overview

Step	Input	Output
Kernel linearization	$f$ , $\hat{\beta}$	Gradients $g_j$
Random projection	$g_j$ , $S$	Features $\psi_j$
ALO influence (TRAK)	$\psi_j$ , $\ell'_j$	$\widehat{I}_i^{TRAK}$

Further, the algorithm is often ensembled over several random projections and/or model checkpoints for improved stability, and soft-thresholding is optionally applied to promote sparsity in influence vectors (Park et al., 2023).

3. Theoretical Guarantees, Error Bounds, and Robustness

Theoretical analysis of TRAK characterizes approximation errors at each algorithmic step under high-dimensional scaling ( $n, p, d \to \infty$ ), with sub-Gaussian data and well-conditioned Hessians (Tong et al., 1 Feb 2026):

Linearization Error: For $z_i = x_{new}$ (dependent case), the linearization error in influence is $O_p(\|\beta^*\|^2 / n)$ . For $z_{new} \perp D$ (independent case), the error is $O_p(\|\beta^*\| / n^{1-\epsilon})$ .
ALO Error: Additional error from the ALO approximation is $O_p(\|\beta^*\|^2 / n^{1.5-\epsilon})$ in the dependent case.
Projection Error: With $k \ll d$ , the error grows as $O_p(\|\beta^*\| \sqrt{k} / (n^{1-\epsilon}\sqrt{d}))$ ; invalid projections ( $k \ll d/\|\beta^*\|^2$ ) collapse the separation between signal and noise influences.

Despite potentially large absolute errors in approximations, the critical theoretical finding is that the relative ordering (ranking) of influence scores remains robust. The Spearman rank correlation between true and TRAK influences converges to 1 as $k/d \to 1$ , provided $k \gg d/\|\beta^*\|^2$ . Consequently, TRAK excels at distinguishing genuinely influential points from noise (Tong et al., 1 Feb 2026).

Recent advances reveal that standard TRAK scores can be geometrically unstable under Euclidean perturbations—specifically, spectral amplification in the feature covariance $Q$ renders Euclidean robustness vacuous. The Natural W-TRAK certification framework addresses this by shifting to a Mahalanobis geometry induced by $Q^{-1}$ , providing certified bounds on the stability of attributions and quantifying intrinsic instability of high-leverage points (Li et al., 9 Dec 2025).

4. Implementation, Practical Recipe, and Hyperparameters

TRAK is implemented as follows (Park et al., 2023, Tong et al., 1 Feb 2026):

Gradient Feature Computation: Compute $\nabla_\theta f(z_j; \theta^*)$ for all points $z_j$ . In deep networks, this utilizes batch automatic differentiation.
Projection: Apply Gaussian random projections ( $k \approx 1000$ –$20,000$ typical).
Ensembling: Fit ensembles of $M$ subsampled models or checkpoints to reduce estimator variance.
Thresholding: Apply soft-thresholding to filter noise and enforce sparsity as required.
Matrix Solution: Use Cholesky or conjugate gradient solvers for $(\Psi^T W \Psi)^{-1}$ .

Typical settings are $M=20$ –$100$ models, projection $k=4{,}000$ – $10{,}000$ (CIFAR), subsampling fraction $\alpha=0.5$ . Computational complexity is dominated by $O(nk^2)$ per model, markedly lower than LOO retraining or LASSO-based datamodel approaches.

Limitations include requirements for differentiability (excluding tree- and $k$ NN-based models) and sensitivity to the conditioning of the feature covariance matrix.

5. Empirical Performance and Comparative Evaluation

TRAK has been validated extensively on image, language, and multimodal architectures (Park et al., 2023). Key findings include:

On CIFAR-10, TRAK with 20 models ( $\sim$ 20 min) attains LDS $\approx 0.54$ , matching the gold-standard datamodel requiring 50,000 models ( $\sim$ 2,500 min).
On QNLI (BERT-b), TRAK achieves LDS $\approx 0.59$ using 10 models, matching datamodel performance while using over four orders of magnitude fewer models.
Competing methods (influence functions, TracIn, gradient cosine) achieve considerably lower LDS ($0.05$–$0.2$).
Downstream effects such as top- $k$ overlap (CIFAR: $>$ 70–80% up to $k=50$ ) and brittleness to deletion are consistently captured.

W-TRAK (natural geometry) extends these capabilities by enabling robust certification. On CIFAR-10/ResNet-18, Natural W-TRAK certifies 68.7% of ranking pairs for reasonable perturbation radii, compared to 0% for naive Euclidean bounds. The method also identifies corrupted labels via self-influence scores, yielding AUROC $=0.970$ and recovering 94.1% of corrupted labels by inspecting the 20% highest-influence training points (Li et al., 9 Dec 2025).

6. Limitations, Open Challenges, and Future Directions

TRAK's principal limitations arise from reliance on first-order model linearization, random projection distortion at low projection dimensions, and sensitivity to the feature covariance's condition number. In extremely dense influence regimes or if $k \ll d/\|\beta^*\|^2$ , ranking fidelity can deteriorate as the separation between signal and noise recedes. Empirical performance can be further improved by integrating early-stopped checkpoints for ensembling or fine-tuning soft-thresholding criteria.

Open directions include:

Incorporating higher-order Taylor expansion terms,
Structural or learned random projections,
Generative-model (e.g., autoregressive next-token) attribution,
Robust attribution using Natural W-TRAK for certified guarantees against distributional drift,
Application to data cleaning, active learning, and dataset distillation tasks via self-influence metrics (Li et al., 9 Dec 2025, Park et al., 2023).

7. Relationship to Robust Data Attribution and Theoretical Insights

The notion of self-influence (the leverage score in the whitened tangent space) is pivotal for both certification and anomaly detection. W-TRAK leverages the feature-covariance–induced (‘natural’) distance, allowing robustness analysis to remain non-vacuous even in highly ill-conditioned settings typical of deep learning (Li et al., 9 Dec 2025). The scale of self-influence quantifies both susceptibility to attribution instability (outlier detection) and Lipschitz control of attributions under Wasserstein perturbations. This unifies classical robust statistics and modern deep model attribution, enabling new principled routines for both certification and data cleaning in real-world pipelines.

Markdown Report Issue Upgrade to Chat

References (3)

Imperfect Influence, Preserved Rankings: A Theory of TRAK for Data Attribution (2026)

TRAK: Attributing Model Behavior at Scale (2023)

Natural Geometry of Robust Data Attribution: From Convex Models to Deep Networks (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to TRAK Algorithm.