Outlier Tokens Tracing (OTT)

Updated 27 February 2026

Outlier Tokens Tracing (OTT) is a methodology that detects anomalous tokens in transformer models using norm-based, probabilistic, and gradient metrics.
It employs dynamic sliding windows, attention sink tracing, and standardized oddballness scores to systematically flag and manage outliers.
OTT enhances applications in quantization, anomaly detection, and privacy tracing, achieving improvements in model compression and interpretability.

Outlier Tokens Tracing (OTT) designates a suite of methodologies for the identification, tracking, and systematic treatment of abnormally behaving tokens—often called outlier tokens or attention sinks—within neural network-based LLMs, particularly in transformers. These procedures are used for a diverse set of applications, such as high-precision quantization of cached hidden states during inference, anomaly detection, privacy tracing, and interpretability. Central to OTT is the explicit construction of token-level judgments based on statistical, geometric, or gradient-based criteria, often involving token-level norms, attention logit statistics, model-generated probabilistic outputs, or loss gradient metrics.

1. Formal Definitions and Taxonomy of Outlier Tokens

Outlier tokens are defined according to different operational contexts:

Quantization Sensitivity: Tokens with abnormally small $\ell_1$ -norms of their key vectors, as quantified by $S_i = \|\mathbf{K}_i\|_1 = \sum_{c=1}^d |K_{i,c}|$ ; the $N$ tokens with the lowest $S_i$ in a group or window are identified as outliers and excluded from quantization to minimize distortion (Su et al., 16 May 2025).
Attention Sinks: Tokens receiving disproportionately large cumulative attention (sum over all queries), characterized by being a statistical outlier in the distribution of attention logits $a_{ij}^{(\ell,h)}$ , where token $j$ is an attention sink in head $h$ of layer $\ell$ if $S_j^{(\ell,h)} = \sum_{i=1}^L \mathbf{1}_{a_{ij}^{(\ell,h)} > \mu^{(\ell,h)} + \alpha \sigma^{(\ell,h)}}$ exceeds a given threshold (Qiu et al., 30 Jan 2026, Zhang et al., 2 Feb 2025).
Anomalous Probabilistic Events: Tokens with high oddballness $E_t = \sum_{v \in V} \max(0,\,p_t(v) - p_t(x_t))$ , marking them as model-surprising events relative to the predicted distribution $p_t$ (Graliński et al., 2024).
Gradient-Norm Outliers: Tokens with per-token gradient norms substantially larger than the empirical mean, dominating influence function estimates and biasing privacy tracing; formally, tokens $(k,j)$ with $\|g_{k,j}\|_2 > \mu_g + \alpha \sigma_g$ (Liu et al., 2024).

This taxonomy captures both geometric (norm-based), probabilistic (oddballness), and functional (gradient-based) perspectives on outlier tokens.

2. Algorithmic Procedures for Outlier Tokens Tracing

The methodological core of OTT comprises dynamic identification, logging, and processing strategies:

Sliding Window and Pool Maintenance: Maintain a pool of outlier tokens of size $N$ . At each quantization step (group of $G$ tokens), merge current outliers and new candidates, rank by $S_i$ , and update the pool to contain those with the lowest scores. Only these are stored in full precision during quantization; others undergo lossy compression (Su et al., 16 May 2025).
Attention Sink Tracing: For each head/layer, calculate per-token sink-scores using standardized attention logit statistics, capturing the dynamic manifestation of attention sinks by tracking which tokens repeatedly attract excess attention. This is realized by instrumented hooks and batchwise logging, supporting both visualization and intervention (Qiu et al., 30 Jan 2026).
Probabilistic Outlier Detection: For sequence anomaly detection, calculate the oddballness metric $E_t$ for each token, optionally standardizing via running mean/variance to obtain z-scores $Z_t$ . Tokens crossing predetermined thresholds are flagged as outliers (Graliński et al., 2024).
Gradient-Norm Adjusted Influence Tracing: In privacy leakage tracing, compute per-token gradient norms to identify outlier tokens. Standard influence functions (IFs) are heuristically reweighted via $A(v) = W(\|v\|_2)\,v$ , typically with $W(u) = 1/u$ , to penalize tokens with excessive gradient amplitudes and restore meaningful attribution (Liu et al., 2024).

3. Mathematical and Statistical Underpinnings

OTT approaches are undergirded by several statistical and optimization-theoretic frameworks:

Quantization: Standard group-wise and channel-wise quantization is adapted in OTT by replacing outlier entries with group means before parameter selection, ensuring that quantization range estimates are unaffected by outlier magnitudes (Su et al., 16 May 2025).
Outlier-Driven Rescaling: Outlier tokens act in conjunction with normalization schemes (softmax, RMSNorm) to globally rescale or regulate representational distributions. Sinks inflate normalization denominators, thus modulating the dynamic range for other coordinates or tokens (Qiu et al., 30 Jan 2026).
Catch, Tag, and Release: Theoretical analysis shows that for certain tasks (e.g., subsequence averaging), the formation and tracking of outlier tokens and their features are a minimal computational primitive; transformers implement this via attention sinks, outlier-feature “tagging,” and downstream uniform release (Zhang et al., 2 Feb 2025).
Influence Function Robustness: High-gradient tokens cause divergence or instability in IF-based provenance attribution. Heuristically Adjusted IFs (HAIF) leverage per-token norm-based suppression to restore the faithfulness of provenance tracing, substantially reducing outlier dominance (Liu et al., 2024).

4. Applications in Model Compression, Privacy, and Anomaly Detection

OTT manifests across several key domains:

KV-Cache Quantization: OTT is integrated into transformer inference pipelines to drastically compress key and value caches (up to $6.4\times$ memory reduction and $2.3\times$ throughput increase at 2-bit quantization) by excluding dynamically selected outlier tokens from quantization, preserving near-FP16 accuracy (Su et al., 16 May 2025).
Attention Mechanism and Interpretability: OTT is used to trace, visualize, and experimentally intervene on attention sinks and outlier features, providing interpretability and informing pruning or low-rank approximation schemes required for robust compression (Zhang et al., 2 Feb 2025).
Anomaly/Grammatical Error Detection: Oddballness-based OTT achieves superior F0.5 scores over low-likelihood and top-K exclusion detectors in unsupervised grammatical error detection across English and other languages, with robust thresholding and model independence (Graliński et al., 2024).
Training Data Influence and Privacy Tracing: OTT, as realized in HAIF, improves privacy tracing accuracy by more than 20 percentage points over baselines, robustly identifying true provenance samples even under challenging offset and length perturbations in user prompts (Liu et al., 2024).

5. Practical Considerations and Hyperparameter Selection

OTT pipeline deployment hinges on several empirically calibrated hyperparameters:

Use Case	Key OTT Hyperparameters	Empirically Supported Values
KV-Quantization	Group size $G$ , window $R$ , pool size $N$	$G=128$ , $R=32$ , $N=3$
Attention Sink Trace	Logit threshold $\alpha$ , sink count threshold $\beta$	$\alpha=3$ , $\beta=0.5\cdot L$
Oddballness OTT	LM context $K_{ctx}$ , stats window $N_{stats}$ , thresholds $E, Z$	$K_{ctx}=512$ -$2048$, $N_{stats}=200$ , $\theta_E=0.85-0.9$ , $\theta_Z=2-3$
HAIF (Privacy Trace)	Weighting function $W(u)$ , outlier cutoff $\alpha$	$W(u)=1/u$ , $\alpha=2$ or $3$

Empirical studies show OTT’s gains plateau at small $N \approx 3$ outliers per layer in quantization. Oddballness scores are robust to model, language, and prompt variability. For IF-based tracing, HAIF’s double adjustment, even under varied prompt/response lengths, maintains >80% tracing accuracy on realistic data (Su et al., 16 May 2025, Graliński et al., 2024, Liu et al., 2024).

6. Open Problems and Theoretical Considerations

Current OTT methods leverage heuristic weight functions, empirical thresholds, and statistical proxies for outlier-ness. Theoretical justifications for choices like $W(u)$ in HAIF or the exact role of outlier-driven rescaling in model generalization are largely open. There is ongoing investigation into principled robustifications of influence functions and generalization to modalities beyond language, as well as the impact of hierarchical (phrase- or sentence-level) outlier phenomena (Liu et al., 2024, Qiu et al., 30 Jan 2026). The necessity of low-rank parameter retention for semantic reliability in task-specific pruning remains an active research area (Zhang et al., 2 Feb 2025).

7. Significance and Future Directions

OTT frameworks have established themselves as crucial in bridging deployment efficiency and functional fidelity in LLMs, providing robust handles on interpretability, outlier-driven model behavior, privacy attribution, and real-world task robustness. Their ongoing evolution will likely expand through more theoretically grounded criteria for outlier detection, efficient implementation in increasingly larger models, and broader application in domains with analogous outlier effects. The integration of OTT into standard model pipelines demonstrates its efficacy and the importance of adaptive, token-level anomaly tracing in contemporary machine learning systems (Su et al., 16 May 2025, Qiu et al., 30 Jan 2026, Zhang et al., 2 Feb 2025, Graliński et al., 2024, Liu et al., 2024).