CircuitSeer: LLM Data Selection
- CircuitSeer is a data selection methodology that leverages transformer reasoning circuits to identify complex, high-quality training data for LLM fine-tuning.
- It quantifies reasoning complexity using attention head probing and variance scoring to prioritize reasoning-rich samples.
- Empirical results show that using just the top 10% of CircuitSeer-selected data improves Pass@1 performance across various LLMs and math benchmarks.
CircuitSeer is a data selection methodology for LLM training, rooted in probing and leveraging the internal structure of mathematical reasoning circuits within transformer models. CircuitSeer reframes data curation for LLMs by quantifying the reasoning complexity of training samples through their influence on the model’s own sparse, highly specialized attention heads. Rather than relying on external heuristics, black-box majority voting, or auxiliary quality scoring models, CircuitSeer provides an efficient, model-internal mechanism for selecting high-quality, reasoning-rich data—demonstrating improved fine-tuning outcomes using dramatically smaller subsets of the original training corpus (Wang et al., 21 Oct 2025).
1. Exploiting Internal Reasoning Circuits in LLMs
CircuitSeer is underpinned by the empirical finding that LLMs, when faced with complex reasoning tasks (such as those in mathematics), consistently exhibit a sparse activation pattern across their attention heads. Only a small, specialized subset of attention heads—termed “reasoning circuits”—are consistently crucial across diverse reasoning problems. These heads can be identified through ablation analysis: when an individual head is ablated (e.g., by altering its attention to mimic an uninformative, lower-triangular mask), those whose ablation causes a significant increase in loss on a reasoning probe set are retained as the “reasoning-sensitive subset.”
This mechanism allows CircuitSeer to bypass opaque or computationally expensive external data selection and instead use interpretable internal dynamics as a reliable indicator of complex reasoning. The approach represents a conceptual shift: sample quality is judged according to the LLM’s own internal “reasoning workload,” making data selection self-consistent with the model’s mechanistic interpretability.
2. Data Scoring via Attention Head Probing
The core of CircuitSeer’s sample selection is a mathematically precise scoring function that captures the non-uniformity and specificity of attention paid by the reasoning heads during inference. For a given sample xᵢ with n tokens, CircuitSeer extracts row-normalized attention matrices Ah from the identified reasoning heads h ∈ ℋ_math. The mean attention for each token position k is computed as
The aggregate CircuitSeer score for the sample is then defined as the sample variance:
This variance quantifies the concentration and selectivity of the reasoning circuitry. Intuitively, higher variance reflects problems that activate attention more selectively and thus require more complex, multi-step inference. Samples are then ranked by S(xᵢ); high-scoring examples are preferentially retained for fine-tuning or pre-training.
During the head ablation phase, LLM attention is forcibly routed through an undifferentiated lower-triangular matrix (e.g., aᵢⱼ = 1/i for i ≥ j, zero otherwise) so that each head’s ablation impact is measured as an expected loss increase on a reference probe set:
Heads with the largest expected loss increases are retained as reasoning-critical.
3. Empirical Evaluation and Performance Impact
CircuitSeer’s methodology was validated across four LLMs (including Qwen2.5-Math-7B, Llama-3.2-3B, and Llama-3.1-8B-Instruct) and nine math reasoning datasets (AIME, AMC, MATH, Olympiad, various Grade-K benchmarks). Models were fine-tuned using (i) the full data and (ii) only the top 10% of data as scored by CircuitSeer, with baselines including random sampling, loss-based selection, quality scoring, and diversity selection.
The results demonstrated that fine-tuning with just the 10% CircuitSeer-selected data yielded a 1.4-point gain in average Pass@1 compared to fine-tuning on the entire data. This improvement was robust across all model sizes and test sets. The selected subset consistently outperformed or matched full-data training—and outperformed established alternative selection heuristics—highlighting its ability to identify the most reasoning-relevant training examples (Wang et al., 21 Oct 2025).
4. Operational Algorithm and Mathematical Framework
The data selection pipeline in CircuitSeer is as follows:
- Reference Model Probing: Probe the LLM on a hand-verified reasoning dataset, systematically ablating each attention head and recording loss increases.
- Identification of Reasoning Heads: For each head, select as “reasoning-critical” if its ablation leads to a significant loss jump across probe samples.
- Data Scoring: For each candidate sample, pass it through the model, aggregating attention scores from only the reasoning heads. Compute the mean attention per token and the variance S(xᵢ) as described above.
- Sampling Policy: Normalize S(xᵢ) over the dataset to define a probability distribution and soft-sample (or rank) to select the most reasoning-rich subset for training.
The specific attention intervention formula is:
with a custom masking matrix A defined as aᵢⱼ = 1/i for i ≥ j, zero otherwise.
5. Theoretical and Practical Implications
CircuitSeer demonstrates that a small number of attention heads (the “reasoning circuits”) in LLMs provide strong, interpretable signals about training sample complexity. By using the internal wiring and activation structure of the model, CircuitSeer is able to prioritize samples that require deep or multi-step reasoning, while excluding those that are simple, repetitive, or offer little further value for mathematical reasoning competence.
The method is distinguished by its efficiency—eschewing external models, avoiding repeated auxiliary score computations, and requiring only a single ablation-pass over the LLM and subsequent shallow forward passes for data ranking.
A plausible implication is that this approach could generalize beyond math to other structured reasoning domains where complex internal circuit activations correlate with human-relevant problem difficulty. It also bridges the field of mechanistic interpretability with practical data curation, using direct measurement of the model’s “reasoning circuits” to optimize learning.
6. Limitations and Future Directions
CircuitSeer, as presented, depends on explicit head ablation and a curated probe dataset for initial critical head detection. This requirement may be mitigated by developing unsupervised or self-supervised head selection techniques, or by exploiting broader patterns of internal activation across tasks. Additionally, extensions could explore input-conditional circuit detection or more granular subnetwork profiling. Finally, while the approach has shown success for mathematical reasoning, its generality and effectiveness in other domains (e.g., logic, code generation, deduction, and higher-order language tasks) warrants systematic exploration.
Further suggested work includes refining head detection criteria, comparing input-side and output-side attention flows, experimenting with harder soft-sampling strategies to balance diversity and reasoning density, and automating probe set construction.
7. Broader Context and Relationship to Prior Work
CircuitSeer represents a departure from prior data curation techniques, which typically rely on random sampling, external loss-based proxies, or heuristics not grounded in the model’s mechanistic function. By integrating the model’s internal logic circuit activity directly into data selection, CircuitSeer offers a more principled method to distill training sets for maximal reasoning gain. This suggests a promising new direction for curriculum learning, dataset construction, and efficient model scaling in the training of LLMs, particularly for domains where exhaustive data annotation is infeasible or cost-prohibitive (Wang et al., 21 Oct 2025).