Persistence of student-over-teacher advantage with more data
Determine whether repository-specialized supervised fine-tuning of Qwen 3-32B on synthetic trajectories continues to match or exceed the performance of the GLM-4.5-Air teacher as the number of repository-specific samples increases beyond approximately 8,000 per repository, or whether the advantage plateaus.
References
Our specialization results show that we can match or exceed teacher performance at around 8,000 samples per repository, and our scaling laws predict this trend continues with sufficient data. However, we could not verify whether this advantage scales further due to compute limitations.
— SERA: Soft-Verified Efficient Repository Agents
(2601.20789 - Shen et al., 28 Jan 2026) in Section 9 (Limitations), Matching teacher performance