Persistence of student-over-teacher advantage with more data

Determine whether repository-specialized supervised fine-tuning of Qwen 3-32B on synthetic trajectories continues to match or exceed the performance of the GLM-4.5-Air teacher as the number of repository-specific samples increases beyond approximately 8,000 per repository, or whether the advantage plateaus.

Background

The authors show that specialization with about 8,000 trajectories per repository allows the student to match or exceed the teacher (GLM-4.5-Air), and scaling laws predict continued improvement with more data.

They acknowledge compute constraints prevented them from validating whether this advantage persists at larger scales, leaving open whether gains saturate or continue.

References

Our specialization results show that we can match or exceed teacher performance at around 8,000 samples per repository, and our scaling laws predict this trend continues with sufficient data. However, we could not verify whether this advantage scales further due to compute limitations.

— SERA: Soft-Verified Efficient Repository Agents (2601.20789 - Shen et al., 28 Jan 2026) in Section 9 (Limitations), Matching teacher performance

Persistence of student-over-teacher advantage with more data

Background

References

Related Problems