Cross-model generalization of findings
Determine whether the empirical findings reported for SERA using the Qwen 3-32B base model and GLM-4.5-Air/GLM-4.6 teachers generalize to other language model families when evaluated thoroughly.
References
While we have some experiments with Claude 3.7 Sonnet and CLaude 4.0 Sonnet that hints at generalization of our method, we do not know whether our findings generalize to other model families when evaluated thoroughly.
— SERA: Soft-Verified Efficient Repository Agents
(2601.20789 - Shen et al., 28 Jan 2026) in Section 9 (Limitations), Model-specific results