Behavior outside the NTK linearization regime
Ascertain the generalization and performance behavior of SAMerging and the associated PAC-Bayes excess-risk guarantees when the Neural Tangent Kernel local linearization assumption does not hold, specifically when the merged parameters deviate sufficiently far from the pretrained checkpoint so that the NTK approximation is invalid.
Sponsor
References
The analysis assumes a local NTK-style linearization, so behavior far from this regime is uncertain.
— Model Merging via Multi-Teacher Knowledge Distillation
(2512.21288 - Dalili et al., 24 Dec 2025) in Section 5, Conclusion — Limitations and future work