Principledness of post‑hoc logit calibration after mask discretization
Determine whether applying a learned affine calibration (a scale and shift fitted by limited‑step L‑BFGS) to the final logits after discretizing node masks in the structured pruning procedure for weight‑sparse transformers is a principled and faithful practice, or whether this post‑hoc adjustment introduces methodological bias or artifact in evaluating pruned circuits.
References
As we find that our discretized models often are quite uncalibrated, we optimize a scale+shift transformation to the final logits using 16 steps of LBFGS. It's unclear whether this is principled to do in general.
— Weight-sparse transformers have interpretable circuits
(2511.13653 - Gao et al., 17 Nov 2025) in Appendix, Method details, Subsection “Pruning algorithm,” Mask discretization paragraph