Persistence of INFUSION perturbations through post-training
Ascertain whether perturbations computed by INFUSION persist through post-training, maintaining adversarial effects after procedures such as fine-tuning or alignment.
References
Key open questions: can INFUSION scale to frontier models, and can perturbations persist through post-training?
— Infusion: Shaping Model Behavior by Editing Training Data via Influence Functions
(2602.09987 - Rosser et al., 10 Feb 2026) in Section 7, Discussion — Defenses and future work