SDF performance on highly capable future models
Determine whether Synthetic Document Finetuning (SDF) remains effective at implanting deep, robust beliefs in future, highly capable language models, and characterize how its performance scales with model capability and increased inference-time compute.
References
While these scaling results are encouraging, it is unclear how SDF will perform on highly capable future systems. Appendix \ref{appendix:future_models} provides some evidence that SDF may scale favorably---effectiveness increases with model size and persists even when models know about the technique---though continued evaluation will be necessary.
— Believe It or Not: How Deeply do LLMs Believe Implanted Facts?
(2510.17941 - Slocum et al., 20 Oct 2025) in Section 4.2, Subsubsection “SDF is robust to increased inference-time compute”