Investigate tractability of scaling verification and program synthesis from toy to frontier systems
Investigate the tractability of scaling mechanistic interpretability–inspired approaches to formal verification and program synthesis from toy models to frontier AI systems, and identify the technical barriers to such scaling.
References
Several open questions remain about the tractability of scaling these approaches from toy models to frontier systems.
— Open Problems in Mechanistic Interpretability
(2501.16496 - Sharkey et al., 27 Jan 2025) in Using mechanistic interpretability for better predictions about AI systems — Predicting behavior in novel situations (Section 3.3.1, paragraph on formal verification and program synthesis)