Self-play in code generation
Determine whether and how self-play can be effectively realized for large language model-based code generation, given that unit-test-based verification is brittle and susceptible to reward hacking and error propagation, so that models can reliably learn without human supervision.
Sponsor
References
Hence self-play in code generation remains an open problem.
— Propose, Solve, Verify: Self-Play Through Formal Verification
(2512.18160 - Wilf et al., 20 Dec 2025) in Section 1 (Introduction)