Reliable synthesis of natural language issue descriptions in self-play
Determine whether and how Self-play SWE-RL can reliably generate high-quality, unambiguous natural language issue descriptions during self-play, avoiding collapse to copying test patches or producing logically incoherent, repetitive descriptions, so that agents can operate with natural language specifications rather than only formal test patches.
Sponsor
References
While this design minimizes data assumptions and proves effective for learning, our initial attempts failed to reliably generate high-quality and unambiguous issue descriptions. The generated issues tend to copy test patches, are logically incoherent, and collapse to identical patterns.
— Toward Training Superintelligent Software Agents through Self-Play SWE-RL
(2512.18552 - Wei et al., 21 Dec 2025) in Discussion, Subsection "Unsuccessful attempts"