Does extended DGM runtime surpass closed-source SWE-bench systems?
Determine whether increasing the number of iterations and compute allocated to the Darwin Gödel Machine—an open-ended, self-improving system that iteratively modifies its own code to design LLM-based coding agents—continues to yield further performance gains on SWE-bench and can exceed the performance of closed-source state-of-the-art SWE-bench systems.
References
However, it still falls short of closed-source SoTA SWE-bench solutions. An open question is whether running the DGM for longer would continue to yield performance gains and eventually surpass closed-source solutions.
— Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
(2505.22954 - Zhang et al., 29 May 2025) in Section 6: Conclusion and Limitations