- The paper presents Escher-Loop, a framework that evolves both task and optimizer agents through closed-loop self-referential optimization to outperform static baselines.
- It employs a dynamic benchmarking mechanism using pairwise Elo evaluations, ensuring robust and adaptive resource allocation across complex optimization landscapes.
- Empirical results demonstrate superior peak scores and stability on tasks like Kissing Number and Circle Packing, highlighting scalable, autonomous self-improvement.
Escher-Loop: Mutual Evolution by Closed-Loop Self-Referential Optimization
Motivation and Framework Architecture
The Escher-Loop framework addresses the fundamental limitations of current autonomous agent systems, which are dominated by manually crafted workflows and static heuristics. By formalizing mutual evolution between task agents (solving concrete problems) and optimizer agents (refining both task agents and themselves through recursive optimization), Escher-Loop enables dynamic, open-ended improvement grounded in empirical task feedback. This closed-loop architecture eschews human priors—aligning with the "bitter lesson" that scalable computation and adaptive search outperform domain-specific intuition—and operationalizes self-referential optimization at the population level.
Figure 1: Escher-Loop schematic illustrating the closed-loop interaction between task and optimizer agents, with empirical task scores providing dual-purpose feedback for both agent populations.
Dynamic Benchmarking Mechanism
A central innovation in Escher-Loop is its dynamic benchmarking strategy. Rather than relying on absolute performance scores—which are non-stationary due to evolving task populations—Escher-Loop conducts pairwise competitions among optimizer agents, producing relative win-loss signals from empirical task outcomes. These signals update optimizer scores via an Elo rating system, ensuring robust, fair evaluation without additional computational overhead. The dual use of empirical task scores for optimizer benchmarking and task evaluation optimizes resource allocation, focusing computation on promising optimizers and dynamically refining optimization strategies.
Empirical Evaluation
Escher-Loop was empirically assessed against static hand-engineered optimizer baselines using mathematical search tasks including Kissing Number (KN), Circle Packing (CP), and Heilbronn Triangle (HT). All methods operated under matched compute budgets, measured in equivalent tokens.
Figure 2: Comparative best-so-far task performance between the baseline optimizer and Escher-Loop across three optimization tasks. Escher-Loop consistently achieves higher peak scores and shows robustness against plateauing.
Across all tasks, Escher-Loop outperformed static baselines in terms of both converged scores and cross-run stability. In complex landscapes (KN, CP), Escher-Loop reliably avoided suboptimal plateaus and achieved the highest absolute scores. In the simpler HT landscape, the baseline showed early efficiency, but Escher-Loop ultimately reached a superior peak.
Figure 3: Best-so-far task performance for baseline versus the single best evolved optimizer (SBO) from Escher-Loop. SBO outperforms baseline in complex tasks but does not exceed the joint performance ceiling achieved by evolving populations.
Analysis of the single best evolved optimizer (SBO) from Escher-Loop reveals that static adoption of a fixed optimizer is suboptimal; only the full mutual-evolution population achieves maximal results, particularly in more complex search domains.
Optimizer Population Dynamics
Figure 4: Elo trajectories of optimizer agents. Top agents receive denser evaluation and their trajectories reflect continuous refinement and reallocation of computational resources.
The dynamic benchmarking approach, reflected in optimizer Elo trajectories, allocates resources toward competitive optimizers and maintains hierarchical population diversity. While exploitation dominates, exploration ensures lower-ranked optimizers can ascend, demonstrating a dynamic equilibrium consistent with evolutionary principles.
Emergence of Advanced Optimization Strategies
Self-referential optimization gives rise to sophisticated emergent behaviors within the optimizer agent population, as illustrated below:



Figure 5: Diagnostic feedback mechanism in evolved optimizers, featuring explicit recovery, plateau signaling, and diversity management for adaptive search.
Evolved optimizers exhibit nontrivial behaviors such as adaptive search control (shifting between exploration and exploitation based on evolutionary signals), stage-aware system prompting, reference-program mining, and explicit diagnostic feedback for regression recovery and diversity warnings. These qualitatively advanced strategies transcend static, handcrafted prompts, reinforcing the effectiveness of population-level mutual evolution.
Theoretical and Practical Implications
Escher-Loop demonstrates that optimization ability is a continuously improvable and comparable property, not a fixed mechanism. The closed-loop coupling induces evolutionary pressure analogous to natural selection, enabling optimizer agents to adapt and refine their strategies based on real task execution feedback. Joint evolution eliminates reliance on static workflows, leading to open-ended discovery in computational tasks.
Practically, this framework holds promise for scalable, autonomous agents capable of meta-cognitive self-improvement across diverse domains. Theoretically, it provides a pathway toward systems where optimization and generalization are unified via transferable, evolving strategies—supporting future multi-task and cross-domain applications where improvement is an intrinsic system property.
Future Directions
Escher-Loop opens avenues for further exploration in cross-domain multi-task optimization, population-based recursive self-improvement, and universal benchmarking strategies. Integrating more expressive agent representations, richer environmental dynamics, and asynchronous co-evolution frameworks (cf. CORAL (Qu et al., 2 Apr 2026), CoEvolve (Yang et al., 17 Apr 2026)) may yield even greater autonomy and scalability.
Conclusion
Escher-Loop advances autonomous agent frameworks by operationalizing closed-loop, self-referential mutual evolution of task agents and optimizers. The robust empirical results across optimization landscapes and emergence of qualitatively advanced strategies underscore the superiority of population-level recursive adaptation over static heuristics. This perspective highlights a fundamental shift from optimizing individual task solutions to optimizing the optimization process itself—suggesting that scalable intelligence is predicated on continual, grounded improvement of optimization capacities. Future work should explore the generalization potential of Escher-Loop across broader domains and validate its capacity for universal, cross-task transferability.