A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence (2507.21046v2)

Published 28 Jul 2025 in cs.AI

Abstract: LLMs have demonstrated strong capabilities but remain fundamentally static, unable to adapt their internal parameters to novel tasks, evolving knowledge domains, or dynamic interaction contexts. As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck, necessitating agents that can adaptively reason, act, and evolve in real time. This paradigm shift -- from scaling static models to developing self-evolving agents -- has sparked growing interest in architectures and methods enabling continual learning and adaptation from data, interactions, and experiences. This survey provides the first systematic and comprehensive review of self-evolving agents, organized around three foundational dimensions -- what to evolve, when to evolve, and how to evolve. We examine evolutionary mechanisms across agent components (e.g., models, memory, tools, architecture), categorize adaptation methods by stages (e.g., intra-test-time, inter-test-time), and analyze the algorithmic and architectural designs that guide evolutionary adaptation (e.g., scalar rewards, textual feedback, single-agent and multi-agent systems). Additionally, we analyze evaluation metrics and benchmarks tailored for self-evolving agents, highlight applications in domains such as coding, education, and healthcare, and identify critical challenges and research directions in safety, scalability, and co-evolutionary dynamics. By providing a structured framework for understanding and designing self-evolving agents, this survey establishes a roadmap for advancing adaptive agentic systems in both research and real-world deployments, ultimately shedding lights to pave the way for the realization of Artificial Super Intelligence (ASI), where agents evolve autonomously, performing at or beyond human-level intelligence across a wide array of tasks.

Summary

The paper introduces a unified framework and taxonomy for self-evolving agents, breaking down evolution into what, when, how, and where to evolve.
It details methodologies including reward-based, imitation, and evolutionary strategies to enable adaptive, self-improving systems.
The survey highlights challenges such as personalization, generalization, safety, and dynamic evaluation to advance robust, adaptive AI architectures.

A Comprehensive Survey of Self-Evolving Agents: Foundations, Taxonomy, and Pathways Toward Artificial Super Intelligence

The paper "A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence" (2507.21046) provides a systematic and in-depth review of the emerging paradigm of self-evolving agents, positioning them as a critical step toward the realization of Artificial Super Intelligence (ASI). The survey introduces a unified theoretical and practical framework for understanding, designing, and evaluating self-evolving agents, dissecting the field along the axes of what, when, how, and where to evolve, and establishing a roadmap for future research and deployment.

Figure 1: A conceptual trajectory illustrating the progression from LLMs to foundation agents, self-evolving agents, and ultimately toward hypothetical ASI, with increasing intelligence and adaptivity.

Motivation and Conceptual Foundations

The static nature of current LLMs, which are unable to adapt their internal parameters or workflows in response to novel tasks or dynamic environments, is identified as a fundamental bottleneck for deploying robust, general-purpose AI agents. The survey formalizes the notion of self-evolving agents as systems capable of continual, autonomous adaptation—modifying not only their model parameters but also their context, toolset, and architecture in response to real-world feedback and experience.

The paper provides a formal definition of self-evolving agents within a POMDP framework, introducing the self-evolving strategy as a transformation $f$ that maps the current agent system to a new state, conditioned on observed trajectories and feedback. The objective is to maximize cumulative utility across a sequence of tasks, generalizing beyond traditional curriculum learning, lifelong learning, and model editing paradigms by enabling active exploration, structural modification, and self-reflection.

Taxonomy: What, When, How, and Where to Evolve

The survey organizes the field around four orthogonal dimensions, each illustrated with representative methods and systems.

Figure 2: A comprehensive overview of self-evolving agents across key dimensions: what, when, how, and where to evolve, with evaluation goals and paradigms.

What to Evolve

Model: Includes both policy (parameter) evolution and experience-driven adaptation. Methods such as SCA, SELF, and RAGEN enable agents to generate their own training data, refine parameters via self-generated supervision, and leverage environmental feedback for continual improvement.
Context: Encompasses prompt optimization and memory evolution. Techniques like PromptBreeder, DSPy, and Mem0 allow agents to refine instructions and manage long-term memory, supporting both in-context adaptation and knowledge retention.
Tool: Covers autonomous tool creation, mastery, and scalable management. Systems such as Voyager, Alita, and ToolGen enable agents to discover, synthesize, and select tools, moving from tool users to tool makers.
Architecture: Involves both single-agent and multi-agent system optimization. Approaches like AgentSquare, Darwin Godel Machine, and AFlow demonstrate the evolution of agentic workflows, modular design, and even self-rewriting code.

When to Evolve

Figure 3: An overview of when to evolve, contrasting intra-test-time (online, within-task) and inter-test-time (offline, between-task) self-evolution.

Intra-test-time: Adaptation occurs during task execution, leveraging in-context learning, test-time supervised fine-tuning, or reinforcement learning. Reflexion and AdaPlanner exemplify real-time self-reflection and plan revision.
Inter-test-time: Learning happens retrospectively, using accumulated trajectories for offline SFT or RL. SELF, STaR, and RAGEN illustrate iterative self-improvement and curriculum-driven policy updates.

How to Evolve

Figure 4: Overview of reward-based self-evolution strategies, categorized into textual, implicit, internal, and external rewards.

Reward-based: Utilizes scalar rewards, textual feedback, or internal confidence as learning signals. Methods such as Reflexion, SCoRe, and TextGrad demonstrate the efficacy of language-based and numerical feedback for self-improvement.
Imitation/Demonstration: Agents learn from self-generated or cross-agent demonstrations, as in STaR and SiriuS, bootstrapping reasoning and multimodal capabilities.
Population-based/Evolutionary: Maintains populations of agent variants, leveraging selection, mutation, and self-play. Darwin Godel Machine and GENOME exemplify open-ended evolution and genetic optimization.
Figure 5: Illustration of cross-cutting evolutionary dimensions: learning paradigm (offline/online), policy consistency (on/off-policy), and reward granularity (process-based, outcome-based, hybrid).

Where to Evolve

Figure 6: Categorization of where to evolve: general domain evolution (broad capability enhancement) vs. specific domain evolution (domain-specific expertise).

General Domain: Agents evolve to handle diverse digital tasks, leveraging memory mechanisms, curriculum learning, and model-agent co-evolution (e.g., Voyager, WebEvolver).
Specialized Domain: Focused on coding, GUI, finance, medical, and education, with agents like SICA, QuantAgent, and Agent Hospital demonstrating domain-specific self-improvement and knowledge accumulation.

Evaluation: Metrics and Paradigms

Figure 7: Overview of evaluation angles for self-evolving agents, including adaptivity, retention, generalization, safety, and efficiency, across static, short-horizon, and long-horizon paradigms.

The survey emphasizes the need for longitudinal, dynamic evaluation frameworks that capture not only immediate task success but also adaptation speed, knowledge retention (mitigating catastrophic forgetting), generalization to novel domains, efficiency, and safety. It distinguishes between static assessment, short-horizon adaptation, and long-horizon lifelong learning, highlighting emerging benchmarks such as LifelongAgentBench and LTMBenchmark.

Implications, Open Challenges, and Future Directions

The paper identifies several critical challenges and research directions:

Personalization: Developing agents that can rapidly adapt to individual user preferences and behaviors, even under limited data, while avoiding bias amplification.
Generalization and Continual Learning: Achieving robust cross-domain transfer, scalable architecture design, and mitigating catastrophic forgetting in resource-constrained settings.
Safety and Controllability: Ensuring agents remain aligned with human values, avoid unsafe behaviors, and can be reliably controlled as they autonomously evolve.
Multi-Agent Ecosystems: Balancing individual and collective reasoning, enabling efficient collaboration, and developing dynamic evaluation frameworks for multi-agent systems.

The survey underscores the necessity of integrating advances in models, algorithms, data, and evaluation to realize the full potential of self-evolving agents as precursors to ASI. It calls for the development of more adaptive, robust, and trustworthy agentic systems capable of open-ended, autonomous evolution.

Conclusion

This survey establishes a comprehensive, multi-dimensional framework for understanding and advancing self-evolving agents. By systematically dissecting the field along the axes of what, when, how, and where to evolve, and by highlighting evaluation methodologies and open challenges, the paper provides a foundational reference for researchers and practitioners. The trajectory from static LLMs to self-evolving agents is positioned as a necessary step toward ASI, with significant implications for the design of adaptive, robust, and safe AI systems in both research and real-world deployments.

PDF Markdown

Follow-up Questions

Related Papers

Authors (27)

First 10 authors:

Tweets

https://twitter.com/jiqizhixin/status/1950458699094372485

https://twitter.com/rohanpaul_ai/status/1950122459606221225

https://twitter.com/rohanpaul_ai/status/1950122485552468218

https://twitter.com/jiqizhixin/status/1950458691012153556

https://twitter.com/jiqizhixin/status/1950458695068119294

https://twitter.com/rohanpaul_ai/status/1950122465839218919

YouTube

Show All Videos

alphaXiv

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence (150 likes, 0 questions)