Meta-Evolving Architectures (MemEvolve)

Updated 17 April 2026

Meta-Evolving Architectures (MemEvolve) are dynamic systems that evolve both model parameters and computational designs through bilevel meta-learning.
They leverage gradient-based optimization and evolutionary strategies to adapt neural architectures, memory modules, and agent toolsets collaboratively.
Empirical evaluations reveal significant gains in efficiency and performance across few-shot learning, NAS benchmarks, and co-evolutionary agent frameworks.

Meta-Evolving Architectures (MemEvolve) refer to a family of algorithmic and system-level approaches where not only model parameters or agent experience are adapted during learning, but the very computational architecture—be it a neural network’s topology, an agent’s memory structure, or an LLM-based agent’s toolset—is itself subject to meta-level adaptation or evolutionary processes. This paradigm departs from conventional fixed-architecture settings by treating the space of architectures as a learnable object, evolving under principled bilevel objectives or co-evolutionary feedback loops. Core motivations stem from the limitations of static model and memory designs in few-shot learning, neural architecture search (NAS), and autonomous agent evolution. MemEvolve methodologies have yielded new computational frameworks for meta-learning, task-aware architecture ranking, experience- and tool-driven agent development, and modular co-evolution of memory systems, demonstrating empirical gains across diverse benchmarks.

1. Foundational Principles and Formalization

Meta-evolving architectures operate via a dual-level optimization, commonly formalized as a bilevel meta-learning problem. For neural models, this framework jointly meta-learns a continuous encoding of architecture parameters $\alpha$ and standard network weights $\theta$ , situated within a distribution of tasks $\tau \sim p(\tau)$ , each with small training and validation splits. The formal meta-objective is

$\min_{\theta, \alpha} \sum_{\tau \sim p(\tau)} L_\text{val}(\theta^\tau, \alpha^\tau; D_\text{val}^\tau)$

subject to inner-loop adaptation: $(\theta^\tau, \alpha^\tau) = \underset{\theta', \alpha'}{\arg\min}\; L_\text{train}(\theta', \alpha'; D_\text{train}^\tau)$ Inner updates simultaneously optimize architectural and parameter space via gradient steps: $\begin{align*} \theta^{\tau}_{j+1} &= \theta^{\tau}_j - \lambda_\text{task} \nabla_\theta L_\text{train}(\theta^{\tau}_j, \alpha^{\tau}_j; D^\tau_\text{train}) \ \alpha^{\tau}_{j+1} &= \alpha^{\tau}_j - \eta_\text{task} \nabla_\alpha L_\text{train}(\theta^{\tau}_j, \alpha^{\tau}_j; D^\tau_\text{train}) \end{align*}$ This end-to-end differentiable formulation allows alignment of architecture evolution with meta-learning efficiency (Elsken et al., 2019). For agent frameworks, the principle generalizes: an outer loop evolves the architecture of memory, tool, or agent systems, while an inner loop accumulates and distills experience under the current architecture (Zhang et al., 21 Dec 2025, Cheng et al., 13 Apr 2026).

2. Methodologies: Learning and Evolving Architectures

MemEvolve algorithms implement meta-evolutionary processes using variants of gradient-based meta-learning, neural architecture search, and evolutionary strategies.

2.1 Gradient-Based Bilevel Optimization

A canonical MemEvolve instantiation is MetaNAS (Elsken et al., 2019), which integrates differentiable architecture search (DARTS) with meta-learning algorithms (MAML, REPTILE). At each meta-training iteration, a batch of tasks is sampled, architectures and weights are jointly adapted via $k$ inner steps, and meta-gradients are computed using task validation losses. Updates can use second-order (MAML) or first-order (REPTILE) estimators. The method applies soft-pruning with an annealed temperature to discretize architectures at test time.

2.2 Evolutionary and Ranking-Based Architecture Search

An alternative MemEvolve methodology employs meta-learned ranking networks to guide architecture evolution. A two-tower MLP is trained with a pairwise ranking loss to predict, for any architecture encoding $\alpha$ and task meta-feature vector $m(T)$ , a score approximating its expected task performance. For new tasks, architectures are optimized either via gradient ascent in the encoding space or through evolutionary strategies—mutating, crossing over, and selecting architectures based on the learned ranker, without training child models on the target task (Dubatovka et al., 2019).

2.3 Modular Memory System Evolution in LLM Agents

For LLM-based agentic systems, MemEvolve refers to a bilevel optimization over memory architectures. The memory system $\Omega$ is modularized into Encode (E), Store (U), Retrieve (R), and Manage (G) primitives. The inner loop exposes each candidate $\theta$ 0 to agent-environment interaction, growing and utilizing memory. The outer loop applies evolutionary operators—diagnosing performance, inducing targeted mutations across modules, and selecting survivors based on Pareto-ranked score vectors—enabling the memory system structure itself to adapt for maximal downstream performance and efficiency (Zhang et al., 21 Dec 2025).

3. Architectures, Memory, and Co-Evolutionary Mechanisms

MemEvolve frameworks encompass algorithms that evolve neural architectures, agent capabilities, toolsets, or memory structures.

Continuous Architecture Encoding: Neural network operations are encoded as softmax-weighted mixtures across network edges, generalizing DARTS (Elsken et al., 2019, Dubatovka et al., 2019).
Memory Architecture Modularity: Agent memory is factored into encode/store/retrieve/manage modules, supporting a combinatorial design space. Each architecture is a 4-tuple $\theta$ 1, evolved across meta-iterations for agentic performance (Zhang et al., 21 Dec 2025).
Co-Evolution in Agent Systems: Frameworks such as Mem $\theta$ 2Evolve (Cheng et al., 13 Apr 2026) introduce dual "Experience" and "Asset" memories. Agent trajectories lead to distillation of experience units, while asset libraries (tools/agents) expand dynamically, each driving and constraining the evolution of the other through task planning, recruitment, self-correction, and reflective distillation loops.
Ranking Networks for Architecture Search: Learned rankers, exploiting both architecture encodings and task meta-features, guide evolutionary search or gradient ascent in architecture space, effectively sidestepping the cost of full NAS per task (Dubatovka et al., 2019).

4. Experimental Results and Empirical Evaluation

Empirical studies consistently demonstrate performance, efficiency, and transferability gains for MemEvolve methods.

Application Domain	Benchmark/Metric	MemEvolve Gain
Neural Architecture	MiniImageNet (1-shot)	63.1% (new SOTA, <1.1M params)
Memory Architecture	WebWalkerQA (pass@1)	52.35% → 61.18% (+17.06%)
Agent Co-Evolution	GAIA (Pass@1)	+18.53% (vs. LLM baseline)
Ranking-based NAS	10 NLP tasks	+1–3% over L2 predictor; $\theta$ 3 ≈ 0.9

On Omniglot and MiniImageNet, meta-evolving architectures (with DARTS/REPTILE) surpass pure meta-learners and most hand-tuned meta-NAS methods, with post-scaling results exceeding previous SOTA models. For memory architecture evolution, relative performance boosts over static and hand-engineered systems are observed on xBench, WebWalkerQA, and cross-framework scenarios (Zhang et al., 21 Dec 2025). Co-evolutionary asset and experience memory approaches yield major improvements over experience- or asset-only agents, notably increasing first-pass asset validity and halving debug iterations (Cheng et al., 13 Apr 2026).

5. Codebases, Implementation Substrates, and Reproducibility

Broad reproducibility of MemEvolve research is supported by unified open-source codebases and modular design spaces.

EvolveLab (Zhang et al., 21 Dec 2025): Implements a standardized substrate for agent memory systems with a base abstract class for encode/store/retrieve/manage, and 12 representative systems spanning prior agent memory designs. Accepted data carriers include MemoryItem, TrajectoryData, and API-standardized request/response interfaces. It supports benchmarked comparison under consistent task streams, backbone LLMs, and cost/latency settings.
Mem $\theta$ 4Evolve (Cheng et al., 13 Apr 2026): Code and data are released; core dependencies include Python, OpenAI GPT-5-chat API, web search integrations, and sandboxed tool execution. Key hyperparameters involve similarity thresholds for asset retrieval/reuse ( $\theta$ 5), self-correction budgets, and LLM inference settings.

No model fine-tuning is needed; the computational burden typically arises from LLM prompt calls and code validation, not gradient-based training.

6. Conceptual and Practical Insights

MemEvolve approaches demonstrate several systematic advantages:

Two-Level Adaptation: Simultaneous evolution of architecture and parameters (or experience/asset) ensures tractable adaptation over diverse task distributions and rapid reconfiguration to new demands.
Adaptive Modularity: The ability to mutate selective elements (e.g., encoding, retrieval, storage) delivers task-aligned architectures surpassing monolithic, fixed designs. Diagnosis-driven module mutation yields parsimony, avoiding bloat.
Empirical Grounding and Maintenance: Diagnose-and-design procedures guide evolution via real-world failure/success cases and cost-performance analysis, including scheduled pruning and memory maintenance.
Enhanced Transfer and Lifelong Learning: Meta-evolved architectures designed for one benchmark frequently transfer effectively to new tasks, agent frameworks, and LLM backbones, with minimal negative transfer (Zhang et al., 21 Dec 2025, Cheng et al., 13 Apr 2026).
Broader Agent Capability Spaces: Co-evolving asset (tool/agent) and experience memory enables broad capability expansion and stable skill synthesis, overcoming the limitations of static toolsets or purely improvisational creation.

A plausible implication is that MemEvolve-style adaptive frameworks offer a foundational mechanism toward open-ended, generalizable, and efficient intelligence across domains where both structure and process must adaptively scale.