- The paper introduces LLM-SAEA, an LLM-driven framework that dynamically configures surrogate-assisted evolutionary algorithms using online self-reflection and expert collaboration.
- The paper demonstrates statistically significant performance gains and rapid convergence across 10D and 30D benchmark functions compared to state-of-the-art methods.
- The paper shows that automating online configuration with LLMs reduces manual heuristic design, offering practical benefits for real-world expensive optimization tasks.
LLM-Driven Surrogate-Assisted Evolutionary Algorithm for Expensive Optimization
Introduction and Motivation
The paper "LLM-Driven Surrogate-Assisted Evolutionary Algorithm for Expensive Optimization" (2507.02892) addresses the algorithmic challenge of dynamically configuring surrogate-assisted evolutionary algorithms (SAEAs) for expensive black-box optimization problems (EOPs). The core contribution is the introduction of LLM-SAEA, a framework in which LLMs automatically perform online configuration of key algorithmic components—specifically, the selection of surrogate models and infill sampling strategies—during SAEA optimization. This approach reduces dependence on human-designed heuristics, reinforcement learning, or multi-armed bandit formulations, by leveraging the contextual knowledge and decision capabilities encoded within LLMs.
The motivation is grounded in difficulties inherent to static and hand-crafted dynamic algorithm configuration schemes: static choices lack adaptivity to varying optimization landscapes, whereas existing dynamic approaches (e.g., RL, MAB) require significant manual engineering and are sensitive to hyperparameters, reward functions, and state-action representations. LLMs provide a mechanism to synthesize contextual information and learned priors across problem types and optimization histories, promising improved automation and generalization.
Methodology
Overall Framework: Collaboration-of-Experts with LLMs
LLM-SAEA formulates SAEA configuration as a collaboration between two LLM-based experts:
- LLM-DE (Decision Expert): Selects, at each iteration, a subset of model-infill criterion pairs (from a fixed set of 8 actions) based on action scores, frequencies, optimization state, and self-reflection (confidence labeling).
- LLM-SE (Scoring Expert): Assigns quality scores to executed actions using solution performance, which are then used for decision-making in subsequent iterations.
This collaboration-of-experts paradigm allows for context-driven configuration choices, with the DE utilizing both score-based argmax/softmax selection and an explicit self-reflection mechanism to manage action selection confidence and diversity.
Algorithmic Details
- Action Space: The set of configurable actions comprises combinations of four surrogate models (GP, RBF, PRS, KNN) and several infill criteria (EI, LCB, Prescreening, Local Search, L1-exploration/exploitation). Actions can invoke DE-based offspring generation or local optimization on surrogates.
- Initialization: Latin hypercube sampling seeds the initial population and database.
- Optimization Loop: In each iteration, LLM-DE recommends a candidate action set considering current action statistics and iteration context. Actions are sampled and executed; performance is scored by LLM-SE; statistics are updated.
- Self-Reflection: LLM-DE provides both action recommendations and confidence labels ("certain" or "uncertain"); "uncertain" actions are augmented using a score-driven selection mechanism.
- Surrogate Model Training: Per-action model fitting is triggered on the selected solution set, using context-appropriate regressors and infill strategies.
Complexity Analysis
The overall complexity is dominated by the number of true black-box evaluations. LLM inference and surrogate modeling overheads are asymptotically negligible given the computational cost of the objective function in EOPs.
Experimental Results
Benchmark Suite and Baselines
Evaluation is performed on a diverse set of expensive optimization test problems: classical benchmark functions (Ellipsoid, Rosenbrock, Ackley, Griewank, Rastrigin) and 15 canonical complex test functions at D=10 and D=30 dimensions, compared to eight leading SAEA baselines, including both static (e.g., ESAO, IKAEA, GL-SADE) and dynamic (e.g., ESA, AutoSAEA) approaches.
Main Results
- Numerical Superiority: LLM-SAEA achieves statistically significant improvements in function error on the majority of problems, outperforming all static and dynamic state-of-the-art baselines on 10D and 30D testbeds, except for ties in a few cases against advanced dynamic schemes.
- Statistical Tests: The Friedman test yields the best average ranking for LLM-SAEA, with extremely low p-values signifying strong performance gains over competing methods.
- Convergence: LLM-SAEA demonstrates robust and rapid convergence behavior across diverse functions, illustrating effective exploration-exploitation balancing.
- Ablation Studies: Variants with static or random action selection, alternative dynamic selectors (Q-learning), or removal of self-reflection and collaboration mechanisms all yield significantly degraded performance, highlighting the necessity and impact of the LLM-based configuration pipeline.
Mechanistic Insights
- Dynamic Adaptivity: Frequency analyses of action selection across functions reveal behavioral shifts in surrogate/infill criterion selection, evidencing effective context sensitivity. The LLM-based decision process adapts its configuration strategy to optimization state and problem structure.
- Self-Reflection and Collaboration: Explicit self-assessment in LLM-DE mitigates overconfidence and error propagation; the interplay between DE and SE supports robust credit assignment for action efficacy.
- Parameter Insensitivity: The single hyperparameter (population size N) demonstrates stable performance; recommended values in [80,120] yield consistent results.
Implications
Practical Impact
Incorporating LLMs to automate online algorithmic configuration obviates the laborious manual engineering previously necessary for high-performance SAEA deployment. This has direct implications for real-world expensive design and engineering optimization scenarios (e.g., crashworthiness, supply chains, drug discovery, traffic networks) where problem characteristics and budgets may shift dynamically, and domain expert knowledge may not be readily available.
Theoretical Significance
The LLM-SAEA architecture points to the viability of treating large-scale neural priors as generic meta-optimizers or configuration agents across combinatorial algorithmic spaces. Importantly, the collaboration-of-experts paradigm, leveraging both context understanding and self-evaluation, shows robust statistical advantages over existing RL/MAB-based formulations.
Future Research
Open directions include:
- Scaling the approach to richer surrogate/model/infill spaces and higher-dimensional architectures.
- Exploring architectural or prompt engineering mechanisms to increase LLM decision interpretability and reliability.
- Integrating LLM-driven configuration with problem-adaptive surrogate modeling and hybrid optimization frameworks.
- Investigating transfer and few-shot configuration in novel EOPs.
Conclusion
LLM-SAEA demonstrates that LLMs, embedded as scoring and decision-making modules, provide a scalable and adaptive approach for dynamic configuration of surrogate-assisted evolutionary algorithms in expensive black-box optimization. The empirical results evidence significant performance gains over traditional and learning-driven baselines. The methodological insights of algorithmic self-reflection and collaboration-of-experts have broad precedent for further AI-augmented optimization research.