Adaptive Meta Black-box Optimization (ABOM)
- ABOM is a meta-optimization framework that integrates offline meta-learning, surrogate modeling via an Attentive Neural Process, and Bayesian optimization for rapid adaptation to unseen tasks.
- It constructs a diverse meta-dataset through parallel simulation and uses well-calibrated uncertainty estimates to significantly reduce the number of expensive online evaluations.
- Empirical evaluations in urban traffic management demonstrate ABOM's ability to lower waiting vehicles and improve throughput with fewer than 100 trials.
The Adaptive Meta Black-box Optimization Model (ABOM) formalizes a class of meta-optimization frameworks that leverage offline meta-learning and modern surrogate modeling to enable rapid, data-efficient adaptation to unseen black-box optimization tasks. The paradigm is motivated by real-world applications such as urban traffic network design, where conventional optimization or heuristic controllers fail to generalize due to heterogeneity and limited sample budgets. ABOM, as introduced in traffic light management (Yun et al., 2024), combines an offline meta-dataset of task–design–performance pairs, a meta-learned Attentive Neural Process (ANP) surrogate, and Bayesian optimization to deliver sample-efficient, uncertainty-aware optimization of system-level designs such as traffic phase configurations and timing.
1. Formulation of the Meta-Black-box Optimization Problem
ABOM considers a family of black-box optimization tasks defined by an unknown distribution ρ. Each task (e.g., a traffic pattern) p induces a black-box objective that evaluates high-level design variables (e.g., intersection phase combinations, green-time allocations). The core challenge is to efficiently optimize for a new, unseen pattern , using as few expensive function evaluations as possible.
ABOM operates in a meta-learning regime: it is provided with an offline meta-dataset , where each comprises design–performance pairs collected under historical tasks (traffic scenarios). This meta-dataset is used to inform and “warm-start” the search for in a new task , given a limited budget of online simulations.
Key components:
- Variable: global traffic-light design vector , with for intersections and choices each.
- Objective: , noisy, nonconvex, and non-differentiable.
- Data: aggregates prior realizations of for .
- Goal: Use to minimize the number of online evaluations needed to solve .
2. Construction and Role of the Offline Meta-Dataset
Data collection is performed entirely offline in parallelized simulation. For each reference pattern , candidate designs are sampled uniformly from the design space (phase-combination one-hot logits, or unnormalized green-time splits). Their simulated performance yields without sequential dependency, ensuring the dataset is unbiased and exploits parallel compute.
This process is repeated independently for each ( in large-scale experiments; ), yielding a meta-dataset that covers a diverse spectrum of traffic conditions and associated interventions. The data supports learning a prior over the mapping from designs to outcomes, enabling generalization to unseen traffic scenarios in a principled few-shot adaptation framework.
A held-out validation split ($5:1$ train/val over ) is used for hyperparameter tuning of the meta-surrogate.
3. Attentive Neural Process Surrogate: Architecture and Training
3.1 Latent Neural Process Structure
The ABOM surrogate is an Attentive Neural Process (ANP), modeling the conditional stochastic process :
- For any context set and any target set :
- ANP introduces a global latent vector with prior and defines:
Encoder: for each context; updated via self-attention to and aggregated .
Recognition model: .
Decoder: , where is a cross-attention over conditioned on query .
3.2 Training Objective
The surrogate is meta-trained across all tasks to maximize the evidence lower bound (ELBO):
This process alternates between random context/target splits of each , encouraging the model to learn transferable structure for rapid adaptation. The ANP models predictive mean and variance, naturally encoding epistemic uncertainty critical for Bayesian optimization.
3.3 Inference
During online adaptation on , the ANP receives (the current online trials) as context and returns:
Predictive mean:
Predictive variance: These are estimated via Monte Carlo or moment matching.
4. Bayesian Optimization with the ANP Surrogate
ABOM applies Bayesian optimization (BO) to maximize using the ANP as a probabilistic surrogate. Standard acquisition functions are employed:
- Upper Confidence Bound (UCB):
- Expected Improvement (EI):
- Probability of Improvement (PI):
Constraints are enforced:
For phase time allocation: simplex constraints on each : and .
For phase combinations: argmax over softmax probabilities ensures valid discrete selection.
Acquisition maximization over feasible is solved with L-BFGS-B on the continuous logits.
Optimization Loop
At each online trial:
Compute surrogate mean/variance given current .
Maximize the acquisition function under the constraints to propose next .
Evaluate via simulation.
Augment online context with new . The best found is returned.
5. Algorithmic Workflow and Pseudocode
Phase 1: Offline Data Collection
- For to : sample designs, evaluate under , store in .
Phase 2: Meta-training ANP
- Until convergence: randomly select , split into context/target, update ANP via stochastic gradient ascent on the ELBO.
Phase 3: Online Adaptation on New Task
Initialize .
For :
- Compute given ANP and .
- Maximize acquisition (UCB, EI, or PI) under constraints.
- Query , record .
- Return with best observed .
6. Empirical Evaluation and Impact
ABOM was instantiated on both synthetic to grids and real networks (Hangzhou , Manhattan and ). Meta-dataset: , . Baselines included basic BBO (GA, PSO, CMA-ES), meta BBO (LGA, LES, RGPE, ABLR, FSBO), and RL 1-step MDP controllers (e.g., DQN, PPO).
Key results:
- On Hangzhou (phase combination), ABOM: $292.7$ vs. best baseline $294.8$ average waiting vehicles.
- On Manhattan (time allocation), ABOM: $3805.4$ vs. $3890.4$ next best.
- ABOM reduced waiting vehicles by across all networks, converged in trials.
- Real deployment: 26 intersections, vehicle throughput improvement over baseline when run hourly for a week.
- Ablations: robust to halved meta-dataset, all three acquisition criteria give similar improvements, ANP preferred over transformer NPs for the available data scale.
7. Interpretation and Significance
ABOM demonstrates that offline meta-learning of surrogate models (ANP) over diverse design–performance datasets, combined with probabilistic BO, enables high sample-efficiency black-box optimization in complex, real-world system design settings. The approach achieves rapid adaptation beyond traditional BBO and RL variants, robustly outperforms a range of meta and naïve baselines on metrics relevant to practice (e.g., waiting vehicles, throughput), and is deployable at urban scale without domain-specific tuning (Yun et al., 2024). The framework is general and could be extended to other black-box systems where acquiring task-generalizable learned surrogates with well-calibrated uncertainty is feasible.
Empirical findings confirm the following:
- Parallel offline data generation (random design sampling) is highly effective for meta-surrogate training and allows for maximal use of simulation infrastructure.
- Well-calibrated epistemic uncertainty from the ANP is critical for BO sample efficiency, irrespective of acquisition function.
- Robustness to meta-dataset size and acquisition choice confirms that meta-BBO with few-shot adaptation is a practical architecture for scalable, adaptive design in high-stake black-box domains.
Principal Reference:
"An Offline Meta Black-box Optimization Framework for Adaptive Design of Urban Traffic Light Management Systems" (Yun et al., 2024)