Active Generation of Pareto Sets (A-GPS)
- The paper introduces an amortized generative model framework that efficiently approximates Pareto fronts while enabling active, user-adaptive sampling.
- It leverages class probability estimators to assess non-dominance and preference alignment without the need for explicit hypervolume calculations.
- The method demonstrates practical success in complex problems like biomolecular design and combinatorial optimization by integrating a-posteriori preference conditioning.
Active generation of Pareto sets (A-GPS) addresses the challenge of efficiently and flexibly identifying, representing, and sampling from the set of non-dominated solutions in multi-objective optimization—especially in discrete black-box settings where objectives may be expensive or noisy to evaluate and user preferences are a-priori unknown or evolving. The A-GPS framework leverages an amortized generative model, class probability estimation, and a-posteriori conditioning to enable sample-efficient, user-adaptable, and high-quality approximation of complex Pareto fronts without relying on hand-crafted scalarization or explicit hypervolume calculations.
1. Framework Definition and Motivation
A-GPS is formulated as an online framework for multi-objective optimization where, instead of iteratively selecting new points solely via acquisition functions (e.g., expected hypervolume improvement, scalarization, or grid-based methods), one learns an amortized conditional generative model capable of active, preference-aware sampling across the Pareto front.
Key motivations:
- Avoid repeated retraining for different user-specific trade-offs;
- Efficiently exploit all previously collected evaluations (non-dominated and dominated) for better sample efficiency;
- Provide posterior conditioning capabilities, i.e., after the model is trained, user preferences (trade-off vectors) can be used to direct generation to specific regions on the Pareto frontier;
- Circumvent explicit costly computation of the hypervolume indicator, instead relying on discriminative class probability estimators (CPEs) to assess Pareto optimality and preference alignment.
This approach is particularly well-suited for expensive or combinatorial design problems such as protein sequence design, discrete biomolecular engineering, or general black-box optimization where evaluating all possible trade-off directions is prohibitive.
2. Amortized Generative Modeling of Pareto Sets
The core of A-GPS is an amortized generative model (AGM) , parameterized by neural networks (e.g., transformers for discrete sequences), which models the conditional distribution of design variables given a preference direction in objective space. The model is trained across all sequential rounds of observation, thus amortizing the “cost” of exploration: future queries and preference changes can be handled without re-optimization.
- Training objective: The AGM is updated in each round by minimizing a reverse KL divergence or, equivalently, maximizing an evidence lower bound (ELBO) incorporating both Pareto membership and preference alignment. The update admits the form
where and are guidance likelihoods derived from class probability estimators, is an alignment label, and controls exploration regularization.
- Preference a-posteriori conditioning: Preference direction vectors are constructed as unit-normalized difference vectors in objective space from a reference point :
where is the vector of objectives for the th design in the observed set. This conditioning enables immediate adaptation to arbitrary user-specified trade-offs without retraining.
- Architectures: The model can be auto-regressive (generating one variable at a time) or masked (updating existing candidates) with adjustable FiLM-style layers or preference-embedding modules for incorporating .
3. Class Probability Estimators for Non-dominance and Alignment
A central component of A-GPS is the use of class probability estimators (CPEs) for both Pareto membership and preference alignment:
- Pareto CPE (): Trained with binary labels indicating non-dominance in the current dataset. For a candidate :
[with the current observed Pareto set].
The CPE predicts the probability that is non-dominated. Importantly, it is shown that , so this estimator effectively predicts the probability of hypervolume improvement (PHVI) for .
- Alignment CPE (): Evaluates how the sample's realized objectives align with a chosen preference direction . This uses contrastive labeling: correct alignment () for matched preference-design pairs and for mismatched (permuted) pairs, enhancing the model's discrimination over alignment in objective space. Conditioning the AGM on this alignment increases the flexibility and allows for interactive control.
This combination enables the training process to focus the generative model toward high-quality, non-dominated, and user-aligned regions of design space.
4. Preference Vector Construction and Conditioning
Preference direction vectors are used to encode user-specified or automatically inferred trade-off directions among objectives. For a candidate in -dimensional objective space and a fixed reference :
The reference point is chosen (typically, so that all feasible dominate ). Each sample in the dataset is thus indexed by an implicit preference vector, allowing the conditional model to learn to map user-specified directions to diverse regions of the Pareto frontier. Once the generative model has been trained in this manner, users can generate candidates from any region of the frontier by instantiating as required. This a-posteriori conditioning is a key advantage over (a-priori) scalarization schemes where each region typically requires a separate optimization pass.
5. Avoiding Explicit Hypervolume Computation
One of the main technical outcomes is that explicit computation of the hypervolume improvement at each candidate is unnecessary. By observing that non-dominance is equivalent to positive hypervolume improvement for not currently in , i.e.,
the Pareto CPE can be trained using only standard non-dominance labeling. This produces a reliable surrogate for PHVI and obviates computational bottlenecks inherent in hypervolume-based utility acquisition, especially in high dimensions.
6. Empirical Performance and Applications
Empirical results on canonical synthetic functions (Branin–Currin, ZDT3, DTLZ2, DTLZ7, multimodal Gaussian mixtures) demonstrate that A-GPS is able to:
- Efficiently approximate Pareto fronts with complex, multi-branched, and disconnected geometries;
- Achieve high relative hypervolume indicators (HVI) and sample quality;
- Easily sample different trade-off segments via preference vector specification;
- Maintain sample efficiency and adaptation across iterative rounds of evaluation.
In protein design tasks (e.g., maximizing "Ehrlich" synthetic landscape score vs. naturalness, optimizing bi-gram frequencies, stability vs. SASA), A-GPS matches or exceeds alternative methods (GP-based baselines, VSD, CbAS, diffusion-guided LaMBO-2) in both hypervolume and front coverage. The conditional sampling enables rapid exploration of the Pareto front under newly specified preferences without retraining the model.
7. Mathematical Foundations
The A-GPS framework directly leverages several key mathematical formulations:
- Pareto set:
- Amortized generative training objective:
- Preference direction vector computation:
- Optimization of the conditional model:
leveraging Bayesian decomposition into Pareto and alignment likelihoods.
These mathematical elements ensure the framework's theoretical soundness and flexibility for multi-objective optimization across various domains.
A-GPS, as formalized in this framework, enables sample-efficient, scalable, and user-adaptable approximation of Pareto fronts in multi-objective optimization, leveraging amortized conditional generative models, discriminative class probability estimation, and a-posteriori preference conditioning, all while eschewing explicit hypervolume calculation and supporting practical interactive design (Steinberg et al., 23 Oct 2025).