Active Generation of Pareto Sets (A-GPS)

Updated 27 October 2025

The paper introduces an amortized generative model framework that efficiently approximates Pareto fronts while enabling active, user-adaptive sampling.
It leverages class probability estimators to assess non-dominance and preference alignment without the need for explicit hypervolume calculations.
The method demonstrates practical success in complex problems like biomolecular design and combinatorial optimization by integrating a-posteriori preference conditioning.

Active generation of Pareto sets (A-GPS) addresses the challenge of efficiently and flexibly identifying, representing, and sampling from the set of non-dominated solutions in multi-objective optimization—especially in discrete black-box settings where objectives may be expensive or noisy to evaluate and user preferences are a-priori unknown or evolving. The A-GPS framework leverages an amortized generative model, class probability estimation, and a-posteriori conditioning to enable sample-efficient, user-adaptable, and high-quality approximation of complex Pareto fronts without relying on hand-crafted scalarization or explicit hypervolume calculations.

1. Framework Definition and Motivation

A-GPS is formulated as an online framework for multi-objective optimization where, instead of iteratively selecting new points solely via acquisition functions (e.g., expected hypervolume improvement, scalarization, or grid-based methods), one learns an amortized conditional generative model $q(x | u)$ capable of active, preference-aware sampling across the Pareto front.

Key motivations:

Avoid repeated retraining for different user-specific trade-offs;
Efficiently exploit all previously collected evaluations (non-dominated and dominated) for better sample efficiency;
Provide posterior conditioning capabilities, i.e., after the model is trained, user preferences (trade-off vectors) can be used to direct generation to specific regions on the Pareto frontier;
Circumvent explicit costly computation of the hypervolume indicator, instead relying on discriminative class probability estimators (CPEs) to assess Pareto optimality and preference alignment.

This approach is particularly well-suited for expensive or combinatorial design problems such as protein sequence design, discrete biomolecular engineering, or general black-box optimization where evaluating all possible trade-off directions is prohibitive.

2. Amortized Generative Modeling of Pareto Sets

The core of A-GPS is an amortized generative model (AGM) $q(x|u)$ , parameterized by neural networks (e.g., transformers for discrete sequences), which models the conditional distribution of design variables $x$ given a preference direction $u$ in objective space. The model is trained across all sequential rounds of observation, thus amortizing the “cost” of exploration: future queries and preference changes can be handled without re-optimization.

Training objective: The AGM is updated in each round by minimizing a reverse KL divergence or, equivalently, maximizing an evidence lower bound (ELBO) incorporating both Pareto membership and preference alignment. The update admits the form

$\mathcal{L}_E(\theta) = \mathbb{E}_{q(x|u)}[\log p_{\text{pareto}}(x|u) + \log p_{\text{align}}(x|a)] - \beta \text{KL}(q(x|u)\|p_0(x))$

where $p_{\text{pareto}}$ and $p_{\text{align}}$ are guidance likelihoods derived from class probability estimators, $a$ is an alignment label, and $\beta$ controls exploration regularization.

Preference a-posteriori conditioning: Preference direction vectors $u$ are constructed as unit-normalized difference vectors in objective space from a reference point $r$ :

$u_n = (y_n - r)/\|y_n - r\|_2,$

where $y_n$ is the vector of objectives for the $n$ th design in the observed set. This conditioning enables immediate adaptation to arbitrary user-specified trade-offs without retraining.

Architectures: The model can be auto-regressive (generating one variable at a time) or masked (updating existing candidates) with adjustable FiLM-style layers or preference-embedding modules for incorporating $u$ .

3. Class Probability Estimators for Non-dominance and Alignment

A central component of A-GPS is the use of class probability estimators (CPEs) for both Pareto membership and preference alignment:

Pareto CPE ( $p_{\text{pareto}}$ ): Trained with binary labels indicating non-dominance in the current dataset. For a candidate $x$ :

$z_n = \begin{cases} 1 & x_n \in \mathcal{P}^t \ 0 & \text{otherwise} \end{cases}$

[with $\mathcal{P}^t$ the current observed Pareto set].

The CPE predicts the probability that $x$ is non-dominated. Importantly, it is shown that $\mathbb{I}[\text{HV}(x|\mathcal{P}^t)>0] = \mathbb{I}_{\text{ND}}(x)$ , so this estimator effectively predicts the probability of hypervolume improvement (PHVI) for $x$ .

Alignment CPE ( $p_{\text{align}}$ ): Evaluates how the sample's realized objectives align with a chosen preference direction $u$ . This uses contrastive labeling: correct alignment ( $a=1$ ) for matched preference-design pairs and $a=0$ for mismatched (permuted) pairs, enhancing the model's discrimination over alignment in objective space. Conditioning the AGM on this alignment increases the flexibility and allows for interactive control.

This combination enables the training process to focus the generative model toward high-quality, non-dominated, and user-aligned regions of design space.

4. Preference Vector Construction and Conditioning

Preference direction vectors are used to encode user-specified or automatically inferred trade-off directions among objectives. For a candidate $y$ in $L$ -dimensional objective space and a fixed reference $r$ :

$u = \frac{y - r}{\|y - r\|_2}$

The reference point $r$ is chosen (typically, so that all feasible $y$ dominate $r$ ). Each sample in the dataset is thus indexed by an implicit preference vector, allowing the conditional model to learn to map user-specified directions to diverse regions of the Pareto frontier. Once the generative model has been trained in this manner, users can generate candidates from any region of the frontier by instantiating $u$ as required. This a-posteriori conditioning is a key advantage over (a-priori) scalarization schemes where each region typically requires a separate optimization pass.

5. Avoiding Explicit Hypervolume Computation

One of the main technical outcomes is that explicit computation of the hypervolume improvement at each candidate is unnecessary. By observing that non-dominance is equivalent to positive hypervolume improvement for $x$ not currently in $\mathcal{P}^t$ , i.e.,

$\mathbb{I}[\text{HV}(x|\mathcal{P}^t) > 0] = \mathbb{I}_{ND}(x),$

the Pareto CPE can be trained using only standard non-dominance labeling. This produces a reliable surrogate for PHVI and obviates computational bottlenecks inherent in hypervolume-based utility acquisition, especially in high dimensions.

6. Empirical Performance and Applications

Empirical results on canonical synthetic functions (Branin–Currin, ZDT3, DTLZ2, DTLZ7, multimodal Gaussian mixtures) demonstrate that A-GPS is able to:

Efficiently approximate Pareto fronts with complex, multi-branched, and disconnected geometries;
Achieve high relative hypervolume indicators (HVI) and sample quality;
Easily sample different trade-off segments via preference vector specification;
Maintain sample efficiency and adaptation across iterative rounds of evaluation.

In protein design tasks (e.g., maximizing "Ehrlich" synthetic landscape score vs. naturalness, optimizing bi-gram frequencies, stability vs. SASA), A-GPS matches or exceeds alternative methods (GP-based baselines, VSD, CbAS, diffusion-guided LaMBO-2) in both hypervolume and front coverage. The conditional sampling enables rapid exploration of the Pareto front under newly specified preferences without retraining the model.

7. Mathematical Foundations

The A-GPS framework directly leverages several key mathematical formulations:

Pareto set:

$\mathcal{P} := \{ x \in \mathcal{X} : \forall x' \in \mathcal{X}, \: f^l(x') \ge f^l(x) \text{ for all } l, \: f^k(x') > f^k(x) \text{ for some } k \implies x \notin \mathcal{P} \}$

Amortized generative training objective:

$\mathcal{L}_E(\theta) = \mathbb{E}_{q}[\log p_{\text{pareto}}(x|u) + \log p_{\text{align}}(x|a)] - \beta \text{KL}(q(x|u)\|p_0(x))$

Preference direction vector computation:

$u = \frac{y - r}{\|y - r\|_2}$

Optimization of the conditional model:

$q^*_t = \arg\min_q \text{KL}(q(x|u) \| p(x,u,a))$

leveraging Bayesian decomposition into Pareto and alignment likelihoods.

These mathematical elements ensure the framework's theoretical soundness and flexibility for multi-objective optimization across various domains.

A-GPS, as formalized in this framework, enables sample-efficient, scalable, and user-adaptable approximation of Pareto fronts in multi-objective optimization, leveraging amortized conditional generative models, discriminative class probability estimation, and a-posteriori preference conditioning, all while eschewing explicit hypervolume calculation and supporting practical interactive design (Steinberg et al., 23 Oct 2025).

PDF Markdown Chat (Pro)

References (1)

Amortized Active Generation of Pareto Sets (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Active Generation of Pareto Sets (A-GPS).