Pareto-aware Acquisition

Updated 21 July 2025

Pareto-aware acquisition is a multi-objective optimization framework that identifies and expands the Pareto front by preserving trade-off structures.
It avoids scalarization by employing strategies like Bayesian optimization, EHVI, and posterior sampling to target non-dominated solutions.
Its applications span neural architecture search, AutoML, fairness-constrained classification, and molecular design, delivering tailored solution sets.

Pareto-aware acquisition is a framework and set of methodologies in multi-objective optimization where acquisition, selection, or learning processes are explicitly guided by the goal of identifying or expanding the Pareto front—i.e., the set of solutions that are non-dominated with respect to all target objectives. Unlike traditional scalarization approaches that convert multiple objectives into a single aggregate score (often necessitating repeated searches for different weightings), Pareto-aware acquisition methods preserve the multi-dimensional structure of objective trade-offs, thereby enabling the discovery of solution sets that optimally represent the spectrum of desired compromises. This concept is foundational in numerous modern applications, such as neural architecture search, automated machine learning under resource constraints, cost-aware Bayesian optimization, fairness-constrained classification, model merging for multi-task systems, and molecular design.

1. Core Concepts and Motivation

Pareto-aware acquisition is rooted in multi-objective optimization, where the Pareto front comprises those solutions for which no objective can be strictly improved without degrading at least one other. A solution x is Pareto optimal if there does not exist another solution y such that $f_i(y) \geq f_i(x)$ for all i and $f_j(y) > f_j(x)$ for some j. This approach is critical where objectives are inherently at odds—for instance, maximizing prediction accuracy while minimizing inference latency in neural networks, or simultaneously optimizing potency and safety in molecular design (Cheng et al., 2018, Yong et al., 18 Jul 2025). Pareto-aware acquisition methods directly target the trade-off structure of such settings by focusing data acquisition, model selection, and optimization efforts on identifying, expanding, or sampling from the Pareto set.

2. Methodological Frameworks

Different research areas implement Pareto-aware acquisition under specific methodologies suited to their domains:

Reinforcement Learning in Neural Architecture Search (NAS): Methods such as MONAS and DPP-Net use tailored reward functions or SMBO procedures, with multi-objective rewards or regression models designed to select architectures that lie on the Pareto front for targets like accuracy, energy, and latency (Cheng et al., 2018).
Hyperparameter and Architecture Search with Resource Constraints: AutoML platforms like RA-AutoML employ hybrid algorithms (e.g., MOBOGA) that combine Bayesian optimization and genetic algorithms, explicitly incorporating resource and hardware constraints within a Pareto framework (Yang et al., 2020).
Bayesian Optimization with Pareto-aware Acquisition Functions: Acquisition functions such as Expected Hypervolume Improvement (EHVI) (Yong et al., 18 Jul 2025), Pareto-efficient adaptations of Expected Improvement (EI), and information-theoretic measures (e.g., Pareto Frontier Entropy Search) are proposed to quantify the expected improvement to the Pareto front as a function of candidate selections (Guinet et al., 2020, Qing et al., 2022).
Neural Generators for Conditional Acquisition: Budget-conditional neural architecture generators such as NAG and PNAG learn mappings from resource budgets to Pareto-optimal models, leveraging policy gradients and pairwise ranking losses to ensure solutions lie on the desired frontier (Guo et al., 2021, Guo et al., 2022).
Pareto Set Learning: Model-based approaches, often with neural network parameterizations or hypernetworks, learn a continuous mapping from preference vectors (usually sampled from a simplex) to the decision space, thus approximating the entire Pareto set with a single trained model and supporting flexible, preference-aware solution retrieval (Lin et al., 2022, Nguyen et al., 23 Dec 2024).
Bandit and Posterior Sampling: Recent work such as PSIPS applies posterior sampling (Thompson sampling) both in sampling and stopping decisions to efficiently identify the true Pareto set in structured or correlated multi-objective settings (Kone et al., 7 Nov 2024).

3. Pareto-aware Acquisition in Practice

Pareto-aware acquisition is widely adopted in real-world settings characterized by conflicting objectives and expensive evaluations:

Device-aware NAS and Embedded Deployment: Models must balance accuracy, resource use, and real-world constraints such as latency or energy. Pareto-aware acquisition yields sets of architectures tailored for a full spectrum of deployment targets, as shown in the superiority of MONAS and DPP-Net over traditional single-objective or random baselines (Cheng et al., 2018).
Cost-aware Automated Machine Learning: Resource-aware AutoML frameworks integrate hardware constraints into the design search, using Pareto fronts to present users with feasible configurations that negotiate accuracy and resource trade-offs (Yang et al., 2020).
Fairness in Machine Learning: Pareto-Efficient Fairness (PEF) selects classifier operating points lying closest to the fairness hyperplane while remaining Pareto efficient over subgroup performances, mitigating the accuracy-loss typically induced by strict fairness constraints (Balashankar et al., 2019).
Batch and Batched Bayesian Optimization: Algorithms leverage Pareto-aware acquisition to address issues of diversity, batch efficiency, and unknown constraints when proposing sets of evaluations—seen e.g. in PDBO’s use of multi-armed bandits and DPPs to maintain high-quality, diverse Pareto fronts (Ahmadianshalchi et al., 13 Jun 2024, Qing et al., 2022).
Molecular Optimization: Directly optimizing for Pareto front expansion using EHVI produces more diverse and higher-quality candidate molecules than scalarized approaches, especially in low-data regimes (Yong et al., 18 Jul 2025).
Multi-task Model Merging: Preference-aware model merging leverages Pareto set learning to generate entire families of merged models tailored to differing user priorities, outperforming one-size-fits-all strategies (Chen et al., 22 Aug 2024).

4. Evaluation Metrics and Empirical Outcomes

The evaluation of Pareto-aware acquisition, and comparison with scalarized or myopic alternatives, relies on metrics that reflect both the quality and diversity of approximated Pareto fronts:

Hypervolume and Expected Hypervolume Improvement (EHVI): The hypervolume indicator quantifies the size of the objective space dominated by the solution set. EHVI computes the expected gain in hypervolume with each evaluation and is empirically superior to scalarized EI in molecular and engineering domains (see Table 1 below for summary characteristics) (Yong et al., 18 Jul 2025).
Diversity Measures: Additional criteria such as the number of structurally distinct solutions (#Circles in molecular design) or the mean pairwise distance in the objective space (DPF metric) are used to quantify how well the Pareto front spans the possible trade-offs (Ahmadianshalchi et al., 13 Jun 2024).
Regret and Convergence Speed: Pareto-aware acquisition functions have been shown to converge more quickly to a high-quality Pareto front compared to scalarized or random strategies, as measured by normalized regret or area under improvement curves (Candelieri, 2023).

Metric	Description	Context of Use
Hypervolume (HV)	Volume dominated by Pareto front in objective space	Molecule design, NAS, BO
EHVI	Expected gain in HV from a candidate evaluation	BO for molecules, engineering
Diversity (#Circles, DPF)	Spread/uniqueness of front solutions	Molecule design, batch BO
Normalized Regret	Convergence of best observed function value	BO batch strategies

5. Algorithmic Innovations and Theoretical Guarantees

Research on Pareto-aware acquisition has produced notable algorithmic and theoretical advances:

Posterior Sampling for Pareto Set Identification: The PSIPS algorithm replaces computationally expensive oracle-based approaches with posterior sampling for both sampling and stopping, achieving asymptotic optimality in expected sample complexity for both Bayesian and frequentist settings (Kone et al., 7 Nov 2024).
Surrogate Modeling and Pareto Set Learning: Differentiable surrogate models (e.g. Gaussian processes with Matern kernels, hypernetworks) allow efficient, batch-aware selection of candidate solutions and gradient-based training of mappings from preferences to decisions, generalizing MOEA/D to continuous model-based settings (Lin et al., 2022, Nguyen et al., 23 Dec 2024).
Batch and Diversity-aware Selection: DPP-based selection for batch optimization promotes output-space diversity directly, outperforming pure greedy or scalarization-based approaches by ensuring the selected batch jointly extends the Pareto front as measured by hypervolume contribution (Ahmadianshalchi et al., 13 Jun 2024).
Cost- and Constraint-aware Acquisition: Pareto-efficient modifications of standard acquisition functions (e.g. EIα, CEI) enable robust navigation of cost-accuracy trade-offs and ensure only evaluations that are on or near the current cost Pareto front are candidates for selection (Guinet et al., 2020).

6. Comparative Performance, Insights, and Applications

Empirical evidence from controlled studies and benchmarks demonstrates:

Superiority over scalarization in sparse regimes: Even strong deterministic scalarizations can underperform Pareto-aware acquisition methods (e.g. EHVI) in data-limited settings, especially where objective trade-offs are nontrivial (Yong et al., 18 Jul 2025).
Robustness to Objectives and Constraints: Adaptive Pareto-aware strategies are more robust when the number of objectives increases, or when constraints are present but unknown, as they explicitly avoid the limitations of rigid or hand-tuned scalarization weights (Qing et al., 2022).
Flexible, Personalized Model Selection: Methods that learn the entire Pareto set in a single process, such as preference-aware model merging and neural architecture generators, provide users and systems with the ability to adaptively select solutions that best match current constraints or priorities (Guo et al., 2022, Chen et al., 22 Aug 2024).

7. Future Directions and Open Problems

Several open challenges and promising avenues have been identified:

Design of Multi-objective Search Spaces: There is a need to move beyond retrofitting search spaces developed for single-objective optimization, toward spaces that explicitly encode resource, latency, or other practical trade-offs (Cheng et al., 2018).
Surrogate Model Reliability and Bias: Improved surrogate models and kernels that better handle fragmented, multi-modal, or high-dimensional objective landscapes are needed, as unreliable surrogates can cause pseudo-local optima (Nguyen et al., 23 Dec 2024).
Efficient, Adaptive Batch and Diversity Strategies: Further refinement of batch-acquisition techniques, possibly using learned notions of diversity, could improve the coverage and efficiency of Pareto front identification in expensive multi-objective problems (Ahmadianshalchi et al., 13 Jun 2024).
Integration of Human Preferences and External Feedback: Incorporating expert or user feedback (either interactively or indirectly via cost models, fairness metrics, or other side information) can inform acquisition and expand the practical relevance of these methods across domains such as negotiation, fairness-aware classification, and customized deployment (Kwon et al., 2021, Balashankar et al., 2019).

Pareto-aware acquisition frameworks therefore constitute a powerful and flexible set of strategies for multi-objective machine learning and optimization, enabling both principled exploration of trade-off spaces and practical, task-relevant deployment across a broad set of modern challenges in science and engineering.