Cost-Aware Physical Optimizer

Updated 2 December 2025

Cost-aware physical optimizers are frameworks that select optimal operator implementations by modeling trade-offs between cost, latency, and quality using detailed cost models.
They combine traditional cost estimation with adaptive sampling and ML-based techniques, enabling near-optimal plan selection under uncertainty.
They support multi-objective and constraint-based optimization in diverse environments, improving both query efficiency and overall cost-effectiveness.

A cost-aware physical optimizer is a query or systems optimization framework whose core algorithms explicitly reason about the trade-offs between alternative physical implementations, resource allocation, and end-to-end system objectives—typically cost, latency, and/or quality—using detailed cost models and, in advanced cases, formal guarantees under uncertainty. Such optimizers represent a shift from traditional cost-based physical plan enumerators by integrating active or adaptive sampling, fine-grained operator modeling, modern statistical/ML cost estimators, and multi-dimensional or probabilistic objectives.

1. Problem Definition and Foundational Models

A cost-aware physical optimizer seeks, given a logical query or computation graph, to select for each operator in the plan a physical implementation (and possibly resource assignment) that minimizes (or constrains) some function of cost, latency, or quality. In the classical RDBMS context, this means selecting the plan $p^* \in P$ minimizing $C(p,s)$ , where $C$ is a cost model parameterized by cardinality/selectivity vector $s$ over $m$ “black-box” predicates, and $P$ is the set of physical plans (e.g., join ordering × operator implementations) (Trummer et al., 2015).

For cloud-native pipelines, cost models such as $C_{\text{total}}=C_{\text{storage}}+C_{\text{maintenance}}+C_{\text{compute}}+C^t$ account for resource usage, provider pricing, and response-time, often requiring multi-objective treatments (Perriot et al., 2017). For semantic-operator and ML-native settings, $C(p)$ generalizes to monetary LLM inference, token cost, or Pareto trade-offs between monetary, latency, and accuracy metrics (Zhu et al., 25 Nov 2025, Russo et al., 20 May 2025).

Recent work advocates for "Probably Approximately Optimal" (PAO) query optimization, which weakens the traditional goal of exact optimality (often infeasible under selectivity uncertainty) to near-optimality within a user-specified factor $\alpha>1$ and probability $1-\delta$ , leading to the problem: find $p$ such that $\Pr[s^* \in R_p] \geq 1-\delta$ , where $R_p = \{ s\in S : C(p,s)\leq \alpha C(p^*(s),s) \}$ (Trummer et al., 2015).

2. Cost Estimation Techniques

Cost-aware physical optimizers require robust and detailed cost models tailored to the execution environment and operator granularity.

Traditional cost models decompose cost per operator into CPU, I/O, network, and memory components, aggregated in $C(plan)=\sum_{op \in plan}(C_{cpu}(op)+C_{io}(op))$ (Heinrich et al., 3 Feb 2025, Kruse et al., 2018).
Deep Query Optimization (DQO) frameworks further unbox each “physical operator” into fine-grained subcomponents, each with analytically derived cost formulas, e.g., for grouping and join-operator alternatives such as static-perfect-hash grouping $c_{\mathrm{SPHG}}(|R|)=|R|$ or sort-order grouping $c_{\mathrm{SOG}}(|R|)=|R|\log_2|R|+|R|$ (Dittrich et al., 2019).
Adaptive and learned models leverage ML regressors or neural architectures trained over real or synthetic workloads, e.g., in Cleo the cost function per physical operator is $C_{\text{learned}\mbox{-}o}(f_o,R_o)$, where $f_o$ is a feature vector comprising cardinalities, row widths, and $R_o$ the resource assignment (e.g. parallelism), with cost models fit via regularized regression or GBDT (Siddiqui et al., 2020).
Probabilistic and risk-aware models (e.g., Roq) add uncertainty quantification by predicting $(\mu,\sigma^2)$ pairs per plan, supporting risk-averse selection mechanisms that minimize worst-case or average suboptimality risk (Kamali et al., 26 Jan 2024).

These estimators can be hybridized: for instance, residual learning augments $C_{\text{trad}}(P)$ with an ML-predicted correction $\Delta(P)$ , leading to $C_{\text{hybrid}}(P)=C_{\text{trad}}(P)+f_{\text{learned}}(features(P))$ (Heinrich et al., 3 Feb 2025).

3. Plan Enumeration and Optimization Algorithms

Cost-aware physical optimizers embed their cost models within advanced plan enumeration and search algorithms:

Dynamic Programming (DP) remains the foundation for combinatorial plan spaces (join ordering, access path, operator variants), but modern approaches extend the search to multi-dimensional objectives (cost, latency, quality) and allow for Pareto-frontier tracking over plans and operator assignments (Perriot et al., 2017, Russo et al., 20 May 2025).
Iterative/Adaptive Methods for Uncertainty: For selectivity-driven cost uncertainty, algorithms alternate between selecting the current best plan for empirical means $\hat{s}$ , discovering the “near-optimality region” where the plan is $\alpha$ -optimal, and determining further sampling to reach desired confidence $\delta$ (Trummer et al., 2015).
Bottom-Up Enumeration with Augmented Operator Properties: When folding Bloom-filter pushdown and selection into plan enumeration, augmented sub-plans carry state (e.g., unresolved Bloom filter side-sets $\delta$ ), tracked through bottom-up join enumeration with aggressive pruning heuristics to control search space blow-up (Zeyl et al., 5 May 2025).
Graph Transformation and Enumerative Algebra: Cross-platform and polystore optimizers (e.g., Rheem) represent the plan as an “inflated” dataflow graph, encoding all alternatives via graph pattern substitutions, then employ lossless pruning rules, data movement cost solving (minimum conversion tree), and an enumeration algebra to maintain optimality across platforms (Kruse et al., 2018).

A particular frontier is the generalization of plan enumeration to semantic or multi-modal operator spaces, as in Abacus, where each logical operator’s physical implementation is selected from thousands of parameterized LLM-pipeline variants, utilizing multi-armed bandit search and Pareto-Cascades extensions to Cascades-style memoization (Russo et al., 20 May 2025).

4. Multi-Objective and Constraint-Based Optimization

Modern cost-aware physical optimizers explicitly model and solve for multi-objective or constrained targets:

Weighted Sum and Bi-Objective Formulations: For workloads where both monetary cost and response time matter, objectives such as $\min_{x,y} \ \alpha\,T_{\text{response}} + (1-\alpha)C_{\text{total}}$ are solved, or bi-objective constraints imposed (e.g., minimize cost under a response time bound) (Perriot et al., 2017).
Pareto-Frontier Enumeration: Systems such as Abacus and Nirvana maintain, per-group in the search graph, the Pareto set of subplans across dimensions (cost, latency, quality), ensuring final plan selection respects constraints and enables users to sweep trade-off curves (Russo et al., 20 May 2025, Zhu et al., 25 Nov 2025).
Probabilistic / Risk-Aware Guarantees: PAO and Roq render robust-optimization explicit by setting PAC-style or risk-based thresholds: plans are returned only if the probability of being within $\alpha$ of the optimum is at least $1-\delta$ , or if the suboptimality-risk is minimized per user-defined tolerance (Trummer et al., 2015, Kamali et al., 26 Jan 2024).
Resource-Aware Partition and Container Optimization: ML-based optimizers (e.g. Cleo) extend cost minimization to dynamic resource allocation (e.g., degree of parallelism $P$ ), optimizing over both plan and resource assignment in a single loop (Siddiqui et al., 2020).

5. Extensions: Incremental, Adaptive, and Incremental/Streaming Optimization

Cost-aware physical optimizers support dynamic and streaming workloads through incremental re-optimization and time-conscious extensions:

Incremental Query Re-Optimization: By modeling plan enumeration and cost estimation as recursive Datalog views and maintaining them as incrementally updatable state ( $\Delta$ -views), optimizers can propagate cost updates (e.g., selectivity drift) through the plan space orders-of-magnitude faster than full re-optimization, supporting near-real-time adaptivity (Liu et al., 2014).
Time-Varying Relations (TVRs) and Progressive Optimization: Tempura (TIP model) generalizes the physical plan space to sequences over time ( $\vec T$ ), supporting explicit modeling of snapshots and deltas, with plan selection optimized for global cost functions such as weighted-sum or lexicographic (for SLA compliance), enabling multi-run, resource-tiered, or progressive data analysis contexts (Wang et al., 2020).
State Sharing and Multi-Query Optimization (MQO): For workloads or streaming settings where subplan results can be shared, MQO-style state sharing is combined with cost-aware plan selection to further amortize long-term resource consumption (Wang et al., 2020).

Cost-aware physical optimization is critical in emerging data processing paradigms featuring LLMs, semantic operators, or unstructured multi-modal data:

Backend Model Selection for Semantic Operators: Systems such as Nirvana and Abacus introduce cost-aware physical optimizers that, for each logical operator, select the cheapest model/ensemble variant that meets a target accuracy or improvement over baseline, using improvement-score metrics and per-operator independence hypotheses (Zhu et al., 25 Nov 2025, Russo et al., 20 May 2025).
Sample-Efficient Bandit and Evaluation-Lite Search: Evaluation pushdown and computation reuse transform naive $O(|D_s|\cdot |M|)$ benchmark cascades into highly efficient routines, minimizing expensive LLM evaluations by filtering to only those records/operators where improvement is possible (Zhu et al., 25 Nov 2025).
Active and Bayesian Learning for Unknown Operator Quality/Cost: For operator systems with unknown performance profiles, Bayesian updating over small validation sets, combined with multi-armed bandit sampling, permits rapid convergence to Pareto-optimal implementations with quantified uncertainty (Russo et al., 20 May 2025).

7. Empirical Validation and Tuning Guidelines

Experimental and systems studies consistently show that cost-aware physical optimizers, when carefully designed:

Achieve substantial performance, cost, and robustness improvements compared to static or non-adaptive baselines (e.g., up to 52.2% lower TPC-H query latency integrating Bloom filters (Zeyl et al., 5 May 2025); 2–3 orders-of-magnitude better cost estimation and 70% of plan changes yielding practical benefits in cloud data pipelines (Siddiqui et al., 2020); 18.7–39.2% higher quality and 23.6× lower monetary cost in semantic operator systems (Russo et al., 20 May 2025)).
Require strong, context-sensitive cost models with explicit treatment of uncertainty and negative examples to avoid pathological misranking (confirmed for learned cost models by (Heinrich et al., 3 Feb 2025)).
Benefit from iterative or active sampling strategies—uniform, exponential, or adaptive—where sample allocation is guided by local plan-cost sensitivity and confidence bounds, ensuring sample complexity matches planning cost and required robustness (Trummer et al., 2015).
Support practical integration into existing query optimizers via non-intrusive API calls, rule-catalog extensions, or plan-annotation; empirical overheads are typically negligible compared to execution benefits (Kamali et al., 26 Jan 2024, Trummer et al., 2015, Kruse et al., 2018).

The field continues to evolve toward unified frameworks combining advanced cost modeling, adaptive/feedback-driven estimation, probabilistic/risk-aware plan selection, and multi-objective (including monetary) optimization for diverse data processing paradigms.