Percentile-Based Optimization

Updated 15 October 2025

Percentile-Based Optimization is a framework that defines objectives and constraints by targeting specific data quantiles, offering robust decision-making in skewed distributions.
It is applied in diverse fields such as risk-sensitive reinforcement learning, wireless communications, and robust statistics, where tail performance and fairness are critical.
Advanced algorithms employ techniques like Lagrangian duality, Bayesian design, and fractional transforms to tackle non-smooth, non-convex, and NP-hard optimization challenges.

A percentile-based optimization method is any optimization framework in which objectives, constraints, or key design decisions are defined in terms of percentiles of a data-driven or stochastic quantity. Rather than focusing on means or totals, these methods target policies, models, or allocations that optimize for specific quantiles—such as the minimum level of service achieved by 95% of users, the 99th percentile delay, or the probability that a performance metric exceeds a critical threshold. Percentile-based optimization arises in a diverse array of applications, including robust statistics, reinforcement learning, wireless communications, risk management, citation normalization, and resource allocation. Its distinguishing feature is the explicit control of tail performance, fairness, or outlier-robustness, often in contexts marked by heavy-tailed or skewed distributions.

1. Foundational Concepts and Rationale

Percentile-based frameworks were developed in part to address the inadequacy of mean- or moment-based metrics in the presence of skewed or heavy-tailed distributions, where a small set of outliers could dominate average or total values. For example, in citation impact analysis, arithmetic averages yield misleading results due to the hyper-skewness of citation distributions; in risk-sensitive reinforcement learning, mean performance masks catastrophic tail events. Percentile-based normalization maps individual values to their rank positions within an empirical or reference distribution, enabling non-parametric assessment that is less susceptible to distortions caused by outliers (Bornmann et al., 2013).

Definitionally, for a sample of $n$ values $x_1,\ldots,x_n$ , the $\alpha$ -percentile (quantile) is the value $q_\alpha$ such that at least a fraction $\alpha$ of the values $\leq q_\alpha$ . In optimization, percentile criteria manifest as constraints on quantile exceedance probabilities, utility functions that sum across ordered (worst-case) subsets, or policy objectives maximizing/minimizing a target quantile.

2. Percentile-Based Formulations Across Domains

2.1 Reinforcement Learning and Robust Control

A central application is risk-sensitive RL and robust MDPs, where decisions are evaluated with respect to their performance at, or beyond, a percentile threshold. In percentile risk-constrained MDPs, the goal is to find a policy $\pi$ that minimizes expected cost $E[G^\pi]$ while ensuring the Conditional Value-at-Risk (CVaR) or the Value-at-Risk (VaR) of the cumulative cost does not exceed a threshold (e.g., $CVaR_\alpha(\mathcal{J}^{\pi}) \leq \beta$ ). The CVaR at level $\alpha$ is defined as: $CVaR_\alpha(Z) = \min_\nu \left\{ \nu + \frac{1}{1-\alpha} \mathbb{E}[(Z-\nu)^+] \right\}$ where $Z$ is the random cost (Chow et al., 2015). Optimization proceeds via Lagrangian saddle-point methods, where the dual variable enforces the tail constraint, and policy and auxiliary parameters are jointly updated using stochastic approximation or actor–critic algorithms. Value iteration and policy gradient methods are adapted to support VaR or CVaR constraints directly in their update rules (Lobo et al., 7 Apr 2024).

In robust offline RL, percentile criteria are often approximated by solving a min–max problem over an ambiguity set $\mathcal{P}$ of transition models (e.g., derived from Bayesian credible regions). The robust return $v^\pi = \min_{P \in \mathcal{P}} \rho(\pi, P)$ provides a lower bound for the given percentile guarantee (Behzadian et al., 2019). Advanced value iteration algorithms now embed the VaR operator directly in the Bellman recursion: $(Tv)_s = \max_{a \in A} VaR_\alpha\left(\tilde{p}_{s,a}(r_{s,a} + \gamma v)\right)$ delivering less conservative, implicit ambiguity sets and improved performance bounds (Lobo et al., 7 Apr 2024).

2.2 Experiments, Black-Box Optimization, and Reliability

Percentile estimation for expensive black-box functions (such as simulations in engineering or reliability studies) motivates sequential design-of-experiments in the percentile space. Gaussian-process metamodels, coupled with stepwise uncertainty reduction (SUR), guide the sequential selection of inputs to maximally reduce uncertainty in a target percentile estimate. In this setting, percentiles of the GP mean over a Monte Carlo sample stand in for the objective, and infill criteria are constructed either to directly target probability of exceedance or to maximize the variance of the percentile estimator across the input space, leveraging closed-form bivariate Gaussian formulas for efficiency (Labopin-Richard et al., 2016).

2.3 Robust Statistics and Outlier-Resilient Estimation

Robust model fitting frequently uses percentile-based (order-statistics-driven) loss functions. For data with potentially $O$ outliers out of $M$ points, the optimal estimator minimizes the $(O+1)$ -st largest residual: $\min_\theta\, \phi_{per}(f_1(\theta),...,f_M(\theta)),\quad \phi_{per}(z) = z_{(O+1)}$ where $z_{(1)} \geq ... \geq z_{(M)}$ are the sorted residuals. Such formulations are typically non-smooth and non-convex, but have explicit properties: solutions are realized as global minimizers over "subset fit" problems—minimizing the worst-case (max) residual over inlier subsets of size $M-O$ (Domingos et al., 15 May 2024). Similar percentile-driven frameworks underpin methods such as LMS or LQS regression, bridging robust statistics with value-at-risk optimization.

2.4 Network Resource Allocation and Wireless Communications

Percentile-based optimization generalizes fairness criteria in resource allocation—specifically, optimizing user throughput at the cell-edge, rather than system-wide totals. The sum-least- $q$ -th-percentile (SLqP) utility function takes the form: $f_{K_q}(\mathbf{r}) = \sum_{i=1}^{K_q} r_{(i)}$ where $r_{(i)}$ is the $i$ -th smallest user rate. This subsumes both max–min (worst-user) and sum-rate (average) objectives. Algorithms for power-control and beamforming in multiuser MIMO exploit quadratic and logarithmic fractional transforms to recast non-smooth, non-convex percentile utility functions into block-concave forms that are amenable to efficient iterative optimization with convergence guarantees (Khan et al., 25 Mar 2024, Khan et al., 25 Mar 2024). In these formulations, percentile-based objectives grant operators fine-grained control over tradeoffs between overall throughput and the service provided to the "worst-off" users.

2.5 Mechanism Design

Order-statistics-inspired percentile mechanisms feature in Bayesian facility location problems, where the $k$ facilities are placed at prescribed order statistics of the agents' preferences. The expected social cost is analyzed through its connection to Wasserstein-1 (optimal transport) projections, with asymptotically optimal percentile rules characterized by a system of $k$ equations linking quantile levels to the first-order "local median" of the population distribution (Auricchio et al., 8 Jul 2024). The Bayesian approximation ratio for such mechanisms converges to the ratio of Wasserstein distances as the agent population grows.

3. Computational Techniques and Algorithmic Advances

Percentile optimization is intrinsically non-smooth and, in most settings, non-convex and NP-hard. Critical algorithmic advances address these hurdles via:

Fractional Transform Methods: In wireless resource allocation, quadratic and logarithmic fractional transforms introduce auxiliary variables that smooth the optimization landscape, allowing block coordinate ascent with provable monotonic improvement to stationary points (Khan et al., 25 Mar 2024, Khan et al., 25 Mar 2024).
Sequential Bayesian Design: Closed-form update rules for percentiles under Gaussian process models enable efficient active learning strategies that maximize information gain about the relevant quantile (Labopin-Richard et al., 2016).
Lagrangian Dual and VaR Operators: In RL and online optimization, embedding risk constraints as Lagrange multipliers or replacing ambiguity sets with direct VaR quantile scheduling provides more practical, less conservative solutions, along with formal sample complexity and sublinear regret guarantees (Chow et al., 2015, Lobo et al., 7 Apr 2024, Dai et al., 2023).
Subset-Enumeration and Reduction: For robust estimation, minimizer characterization theorems dramatically reduce the effective search space—particularly for convex residual losses, where only small subsets (of size $d+1$ or similar) need to be examined in depth (Domingos et al., 15 May 2024).
Distribution-Fitting-Free Estimation: Using L-moments and normal-polynomial transformations for percentile function estimation avoids the pitfalls of distributional mis-specification and yields monotonic, valid quantile curves even with limited data or contamination (Chen et al., 6 Mar 2025).

4. Performance, Limitations, and Tradeoffs

The empirical efficacy of percentile-based optimization depends heavily on the specifics of the percentile estimator and the application domain. Key observations include:

Fixed-Scale, Ambiguity, and Ties: Methods such as P100 yield fixed-scale percentile ranks and unambiguous tie handling but may underperform in predictive utility versus composite approaches (e.g., those using auxiliary covariates such as journal impact) (Bornmann et al., 2013).
Conservatism versus Efficiency: Bayesian credible region-based ambiguity sets are often overly conservative in RL, resulting in policies with diminished returns. VaR-based operators yield smaller, dynamically-adjusted ambiguity sets and less conservative, but still robust, guarantees (Lobo et al., 7 Apr 2024).
Resource-Performance Tradeoffs: Approaches that directly optimize percentile objectives (e.g., delay violation probability, lower-percentile user throughput) allow for significant reductions in resource usage and improved worst-user QoS, but may increase solution complexity relative to mean-based formulations (Tehrani et al., 24 Jul 2025, Khan et al., 25 Mar 2024).
Robustness and Sample Efficiency: L-moment-based percentile function estimation is demonstrably robust to outliers and maintains higher validity (monotonicity and coverage) in small samples compared to conventional parametric or Cornish-Fisher methods (Chen et al., 6 Mar 2025).

5. Application Case Studies

A wide spectrum of real-world applications are built on percentile-based optimization:

Citation Impact Assessment: Normalizing citation counts using percentile-based measures such as P100, SCImago, or double-rank power law fitting delivers inter-field comparability and robust long-run impact prediction (Bornmann et al., 2013, Brito et al., 2017).
Wireless Networks: SLqP-optimized power control and beamforming algorithms deliver configurable cell-edge throughput improvements in both short-term and ergodic scenarios, with design flexibility for hybrid utility functions (Khan et al., 25 Mar 2024, Khan et al., 25 Mar 2024).
Deep RL for Network Slicing: In open RAN environments, percentile-based delay constraints embedded in DRL reward functions ensure probabilistic QoS guarantees while optimizing resource (PRB) usage, outperforming mean-delay-based RL methods and traditional heuristics (Tehrani et al., 24 Jul 2025).
Online Ad Allocation: Budget pacing algorithms leveraging percentile-normalized dual variables and monotonic mapping functions achieve $O(\sqrt{T})$ dynamic regret and improved smoothness in ad delivery under traffic drift (Dai et al., 2023).
Robust Travel Time Estimation: The LMNPT approach yields accurate, robust percentile travel time predictions for ITS, retaining monotonicity and low volatility even under outlier contamination and small sample sizes (Chen et al., 6 Mar 2025).
Model Diagnostic Tools: Percentile-based residuals improve calibration of outlier detection and fit diagnostics in statistical models—particularly those with non-Gaussian or hierarchical structure (Bérubé et al., 2019).
Facility Location Mechanisms: Bayesian optimal percentile mechanisms for k-facility placement possess provably finite, distribution-dependent approximation ratios via optimal transport projections, providing both theoretical performance bounds and practical implementation guidelines (Auricchio et al., 8 Jul 2024).

6. Challenges, Misconceptions, and Future Directions

Percentile-based optimization is not universally "better" than mean-based or moment-based formulations; its appropriateness is determined by the application's sensitivity to tail behavior, fairness, or robustness concerns. In the empirical domain, the superiority of any given percentile method depends crucially on tie-handling, handling of covariates, degree of skewness in the distribution, and the predictive consistency of percentile assignments over time (Bornmann et al., 2013).

Computational tractability remains a critical challenge, as many percentile-based formulations are strongly NP-hard, motivating research into advanced transformations, block coordinate minimization, and subset-reduction theorems. Robustness against input distribution shifts, especially in online or high-dimensional settings, is a leading area of research focus (Khan et al., 25 Mar 2024, Dai et al., 2023).

Emerging topics include tighter theoretical characterizations of the approximation loss in percentile mechanisms under distributional mis-specification (Auricchio et al., 8 Jul 2024), further integration with Bayesian optimization frameworks for cost-effective black-box quantile estimation (Labopin-Richard et al., 2016), and advancements in hybrid percentile–mean/utility functions to tune multi-faceted objectives in wireless networks (Khan et al., 25 Mar 2024).

7. Summary Table: Representative Percentile-Based Methods

Application Area	Key Approach/Formulation	Notable Feature or Formula
RL (risk-sensitive, robust)	CVaR/VaR constraints in policy	$CVaR_\alpha(Z)$ , robust Bellman ops.
Wireless network optimization	SLqP/percentile utility	$f_{K_q}(\mathbf{r}) = \sum_{i=1}^{K_q} r_{(i)}$
Robust statistics	Percentile loss (outlier discard)	$\min_\theta\, (O+1)$ -st max residual
Mechanism design & facility location	Order-statistics-based placement	Facility at $q$ -th percentile, Wasserstein cost
Model assessment	Percentile-based residuals	$R^\diamond_k = \Phi^{-1}\{D_k(y_k)\}$
Black-box function optimization	Bayesian GP + percentile SUR	Infill at $\arg\max$ SUR criterion; $q_n = q(m_n(X))$
ITS/travel time percentile estimation	L-moment + NPT	$PTT(p) = a+b\,\Phi^{-1}(p)+c[\cdot]^2+d[\cdot]^3$

Percentile-based optimization thus provides a rigorous, adaptable suite of methods for robust, fair, and tail-sensitive decision making across complex modern data- and model-driven environments.