Minimax Quantiles Overview

Updated 10 June 2026

Minimax quantiles are defined as the smallest risk thresholds ensuring that the loss exceeds a specified level with probability at most δ, thereby strictly controlling tail risks.
They extend the classical minimax risk concept by focusing on the behavior of loss quantiles rather than averages, which is crucial for managing rare, catastrophic events in various models.
Analytical techniques such as high-probability Le Cam and Fano methods, along with adaptive estimation, underpin their optimality in both parametric and nonparametric frameworks.

A minimax quantile, or minimax-(1−δ)-quantile, is the smallest risk threshold achievable uniformly over a class of estimators or policies, such that for every data-generating process in a given model, the probability that the loss exceeds this threshold is at most δ. Unlike classical minimax risk, which focuses on expected loss, the minimax quantile formalism rigorously controls the tail behavior of losses at a user-specified confidence level. This perspective is central in robust statistics, high-dimensional inference, learning theory, optimal treatment assignment, and interactive decision making—particularly where rare but catastrophic errors are a primary concern.

1. Formal Definition and Distinction from Classical Minimax Risk

Let $\Theta$ be a parameter (or model) space, $\mathcal{A}$ a class of estimators or policies, and $L(\theta, \hat\theta)$ a nonnegative loss functional. For any estimator $\hat\theta(X)$ and distribution $P_\theta$ , the usual minimax risk (in expectation) is

$\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$

The minimax quantile at level $\delta \in (0,1)$ is defined as (Ma et al., 2024, Bongole et al., 7 Oct 2025): $\mathcal{M}(\delta) = \inf_{\hat\theta} \sup_{\theta \in \Theta} \inf\{ r \ge 0 : P_\theta\big( L(\theta, \hat\theta(X)) > r \big) \le \delta \}.$ The lower minimax quantile relaxes the order of infimum and supremum: $\mathcal{M}_-(\delta) = \inf\left\{ r \ge 0 : \inf_{\hat\theta}\sup_{\theta\in\Theta} P_\theta(L(\theta, \hat\theta(X)) > r) \le \delta \right\}.$

Minimax quantiles capture distributional tails: the $(1-\delta)$ -quantile of the loss is the minimal $\mathcal{A}$ 0 such that with at least $\mathcal{A}$ 1 probability, $\mathcal{A}$ 2 holds simultaneously for all $\mathcal{A}$ 3 (or with worst-case probability at most $\mathcal{A}$ 4 for exceeding $\mathcal{A}$ 5).

Unlike the minimax mean risk, which “averages” over tail events, minimax quantiles yield tight control on worst-case high-probability errors. Relevant equivalence and scaling relationships include $\mathcal{A}$ 6 (quantile-to-expectation conversion) and the near-coincidence of $\mathcal{A}$ 7 except on a countable set of $\mathcal{A}$ 8 (Ma et al., 2024, Bongole et al., 7 Oct 2025).

2. Information-Theoretic Lower Bounds for Minimax Quantiles

The sharpest minimax quantile lower bounds in classical and interactive statistical inference are obtained via high-probability extensions of the Le Cam and Fano methods (Ma et al., 2024, Bongole et al., 7 Oct 2025):

High-probability Le Cam (two-point) method:

Let $\mathcal{A}$ 9 denote a pseudo-metric between parameters. If two models $L(\theta, \hat\theta)$ 0 induce distributions $L(\theta, \hat\theta)$ 1 with total variation $L(\theta, \hat\theta)$ 2 (for $L(\theta, \hat\theta)$ 3), then for any estimator,

$L(\theta, \hat\theta)$ 4

where $L(\theta, \hat\theta)$ 5 and $L(\theta, \hat\theta)$ 6 is increasing.

High-probability Fano method:

For a packing $L(\theta, \hat\theta)$ 7, if

$L(\theta, \hat\theta)$ 8

then $L(\theta, \hat\theta)$ 9 for all $\hat\theta(X)$ 0, where $\hat\theta(X)$ 1 (Ma et al., 2024).

These tools establish that nontrivial tail control requires matching the classical minimax risk lower bounds but now at specified risk levels. In the interactive framework, high-probability lower bounds extend to adaptive protocols and bandit-type regret for quantile-level guarantees (Bongole et al., 7 Oct 2025).

3. Minimax Quantiles in Specific Parametric and Nonparametric Models

Location-Scale Families

For i.i.d. $\hat\theta(X)$ 2 from a location-scale family $\hat\theta(X)$ 3, estimation of the $\hat\theta(X)$ 4-quantile $\hat\theta(X)$ 5 under invariant loss,

$\hat\theta(X)$ 6

the general result of Marchand and Strawderman shows that the minimum risk equivariant estimator (MRE)—which is Bayes with respect to the right-Haar prior—is minimax for both the unrestricted and many restricted parameter spaces (e.g., lower bounds on the quantile or on $\hat\theta(X)$ 7) (Marchand et al., 2012). Explicitly, for normal error and squared loss, the MRE estimator

$\hat\theta(X)$ 8

remains minimax with constant risk, even under the constraints, and further improvements via truncation are not possible.

Lévy Measures and Generalized Quantiles

For real-valued Lévy processes, the generalized quantiles $\hat\theta(X)$ 9, $P_\theta$ 0 solve $P_\theta$ 1 and $P_\theta$ 2; i.e., expected number of jumps larger than $P_\theta$ 3 per unit time is $P_\theta$ 4 (Trabs, 2014). Nonparametric estimators based on empirical characteristic functions and Fourier-based kernel smoothing attain minimax-optimal rates in both moderately and severely ill-posed settings: $P_\theta$ 5 with $P_\theta$ 6 explicitly in terms of sample size, smoothness, and ill-posedness parameters (Trabs, 2014).

High-dimensional, Nonparametric and Sparse Cases

For models with bounded variation or other structural constraints, such as trend filtering (Padilla et al., 2020) and $P_\theta$ 7-sparse quantile regression (Chen et al., 2020), minimax quantile rates scale as $P_\theta$ 8 for $P_\theta$ 9-sparse linear functionals and $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 0 for bounded-variation signals, up to logarithmic factors. Regularization methods (e.g., penalized or constrained) achieve tail risk bounds that are exponential in sample size and quantitative in sparsity and complexity (Belitser et al., 2022, Padilla et al., 2020).

Regularized empirical risk minimization with the pinball/check loss achieves the minimax-optimal rates $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 1 for conditional quantile estimation under $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 2-Hölder or Sobolev smoothness of $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 3 (Steinwart et al., 2011, Padilla et al., 2020). Modern deep networks with ReLU activations similarly achieve these rates on broad nonparametric classes, even under minimal assumptions and heavy-tailed error (Padilla et al., 2020).

4. Minimax Quantile Risk in Linear Regression and Beyond

For linear regression $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 4 (with Gaussian or heavy-tailed errors), the minimax quantile risk for squared error is attained by the empirical risk minimizer (OLS) when the noise is Gaussian and the design is well-behaved: $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 5 and is matched (up to constants) by robust min–max procedures under weaker assumptions (Hanchi et al., 2024). For more general convex and symmetric error functions $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 6 (including $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 7-th power losses $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 8), trimmed empirical min–max estimators yield $\mathcal{M} = \inf_{\hat\theta} \sup_{\theta\in\Theta} \mathbb{E}_{P_\theta}[L(\theta, \hat\theta(X))].$ 9 quantile risk, provided the design satisfies moment assumptions and regularity (Hanchi et al., 2024). This extension relies on sharp characterization of the sample covariance's smallest eigenvalue quantiles and tailored trimming to protect against outliers and heavy-tails.

5. Applications, Extensions, and Implications

Treatment Choice and Minimax Quantile Regret

In finite-sample treatment choice, the minimax quantile regret problem is fully identified for a broad class of assignment rules and sampling designs. For binary treatment with $\delta \in (0,1)$ 0 and target quantile $\delta \in (0,1)$ 1, all treatment rules are minimax for fixed design and random assignment (supremum regret always $\delta \in (0,1)$ 2), with unique solutions only when some baseline quantiles are known a priori and differ from $\delta \in (0,1)$ 3 (Guggenberger et al., 6 Jan 2026). This contrasts with expectation-based minimax rules, where data-dependent algorithms can strictly outperform naive rules.

Interactive and Safety-Critical Learning

Quantile-based minimax risk is fundamentally necessary for safety-critical learning protocols, such as reinforcement learning or stochastic MDPs, where rare maximum-regret events are catastrophic (Bongole et al., 7 Oct 2025). High-probability minimax lower bounds deliver risk-level explicit tradeoffs and a deeper understanding of the limits of learning under tail constraints. Key technical advances include Fano and Le Cam tail bounds, quantile-to-expectation conversion, and equivalence between strict and lower minimax quantiles.

Adaptive and Robust Estimation

Lepski’s method enables minimax-adaptive quantile estimation with rates that match lower bounds up to logarithmic factors under both mild and severe ill-posedness (Trabs, 2014). Penalized and constrained quantile estimators yield oracle-type adaptive rates under sparsity, and robust confidence sets with adaptive diameter can be constructed under quantile-based tail assumptions (Belitser et al., 2022).

6. Summary Table: Minimax Quantile Estimation Problems

Problem/Setting	Achievable Rate (up to $\delta \in (0,1)$ 4 factors)	Reference
Location-scale quantile (parametric)	Constant (explicit closed form, normal case)	(Marchand et al., 2012)
Generalized Lévy quantile	$\delta \in (0,1)$ 5 (mildly ill-posed)	(Trabs, 2014)
Trend filtering, total variation	$\delta \in (0,1)$ 6 or $\delta \in (0,1)$ 7 (fast regime, spline)	(Padilla et al., 2020)
High-dimensional sparse regression	$\delta \in (0,1)$ 8	(Chen et al., 2020)
Nonparametric RKHS (pinball)	$\delta \in (0,1)$ 9 (Hölder/Sobolev)	(Steinwart et al., 2011)
Linear regression (square loss)	$\mathcal{M}(\delta) = \inf_{\hat\theta} \sup_{\theta \in \Theta} \inf\{ r \ge 0 : P_\theta\big( L(\theta, \hat\theta(X)) > r \big) \le \delta \}.$ 0	(Hanchi et al., 2024)
Minimax-regret (treatment choice)	$\mathcal{M}(\delta) = \inf_{\hat\theta} \sup_{\theta \in \Theta} \inf\{ r \ge 0 : P_\theta\big( L(\theta, \hat\theta(X)) > r \big) \le \delta \}.$ 1 (for all rules, worst-case)	(Guggenberger et al., 6 Jan 2026)

7. Practical Considerations and Open Directions

The minimax quantile paradigm is now recognized as providing a finer-grained and operationally meaningful standard for evaluating statistical procedures in the presence of heavy tails, adversarial contamination, high-dimensional noise, and safety constraints (Ma et al., 2024, Bongole et al., 7 Oct 2025). Modern advances enable its rigorous application in parametric, nonparametric, and interactive settings.

Ongoing developments include tight high-probability minimax lower bounds for local problems, quantile-performance measures for sequential decision making, and adaptivity with respect to unknown tail regularity. Open problems include full characterization of minimax quantiles in reinforcement learning, adaptive minimax quantile testing, and robust uncertainty quantification under complex data contamination regimes.