Confidence-Level Allocation (COLA)
- COLA is a framework that allocates a specified risk level (α) to balance judgment and statistical evidence, controlling miscoverage across procedures.
- In statistical decision theory, COLA integrates judgmental actions with data-driven updates to ensure optimality and tunable risk aversion.
- In conformal prediction, COLA optimizes the allocation of miscoverage over multiple nonconformity scores, reducing prediction set sizes while guaranteeing coverage.
Confidence-Level Allocation (COLA) refers to a class of decision rules and statistical frameworks in which a global or local confidence level parameter is allocated, either for the purposes of decision-making under uncertainty or for controlling statistical miscoverage across multiple procedures. COLA methodologies appear in two principal domains: (1) statistical decision theory, where it encodes statistical risk aversion when departing from a judgmental anchor, and (2) conformal prediction, where it governs the allocation of miscoverage across prediction sets induced by multiple nonconformity scores. Both classes of COLA strategies provide optimality and admissibility guarantees, with having a direct interpretation as a measure of tolerated statistical error or uncertainty.
1. COLA in Statistical Decision Theory
In the context of statistical decision theory, the Confidence-Level Allocation (COLA) methodology formalizes how a decision-maker merges judgmental and statistical inputs. Consider a parameter , a sample (with known ), and a quadratic loss function governing the quality of action (e.g., a portfolio weight). The decision-maker specifies a status-quo action and a confidence level . The COLA rule is as follows:
- If the data do not provide statistically significant evidence against at level , retain .
- Otherwise, move from to the nearest endpoint of the confidence interval for the optimality condition.
This rule admits an explicit form: for (the "informed" optimum), the action
where and is the standard normal CDF (Manganelli, 2019). For the special case , this reduces to soft-thresholding:
2. Admissibility, Performance Guarantees, and Statistical Risk Aversion
COLA rules in decision theory are admissible in the sense of minimal risk—no other rule yields uniformly lower expected loss under the quadratic objective. The central guarantee: with probability at least , the data-driven action does not incur greater loss than . This formalizes as the maximum probability of performing strictly worse than the judgmental action (Manganelli, 2019).
The parameter quantifies statistical risk aversion:
- As , the rule never abandons (maximum aversion).
- As , it always sets (plug-in maximum likelihood, risk-neutral).
Urn-based elicitation experiments, modeled after Ellsberg, have been proposed to operationalize the choice of . Here, a participant's chosen "bet number" (count of adverse outcomes tolerated in repeated urn draws) translates to and thus codifies their statistical risk aversion.
3. COLA in Conformal Prediction: Multi-Score Aggregation
In predictive inference, Confidence-Level Allocation (COLA) addresses the challenge of aggregating multiple conformal prediction sets induced by distinct nonconformity score functions. Each generates a split-conformal prediction set at nominal coverage , with the allocation vector $\alpha_{\vec} \in \mathbb{R}^K$ satisfying .
Given candidate split-conformal sets, the COLA framework searches for the allocation of miscoverage that minimizes the expected (or empirical) size of the intersection set:
while preserving the overall marginal coverage guarantee via a union bound. The corresponding optimization is:
$\min_{\alpha_{\vec} \in \Theta} \frac{1}{n} \sum_{i=1}^n \left| \bigcap_{k=1}^K \hat C_k(X_i; \alpha_k) \right|,\quad \text{s.t.}\ \alpha_k \geq 0,\ \sum_k \alpha_k = \alpha.$
where is the simplex of feasible allocations.
4. COLA Algorithmic Variants: COLA-e, COLA-s, COLA-f, COLA-l
Distinct algorithmic instantiations of COLA provide performance-efficiency tradeoffs and adapt to application constraints (Xu et al., 15 Nov 2025):
- COLA-e (Empirical Allocation): Minimizes the empirical average prediction set size over the training set, achieving asymptotic marginal coverage with a rate gap of .
- COLA-s (Sample Splitting): Ensures finite-sample marginal coverage by dividing data into train/validation (fit allocation on split, deploy on held-out). Coverage is exact due to exchangeability; set sizes are modestly larger than COLA-e.
- COLA-f (Full Conformalization): Grants exact finite-sample coverage by re-computing allocations for every test point/label under augmented exchangeability. Computationally intensive, but minimizes conservatism relative to splitting.
- COLA-l (Local Allocation): Individualized, data-adaptive allocation based on kernel-weighted quantiles. Minimizes local prediction set size at ; achieves asymptotic conditional coverage.
The underlying optimization remains piecewise-constant and combinatorial. Grid and stepwise search routines over the allocation simplex are used to identify near-optimal allocations, with computational complexity scaling as or better when exploiting sparsity.
5. Theoretical Guarantees and Efficiency
Theoretical performance of COLA variants is governed by the properties of empirical quantiles and the Bonferroni union bound. COLA-s and COLA-f yield exact finite-sample marginal coverage; COLA-e attains asymptotic coverage with explicit convergence rates. Efficiency (measured by set size) is bounded above by terms involving the Lipschitz constants of empirical quantile functions, with excess size over oracle allocation vanishing as sample size increases.
COLA-l's conditional validity result demonstrates that under appropriate kernel smoothness conditions, its local prediction set attains asymptotic conditional coverage at the specified level, modulo rates depending on kernel bandwidth and support overlap.
6. Empirical Benchmarks and Applications
Empirical evaluations utilize both synthetic and real-world regression datasets (e.g., UCI BlogFeedback, Concrete, and Superconductivity) to benchmark COLA against baselines such as EFCP/VFCP (single-score selection), majority vote, score/model-level aggregation, and SAT (p-value merging) (Xu et al., 15 Nov 2025). Across scenarios, COLA-e achieves the smallest average set sizes for moderate to large , while COLA-s and COLA-f guarantee exact coverage with limited cost in size. COLA-l adapts set sizes to local complexity, yielding smallest sets where the conformity scores agree or data density is high.
In the statistical decision-theoretic setting, applications center on mean–variance portfolio allocation, where COLA offers tunable risk guarantees relative to a judgmental "cash only" baseline. For low , portfolios remain in cash; for high , they aggressively chase mean estimates (with empirical evidence of higher drawdowns in adverse regimes) (Manganelli, 2019).
7. Practical Recommendations and Extensions
Choice among COLA variants depends on validity-efficiency-computation tradeoffs:
- For exact finite-sample validity with manageable computation, use COLA-s.
- For minimal set size with large and acceptable approximate coverage, COLA-e is preferred.
- For individualized coverage demands and heterogeneity, COLA-l leverages data-adaptive kernel weighting.
- For scenarios tolerating high computation to minimize conservatism, COLA-f is applicable.
Hyperparameters, e.g., train/cal split proportion, kernel and bandwidth in COLA-l, require cross-validation or plug-in tuning. Extensions under discussion include regularized allocation to enforce structure (e.g., sparsity) and end-to-end optimization of models and -allocation in conformal pipelines, as well as utility-driven allocations trading off set size for downstream decision value (Xu et al., 15 Nov 2025).