Papers
Topics
Authors
Recent
2000 character limit reached

Group-Conditional Conformal Bands

Updated 21 November 2025
  • Group-conditional conformal bands are predictive intervals constructed via conformal methods to guarantee prescribed coverage uniformly within each subpopulation.
  • They utilize methodologies such as split-conformal calibration, quantile regression, and density estimation to adapt to differing noise levels and distributional properties among groups.
  • Their application in fairness-sensitive and high-stakes domains provides robust, stratified uncertainty quantification while balancing calibration accuracy and computational efficiency.

Group-conditional conformal bands are predictive intervals or sets constructed by conformal prediction techniques to guarantee prescribed coverage (1α1-\alpha) uniformly within each group of interest, rather than only overall or marginally. This approach is crucial when subpopulations differ in noise, distributional properties, or risk levels—a situation common in fairness-sensitive, scientific, and high-stakes applications. In contrast to classical conformal methods that only ensure overall coverage, group-conditional techniques explicitly control the probability that the true outcome falls within the predicted set, for every group indexed by auxiliary variables, covariates, or post hoc clusters. This article synthesizes recent developments in the theory, methodology, and practical deployment of group-conditional conformal bands, based on research such as (Duchi, 28 Feb 2025, Izbicki et al., 2019, Melki et al., 2023, Bairaktari et al., 24 Feb 2025, Tassopoulou et al., 17 Nov 2025, Plassier et al., 1 Jul 2024), and (Kaur et al., 17 Jan 2025).

1. Formal Definition and Theoretical Guarantees

For a covariate space X\mathcal X, consider a finite or infinite partition into KK groups: G={g1,g2,,gK},gkX,gkgk=,k=1Kgk=X.\mathcal G = \{\,g_1, g_2,\dots,g_K\},\quad g_k \subseteq \mathcal X,\quad g_k\cap g_{k'}=\emptyset,\quad \bigcup_{k=1}^K g_k=\mathcal X. A prediction set function $C:\mathcal X\toto\mathcal Y$ satisfies group-conditional (1α)(1-\alpha) coverage if gG\forall g\in\mathcal G,

P(YC(X)Xg)    1αP(Y\in C(X)\mid X\in g)\;\ge\;1-\alpha

or equivalently,

P(Y∉C(X),Xg)αP(Xg).P(Y\not\in C(X),\,X\in g) \le \alpha\,P(X\in g).

This guarantee can be formulated for groups defined by arbitrary covariate partitions, demographic information, or even overlapping/fuzzy memberships via weights wG(x,y)[0,1]w_G(x,y)\in[0,1] (Bairaktari et al., 24 Feb 2025).

While exact finite-sample conditional coverage at the individual level is statistically infeasible without strong assumptions (Izbicki et al., 2019), group-conditional coverage is attainable and can be made arbitrarily close to exact as group sample sizes grow, with error converging as O(1/ng)O(1/\sqrt{n_g}) (Duchi, 28 Feb 2025, Bairaktari et al., 24 Feb 2025). For practical finite samples with ngn_g calibration points per group, non-asymptotic bounds of the form

P(YCg(X)Xg)(1α)O(log(K/δ)ng)|P(Y\in C_g(X)\mid X\in g) - (1-\alpha)| \le O\Bigl(\sqrt{\frac{\log(K/\delta)}{n_g}}\Bigr)

hold with high probability (Duchi, 28 Feb 2025, Bairaktari et al., 24 Feb 2025).

2. Construction Methodologies for Group-Conditional Bands

Split-Conformal Group Quantiles

One canonical approach calibrates prediction bands separately within each group. After fitting a base predictor μ:XR\mu:\mathcal X\to\mathbb R, a held-out calibration set is partitioned by group. For each gg, collect residual scores (usually Yiμ(Xi)|Y_i-\mu(X_i)|), then compute the empirical quantile q^g(α)\hat q_g(\alpha) at (1α)(1-\alpha): q^g(α):=empirical (1α)-quantile of {S(Xi,Yi)Xig}\hat q_g(\alpha) := \text{empirical } (1-\alpha)\text{-quantile of}\ \{S(X_i,Y_i)\mid X_i\in g\} The prediction band for xgx\in g is then Cg(x)={y:S(x,y)q^g(α)}C_g(x) = \{y: S(x,y)\leq \hat q_g(\alpha)\}, which achieves coverage up to an O(1/ng)O(1/\sqrt{n_g}) inflation (Duchi, 28 Feb 2025, Tassopoulou et al., 17 Nov 2025).

Quantile Regression Calibration

An alternative, especially when group structure is high-dimensional or continuous, is to regress quantile thresholds on group indicators or features. For each group gg, estimate the (1α)(1-\alpha)-quantile of residuals via a pinball loss regression: q^g(α)=argminqi:Xigρα(Yiμ(Xi)q),ρα(u)=u(α1{u<0})\hat q_g(\alpha) = \arg\min_q \sum_{i:X_i\in g} \rho_\alpha(Y_i-\mu(X_i)-q),\quad \rho_\alpha(u)=u(\alpha-1\{u<0\}) or, for classification, encode group features and solve a quantile regression for the conformity scores (Melki et al., 2023). This yields simultaneous, adaptive, and regularized group thresholds.

Density Estimator-based (CD-split) Bands

Another principled method partitions the feature space via unsupervised clustering and constructs group-specific bands using conditional density estimators:

  • Fit f^(yx)\hat f(y|x) on a calibration split;
  • Cluster covariate space into cells via “profile-distance” metrics reflecting f^\hat f shape (Izbicki et al., 2019);
  • For each cell (group), calibrate a local threshold, thus defining the group’s predictive set as a density level set.

Weighted and Overlapping Groups (Kandinsky Framework)

Recent extensions—inspired by Kandinsky’s art—allow for overlapping and fractional group memberships. Each group GG is characterized by wG(x,y)w_G(x,y), and thresholds are estimated either pessimistically (minimum over relevant groups) or via weighted aggregation: τ^(x,y)=minG:wG(x,y)>0τGorτ^(x,y)=GλG(x,y)τG\hat\tau(x,y) = \min_{G:w_G(x,y)>0}\tau_G\quad \text{or}\quad \hat\tau(x,y)=\sum_G\lambda_G(x,y)\tau_G where λGwG(x,y)\lambda_G\propto w_G(x,y) (Bairaktari et al., 24 Feb 2025). Quantile regression in a low-dimensional basis of group weights yields minimax-optimal bounds on coverage deviation for arbitrary groupings.

3. Empirical Performance and Practical Tradeoffs

Group-conditional conformal bands have been empirically validated across regression, classification, and time-series biomarker prediction (Tassopoulou et al., 17 Nov 2025, Melki et al., 2023, Izbicki et al., 2019, Kaur et al., 17 Jan 2025). The following table summarizes empirical performance highlights found in the literature:

Method/Context Coverage Gap (per group) Band Width
Population-level conformal Often undercovers (esp. small/minority groups) Minimal
Mondrian/group-conditional O(1/ng)O(1/\sqrt{n_g}) error; achieves target coverage uniformly Slightly wider in small groups
Quantile-regression calibrated Near-exact 1α1-\alpha coverage, better adaptation; sometimes smaller sets Adaptive
CD-split (density-based) Exact finite-sample in each cell; asymptotically optimal region size Flexible
  • In clinical biomarker prediction, group-conditional bands restored desired coverage (e.g., 90%90\%) in all subgroups while maintaining practical band width (Tassopoulou et al., 17 Nov 2025).
  • In classification, quantile regression over group indicators yields set sizes that can be smaller than marginal APS while ensuring stable coverage per group, even for numerous or rare groups (Melki et al., 2023).
  • CD-split partitions yield exact finite-sample coverage per cell, with smaller regions in multimodal or skewed regimes (Izbicki et al., 2019).

For small groups (ngn_g low), width inflation from quantile estimation error can be mitigated by regularization, sharing thresholds, or merging groups (Duchi, 28 Feb 2025, Bairaktari et al., 24 Feb 2025).

4. Extensions: Approximate and Model-Based Conditional Validity

When interest extends to groups defined by continuous covariates or when model-based estimates for PYXP_{Y|X} are available, frameworks such as Probabilistic Conformal Prediction (G-CP) enable approximate group-conditional validity. If P^YX\widehat P_{Y|X} is a good estimator, coverage within groups is

P(YC(X)G=g)(1α)ϵTV+O(1/ng)|P(Y\in C(X)\mid G=g)-(1-\alpha)| \le \epsilon_{\rm TV} + O(1/\sqrt{n_g})

where ϵTV\epsilon_{\rm TV} is the maximum total variation discrepancy between the true conditional and the model estimate (Plassier et al., 1 Jul 2024). The calibration involves sampling, score computation, and quantile thresholds per group, leading to robust coverage even under heteroskedastic or non-Gaussian data.

Furthermore, groupings can be induced adaptively via clustering and “profile” metrics or stratified by model confidence and trust scores, leading to coverage stratification across arbitrary subpopulations (Kaur et al., 17 Jan 2025).

5. Computational Aspects and Practical Considerations

Group-conditional split conformal calibration is computationally efficient: sorting and quantile computation scale linearly per group. Full conformal or density-based approaches may incur additional cost for clustering, density estimation, or solving quadratic programs at test time (Duchi, 28 Feb 2025, Izbicki et al., 2019).

Key practical considerations:

  • Sample size per group: Sufficient calibration points (ng1/αn_g \gg 1/\alpha) ensure tight coverage; small ngn_g requires merging or smoothing (Duchi, 28 Feb 2025, Melki et al., 2023).
  • Choice of conformity score: Residuals, softmax tail sums, density estimators, and maximal deviations are all employed depending on context.
  • Regularization: Quantile regression solvers benefit from 1\ell_1 or 2\ell_2 penalties in high-dimensional group-feature settings (Melki et al., 2023, Bairaktari et al., 24 Feb 2025).
  • Extension to continuous or hierarchical groups: Encodings via feature maps, polynomial approximations, or hierarchical clustering facilitate broader applicability (Bairaktari et al., 24 Feb 2025, Izbicki et al., 2019).

6. Application Domains and Fairness Motivations

Group-conditional conformal bands are motivated by practical needs for stratified risk control, fairness, and hidden stratification in settings such as:

In fairness-centric analyses, population-level conformal bands are often insufficient: coverage in minority or high-risk subgroups can fall far below the nominal target, leading to disparate risk and eroded trust (Tassopoulou et al., 17 Nov 2025, Bairaktari et al., 24 Feb 2025).

7. Limitations and Future Directions

Current approaches are limited by

Ongoing research investigates:

Group-conditional conformal bands are a robust advancement in predictive uncertainty quantification, particularly relevant for equitable inference and reliable decision-making in stratified populations. They integrate statistical learning, nonparametric calibration, and fairness-driven methodology, offering rigorous tools for practical deployment across scientific and engineering contexts (Duchi, 28 Feb 2025, Tassopoulou et al., 17 Nov 2025, Bairaktari et al., 24 Feb 2025, Melki et al., 2023, Izbicki et al., 2019, Plassier et al., 1 Jul 2024, Kaur et al., 17 Jan 2025).

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Group-Conditional Conformal Bands.