Supremum Integral Probability Metric
- supIPM is a distributional fairness measure that quantifies worst-case discrepancies between subgroup distributions using a supremum over IPMs.
- The DRAF algorithm provides a scalable surrogate by using a doubly regressing R² statistic and Fisher z-transformation to upper-bound the fairness gap.
- The framework offers rigorous theoretical guarantees and effective trade-offs between fairness and accuracy in high-dimensional intersectional contexts.
The supremum Integral Probability Metric (supIPM) is a distributional fairness measure and theoretical tool formalizing the worst-case discrepancy between induced distributions of an algorithm—often with respect to subgroups distinguished by combinations of sensitive attributes—using an IPM evaluated over a class of discriminators. The supIPM is defined as the largest (supremum) value taken by an IPM between distributions corresponding to subgroups or subgroup-unions (“subgroup-subsets”), providing a principled, distribution-based approach to measuring and enforcing subgroup fairness under extensive intersectionality.
1. Formulation of supIPM in Subgroup Fairness
The supIPM arises as the following fairness divergence:
where:
- is a family of subgroup-subsets (i.e., sets composed from unions of intersectional subgroups determined by multiple sensitive attribute values),
- is the conditional distribution of predictor outputs given sensitive attributes in ,
- is the analogous distribution for the complement,
- is a class of discriminator functions (e.g., neural networks, Lipschitz, or parametric sigmoid-based functions),
- The inner IPM assesses the distributional distance between each subgroup and its complement, and the outer supremum takes the worst-case such gap across all elements of (Lee et al., 24 Oct 2025).
This definition generalizes previous mean-based fairness notions to the distributional level and captures marginal, intersectional, and richer subgroup fairness regimes depending on the choice of .
2. Computational Scalability and the DRAF Algorithm
The direct implementation of supIPM can be computationally prohibitive due to the exponential growth of subgroup-subsets as the number of sensitive attributes increases. Each evaluation of supIPM requires computing an IPM for each , with potentially exceeding —the number of possible attribute combinations—making naive computation infeasible in practice for large or data sparsity in subgroups.
To address this, the Doubly Regressing Adversarial learning for Fairness (DRAF) algorithm introduces a surrogate fairness gap that is an explicit upper bound on supIPM, but can be optimized using a single adversary and weight vector:
- For each , define a membership vector indicating membership in each subgroup-subset .
- Introduce , the -dimensional unit sphere, to parameterize combinations of subgroup dependencies.
- Define the "doubly regressing" R² statistic:
with . Apply a Fisher z-transformation for numerical stability:
This surrogate fairness gap is provably an upper bound on supIPM [(Lee et al., 24 Oct 2025), Theorem 2], enabling adversarial optimization without explicit enumeration over all subgroup-subsets.
3. Theoretical Guarantees and Properties
DRAF’s surrogate fairness gap has key theoretical properties:
- It upper-bounds the original supIPM fairness gap , ensuring that reducing the surrogate also reduces the true worst-case distributional gap.
- The approach allows optimization via a single adversary (from ) and a single vector over the simplex, regardless of .
- Theoretical results establish that for a chosen , the difference in predictions across any is captured via a modified R² regression fit, reducing the computational burden (Lee et al., 24 Oct 2025).
If the discriminator class is rich enough, supIPM characterizes distributional fairness precisely; with simpler , it provides guaranteed control over particular classes of statistical disparities.
4. Empirical Methodology and Fairness-Accuracy Trade-offs
DRAF alternates minimization over predictor parameters and maximization over adversarial parameters , optimizing the objective:
where is a Lagrange multiplier trading off predictive accuracy and fairness.
Empirical results demonstrate:
- DRAF outperforms baseline methods (such as marginal fairness constraints or group-wise regularization) on benchmark datasets when the number of sensitive attributes is large and many intersectional subgroups are poorly represented.
- Trade-off assessments indicate that DRAF achieves favorable fairness (as measured by subgroup parity, marginal parity, and distributional metrics) without significantly compromising accuracy.
- Ablation studies show the importance of including all relevant subgroup-subsets in ; limiting constraints to marginal fairness alone may leave fairness gaps unaddressed in small, intersectional subgroups (Lee et al., 24 Oct 2025).
5. Relation to General IPMs and Other Fairness Notions
SupIPM subsumes and generalizes earlier distributional and mean fairness measures:
- For as the class of constant functions, supIPM reduces to worst-case mean parity disparity.
- For as all -Lipschitz functions, the IPM becomes the Wasserstein- distance; for RKHS or sigmoid function families, other fairness divergences arise.
- SupIPM’s distribution-based formalism supports both marginal and intersectional fairness and can handle subgroups with a wide range of sample sizes.
- Its design ensures theoretical consistency across the discrete-mean and full-distribution fairness landscape.
In the limit, as the discriminator class increases, the IPM becomes a strong differentiator for empirical distributions, making supIPM an effective tool for analyzing the extremal fairness achievable under adversarial training (Lee et al., 24 Oct 2025).
6. Limitations and Applicability
While DRAF and the underlying supIPM address computational issues inherent in intersectional subgroup fairness:
- The quality of approximation (tightness of the surrogate bound) depends on the richness of both the group set and discriminator class ; too narrow a choice may leave fairness violations undetected in some subgroups.
- Datasets with very small or empty subgroups require careful selection of to ensure empirical tractability and statistical reliability.
- Interpretation of supIPM values relies on understanding the measure’s sensitivity to both and (Lee et al., 24 Oct 2025).
Nonetheless, the approach is demonstrably scalable and robust across a variety of real-world fairness tasks.
7. Broader Impact and Future Directions
The supIPM framework offers a principled mechanism for enforcing and measuring subgroup and intersectional fairness with rigorous distributional guarantees. DRAF and similar algorithms provide a computational toolkit for achieving these guarantees in modern, high-dimensional, and intersectional fairness scenarios. Further research may refine the selection and approximation of extremely large , as well as dynamically learn tailored to data context. Extending supIPM-based certification to other types of statistical parity and causal fairness criteria remains an active direction.
Summary Table: supIPM for Subgroup Fairness
| Component | Description | Reference Section |
|---|---|---|
| Mathematical Definition | 1 | |
| Computational Surrogate | DRAF’s doubly regressing R² statistic with Fisher z-transformation as upper bound on supIPM | 2, 3 |
| Fairness Guarantee | Surrogate fairness gap provably upper-bounds true supIPM gap | 3 |
| Scalability | Single adversary and vector optimization for arbitrarily large | 2 |
| Empirical Efficacy | Superior subgroup and marginal fairness under high intersectionality | 4 |
| Applicability | Interpolation between mean, marginal, and intersectional fairness regimes | 5 |
The supIPM thus anchors a rigorous, distributional perspective on algorithmic fairness in settings with high-dimensional, intersectional sensitive attributes, enabling scalable, theoretically justified, and empirically robust learning algorithms (Lee et al., 24 Oct 2025).