SA-DRO: Fairness via Sensitive Attribute Optimization
- Sensitive-Attribute-based DRO (SA-DRO) is a framework that ensures fairness by optimizing for worst-case perturbations in sensitive attribute distributions to mitigate subgroup bias.
- It employs both f-divergence and Wasserstein ball formulations to design ambiguity sets that counteract demographic imbalances and preserve minority subgroup performance.
- SA-DRO achieves competitive accuracy with minimal loss while offering theoretical bias reduction, finite-sample guarantees, and robustness in centralized and federated settings.
Sensitive-Attribute-based Distributionally Robust Optimization (SA-DRO) refers to a family of methods designed to ensure fairness in machine learning models by robustifying predictive performance with respect to variations or adversarial shifts in the distribution of sensitive (protected) attributes. SA-DRO aims to limit undesirable inductive biases—such as systematic performance degradation for minority subgroups—induced by enforcing fairness constraints like demographic parity (DP) under imbalanced sensitive-attribute marginals. The framework generalizes principled robust optimization to the fairness setting by constructing ambiguity sets or cost structures that explicitly encode invariance or adversarial perturbations along sensitive attributes, and can be instantiated using both –divergence ball approaches and causally-motivated Wasserstein balls.
1. Motivation: Inductive Bias in DP-based Fairness
Standard in-processing methods for fair supervised learning commonly enforce demographic parity by penalizing or constraining the dependence of the prediction on a sensitive attribute , for example: where a popular measure is the Difference in Demographic Parity (DDP): Theorem 1 from (Lei et al., 2024) establishes that, under sensitive-attribute imbalance (, ), strict DP regularization () forces the conditional output distributions for all subgroups to concentrate around that of the empirical majority group: Thus, in imbalanced settings, DP regularization risks "washing out" minority subgroup predictive structure, manifesting as accuracy loss for minorities.
SA-DRO was introduced to alleviate this effect by ensuring fair classifiers are robust not just under the observed sensitive-attribute marginal, but under worst-case perturbations in the sensitive distribution—thereby mitigating the majority-group bias induced by demographic imbalance (Lei et al., 2024).
2. Mathematical Formulation of SA-DRO
2.1 Sensitive-Marginal –divergence Ball
Let the data-generating joint be
0
SA-DRO restricts the ambiguity set so that only the sensitive marginal 1 can differ, with conditional features and outcomes fixed: 2 where
3
for an 4–divergence such as 5 or 6.
The combined min–max fair learning problem is then: 7 Metrics such as DDP, mutual information, or kernel-MMD may be used for the regularizer.
2.2 Causally Fair Wasserstein SA-DRO
A distinct instantiation introduces Wasserstein DRO over feature-outcome pairs 8, where 9, 0 is sensitive and 1 non-sensitive (Ehyaei et al., 30 Sep 2025). The ambiguity set is a Wasserstein ball with a cost metric 2—the "Causally-Fair Dissimilarity Function" (CFDF)—defined so that counterfactual twins (samples differing only in sensitive attribute) have zero distance: 3 with
4
The SA-DRO optimization is: 5 Strong duality allows reducing this to a regularized empirical risk with explicit counterfactual-fairness and adversarial-robustness structure.
3. Theoretical Analysis
3.1 Bias Reduction Intuition
In standard DP-regularized minimization, tightening DDP constraints with imbalanced 6 enforces alignment to the majority conditional, propagating bias to all subgroups (Lei et al., 2024). In SA-DRO, the inner maximization over 7 redistributes mass towards groups with higher conditional risk, compelling the classifier to perform well even if the sensitive marginal shifts against the observed empirical distribution.
3.2 Dual Form and Regularization
For 8, Fenchel duality yields a tractable representation: 9 where 0 is group-conditional risk. The resulting minimization is amenable to standard optimization and generalization analysis (Lei et al., 2024).
For causally-aware Wasserstein SA-DRO, under linear SCM and 1-type losses, one obtains a closed-form regularizer: 2 With Lipschitz losses,
3
where 4 is the counterfactual-robust risk (Ehyaei et al., 30 Sep 2025).
3.3 Finite-Sample Guarantees
Under boundedness, Lipschitzness, and finite-diameter assumptions, SA-DRO admits finite-sample error bounds of 5 for the excess robust risk, with additional 6 penalty for metric-estimation error in the causal metric (Ehyaei et al., 30 Sep 2025).
4. Algorithmic Framework
4.1 Projected Gradient Descent–Ascent
The standard 7–divergence ball SA-DRO (Algorithm 1 in (Lei et al., 2024)) uses alternating updates:
- w-step (model parameters):
8
- q-step (group distribution):
9
where 0 projects onto the divergence-ball, often in closed-form for 1 or via root finding for 2 divergence.
4.2 Complexity and Practicalities
- For binary or few-valued sensitive attributes, projection overhead is negligible.
- For high-dimensional/continuous sensitive attributes, group-wise reweighting or kernelization is employed.
- Total cost per iteration: 3 for 4, 5 for 6.
For causally fair Wasserstein SA-DRO, computation relies on:
- Counterfactual risk evaluation: 7
- Regularizer computation (closed-form or first-order approximation)
- Backpropagation through both loss and, if necessary, its derivatives (Ehyaei et al., 30 Sep 2025).
5. Empirical Results and Performance
5.1 Centralized Experiments
On datasets such as COMPAS (race), Adult (gender), and CelebA (blondness), with synthetic 80%–20% subgroup imbalance:
| Method | COMPAS Acc | COMPAS DDP | Adult Acc | Adult DDP |
|---|---|---|---|---|
| ERM | 68.0% | 0.287 | 85.1% | 0.183 |
| KDE-fair | 66.8% | 0.027 | 83.2% | 0.023 |
| KDE-SA-DRO | 66.0% | 0.009 | 82.5% | 0.012 |
| RFI-fair | 66.4% | 0.021 | 80.6% | 0.019 |
| RFI-SA-DRO | 65.4% | 0.017 | 80.1% | 0.021 |
Accuracy reduction compared to ERM is modest (<2 pp), while DDP is driven near zero. SA-DRO further reduces subgroup disparity as measured by negative rate 8 (Lei et al., 2024).
5.2 Distributed (Federated) Experiments
In federated settings, standard DP-federated models degrade minority client accuracy below baseline. SA-DRO recoups much of this accuracy, aligning incentives even for heavily underrepresented minority clients (Lei et al., 2024).
5.3 Individual and Causal Fairness
On Adult and COMPAS processed via a linear SCM, causally fair SA-DRO yields substantial reduction in unfair area (Unfair Area Index 9) and improved counterfactual fairness compared to ERM, adversarial learning, and the ROSS procedure (Ehyaei et al., 30 Sep 2025).
6. Extensions and Limitations
- Fairness Regularizer Generality: SA-DRO accommodates any in-processing fairness penalty 0, such as equalized odds or subgroup fairness, by substitution in the objective.
- Hyperparameter Sensitivity: Divergence radius 1 or Wasserstein radius 2 controls tradeoff between robustness and pessimism—too small gives little regularization, too large may degrade performance (Lei et al., 2024).
- Finite-Sample Theory: Current 3–divergence SA-DRO theory demonstrates qualitative bias-reduction; quantitative generalization bounds remain an open area for DP-oriented SA-DRO (Lei et al., 2024).
- High-Dimensional S: For large or continuous sensitive attribute domains, kernelization or smoothing strategies are suggested for computational tractability.
- SCM Generalization: Causal SA-DRO provides explicit rates and regularizers under both known and unknown SCMs, with additional metric-estimation penalties if the CFDF is estimated (Ehyaei et al., 30 Sep 2025).
A plausible implication is that future work will refine DRO-driven fairness guarantees and further unify group, individual, and causal fairness through richer ambiguity sets.
7. Impact and Real-World Deployment
In high-stakes domains such as credit or healthcare, SA-DRO offers explicit robustness certificates tied to allowed drift in the sensitive-attribute composition, providing interpretable guarantees to auditors and regulators. The approach—whether instantiated via 4–divergence or causally motivated Wasserstein balls—yields efficient, gradient-based optimization procedures and empirically achieves robustness to both distributional shifts and fairness violations (Lei et al., 2024, Ehyaei et al., 30 Sep 2025). SA-DRO is extensible to streaming, distributed, and individual-fairness–aware learning, and is competitive with state-of-the-art methods in practical evaluations.