Differentiable MaxAccGap Loss
- Differentiable MaxAccGap loss is a fairness-driven regularization technique that penalizes accuracy differences between demographic groups to promote diagnostic parity.
- It replaces non-differentiable accuracy indicators with soft surrogates, enabling optimization with stochastic gradient descent and integration into LoRA-based methods.
- Empirical evaluations in glaucoma diagnosis show substantial gap reductions (up to 70%) with minimal overall accuracy impact and high parameter efficiency.
The Differentiable MaxAccGap loss is a fairness-driven regularization technique designed to minimize the diagnostic performance disparity across sensitive demographic groups by explicitly penalizing the maximum difference in model accuracy between groups in a differentiable manner. Introduced in the context of fairness-aware fine-tuning of vision-LLMs (VLMs) for medical glaucoma diagnosis, this loss facilitates end-to-end optimization for accuracy parity, making it directly compatible with stochastic gradient descent and modern deep learning frameworks. Its implementation is central to three Low-Rank Adaptation (LoRA)–based methods (FR-LoRA, GR-LoRA, and Hybrid-LoRA) that address fairness and data imbalance in large-scale medical AI settings while maintaining parameter efficiency (Gu et al., 3 Dec 2025).
1. Mathematical Formulation of Differentiable MaxAccGap Loss
Let denote the dataset, where is the binary glaucoma label, and is a sensitive attribute (e.g., ethnicity). For each group , is the subset of samples with and size .
Group-wise hard accuracy:
MaxAccGap:
Since this metric is non-differentiable, a soft (differentiable) surrogate replaces the indicator with , the model's predicted probability for the true class.
Soft-accuracy for group :
Differentiable MaxAccGap:
This structure is piecewise-differentiable, with subgradients targeting the group-wise soft-accuracy extremes: where attain the maximal and minimal group soft-accuracy, respectively.
2. Integration with LoRA-based Fairness Methods
Three algorithmic variants leverage the Differentiable MaxAccGap for fairness-aware adaptation:
- FR-LoRA: Directly adds the soft MaxAccGap as a regularization term to the cross-entropy objective:
Here, tunes the accuracy-fairness trade-off.
- GR-LoRA: Applies inverse frequency reweighting for group cross-entropy, mitigating sample-size imbalance:
with . MaxAccGap is not an explicit loss term but becomes minimized through gradient balancing.
- Hybrid-LoRA: Combines both objectives:
For all methods, only 0.24% of model parameters (via LoRA adapters) are fine-tuned, allowing parameter-efficient deployment in restricted clinical settings.
3. Implementation and Optimization Procedures
Efficient group-wise accumulation is executed in each mini-batch (approximately 8 samples per batch), using indexed tensors for accumulating per group to estimate soft-accuracies. To prevent instability due to minority group over-sampling, group loss weights are capped at . Gradient accumulation across four steps is adopted to ensure multiple-group representation for each update.
Stabilization procedures include linear learning-rate warmup for 100 steps, LoRA adapter dropout (0.05), and last-token pooling for VLMs. All LoRA adapters use rank and scaling .
The primary hyperparameters and their effects are outlined here:
| λ | MaxAccGap Reduction | Overall Accuracy Change |
|---|---|---|
| 0.1 | 45% reduction (3.80→2.10) | +0.25pp |
| 0.5 | Over-correction (gap↑ to 6.04) | +0.05pp |
| 1.0 | 47% reduction (3.80→2.01) | −0.15pp |
is empirically optimal; larger values induce training spikes, lower fail to adequately represent minority groups.
4. Empirical Evaluation and Quantitative Impact
Experiments on 10,000 fundus images for binary glaucoma diagnosis demonstrate substantial disparity reductions with minimal accuracy trade-off.
| Method | Overall Acc (%) | MaxAccGap (%) |
|---|---|---|
| Zero-Shot | 50.15 | 3.95 |
| Vanilla LoRA | 53.50 | 3.80 |
| FR-LoRA (0.5) | 53.55 | 6.04 |
| GR-LoRA | 53.15 | 1.17 |
| Hybrid-LoRA | 53.45 | 3.80 |
GR-LoRA achieves the lowest MaxAccGap (1.17%), evenly distributing accuracy across Non-Hispanic (53.14%), Hispanic (53.95%), and Unknown (52.78%) groups. Ablation for FR-LoRA shows that strong regularization () can robustly reduce the MaxAccGap. Hybrid-LoRA generalizes MaxAccGap reduction across different attributes; for example, the race attribute saw reduction from 4.36% (Vanilla) to 1.74% (Hybrid).
5. Trade-offs, Limitations, and Generalization
Differentiable MaxAccGap optimizes for accuracy parity, which is deemed clinically relevant for diagnostic tasks. However, it is restricted to single-output (binary) classification in current application. Intersectional fairness across multiple sensitive attributes is not addressed, nor are alternative notions such as equalized odds (e.g., TPR, FPR parity). Soft-accuracy assumes probability calibration, which, if violated, could misdirect the gap gradient.
Task generalization proceeds by redefining group-wise soft metrics: for multi-class classification, compute soft-accuracy per class and group; for regression, apply group-wise or negative MSE within the same max-min regularization. The group balancing machinery is similarly extensible to attributes including age and socio-economic status.
6. Significance in Fair Medical AI and Deployment Considerations
By transforming the clinically interpretable metric of accuracy parity into a gradient-friendly regularizer, Differentiable MaxAccGap loss bridges theoretical fairness with practical, parameter-efficient adaptation of billion-parameter VLMs on medical data. The resulting FR-LoRA, GR-LoRA, and Hybrid-LoRA methods achieve up to 70% disparity reduction and 53.15% overall accuracy, requiring only 0.24% trainable parameters, offering a promising solution for fair medical AI in resource-constrained clinical environments (Gu et al., 3 Dec 2025).