Fairness Algorithm for Skin Lesion Classification

Updated 7 September 2025

The paper introduces an adversarial multi-task framework that integrates a bias mitigation branch with orthogonality regularization to significantly reduce fairness gaps.
The methodology employs advanced techniques such as adversarial learning, multi-task training, and fairness metrics (SPD, EOD, AOD) to ensure equitable diagnostic outcomes.
Real-world evaluations demonstrate that fairness improvements are achieved without compromising diagnostic accuracy, aiding reliable clinical deployments.

A fairness algorithm for skin lesion classification is a computational methodology designed to ensure that predictive accuracy and decision outcomes of deep learning systems are equitable across groups defined by sensitive attributes such as skin tone, sex, age, or other demographic features. These algorithms address documented biases in skin lesion classifiers—where privileged groups (often light-skinned patients) have historically received higher diagnostic accuracy—by mitigating disparate treatment while preserving overall diagnostic performance. Approaches to fairness in this area leverage a variety of methods, including adversarial learning, multi-task regularization, representation disentanglement, explicit pruning, federated learning, and post-processing calibration, frequently using domain-specific fairness metrics.

1. Adversarial Multi-Task Training for Fair Representation

Adversarial multi-task training frameworks, such as the one proposed by Estimating and Improving Fairness with Adversarial Learning (Li et al., 2021), augment standard deep classifiers with auxiliary branches to mitigate bias and estimate fairness. The network is decomposed into a diagnosis branch and a bias mitigation branch:

Diagnosis branch: Feature generator $G$ maps input image $x$ to representation $h = G(x)$ , followed by classifier $C$ that predicts diagnosis $y$ .
Bias mitigation branch:
- Bias discriminator $D$ predicts sensitive attribute $z$ from $h$ and is trained via adversarial loss:
$\mathcal{L}_{\text{advD}} = -\mathbb{E}_{x \sim X^p}[\log D(G(x))] - \mathbb{E}_{x\sim X^u}[\log(1 - D(G(x)))]$ - The generator $G$ is trained to confuse $D$ so that $h$ encodes minimal information about $z$ :

$\mathcal{L}_{\text{advG}} = -\mathbb{E}_{x\sim X^p}[\log(1 - D(G(x)))] - \mathbb{E}_{x\sim X^u}[\log(1 - D(G(x)))]$ - A critical module $P$ is co-trained to predict fairness scores such as Statistical Parity Difference (SPD) for test-time fairness assessment, with its own regression loss:

$\mathcal{L}_P = \frac{1}{2} \mathbb{E}_{x\sim X_b} [\|P(G(x)) - \text{SPD}_b\|^2]$ - To prevent collapse between $D$ and $P$ (as they may share early layers), orthogonality regularization is introduced by minimizing:

$\mathcal{L}_{\text{orth}} = \mathbb{E}_{x\sim X}[\|J_x^\top J_x - I_2\|]$

where $J_x = [\partial D/\partial h; \partial P/\partial h]$ .

In experimental results, this adversarial setup with orthogonality regularization reduces fairness gaps (measured by SPD, Equal Opportunity Difference (EOD), and Average Odds Difference (AOD)) without degrading overall accuracy. The critical module $P$ can also estimate fairness on new data lacking sensitive attribute labels, which is critical for real-world, privacy-constrained deployments.

2. Fairness Metrics and Quantitative Evaluation

Fairness in skin lesion classification is quantified primarily via group fairness metrics that capture disparities in prediction performance between privileged and underprivileged groups. The three principal metrics are:

Metric	Mathematical Definition	Fairness Interpretation
SPD	$SPD = Pr(\hat{y}=y\|z=0)-Pr(\hat{y}=y\|z=1)$	Zero indicates equal positive outcome rates
EOD	$EOD = TPR(z=0)-TPR(z=1)$	Zero indicates equal true positive rates
AOD	$AOD = \frac{1}{2}[(FPR(z=0)-FPR(z=1)) + (TPR(z=0)-TPR(z=1))]$	Zero indicates average odds equality

Experimental evaluation typically involves training on large, diverse datasets (e.g., ISIC 2018), measuring accuracy by sensitive attribute, and reporting baseline (vanilla) versus debiased fairness-aware models. For example, adding orthogonality regularization reduced SPD from 8.3×10⁻² to 2.4×10⁻² (debiasing w.r.t. sex), with similar improvements in EOD and AOD. Importantly, fairness gains do not generally compromise diagnostic accuracy.

3. Orthogonality Regularization in Multi-Task Fairness

Multi-task fairness approaches that impose orthogonality between bias detection and fairness estimation branches are critical to prevent mutual interference. Concretely, by minimizing

$\mathcal{L}_{\text{orth}} = \mathbb{E}_{x\sim X}[\|J_x^\top J_x - I_2\|]$

where $J_x$ stacks the gradients (tangent vectors) of $D$ and $P$ with respect to $h$ , the model maintains independent and nonoverlapping learning directions for the bias and fairness tasks. This regularization addresses the manifold collapse problem seen in auxiliary task schemes and is empirically shown to enhance both fairness and the predictive reliability of critical module outputs.

4. General Architectural and Implementation Considerations

The adversarial fairness architecture involves several practical choices:

Feature extraction: Use of a shared backbone for both diagnosis and auxiliary branches.
Adversarial optimization: Discriminator $D$ and generator $G$ are optimized with alternating or gradient reversal strategies.
Critical module update: When updating $P$ , freeze $G$ ; when updating $G$ , consider adversarial and classification losses.
Orthogonality regularization: Gradient calculations and Jacobian construction are computationally straightforward with modern deep learning frameworks.
Resource requirements: The architecture augments a standard classifier by two auxiliary modules and extra gradient computations for orthogonality, which is tractable for modern GPUs.
Deployment: The critical module $P$ allows for ongoing fairness estimation post-deployment, even when sensitive attributes are unavailable due to privacy or regulatory constraints.

5. Comparative Analysis and Real-World Implications

Current approaches to fairness mitigation, such as adversarial learning with regularization, present advantages over naive group rebalancing or post-processing. Advantages include:

Strong empirical performance: Substantial reductions in fairness gaps on real clinical datasets.
Minimal accuracy tradeoff: Diagnostic performance is maintained or improved.
Generalizability: The framework extends to different sensitive attributes, such as sex and skin tone.
Privacy-aware deployment: Fairness estimation is available without sensitive attribute labels at inference.

Potential limitations include the need for sufficient subgroup-labeled data to supervise adversarial training and the assumption that sensitive attribute confusion leads to fairness (which might not encompass subtle data imbalances or inter-attribute dependencies).

6. Broader Methodological Context

The adversarial multi-task training with orthogonality regularization fits within a broader landscape of fairness approaches in medical AI, including but not limited to:

Bias unlearning with dual-head architectures and gradient reversal layers (Bevan et al., 2022)
Preprocessing techniques that suppress skin tone information (e.g., EdgeMixup) (Yuan et al., 2022)
Post hoc pruning of parameters or channels aligned with group-specific accuracy gaps (Wu et al., 2022, Kong et al., 14 May 2024)
Federated and distributed learning to enhance data heterogeneity and subgroup representation (Fan et al., 2021, Xu et al., 2022)
Fair representation and disentanglement approaches (contrastive or domain-realignment strategies) (Du et al., 2022, Wang et al., 2023, Sheng et al., 18 Jul 2024) These methods variously trade off between annotation requirements, implementation complexity, resource costs, and interpretability. Adversarial multi-task methods with orthogonality regularization represent a rigorously evaluated, practical solution for bias mitigation in medical image analysis that is well suited for regulatory, clinical, and privacy-sensitive environments.

7. Summary Table of Core Components

Component	Function	Loss Term
Generator $G$	Feature extraction, debiased representation	$\mathcal{L}_{\text{advG}}$
Classifier $C$	Diagnosis	Cross-entropy
Bias Discriminator $D$	Predict sensitive attribute, adversarial	$\mathcal{L}_{\text{advD}}$
Critical Module $P$	Fairness score prediction	$\mathcal{L}_P$
Orthogonality Regularizer	Enforce task independence	$\mathcal{L}_{\text{orth}}$

All losses are jointly minimized (with appropriate adversarial updates), maintaining both fairness and performance.

This synthesis outlines the foundational elements, technical mechanisms, evaluation metrics, and implementation specifics of fairness algorithms for skin lesion classification, with a focus on adversarial, multi-task, and regularization-based methods and their empirical validation. These frameworks are central to developing equitable, robust, and clinically applicable AI diagnostic tools in dermatology.