Impact of multiple interacting confounders on correction quality

Determine how the presence of multiple interacting confounders in image-classification datasets affects the correction quality of the bias-mitigation methods evaluated in this study—Counterfactual Knowledge Distillation (CFKD), Right-Reason ClArC (RR-ClArC), Projective ClArC (P-ClArC), Deep Feature Reweighting (DFR), and Group Distributionally Robust Optimization (Group DRO)—under the data-scarce, highly imbalanced subgroup settings considered.

Background

The study systematically compares XAI-based and non-XAI-based methods for mitigating Clever Hans behavior in image classifiers across synthetic and real-world datasets. Each dataset is constructed or selected to include a single confounding feature that is spuriously correlated with class labels, with severe subgroup imbalance and limited validation coverage. Performance is assessed via average group accuracy (AGA) and worst group accuracy (WGA) on held-out test sets.

While CFKD and RR-ClArC often outperform non-XAI baselines, the experiments are constrained to a single confounder per dataset. Real-world scenarios commonly involve multiple, potentially interacting confounders (e.g., combined artifacts, demographic attributes, and acquisition conditions). The authors explicitly note that the effect of such multi-confounder settings on the correction quality of the evaluated methods remains unresolved.

References

Finally, each dataset contained only one confounding factor, whereas real-world scenarios often involve multiple interacting confounders. It remains an open question how this would have affected correction quality.