Discover and Mitigate Multiple Biased Subgroups in Image Classifiers (2403.12777v2)
Abstract: Machine learning models can perform well on in-distribution data but often fail on biased subgroups that are underrepresented in the training data, hindering the robustness of models for reliable applications. Such subgroups are typically unknown due to the absence of subgroup labels. Discovering biased subgroups is the key to understanding models' failure modes and further improving models' robustness. Most previous works of subgroup discovery make an implicit assumption that models only underperform on a single biased subgroup, which does not hold on in-the-wild data where multiple biased subgroups exist. In this work, we propose Decomposition, Interpretation, and Mitigation (DIM), a novel method to address a more challenging but also more practical problem of discovering multiple biased subgroups in image classifiers. Our approach decomposes the image features into multiple components that represent multiple subgroups. This decomposition is achieved via a bilinear dimension reduction method, Partial Least Square (PLS), guided by useful supervision from the image classifier. We further interpret the semantic meaning of each subgroup component by generating natural language descriptions using vision-language foundation models. Finally, DIM mitigates multiple biased subgroups simultaneously via two strategies, including the data- and model-centric strategies. Extensive experiments on CIFAR-100 and Breeds datasets demonstrate the effectiveness of DIM in discovering and mitigating multiple biased subgroups. Furthermore, DIM uncovers the failure modes of the classifier on Hard ImageNet, showcasing its broader applicability to understanding model bias in image classifiers. The code is available at https://github.com/ZhangAIPI/DIM.
- Partial least squares methods: partial least squares correlation and partial least square regression. Computational Toxicology: Volume II, pages 549–579, 2013.
- Model-based gaussian and non-gaussian clustering. Biometrics, pages 803–821, 1993.
- Gender bias in word embeddings: a comprehensive analysis of frequency, syntax, and semantics. In Proceedings of the 2022 AAAI Conference on AI, Ethics, and Society, pages 156–170, 2022.
- Environment inference for invariant learning. In International Conference on Machine Learning, pages 2189–2200. PMLR, 2021.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Towards debiasing dnn models from spurious feature influence. Proceedings of the AAAI Conference on Artificial Intelligence, 36(9):9521–9528, 2022.
- Describing differences in image sets with natural language. arXiv preprint arXiv:2312.02974, 2023.
- Domino: Discovering systematic errors with cross-modal embeddings. In International Conference on Learning Representations, 2021.
- Domino: Discovering systematic errors with cross-modal embeddings. In International Conference on Learning Representations, 2022.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Egocentric audio-visual object localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22910–22921, 2023.
- Simple data balancing achieves competitive worst-group-accuracy. In Conference on Causal Learning and Reasoning, pages 336–351. PMLR, 2022.
- Distilling model failures as directions in latent space. arXiv preprint arXiv:2206.14754, 2022.
- One forward is enough for neural network training via likelihood ratio method. In The Twelfth International Conference on Learning Representations, 2023.
- Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065):20150202, 2016.
- Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 1548–1558, 2021.
- Trustworthy artificial intelligence: a review. ACM Computing Surveys (CSUR), 55(2):1–38, 2022.
- Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009.
- Explaining in style: Training a gan to explain a classifier in stylespace. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 693–702, 2021.
- Trustworthy ai: From principles to practices. ACM Computing Surveys, 55(9):1–46, 2023.
- Repair: Removing representation bias by dataset resampling. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9572–9581, 2019.
- A whac-a-mole dilemma: Shortcuts come in multiples where mitigating one amplifies others. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20071–20082, 2023.
- Discover and mitigate unknown biases with debiasing alternate networks. In European Conference on Computer Vision, pages 270–288. Springer, 2022.
- Discover the unknown biased attribute of an image classifier. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14970–14979, 2021.
- Biasadv: Bias-adversarial augmentation for model debiasing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3832–3841, 2023.
- Zin: When and how to learn invariance without environment partition? Advances in Neural Information Processing Systems, 35:24529–24542, 2022.
- Just train twice: Improving group robustness without training group information. In International Conference on Machine Learning, pages 6781–6792. PMLR, 2021.
- Towards trustworthy and aligned machine learning: A data-centric survey with causality perspectives. arXiv preprint arXiv:2307.16851, 2023.
- Zero-shot model diagnosis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11631–11640, 2023.
- Debiasing intrinsic bias and application bias jointly via invariant risk minimization (student abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 37(13):16280–16281, 2023.
- Hard imagenet: Segmentations for objects with strong spurious cues. Advances in Neural Information Processing Systems, 35:10068–10077, 2022.
- Learning from failure: De-biasing classifier from biased classifier. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 20673–20684. Curran Associates, Inc., 2020.
- Learning from failure: De-biasing classifier from biased classifier. Advances in Neural Information Processing Systems, 33:20673–20684, 2020.
- OpenAI. https://chat.openai.com, 2023.
- Training debiased subnetworks with contrastive weight pruning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7929–7938, 2023.
- Gradient starvation: A learning proclivity in neural networks. Advances in Neural Information Processing Systems, 34:1256–1272, 2021.
- Bias mimicking: A simple sampling approach for bias mitigation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20311–20320, 2023.
- Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
- Julia M Rohrer. Thinking clearly about correlations and causation: Graphical causal models for observational data. Advances in methods and practices in psychological science, 1(1):27–42, 2018.
- Distributionally robust neural networks. In International Conference on Learning Representations, 2019.
- Breeds: Benchmarks for subpopulation shift. In International Conference on Learning Representations, 2021.
- Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 35:25278–25294, 2022.
- Salient imagenet: How to discover spurious features in deep learning? In International Conference on Learning Representations, 2022.
- On distributionally robust optimization and data rebalancing. In International Conference on Artificial Intelligence and Statistics, pages 1283–1297. PMLR, 2022.
- Revise: A tool for measuring and mitigating bias in visual datasets. International Journal of Computer Vision, 130(7):1790–1810, 2022.
- Towards fairness in visual recognition: Effective strategies for bias mitigation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8919–8928, 2020.
- Distributionally-robust recommendations for improving worst-case user experience. In Proceedings of the ACM Web Conference 2022, pages 3606–3610, 2022.
- Reducing co-occurrence bias to improve classifier explainability and zero-shot detection. In 2022 IEEE Aerospace Conference (AERO), pages 1–8. IEEE, 2022.
- Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 335–340, 2018.
- Random smooth-based certified defense against text adversarial attack. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1251–1265, 2024.
- Bag of tricks to boost adversarial transferability. arXiv preprint arXiv:2401.08734, 2024.
- Towards fair classifiers without sensitive attributes: Exploring biases in related features. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pages 1433–1442, 2022.
- Deep supervised cross-modal retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10394–10403, 2019.
- Gsclip: A framework for explaining distribution shifts in natural language. arXiv preprint arXiv:2206.15007, 2022.
- Zeliang Zhang (34 papers)
- Mingqian Feng (14 papers)
- Zhiheng Li (67 papers)
- Chenliang Xu (114 papers)