Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 182 tok/s Pro
GPT OSS 120B 453 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles (2311.16176v5)

Published 23 Nov 2023 in cs.LG, cs.AI, and cs.CV

Abstract: Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut learning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In this work, we propose DiffDiv an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs) to mitigate this form of bias. We show that at particular training intervals, DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features. We leverage this crucial property to generate synthetic counterfactuals to increase model diversity via ensemble disagreement. We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals. We further empirically quantify its efficacy on several diversification objectives, and finally show improved generalization and diversification on par with prior work that relies on auxiliary data collection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Don’t just assume; look and answer: Overcoming priors for visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  2. Synthetic data from diffusion models improves imagenet classification. arXiv preprint arXiv:2304.08466, 2023.
  3. Recognition in terra incognita. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
  4. Beyond surface statistics: Scene representations in a latent diffusion model. arXiv preprint arXiv:2306.05720, 2023.
  5. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673, 2020.
  6. Learning not to learn: Training deep neural networks with biased data. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9004–9012, Los Alamitos, CA, USA, 2019. IEEE Computer Society.
  7. Last layer re-training is sufficient for robustness to spurious correlations. In The Eleventh International Conference on Learning Representations, 2023.
  8. Diffusion models already have a semantic latent space. arXiv preprint arXiv:2210.10960, 2022.
  9. Diversify and disambiguate: Learning from underspecified data. arXiv preprint arXiv:2202.03418, 2022.
  10. Repair: Removing representation bias by dataset resampling. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9564–9573, Los Alamitos, CA, USA, 2019. IEEE Computer Society.
  11. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
  12. dsprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/, 2017.
  13. Automatic shortcut removal for self-supervised representation learning. In Proceedings of the 37th International Conference on Machine Learning. JMLR.org, 2020.
  14. Learning diverse features in vision transformers for improved generalization. arXiv preprint arXiv:2308.16274, 2023.
  15. Compositional abilities emerge multiplicatively: Exploring diffusion models on a synthetic task. arXiv preprint arXiv:2310.09336, 2023.
  16. Agree to disagree: Diversity through disagreement for better transferability. arXiv preprint arXiv:2202.04414, 2022.
  17. Ensembles of locally independent prediction models. In AAAI, 2020.
  18. Distributionally robust neural networks. In International Conference on Learning Representations, 2020.
  19. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
  20. Fake it till you make it: Learning transferable representations from synthetic imagenet clones. In CVPR 2023–IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  21. Which shortcut cues will dnns choose? a study from the parameter-space perspective. In International Conference on Learning Representations, 2022.
  22. Leveraging diffusion disentangled representations to mitigate shortcuts in underspecified visual tasks. In NeurIPS 2023 Workshop on Diffusion Models, 2023.
  23. The pitfalls of simplicity bias in neural networks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2020. Curran Associates Inc.
  24. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  25. Evading the simplicity bias: Training a diverse set of models discovers solutions with superior OOD generalization. In CVPR, 2022a.
  26. Predicting is not understanding: Recognizing and addressing underspecification in machine learning. arXiv preprint arXiv:2207.02598, 2022b.
  27. Unbiased look at dataset bias. In CVPR 2011, pages 1521–1528, 2011.
  28. Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 5309–5318, Los Alamitos, CA, USA, 2019. IEEE Computer Society.
  29. Concept algebra for (score-based) text-controlled generative models. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  30. Uncovering the disentanglement capability in text-to-image diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1900–1910, 2023.
  31. Noise or signal: The role of image backgrounds in object recognition. In International Conference on Learning Representations, 2021.
  32. Investigating bias and fairness in facial expression recognition. In Computer Vision – ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part VI, page 506–523, Berlin, Heidelberg, 2020. Springer-Verlag.
  33. Not just pretty pictures: Text-to-image generators enable interpretable interventions for robust representations. arXiv preprint arXiv:2212.11237, 2022.
  34. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med., 15(11):e1002683, 2018.
  35. Age progression/regression by conditional adversarial autoencoder. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5810–5818, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.