Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simple and Fast Group Robustness by Automatic Feature Reweighting (2306.11074v1)

Published 19 Jun 2023 in cs.LG and stat.ML

Abstract: A major challenge to out-of-distribution generalization is reliance on spurious features -- patterns that are predictive of the class label in the training data distribution, but not causally related to the target. Standard methods for reducing the reliance on spurious features typically assume that we know what the spurious feature is, which is rarely true in the real world. Methods that attempt to alleviate this limitation are complex, hard to tune, and lead to a significant computational overhead compared to standard training. In this paper, we propose Automatic Feature Reweighting (AFR), an extremely simple and fast method for updating the model to reduce the reliance on spurious features. AFR retrains the last layer of a standard ERM-trained base model with a weighted loss that emphasizes the examples where the ERM model predicts poorly, automatically upweighting the minority group without group labels. With this simple procedure, we improve upon the best reported results among competing methods trained without spurious attributes on several vision and natural language classification benchmarks, using only a fraction of their compute.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Nuanced Metrics For Measuring Unintended Bias With Real Data For Text Classification. Companion proceedings of the 2019 world wide web conference, pp.  491–500, 2019.
  2. Approximating CNNs With Bag-of-local-Features Models Works Surprisingly Well On ImageNet. arXiv preprint arXiv:1904.00760, 2019.
  3. Class-balanced Loss Based On Effective Number Of Samples. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  9268–9277, 2019.
  4. BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding. arXiv preprint arXiv:1810.04805, 2018.
  5. ImageNet-trained CNNs Are Biased Towards Texture; Increasing Shape Bias Improves Accuracy And Robustness. arXiv preprint arXiv:1811.12231, 2018.
  6. Shortcut Learning In Deep Neural Networks. Nature Machine Intelligence, 2(11):665–673, 2020.
  7. Deep Residual Learning For Image Recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  8. Simple Data Balancing Achieves Competitive Worst-Group-Accuracy. arXiv preprint arXiv:2110.14503, 2021.
  9. CheXpert: A Large Chest Radiograph Dataset With Uncertainty Labels And Expert Comparison. Association for the Advancement of Artificial Intelligence (AAAI), 2019.
  10. On Feature Learning in the Presence of Spurious Correlations. Neural Information Processing Systems (NeurIPS), 35:38516–38532, 2022.
  11. Last Layer Re-Training Is Sufficient For Robustness To Spurious Correlations. Preprint arXiv 2204.02937v1, 2022.
  12. Kish, L. Survey Sampling. John Wiley and Sons, New York, NY, 1965.
  13. Wilds: A Benchmark Of In-The-Wild Distribution Shifts. International Conference on Machine Learning (ICML), 2021.
  14. Learning Multiple Layers Of Features From Tiny Images. PhD Thesis, 2009.
  15. Learning Multiple Layers Of Features From Tiny Images. University of Toronto (CS), 2009.
  16. Surgical Fine-Tuning Improves Adaptation To Distribution Shifts. arXiv preprint arXiv:2210.11466, 2022a.
  17. Diversify And Disambiguate: Learning From Underspecified Data. arXiv preprint arXiv:2202.03418, 2022b.
  18. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pp.  2980–2988, 2017.
  19. ZIN: When and How to Learn Invariance Without Environment Partition? Neural Information Processing Systems (NeurIPS), 2022.
  20. Just Train Twice: Improving Group Robustness Without Training Group Information. International Conference on Machine Learning (ICML), 2021.
  21. Deep Learning Face Attributes In The Wild. Proceedings of the IEEE international conference on computer vision, 2015.
  22. Overparameterisation and Worst-Case Generalisation: Friend or Foe? International Conference on Learning Representations (ICLR), 2020.
  23. A Comprehensive Study Of Image Classification Model Sensitivity To Foregrounds, Backgrounds, And Visual Attributes. arXiv preprint arXiv:2201.10766, 2022.
  24. Learning From Failure: De-biasing Classifier From Biased Classifier. Neural Information Processing Systems (NeurIPS), 33:20673–20684, 2020.
  25. Spread Spurious Attribute: Improving Worst-group Accuracy With Spurious Attribute Estimation. International Conference on Learning Representations, 2022.
  26. Hidden Stratification Causes Clinically Meaningful Failures In Machine Learning For Medical Imaging. Proceedings of the ACM conference on health, inference, and learning, pp.  151–159, 2020.
  27. ImageNet Large Scale Visual Recognition Challenge. International journal of computer vision, 115(3):211–252, 2015.
  28. Distributionally Robust Neural Networks For Group Shifts: On The Importance Of Regularization For Worst-Case Generalization. International Conference on Learning Representations (ICLR), 2020.
  29. No Subclass Left Behind: Fine-Grained Robustness In Coarse-Grained Classification Problems. Neural Information Processing Systems (NeurIPS), 33:19339–19352, 2020.
  30. BARACK: Partially Supervised Group Robustness With Guarantees. arXiv preprint arXiv:2201.00072, 2021.
  31. Robust Representation Learning Via Perceptual Similarity Metrics. International Conference on Machine Learning (ICML), 2021.
  32. Chestx-ray8: Hospital-scale Chest X-Ray Database And Benchmarks On Weakly-Supervised Classification And Localization Of Common Thorax Diseases. Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  2097–2106, 2017.
  33. A Broad-Coverage Challenge Corpus For Sentence Understanding Through Inference. arXiv preprint arXiv:1704.05426, 2017.
  34. Noise Or Signal: The Role Of Image Backgrounds In Object Recognition. arXiv preprint arXiv:2006.09994, 2020.
  35. Chroma-VAE: Mitigating Shortcut Learning With Generative Classifiers. Neural Information Processing Systems (NeurIPS), 2022.
  36. Change is Hard: A Closer Look at Subpopulation Shift. arXiv preprint arXiv:2302.12254, 2023.
  37. Variable Generalization Performance Of A Deep Learning Model To Detect Pneumonia In Chest Radiographs: A Cross-Sectional Study. PLoS medicine, 15(11):e1002683, 2018.
  38. Correct-n-Contrast: A Contrastive Approach For Improving Robustness To Spurious Correlations. Preprint arXiv 2203.01517v1, 2022.
Citations (37)

Summary

We haven't generated a summary for this paper yet.