Discovering environments with XRM
Abstract: Environment annotations are essential for the success of many out-of-distribution (OOD) generalization methods. Unfortunately, these are costly to obtain and often limited by human annotators' biases. To achieve robust generalization, it is essential to develop algorithms for automatic environment discovery within datasets. Current proposals, which divide examples based on their training error, suffer from one fundamental problem. These methods introduce hyper-parameters and early-stopping criteria, which require a validation set with human-annotated environments, the very information subject to discovery. In this paper, we propose Cross-Risk-Minimization (XRM) to address this issue. XRM trains twin networks, each learning from one random half of the training data, while imitating confident held-out mistakes made by its sibling. XRM provides a recipe for hyper-parameter tuning, does not require early-stopping, and can discover environments for all training and validation data. Algorithms built on top of XRM environments achieve oracle worst-group-accuracy, addressing a long-standing challenge in OOD generalization. Code available at \url{https://github.com/facebookresearch/XRM}.
- Machine bias. ProPublica, 2016. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
- Invariant risk minimization. arXiv, 2019. https://arxiv.org/abs/1907.02893.
- Learning to split for automatic bias detection. arXiv, 2022. https://arxiv.org/abs/2204.13749.
- Fairness and Machine Learning: Limitations and Opportunities. fairmlbook.org, 2019. http://www.fairmlbook.org.
- End to end learning for self-driving cars. arXiv, 2016. https://arxiv.org/abs/1604.07316.
- Nuanced metrics for measuring unintended bias with real data for text classification. WWW, 2019. https://arxiv.org/abs/1903.04561.
- Léon Bottou. Learning representation with causal invariance. ICLR Keynote, 2019. https://leon.bottou.org/talks/invariances.
- Environment inference for invariant learning. arXiv, 2020. https://arxiv.org/abs/2010.07249.
- A too-good-to-be-true prior to reduce shortcut reliance. arXiv, 2021. https://arxiv.org/abs/2102.06406.
- AI for radiographic COVID-19 detection selects shortcuts over signal. Nature Machine Intelligence, 2021. https://www.nature.com/articles/s42256-021-00338-7.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- What neural networks memorize and why: Discovering the long tail via influence estimation. arXiv, 2020. https://arxiv.org/abs/2008.03703.
- Shortcut learning in deep neural networks. Nature Machine Intelligence, 2020. https://arxiv.org/abs/2004.07780.
- Explaining and harnessing adversarial examples. arXiv, 2014. https://arxiv.org/abs/1412.6572.
- Nelson Goodman. Seven strictures on similarity. Problems and Projects, 1972. https://philpapers.org/rec/GOOSSO.
- In search of lost domain generalization. arXiv, 2020. https://arxiv.org/abs/2007.01434.
- On calibration of modern neural networks. In ICML, 2017. https://arxiv.org/abs/1706.04599.
- Statistical classification methods in consumer credit scoring: a review. Journal of the royal statistical society: series a (statistics in society), 1997. https://www.jstor.org/stable/2983268.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Will Douglas Heaven. Hundreds of AI tools have been built to catch COVID. none of them helped. MIT Technology Review, 2021. https://www.technologyreview.com/2021/07/30/1030329/machine-learning-ai-failed-covid-hospital-diagnosis-pandemic/.
- Simple data balancing achieves competitive worst-group-accuracy. CLeaR, 2022. https://arxiv.org/abs/2110.14503.
- On feature learning in the presence of spurious correlations. NeurIPS, 2022. https://arxiv.org/abs/2210.11369.
- Nathalie Japkowicz. The class imbalance problem: Significance and strategies. In ICML, 2000. https://www.researchgate.net/publication/2639031_The_Class_Imbalance_Problem_Significance_and_Strategies.
- Artificial intelligence in healthcare: past, present and future. Stroke and vascular neurology, 2017. https://pubmed.ncbi.nlm.nih.gov/29507784/.
- Cifar-10, 2009. http://www.cs.toronto.edu/~kriz/cifar.html.
- Fairness without demographics through adversarially reweighted learning. arXiv, 2020. https://arxiv.org/abs/2006.13114.
- Metashift: A dataset of datasets for evaluating contextual distribution shifts and training conflicts. arXiv, 2022. https://arxiv.org/abs/2202.06523.
- Deep learning face attributes in the wild. In ICCV, 2015. https://arxiv.org/abs/1411.7766.
- Measuring and signing fairness as performance under multiple stakeholder distributions. arXiv, 2022. https://arxiv.org/abs/2207.09960.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Learning from failure: Training debiased classifier from biased classifier. arXiv, 2020. https://arxiv.org/abs/2007.02561.
- Simple and fast group robustness by automatic feature reweighting. arXiv, 2023. https://arxiv.org/abs/2306.11074.
- Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. ICLR, 2019. https://arxiv.org/abs/1911.08731.
- The pitfalls of simplicity bias in neural networks. NeurIPS, 2020. https://arxiv.org/abs/2006.07710.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Vladimir Vapnik. Statistical learning theory. Wiley, 1998. https://www.wiley.com/en-us/Statistical+Learning+Theory-p-9780471030034.
- The Caltech-UCSD birds-200-2011 dataset. California Institute of Technology, 2011. https://www.vision.caltech.edu/datasets/cub_200_2011/.
- On calibration and out-of-domain generalization. NeurIPS, 2021. https://arxiv.org/abs/2102.10395.
- Generalizing to unseen domains: A survey on domain generalization. arXiv, 2021. https://arxiv.org/abs/2103.03097.
- A broad-coverage challenge corpus for sentence understanding through inference. ACL, 2017. https://aclanthology.org/N18-1101/.
- Noise or signal: The role of image backgrounds in object recognition. arXiv, 2020. https://arxiv.org/abs/2006.09994.
- Change is hard: A closer look at subpopulation shift. arXiv, 2023. https://arxiv.org/pdf/2302.12254.pdf.
- Just train twice: Improving group robustness without training group information. arXiv, 2021. https://arxiv.org/abs/2107.09044.
- Domain generalization: A survey. IEEE PAMI, 2022. https://arxiv.org/abs/2103.02503.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.