Enhancing Distributional Stability among Sub-populations (2206.02990v2)
Abstract: Enhancing the stability of machine learning algorithms under distributional shifts is at the heart of the Out-of-Distribution (OOD) Generalization problem. Derived from causal learning, recent works of invariant learning pursue strict invariance with multiple training environments. Although intuitively reasonable, strong assumptions on the availability and quality of environments are made to learn the strict invariance property. In this work, we come up with the ``distributional stability" notion to mitigate such limitations. It quantifies the stability of prediction mechanisms among sub-populations down to a prescribed scale. Based on this, we propose the learnability assumption and derive the generalization error bound under distribution shifts. Inspired by theoretical analyses, we propose our novel stable risk minimization (SRM) algorithm to enhance the model's stability w.r.t. shifts in prediction mechanisms ($Y|X$-shifts). Experimental results are consistent with our intuition and validate the effectiveness of our algorithm. The code can be found at https://github.com/LJSthu/SRM.
- Invariance principle meets information bottleneck for out-of-distribution generalization. CoRR, abs/2106.06607.
- Invariant risk minimization games. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 145–155. PMLR.
- Invariant risk minimization. CoRR, abs/1907.02893.
- Environment inference for invariant learning. In Meila, M. and Zhang, T., editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 2189–2200. PMLR.
- Retiring adult: New datasets for fair machine learning. Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021.
- Distributionally robust losses against mixture covariate shifts. Under review, 2.
- Learning models with uniform performance via distributionally robust optimization. CoRR, abs/1810.08750.
- Incorporating unlabeled data into distributionally robust learning. CoRR, abs/1912.07729.
- In search of lost domain generalization. In International Conference on Learning Representations.
- Does distributionally robust supervised learning give robust classifiers? In Dy, J. G. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pages 2034–2042. PMLR.
- Out-of-distribution generalization with maximal invariant predictor. CoRR, abs/2008.01883.
- Heterogeneous risk minimization. In Meila, M. and Zhang, T., editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 6804–6814. PMLR.
- Integrated latent heterogeneity and invariance learning in kernel space. Advances in Neural Information Processing Systems, 34:21720–21731.
- Kernelized heterogeneous risk minimization. Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021.
- On the need for a language describing distribution shifts: Illustrations on tabular datasets. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Measure the predictive heterogeneity. In The Eleventh International Conference on Learning Representations.
- Accuracy on the line: on the strong correlation between out-of-distribution and in-distribution generalization. In International Conference on Machine Learning, pages 7721–7735. PMLR.
- Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society. Series B (Statistical Methodology), pages 947–1012.
- Invariant models for causal transfer learning. J. Mach. Learn. Res., 19:36:1–36:34.
- An investigation of why overparameterization exacerbates spurious correlations. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 8346–8356. PMLR.
- Truncated back-propagation for bilevel optimization. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1723–1732. PMLR.
- Meta-weight-net: Learning an explicit mapping for sample weighting. Advances in neural information processing systems, 32.
- Certifying some distributional robustness with principled adversarial training. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.
- Towards a theoretical framework of out-of-distribution generalization. Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021.