Out-of-distribution robustness for multivariate analysis via causal regularisation (2403.01865v3)
Abstract: We propose a regularisation strategy of classical machine learning algorithms rooted in causality that ensures robustness against distribution shifts. Building upon the anchor regression framework, we demonstrate how incorporating a straightforward regularisation term into the loss function of classical multivariate analysis algorithms, such as (orthonormalized) partial least squares, reduced-rank regression, and multiple linear regression, enables out-of-distribution generalisation. Our framework allows users to efficiently verify the compatibility of a loss function with the regularisation strategy. Estimators for selected algorithms are provided, showcasing consistency and efficacy in synthetic and real-world climate science problems. The empirical validation highlights the versatility of anchor regularisation, emphasizing its compatibility with multivariate analysis approaches and its role in enhancing replicability while guarding against distribution shifts. The extended anchor framework advances causal inference methodologies, addressing the need for reliable out-of-distribution generalisation.
- H. Abdi. Partial least squares regression and projection on latent structure regression (PLS regression). Wiley Interdisciplinary Reviews: Computational Statistics, 2:97–106, 01 2010.
- Kernel multivariate analysis framework for supervised subspace learning: A tutorial on linear and kernel multivariate methods. IEEE Signal Processing Magazine, 30(4):16–29, 2013.
- J. Arenas-García and V. Gómez-Verdejo. Sparse and kernel OPLS feature extraction based on eigenvalue problem solving. Pattern Recognition, 48, 05 2015.
- Invariant risk minimization, 2020.
- A survey on multi-output regression. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5, 07 2015. doi: 10.1002/widm.1157.
- Instrumental variables. Cambridge university press, 1990.
- P. Bühlmann. Invariance, causality and robustness. ArXiv, 2020.
- P. Bühlmann and S. Van De Geer. Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media, 2011.
- E. Candes and T. Tao. The dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35(6):2313 – 2351, 2007.
- Double/debiased/neyman machine learning of treatment effects. American Economic Review, 107(5):261–65, May 2017.
- Comparison of low-frequency internal climate variability in CMIP5 models and observations. Journal of Climate, 30(12):4763–4776, 2017.
- Physics-aware nonparametric regression models for earth data analysis. Environmental Research Letters, 17(5):054034, 2022.
- G. Csurka. Domain Adaptation in Computer Vision Applications. Springer, 01 2017.
- T. DelSole. Low-frequency variations of surface temperature in observations and simulations. Journal of Climate, 19(18):4487–4507, 2006.
- Insights from earth system model initial-condition large ensembles and future prospects. Nature Climate Change, 10:277–286, 04 2020.
- P. Esfahani and D. Kuhn. Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Mathematical Programming, 171, 05 2015.
- Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development, 9(5):1937–1958, May 2016.
- Regularization theory and neural networks architectures. Neural computation, 7(2):219–269, 1995.
- The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009.
- Detecting greenhouse-gas-induced climate change with an optimal fingerprint method. Journal of Climate, 9(10):2281–2306, 1996.
- A. J. Izenman. Reduced-rank regression for the multivariate linear model. Journal of Multivariate Analysis, 5(2):248–264, 1975.
- Deep Domain Adaptation in Earth Observation, chapter 7, pages 90–104. John Wiley & Sons, Ltd, 2021.
- G. Kociuba and S. B. Power. Inability of CMIP5 models to simulate recent strengthening of the walker circulation: Implications for projections. Journal of Climate, 28(1):20–35, 2015.
- Distributional anchor regression. Statistics and Computing, 2022.
- Kernel dependence regularizers and gaussian processes with applications to algorithmic fairness. Pattern Recognition, 132:108922, 2022.
- Model tropical atlantic biases underpin diminished pacific decadal variability. Nature Climate Change, 8, 06 2018.
- A. Mukherjee and J. Zhu. Reduced rank ridge regression and its kernel extensions. Statistical Analysis and Data Mining: The ASA Data Science Journal, 4(6):612–622, 2011.
- H. Namkoong and J. C. Duchi. Stochastic gradient methods for distributionally robust optimization with f-divergences. In Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016.
- Instrumental variable estimation of nonparametric models. Econometrica, 71(5):1565–1578, 2003.
- Regularizing towards causal invariance: Linear models with proxies. In International Conference on Machine Learning, pages 8260–8270. PMLR, 2021.
- Magnitudes and spatial patterns of interdecadal temperature variability in CMIP6. Geophysical Research Letters, 47(7):e2019GL086588, 2020.
- Causal inference by using invariant prediction: Identification and confidence intervals. Journal of the Royal Statistical Society Series B: Statistical Methodology, 78(5):947–1012, 10 2016. ISSN 1369-7412.
- Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017. ISBN 0262037319.
- Internal variability and forcing influence model-satellite differences in the rate of tropical tropospheric warming. Proceedings of the National Academy of Sciences of the United States of America, 119:e2209431119, 11 2022.
- Anchor regression: Heterogeneous data meet causality. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83, 01 2018.
- Towards out-of-distribution generalization: A survey. ArXiv, abs/2108.13624, 2021.
- W. Shi and W. Xu. Learning nonlinear causal effect via kernel anchor regression. In Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, volume 216 of Proceedings of Machine Learning Research, pages 1942–1952. PMLR, 31 Jul–04 Aug 2023.
- Uncovering the forced climate response from a single ensemble member using statistical learning. Journal of Climate, 32(17):5677–5699, 2019.
- Robust detection of forced warming in the presence of potentially large climate variability. Science Advances, 7(43):eabh4429, 2021.
- M. Sugiyama and M. Kawanabe. Machine learning in non-stationary environments: Introduction to covariate shift adaptation. MIT press, 2012.
- On the equivalence between canonical correlation analysis and orthonormalized partial least squares. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI’09, page 1230–1235, San Francisco, CA, USA, 2009. Morgan Kaufmann Publishers Inc.
- Robust detection and attribution of climate change under interventions, 12 2022.
- An overview of CMIP5 and the experiment design. Bulletin of the American Meteorological Society, 93(4):485–498, 2011. Publisher: American Meteorological Society.
- R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996.
- Domain adaptation for the classification of remote sensing data: An overview of recent advances. IEEE Geoscience and Remote Sensing Magazine, 4(2):41–57, 2016.
- An empirical framework for domain generalization in clinical settings. In Proceedings of the Conference on Health, Inference, and Learning, CHIL ’21, page 279–290, New York, NY, USA, 2021. Association for Computing Machinery.
- Domain generalization: A survey. IEEE Trans. Pattern Anal. Mach. Intell., 45(4):4396–4415, Apr. 2023.
- A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76, Jan. 2021.