Progression: an extrapolation principle for regression (2410.23246v1)
Abstract: The problem of regression extrapolation, or out-of-distribution generalization, arises when predictions are required at test points outside the range of the training data. In such cases, the non-parametric guarantees for regression methods from both statistics and machine learning typically fail. Based on the theory of tail dependence, we propose a novel statistical extrapolation principle. After a suitable, data-adaptive marginal transformation, it assumes a simple relationship between predictors and the response at the boundary of the training predictor samples. This assumption holds for a wide range of models, including non-parametric regression functions with additive noise. Our semi-parametric method, progression, leverages this extrapolation principle and offers guarantees on the approximation error beyond the training data range. We demonstrate how this principle can be effectively integrated with existing approaches, such as random forests and additive models, to improve extrapolation performance on out-of-distribution samples.
- A note on second order conditions in extreme value theory: linking general and heavy tail conditions. REVSTAT-Statistical Journal 5(3), 285–304.
- Tail asymptotics of light-tailed weibull-like sums. Probability and Mathematical Statistics 37, 235–256.
- Generalized random forests. Ann. Statist. 47, 1148–1178.
- Residual life time at great age. The Annals of Probability 2, 792–804.
- Densities with Gaussian tails. Proc. Lond. Math. Soc. 66, 568–588.
- Statistics of Extremes: Theory and Applications. West Sussex: John Wiley & Sons.
- A modeler’s guide to extreme value software. Extremes 26(4), 595–638.
- Modeling and simulating spatial extremes by combining extreme value theory with generative adversarial networks. Environmental Data Science 1, e5.
- Box, G. E. P. and D. R. Cox (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological) 26(2), 211–252.
- Breiman, L. (2001). Random forests. Machine Learning 45, 5–32.
- Bühlmann, P. (2020). Invariance, causality and robustness. Statistical Science 35, 404–426.
- Caeiro, F. and M. I. Gomes (2016). Threshold selection in extreme value analysis. In Extreme value modeling and risk analysis, pp. 69–86. CRC Press, Boca Raton, FL.
- Chernozhukov, V. (2005). Extremal quantile regression. The Annals of Statistics 33(2), 806 – 839.
- BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298.
- A causal framework for distribution generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 6614–6630.
- Cline, D. B. (1986). Convolution tails, product tails and domain of attraction. Probability Theory and Related Fields 72, 529–557.
- Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. London: Springer Series in Statistics.
- Learning bounds for importance weighting. In Advances in Neural Information Processing Systems, Volume 23, pp. 442–450.
- Extreme Value Theory: and introduction. New York: Springer.
- A probabilistic theory of pattern recognition, Volume 31 of Applications of Mathematics (New York). Springer-Verlag, New York.
- First steps toward understanding the extrapolation of nonlinear models to unseen domains. In The Eleventh International Conference on Learning Representations.
- Modelling extremal events: for insurance and finance, Volume 13. Zurich: Springer.
- Extremal dependence of random scale constructions. Extremes 22(4), 623–666.
- Neural network extrapolation to distant regions of the protein fitness landscape. Nature Communications 15, 6405.
- Local linear forest. Journal of computational and graphical statistics 30(2), 503–517.
- Friedman, J. H. (1991). Multivariate adaptive regression splines. Ann. Statist. 19(1), 1–141.
- Galton, F. (1886). Regression towards mediocrity in hereditary stature. The Journal of the Anthropological Institute of Great Britain and Ireland 15, 246–263.
- Domain-adversarial training of neural networks. Journal of Machine Learning Research 17(59), 1–35.
- Second-order regular variation, convolution and the central limit theorem. Stochastic Process. Appl. 69(2), 139–159.
- Convolutions of heavy-tailed random variables and applications to portfolio diversification and MA(1)MA1{\rm MA}(1)roman_MA ( 1 ) time series. Adv. in Appl. Probab. 32(4), 1011–1026.
- Boosted control function. arXiv 2310.05805.
- Extremal random forests. Journal of the American Statistical Association. To appear.
- Extreme Value Theory: an introduction. New York: Springer.
- Generalized additive models: Some applications. Journal of the American Statistical Association 82(398), 371–386.
- The elements of statistical learning (Second ed.). Springer Series in Statistics. Springer, New York.
- Heffernan, J. E. and J. A. Tawn (2004). A conditional approach for multivariate extreme values (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66, 497–546.
- Assessing present and future risk of water damage using building attributes, meteorology, and topography. Journal of the Royal Statistical Society Series C: Applied Statistics 72(4), 809–828.
- On regression in extreme regions. https://arxiv.org/abs/2303.03084.
- On binary classification in extreme regions. Advances in Neural Information Processing Systems 31 31, 3092–3100.
- Koh, J. (2023). Gradient boosting with extreme-value theory for wildfire prediction. Extremes 26, 273–299.
- Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning, Chapter 6, pp. 130–166.
- Learning skillful medium-range global weather forecasting. Science 382(6677), 1416–1421.
- Optimally tackling covariate shift in rkhs-based nonparametric regression. Ann. Statist. 51, 738–761.
- Meinshausen, N. (2007). Quantile regression forests. JMLR 7, 983–999.
- Mikosch, T. (2014). Modeling heavy-tailed time series.
- Namkoong, H. and J. C. Duchi (2016). Stochastic gradient methods for distributionally robust optimization with f-divergences. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), Advances in Neural Information Processing Systems, Volume 29. Curran Associates, Inc.
- Neural networks for extreme quantile regression with an application to forecasting of flood risk. Ann. Appl. Stat.. To appear.
- Validating deep-learning weather forecast models on recent high-impact extreme events. https://arxiv.org/abs/2404.17652.
- Extrapolation-aware nonparametric statistical inference. arXiv:2402.09758.
- Pickands, J. (1975). Statistical inference using extreme order statistics. The Annals of Statistics 3, 119–131.
- Prediction of regional wildfire activity in the probabilistic bayesian framework of firelihood. Ecological Applications 31(5), e02316.
- Dataset shift in machine learning. London: MIT Press.
- Resnick, S. I. (2007). Heavy-tail Phenomena: Probabilistic and Statistical Modeling. New York: Springer.
- Anchor regression: Heterogeneous data meet causality. Journal of Royal Statistical Society. Series B. 83, 215–246.
- Engression: Extrapolation for nonlinear regression? arXiv:2307.00835.
- Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90, 227–244.
- Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research 8(35), 985–1005.
- Gradient boosting for extreme quantile regression. Extremes 26, 639–667.
- Extreme value theory for anomaly detection — the gpd classifier. Extremes 23, 501–520.
- How neural networks extrapolate: From feedforward to graph neural networks. In International Conference on Learning Representations.
- Youngman, B. D. (2019). Generalized Additive Models for Exceedances of High Thresholds With an Application to Return Level Estimation for U.S. Wind Gusts. J. Am. Stat. Assoc. 114(528), 1865–1879.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.