Dimension Reduction and MARS (2302.05790v2)
Abstract: The multivariate adaptive regression spline (MARS) is one of the popular estimation methods for nonparametric multivariate regressions. However, as MARS is based on marginal splines, to incorporate interactions of covariates, products of the marginal splines must be used, which leads to an unmanageable number of basis functions when the order of interaction is high and results in low estimation efficiency. In this paper, we improve the performance of MARS by using linear combinations of the covariates which achieve sufficient dimension reduction. The special basis functions of MARS facilitate calculation of gradients of the regression function, and estimation of the linear combinations is obtained via eigen-analysis of the outer-product of the gradients. Under some technical conditions, the asymptotic theory is established for the proposed estimation method. Numerical studies including both simulation and empirical applications show its effectiveness in dimension reduction and improvement over MARS and other commonly-used nonparametric methods in regression estimation and prediction.
- Is rotation forest the best classifier for problems with continuous features? arXiv preprint arXiv:1809.06705, 2018.
- Covariance regularization by thresholding. The Annals of Statistics, 36(6):2577–2604, 2008.
- Random rotation ensembles. Journal of Machine Learning Research, 17(1):126–151, 2016.
- Richard C Bradley. Basic properties of strong mixing conditions. a survey and some open questions. Probability Surveys, 2:107–144, 2005.
- Leo Breiman. Random forests. Machine Learning, 45:5–32, 2001.
- An outer-product-of-gradient approach to dimension reduction and its application to classification in high dimensional space. Journal of the American Statistical Association, forthcoming, 2022.
- Random-projection ensemble classification. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(4):959–1035, 2017.
- Coordinate-independent sparse sufficient dimension reduction and variable selection. The Annals of Statistics, 38(6):3696–3723, 2010.
- Dimension reduction for conditional mean in regression. The Annals of Statistics, 30(2):455–474, 2002.
- Determining the dimension of iterative hessian transformation. The Annals of Statistics, 32(6):2501–2531, 2004.
- Support-vector networks. Machine Learning, 20:273–297, 1995.
- Misc functions of the department of statistics (e1071), tu wien. R package, 1:5–24, 2008.
- Semiparametric estimates of the relation between weather and electricity sales. Journal of the American Statistical Association, 81(394), 1986.
- Local Polynomial Modelling and Its Applications. Chapman & Hall/CRC, 1996.
- Conditional variance estimator for sufficient dimension reduction. Bernoulli, 28(3):1862–1891, 2022.
- Jerome H Friedman. Multivariate adaptive regression splines. The Annals of Statistics, 19(1):1–67, 1991.
- Gradient-based kernel dimension reduction for regression. Journal of the American Statistical Association, 109(505):359–370, 2014.
- Optimal smoothing in single-index models. The Annals of Statistics, 21(1):157–178, 1993.
- Generalized additive models. Statistical Science, 1(3):297–310, 1986.
- Varying-coefficient models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 55(4):757–796, 1993.
- The Elements of Statistical Learning: Data mining, Inference, and Prediction, volume 2. Springer, 2009.
- Jeffrey M Hausdorff. Gait dynamics, fractals and falls: finding meaning in the stride-to-stride fluctuations of human walking. Human Movement Science, 26(4):555–589, 2007.
- Jianhua Z Huang. Local asymptotics for polynomial spline regression. The Annals of Statistics, 31(5):1600–1635, 2003.
- Detrended fluctuation analysis and adaptive fractal analysis of stride time data in parkinson’s disease: stitching together short gait trials. PloS One, 9(1):e85787, 2014.
- Ker-Chau Li. Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414):316–327, 1991.
- Classification and regression by randomforest. R news, 2(3):18–22, 2002.
- Wei Lin. The Econometric Analysis of Interval-Valued Data and Adaptive Regression Splines. PhD thesis, UC Riverside, 2013.
- On efficient dimension reduction with respect to a statistical functional of interest. The Annals of Statistics, 42(1):382–412, 2014.
- On estimation efficiency of the central mean subspace. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(5):885–901, 2014.
- earth: Multivariate adaptive regression splines. R package version, 5(2), 2017.
- Charles J Stone. Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10(4):1040–1053, 1982.
- Charles J Stone. Large-sample inference for log-spline models. The Annals of Statistics, 18(2):717–741, 1990.
- Charles J Stone. Asymptotics for doubly flexible logspline response models. The Annals of Statistics, 19(4):1832–1854, 1991.
- Polynomial splines and their tensor products in extended linear modeling. The Annals of Statistics, 25(4):1371–1425, 1997.
- Joel A. Tropp. User-friendly tail bounds for sums of random matrices. Foundations of Computational Mathematics, 12:389–434, 2012.
- Variable selection and estimation for semi-parametric multiple-index models. Bernoulli, 21(1):242–275, 2015.
- S. Weisberg. Dimension reduction regression in r. Journal of Statistical Software, 7(1):1–22, 2002.
- Yingcun Xia. A multiple-index model and dimension reduction. Journal of the American Statistical Association, 103(484):1631–1640, 2008.
- An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3):363–410, 2002.
- On stein’s identity and near-optimal estimation in high-dimensional index models. arXiv preprint arXiv:1709.08795, 2017.
- Sufficient dimension reduction based on an ensemble of minimum average variance estimators. The Annals of Statistics, 39(6):3392–3416, 2011.
- A useful variant of the davis–kahan theorem for statisticians. Biometrika, 102(2):315–323, 2015.
- Local asymptotics for regression splines and confidence regions. The Annals of Statistics, 26(5):1760–1782, 1998.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.