Functional Mixtures-of-Experts (2202.02249v2)
Abstract: We consider the statistical analysis of heterogeneous data for prediction in situations where the observations include functions, typically time series. We extend the modeling with Mixtures-of-Experts (ME), as a framework of choice in modeling heterogeneity in data for prediction with vectorial observations, to this functional data analysis context. We first present a new family of ME models, named functional ME (FME) in which the predictors are potentially noisy observations, from entire functions. Furthermore, the data generating process of the predictor and the real response, is governed by a hidden discrete variable representing an unknown partition. Second, by imposing sparsity on derivatives of the underlying functional parameters via Lasso-like regularizations, we provide sparse and interpretable functional representations of the FME models called iFME. We develop dedicated expectation--maximization algorithms for Lasso-like (EM-Lasso) regularized maximum-likelihood parameter estimation strategies to fit the models. The proposed models and algorithms are studied in simulated scenarios and in applications to two real data sets, and the obtained results demonstrate their performance in accurately capturing complex nonlinear relationships and in clustering the heterogeneous regression data.
- The dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 35(6):2313–2351.
- Regularized maximum likelihood estimation and feature selection in mixtures-of-experts models. Journal de la Société Française de Statistique, 160(1):57–85.
- Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models. In Research School on Statistics and Data Science, pages 42–56. Springer.
- Model-based clustering and classification of functional data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(4):e1298.
- Functional response models. Statistica Sinica, 14(3):675–693.
- Wavelet-based scalar-on-function finite mixture regression models. Computational Statistics & Data Analysis, 93:86 – 96.
- Maximum likelihood from incomplete data via the EM algorithm. Journal of The Royal Statistical Society, B, 39(1):1–38.
- Devijver, E. (2017). Model-based clustering for high-dimensional data. Application to functional data. Advances in Data Analysis and Classification, 11:243–279.
- Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics). Springer-Verlag, Berlin, Heidelberg.
- Algorithms for fitting the constrained lasso. Journal of Computational and Graphical Statistics, 27(4):861–871. PMID: 30618485.
- Penalized functional regression. Journal of Computational and Graphical Statistics, 20(4):830–851.
- Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. Journal of the Royal Statistical Society: Series C (Applied Statistics), 61(3):453–469.
- A mixture of experts model for rank data with applications in election studies. The Annals of Applied Statistics, 2(4):1452 – 1477.
- Statistical learning with sparsity: the lasso and generalizations. Chapman and Hall/CRC.
- Estimation and feature selection in mixtures of generalized linear experts models. arXiv preprint arXiv:1810.12161.
- Adaptive mixtures of local experts. Neural Computation, 3(1):79–87.
- Functional data clustering: a survey. Advances in Data Analysis and Classification, 8(3):231–255.
- Model-based clustering for multivariate functional data. Computational Statistics & Data Analysis, 71:92–106.
- James, G. M. (2002). Generalized linear models with functional predictor variables. Journal of the Royal Statistical Society Series B, 64:411–432.
- Functional linear discriminant analysis for irregularly sampled curves. Journal of the Royal Statistical Society Series B, 63:533–550.
- Clustering for sparsely sampled functional data. Journal of the American Statistical Association, 98(462).
- Functional linear regression that’s interpretable. Annals of Statistics, 37(5A):2083–2108.
- On the identifiability of mixtures-of-experts. Neural Networks, 12(9):1253–1258.
- Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6:181–214.
- Khalili, A. (2010). New estimation and feature selection methods in mixture-of-experts models. Canadian Journal of Statistics, 38(4):519–539.
- Simultaneous curve registration and clustering for functional data. Computational Statistics and Data Analysis, 53(4):1361–1376.
- The EM algorithm and extensions. New York: Wiley, second edition.
- Finite Mixture Models. New York: Wiley.
- Mixture of gaussian regressions model with logistic weights, a penalized maximum likelihood approach. Electronic Journal of Statistics, 8(1):1661–1695.
- Multinomial functional regression with wavelets and lasso penalization. Econometrics and Statistics, 150–166.
- Functional logistic regression: a comparison of three methods. Journal of Statistical Computation and Simulation, 88(2):250–268.
- Generalized functional linear models. Annals of Statistics, 33(2):774–805.
- Practical and theoretical aspects of mixture-of-experts modeling: An overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, pages e1246–n/a.
- Approximation results regarding the multiple-output Gaussian gated mixture of linear experts model. Neurocomputing, 366:208–214.
- Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models. Journal of Statistical Distributions and Applications, 8(1):13.
- A non-asymptotic model selection in block-diagonal mixture of polynomial experts models. arXiv:2104.08959.
- A non-asymptotic penalization criterion for model selection in mixture of experts models. arXiv preprint arXiv:2104.02640.
- An l_1𝑙_1l\_1italic_l _ 1-oracle inequality for the lasso in mixture-of-experts regression models. arXiv preprint arXiv:2009.10622.
- Functional graphical models. Journal of the American Statistical Association, 114(525):211–222.
- Applied Functional Data Analysis: Methods and Case Studies. Springer Series in Statistics. Springer, Berlin, Heidelberg.
- Functional Data Analysis. Springer Series in Statistics. Springer.
- Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2):461–464.
- ℓℓ\ellroman_ℓ1-penalization for mixture regression models. Test, 19(2):209–256.
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1):267–288.
- Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1):91–108.
- Wu, C. F. J. (1983). On the convergence properties of the em algorithm. The Annals of Statistics, 11(1):95–103.
- An alternative model for mixtures of experts. Advances in neural information processing systems, 7:633–640.
- Functional mixture regression. Biostatistics, 12(2):341–353.
- Twenty years of mixture of experts. IEEE Trans. Neural Netw. Learning Syst., 23(8):1177–1193.
- Non-asymptotic adaptive prediction in functional linear models. Journal of Multivariate Analysis, 143:208–232.