Invariant Subspace Decomposition
Abstract: We consider the task of predicting a response Y from a set of covariates X in settings where the conditional distribution of Y given X changes over time. For this to be feasible, assumptions on how the conditional distribution changes over time are required. Existing approaches assume, for example, that changes occur smoothly over time so that short-term prediction using only the recent past becomes feasible. To additionally exploit observations further in the past, we propose a novel invariance-based framework for linear conditionals, called Invariant Subspace Decomposition (ISD), that splits the conditional distribution into a time-invariant and a residual time-dependent component. As we show, this decomposition can be utilized both for zero-shot and time-adaptation prediction tasks, that is, settings where either no or a small amount of training data is available at the time points we want to predict Y at, respectively. We propose a practical estimation procedure, which automatically infers the decomposition using tools from approximate joint matrix diagonalization. Furthermore, we provide finite sample guarantees for the proposed estimator and demonstrate empirically that it indeed improves on approaches that do not use the additional invariant structure.
- P. Bühlmann and N. Meinshausen. Magging: maximin aggregation for inhomogeneous large-scale data. Proceedings of the IEEE, 104(1):126–135, 2015.
- L. De Lathauwer. Decompositions of a higher-order tensor in block terms—part ii: Definitions and uniqueness. SIAM Journal on Matrix Analysis and Applications, 30(3):1033–1066, 2008.
- J. Durbin and S. J. Koopman. Time series analysis by state space methods, volume 38. OUP Oxford, 2012.
- J. Fan and W. Zhang. Statistical methods with varying coefficient models. Statistics and its Interface, 1(1):179, 2008.
- C. Févotte and F. J. Theis. Pivot selection strategies in jacobi joint block-diagonalization. In International Conference on Independent Component Analysis and Signal Separation, pages 177–184. Springer, 2007.
- Uniqueness of linear factorizations into independent subspaces. Journal of Multivariate Analysis, 112:48–62, 2012.
- T. Hastie and R. Tibshirani. Varying-coefficient models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 55(4):757–779, 1993.
- Matrix analysis. Cambridge university press, 2012.
- Domain adaptation by using causal inference to predict invariant conditional distributions. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 10846–10856. Curran Associates, Inc., 2018.
- N. Meinshausen and P. Bühlmann. Maximin effects in inhomogeneous large-scale data. The Annals of Statistics, 43(4):1801–1830, 2015.
- J. Mourtada. Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices. The Annals of Statistics, 50(4):2157–2178, 2022.
- A numerical algorithm for block-diagonal decomposition of matrix-algebras with application to semidefinite programming. Japan Journal of Industrial and Applied Mathematics, 27(1):125–160, 2010.
- D. Nion. A tensor framework for nonunitary joint block diagonalization. IEEE Transactions on Signal Processing, 59(10):4585–4594, 2011.
- Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society Series B: Statistical Methodology, 78(5):947–1012, 2016.
- Invariant causal prediction for sequential data. Journal of the American Statistical Association, 114(527):1264–1276, 2019a.
- Robustifying independent component analysis by adjusting for group-wise stationary noise. Journal of Machine Learning Research, 20(147):1–50, 2019b.
- Stabilizing variable selection and regression. The Annals of Applied Statistics, 15(3):1220–1246, 2021.
- Causal transfer in machine learning. Journal of Machine Learning Research, 19(36):1–34, 2018.
- J. R. Schott. Matrix analysis for statistics. John Wiley & Sons, 2016.
- Domain adaptation with invariant representation learning: What transformations to learn? In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 24791–24803. Curran Associates, Inc., 2021.
- Correlation alignment for unsupervised domain adaptation. In Domain Adaptation in Computer Vision Applications, pages 153–171. Springer, 2017.
- P. Tichavsky and Z. Koldovsky. Algorithms for nonorthogonal approximate joint block-diagonalization. In 2012 Proceedings of the 20th European signal processing conference (EUSIPCO), pages 2094–2098. IEEE, 2012.
- P. Tichavsky and A. Yeredor. Fast approximate joint diagonalization incorporating weight matrices. IEEE Transactions on Signal Processing, 57(3):878–891, 2008.
- On computation of approximate joint block-diagonalization using ordinary ajd. In International Conference on Latent Variable Analysis and Signal Separation, pages 163–171. Springer, 2012.
- On learning invariant representations for domain adaptation. In International conference on machine learning, pages 7523–7532. PMLR, 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.