Quasi Maximum Likelihood Estimation of High-Dimensional Factor Models: A Critical Review (2303.11777v5)
Abstract: We review Quasi Maximum Likelihood estimation of factor models for high-dimensional panels of time series. We consider two cases: (1) estimation when no dynamic model for the factors is specified (Bai and Li, 2012, 2016); (2) estimation based on the Kalman smoother and the Expectation Maximization algorithm thus allowing to model explicitly the factor dynamics (Doz et al., 2012, Barigozzi and Luciani, 2019). Our interest is in approximate factor models, i.e., when we allow for the idiosyncratic components to be mildly cross-sectionally, as well as serially, correlated. Although such setting apparently makes estimation harder, we show, in fact, that factor models do not suffer of the {\it curse of dimensionality} problem, but instead they enjoy a {\it blessing of dimensionality} property. In particular, given an approximate factor structure, if the cross-sectional dimension of the data, $N$, grows to infinity, we show that: (i) identification of the model is still possible, (ii) the mis-specification error due to the use of an exact factor model log-likelihood vanishes. Moreover, if we let also the sample size, $T$, grow to infinity, we can also consistently estimate all parameters of the model and make inference. The same is true for estimation of the latent factors which can be carried out by weighted least-squares, linear projection, or Kalman filtering/smoothing. We also compare the approaches presented with: Principal Component analysis and the classical, fixed $N$, exact Maximum Likelihood approach. We conclude with a discussion on efficiency of the considered estimators.
- Ahn, S. C. and A. R. Horenstein (2013). Eigenvalue ratio test for the number of factors. Econometrica 81, 1203–1227.
- Using principal component analysis to estimate a high dimensional factor model with high-frequency data. Journal of Econometrics 201, 384–399.
- Akaike, H. (1974). Stochastic theory of minimal realization. IEEE Transactions on Automatic Control 19(6), 667–674.
- Improved penalization for determining the number of factors in approximate static factor models. Statistics and Probability Letters 80, 1806–1813.
- Anchoring the yield curve using survey expectations. Journal of Applied Econometrics 32, 1055–1068.
- The asymptotic distributions of some estimators for a factor analysis model. Journal of Multivariate Analysis 22, 51–64.
- Anderson, B. D. O. and M. Deistler (2008). Generalized linear dynamic factor models-A structure theory. In Proceedings of the 47th IEEE Conference on Decision and Control, pp. 1980–1985.
- Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Dover Publications, Inc.
- Anderson, T. W. (2003). An Inrtoduction to Multivariate Statistical Analysis. Wiley series in Probability and Mathematical Statistics.
- The asymptotic normal distribution of estimators in factor analysis under general conditions. The Annals of Statistics 16, 759–771.
- Statistical inference in factor analysis. In Proceedings of the third Berkeley symposium on mathematical statistics and probability, Volume 5, pp. 111–150.
- Andrews, D. W. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59, 817–858.
- Limit theorems for distributions invariant under groups of transformations. The Annals of Statistics 50, 1960–1991.
- Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica 71, 135–171.
- Statistical analysis of factor models of high dimension. The Annals of Statistics 40, 436–465.
- Maximum likelihood estimation and inference for approximate factor models of high dimension. The Review of Economics and Statistics 98, 298–309.
- Efficient estimation of approximate factor models via penalized maximum likelihood. Journal of Econometrics 191, 1–18.
- Determining the number of factors in approximate factor models. Econometrica 70, 191–221.
- Confidence intervals for diffusion index forecasts and inference for factor augmented regressions. Econometrica 74, 1133–1150.
- Principal components estimation and identification of static factors. Journal of Econometrics 176, 18–29.
- Simpler proofs for approximate factor models of large dimensions. Technical Report arXiv:2008.00254.
- Approximate factor models with weaker loadings. Technical Report arXiv:2109.03773.
- Conditional forecasts and scenario analysis with vector autoregressions for large cross-sections. International Journal of Forecasting 31, 739–756.
- Now-casting and the real-time data flow. In Handbook of economic forecasting, Volume 2, pp. 195–237. Elsevier.
- Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. Journal of Applied Econometrics 29, 133–160.
- Barigozzi, M. (2022). On estimation and inference of large approximate dynamic factor models via the principal component analysis. Technical report. arXiv:2211.01921.v3.
- Barigozzi, M. (2023). Asymptotic equivalence of principal component and quasi maximum likelihood estimators in large approximate factor models. Technical report. arXiv:2307.09864.v2.
- Do euro area countries respond asymmetrically to the common monetary policy? Oxford Bulletin of Economics and Statistics 76, 693–714.
- Factoring in the micro: A transaction-level dynamic factor approach to the decomposition of export volatility. LEM working papers 2021/22, Institute of Economics Sant’Anna School of Advanced Studies.
- Generalized dynamic factor models and volatilities: Consistency, rates, and prediction intervals. Journal of Econometrics 216, 4–34.
- Dynamic factor models: A genealogy. In N. Ngoc Thach, N. D. Trung, D. T. Ha, and K. V. (Eds.), Partial Identification in Econometrics and Related Topics. Studies in Systems, Decision and Control. Springer.
- Large-dimensional dynamic factor models: Estimation of impulse-response functions with I(1)𝐼1I(1)italic_I ( 1 ) cointegrated factors. Journal of Econometrics 221, 455–482.
- EA-MD-QD: Large Euro Area and Euro member countries datasets for macroeconomic research. Technical report. Zenodo, doi/10.5281/zenodo.11093292.
- Quasi maximum likelihood estimation and inference of large approximate dynamic factor models via the EM algorithm. Technical Report arXiv:1910.03821.v4.
- Measuring the output gap using large datasets. The Review of Economics and Statistics 105, 1500–1514.
- Identifying the independent sources of consumption variation. Journal of Applied Econometrics 31, 420–449.
- Multidimensional dynamic factor models. Technical report. arXiv:2301.12499.
- Latent variable models and factor analysis: A unified approach. John Wiley & Sons.
- Bartlett, M. S. (1937). The statistical conception of mental factors. British Journal of Psychology 28, 97–104.
- Bartlett, M. S. (1938). Methods of estimating mental factors. Nature 141(3570), 609–610.
- Measuring the effects of monetary policy: A Factor-Augmented Vector Autoregressive (FAVAR) approach. The Quarterly Journal of Economics 120, 387–422.
- Bolthausen, E. (1982). On the central limit theorem for stationary mixing random fields. The Annals of Probability 10, 1047–1050.
- Bolthausen, E. (1984). An estimate of the remainder in a combinatorial central limit theorem. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 66, 379–386.
- GLS estimation of dynamic factor models. Journal of the American Statistical Association 106, 1150–1166.
- Chamberlain, G. (1983). Funds,factorsanddiversificationinarbitragepric- ing models. Econometrica 51, 1305–1323.
- Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51, 1281–1304.
- Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 21, C1–C68.
- Choi, I. (2012). Efficient estimation of factor models. Econometric Theory 28, 274–308.
- Cochrane, D. and G. H. Orcutt (1949). Application of least squares regression to relationships containing auto-correlated error terms. Journal of the American statistical association 44, 32–61.
- The common and specific components of dynamic volatility. Journal of Econometrics 132, 231–255.
- Unspanned macroeconomic factors in the yield curve. Journal of Business and Economic Statistics 34, 472–485.
- Comparing alternative predictors based on large-panel factor models. Oxford Bulletin of Economics and Statistics 74, 306–326.
- De Jong, P. (1989). Smoothing and interpolation with the state-space model. Journal of the American Statistical Association 84, 1085–1088.
- Forecasting using a large number of predictors: Is bayesian shrinkage a valid alternative to principal components? Journal of Econometrics 146, 318–328.
- Common factors of commodity prices. Journal of Applied Econometrics. forthcoming.
- Efficient matrix approach for classical inference in state space models. Economics Letters 181, 22–27.
- Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 39, 1–38.
- A two-step estimator for large approximate dynamic factor models based on Kalman filtering. Journal of Econometrics 164, 188–205.
- A quasi maximum likelihood approach for large approximate dynamic factor models. The Review of Economics and Statistics 94(4), 1014–1024.
- Duncan, D. B. and S. D. Horn (1972). Linear dynamic recursive estimation from the viewpoint of regression analysis. Journal of the American Statistical Association 67, 815–821.
- Dunsmuir, W. (1979). A central limit theorem for parameter estimation in stationary vector time series and its application to models for a signal observed with noise. The Annals of Statistics 7, 490–506.
- Durbin, J. and S. J. Koopman (2012). Time Series Analysis by State Space Methods. Oxford University Press.
- Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75, 603–680.
- A spectral EM algorithm for dynamic factor models. Journal of Econometrics 205, 249–279.
- The dynamic effects of monetary policy: A structural factor model approach. Journal of Monetary Economics 57, 203–216.
- No news in business cycles. Economic Journal 124, 1168–1191.
- Opening the black box: Structural factor models versus structural VARs. Econometric Theory 25, 1319–1347.
- The Generalized Dynamic Factor Model: Identification and estimation. The Review of Economics and Statistics 82, 540–554.
- The Generalized Dynamic Factor Model: Representation theory. Econometric Theory 17, 1113–1141.
- Freyaldenhoven, S. (2022). Factor models with local factors – determining the number of relevant factors. Journal of Econometrics 229, 80–102.
- A note on statistical analysis of factor models of high dimension. Science China Mathematics 64, 1905–1916.
- Geweke, J. (1977). The dynamic factor analysis of economic time series. pp. 365–383. North-Holland.
- Geweke, J. F. and K. J. Singleton (1981). Latent variable models for time series: A frequency domain approach with an application to the permanent income hypothesis. Journal of Econometrics 17, 287–304.
- Parameter estimation for linear dynamical systems. Technical report, Cambridge University. mimeo.
- Money, credit, monetary policy and the business cycle in the euro area: what has changed since the crisis? International Journal of Central Banking 15, 137–173.
- Exploiting the monthly data flow in structural forecasting. Journal of Monetary Economics 84, 201–215.
- Monetary policy in real time. In M. Gertler and K. Rogoff (Eds.), NBER Macroeconomics Annual 2004. MIT Press.
- Tracking Greenspan: Systematic and nonsystematic monetary policy revisited. Discussion papers 3550, CEPR.
- Nowcasting: The real-time informational content of macroeconomic data. Journal of Monetary Economics 55, 665–676.
- Factor models in high-dimensional time series? A time-domain approach. Stochastic Processes and their Applications 123, 2678–2695.
- Hannan, E. J. (1970). Multiple time series. John Wiley & Sons.
- The statistical theory of linear systems. SIAM.
- Harvey, A. C. (1990). Forecasting, structural time series models and the Kalman filter. Cambridge University Press.
- Estimation procedures for structural time series models. Journal of Forecasting 9, 89–108.
- Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology 24, 417.
- Ibragimov, I. A. (1962). Some limit theorems for stationary processes. Theory of Probability and its Applications 7, 349–382.
- Jolliffe, I. T. (2002). Principal component analysis. Springer.
- Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika 34, 183–202.
- Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347–387.
- Jungbacker, B. and S. J. Koopman (2015). Likelihood-based dynamic factor analysis for measurement and forecasting. Econometrics Journal 18, C1–C21.
- Maximum likelihood estimation for dynamic factor models with missing data. Journal of Economic Dynamics and Control 35, 1358–1368.
- Smooth dynamic factor analysis with application to the US term structure of interest rates. Journal of Applied Econometrics 29, 65–90.
- Speculation in the oil market. Journal of Applied Econometrics 30, 1099–1255.
- Kálmán, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering 82, 35–45.
- Factor GARCH-Itô models for high-frequency data with application to large volatility matrix prediction. Journal of Econometrics 208, 395–417.
- Kolmogorov, A. N. (1941). Interpolation und Extrapolation von stationären zufälligen Folgen. Izvestiya Akademii Nauk. Seriya Matematicheskaya 5, 3–14. Translated in “Selected Works of A. N. Kolmogorov”, (1992), A. N. Shiryayev (ed.), Springer.
- Koopman, S. J. and M. van der Wel (2013). Forecasting the us term structure of interest rates using a macroeconomic smooth dynamic factor model. International Journal of Forecasting 29, 676–694.
- Lawley, D. N. (1940). The estimation of factor loadings by the method of maximum likelihood. Proceedings of the Royal Society of Edinburgh 60, 64–82.
- Lawley, D. N. (1942). Further investigations in factor estimation. Proceedings of the Royal Society of Edinburgh Section A: Mathematics 61, 176?185.
- Lawley, D. N. and A. E. Maxwell (1971). Factor Analysis as a Statistical Method. Butterworths, London.
- Theory of point estimation. Springer Science & Business Media.
- Quasi maximum likelihood analysis of high dimensional constrained factor models. Journal of Econometrics 206, 574–612.
- System identification of high-dimensional linear dynamical systems with serially correlated output noise components. IEEE Transactions on Signal Processing 68, 5573–5587.
- A model for daily global stock market returns. Technical Report arXiv:2202.03638.
- Luciani, M. (2015). Monetary policy and the housing market: A structural factor analysis. Journal of Applied Econometrics 30, 199–218.
- Monetary, fiscal and oil shocks: Evidence based on mixed frequency structural FAVARs. Journal of Econometrics 193, 335–348.
- Multivariate analysis.
- A new coincident index of business cycles based on monthly and quarterly series. Journal of Applied Econometrics 18, 427–443.
- The EM algorithm and extensions, Volume 382. John Wiley & Sons.
- Meng, X.-L. and D. B. Rubin (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80, 267–278.
- Meng, X.-L. and D. B. Rubin (1994). On the global and componentwise rates of convergence of the EM algorithm. Linear Algebra and its Applications 199, 413–425.
- Modugno, M. (2013). Now-casting inflation using high frequency data. International Journal of Forecasting 29, 664–675.
- Generalized latent trait models. Psychometrika 65, 391–411.
- Newey, W. K. and K. D. West (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55, 703–708.
- Neyman, J. (1979). C (α𝛼\alphaitalic_α) tests and their use. Sankhyā: The Indian Journal of Statistics, Series A 41, 1–21.
- Likelihood inferences for high-dimensional factor analysis of time series with applications in finance. Journal of Computational and Graphical Statistics 24(3), 866–884.
- Constructing high frequency economic indicators by imputation. The Econometrics Journal 27, C1–C30.
- Onatski, A. (2010). Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics 92, 1004–1016.
- Onatski, A. (2012). Asymptotics of the principal components estimator of large factor models with weakly influential factors. Journal of Econometrics 168, 244–258.
- Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2, 559–572.
- Statistical analysis of sparse approximate factor models. Electronic Journal of Statistics 14, 3315–3365.
- More is not always better: Kalman filtering in dynamic factor models. In S. J. Koopman and N. Shephard (Eds.), Unobserved Components and Time Series Econometrics. Oxford Scholarship Online.
- Quah, D. and T. J. Sargent (1993). A dynamic index model for large cross sections. In J. Stock and M. Watson (Eds.), Business cycles, indicators and forecasting, pp. 285–306. University of Chicago Press.
- Rauch, H. (1963). Solutions to the linear smoothing problem. IEEE Transactions on Automatic Control 8, 371–372.
- Reis, R. and M. W. Watson (2010). Relative goods’ prices, pure inflation, and the Phillips correlation. American Economic Journal Macroeconomics 2, 128–157.
- Rubin, D. B. and D. T. Thayer (1982). EM algorithms for ML factor analysis. Psychometrika 47, 69–76.
- Factor extraction in Dynamic Factor Models: Using Kalman Filter and Principal Components in practice. Foundations and Trends in Econometrics. forthcoming.
- Ruud, P. A. (1991). Extensions of estimation methods using the EM algorithm. Journal of Econometrics 49(3), 305–341.
- Sargent, T. J. and C. A. Sims (1977). Business cycle modeling without pretending to have too much a priori economic theory. In New methods in business cycle research. Federal Reserve Bank of Minneapolis.
- A structure theory for linear dynamic errors-in-variables models. SIAM Journal on Control and Optimization 36, 2148–2175.
- Scott, J. T. (1966). Factor analysis and regression. Econometrica 34, 552–562.
- Shumway, R. H. and D. S. Stoffer (1982). An approach to time series smoothing and forecasting using the EM algorithm. Journal of Time Series Analysis 3, 253–264.
- Solari, M. E. (1969). The “maximum likelihood solution” of the problem of estimating a linear functional relationship. Journal of the Royal Statistical Society: Series B (Methodological) 31, 372–375.
- Spearman, C. (1904). General intelligence objectively determined and measured. American Journal of Psychology 15, 201–293.
- Stock, J. H. and M. W. Watson (1989). New indexes of coincident and leading economic indicators. In O. J. Blanchard and S. Fischer (Eds.), NBER Macroeconomics Annual 1989. MIT press.
- Stock, J. H. and M. W. Watson (1991). A probability model of the coincident economic indicators. In K. Lahiri and G. Moore (Eds.), Leading Economic Indicators: New Approaches and Forecasting Records, pp. 63–90. Cambridge University Press.
- Stock, J. H. and M. W. Watson (2002a). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97, 1167–1179.
- Stock, J. H. and M. W. Watson (2002b). Macroeconomic forecasting using diffusion indexes. Journal of Business and Economic Statistics 20, 147–162.
- Stock, J. H. and M. W. Watson (2016). Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics. In J. B. Taylor and H. Uhlig (Eds.), Handbook of Macroeconomics, Volume 2, pp. 415–525. Elsevier.
- Stone, R. (1945). The analysis of market demand. Journal of the Royal Statistical Society 108, 286–391.
- Sundberg, R. (1974). Maximum likelihood theory for incomplete data from an exponential family. Scandinavian Journal of Statistics 1, 49–58.
- Sundberg, R. (1976). An iterative method for solution of the likelihood equations for incomplete data from exponential families. Communication in Statistics-Simulation and Computation 5, 55–64.
- Sundberg, R. (2019). Statistical modelling by exponential families. Cambridge University Press.
- Exploratory factor analysis—Parameter estimation and scores prediction with high-dimensional data. Journal of Multivariate Analysis 148, 49–59.
- Terada, Y. (2014). Strong consistency of reduced k-means clustering. Scandinavian Journal of Statistics 41, 913–931.
- Thomson, G. H. (1936). Some points of mathematical technique in the factorial analysis of ability. Journal of Educational Psychology 27, 37–54.
- Thomson, G. H. (1951). The factorial analysis of human ability. London University Press.
- Tipping, M. E. and C. M. Bishop (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61, 611–622.
- Trapani, L. (2018). A randomized sequential procedure to determine the number of factors. Journal of the American Statistical Association 113, 1341–1349.
- Inference in sparsity-induced weak factor models. Journal of Business & Economic Statistics, 1–14.
- van der Vaart, A. W. (2000). Asymptotic Statistics. Cambridge University Press.
- On the penalized maximum likelihood estimation of high-dimensional approximate factor model. Computational Statistics 34, 819–846.
- Watson, M. W. and R. F. Engle (1983). Alternative algorithms for the estimation of dynamic factor, mimic and varying coefficients regression models. Journal of Econometrics 23, 385–400.
- White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48, 817–838.
- Whittle, P. (1952). On principal components and least square methods of factor analysis. Scandinavian Actuarial Journal 1952, 223–239.
- Wiener, N. (1949). Extrapolation, Interpolation, and Smoothing of Stationary Time Series. MIT press.
- Wu, J. C. F. (1983). On the convergence properties of the EM algorithm. The Annals of Statistics 11, 95–103.
- Young, G. (1940). Maximum likelihood estimation and factor analysis. Psychometrika 6, 49–53.
- Zadrozny, P. A. (2023). Gaussian maximum likelihood estimation of factor models with unrestricted positive definite disturbance covariance matrices. mimeo, Bureau of Labor Statistics.
- Zaffaroni, P. (2019). Factor models for conditional asset pricing. mimeo, Imperial College London.
- Zhang, D. and W. B. Wu (2021). Convergence of covariance and spectral density estimates for high-dimensional locally stationary processes. The Annals of Statistics 49(1), 233–254.