High-dimensional forecasting with known knowns and known unknowns (2401.14582v2)

Published 26 Jan 2024 in econ.EM

Abstract: Forecasts play a central role in decision making under uncertainty. After a brief review of the general issues, this paper considers ways of using high-dimensional data in forecasting. We consider selecting variables from a known active set, known knowns, using Lasso and OCMT, and approximating unobserved latent factors, known unknowns, by various means. This combines both sparse and dense approaches. We demonstrate the various issues involved in variable selection in a high-dimensional setting with an application to forecasting UK inflation at different horizons over the period 2020q1-2023q1. This application shows both the power of parsimonious models and the importance of allowing for global variables.

References (24)

Summary

The paper introduces a dual-method framework combining Lasso and OCMT to enhance forecasting accuracy in high-dimensional economic data.
It effectively integrates sparse and dense modeling techniques to capture both observed predictors and latent factors.
Empirical results reveal that incorporating international economic signals notably improves forecast performance and reduces overfitting risks.

High-Dimensional Forecasting with Known Knowns and Known Unknowns

This paper, authored by Pesaran and Smith, addresses the challenges and methodologies of leveraging high-dimensional data in forecasting, with a particular focus on economic applications such as predicting UK inflation. By synthesizing both sparse and dense modeling approaches, it seeks to enhance forecasting accuracy amidst the increasing complexity of global economic factors and data availability.

The paper distinguishes between "known knowns" and "known unknowns" to frame the high-dimensional forecasting problem. In this context, "known knowns" refer to a pre-defined set of variables from which relevant predictors are selected through techniques like Lasso and OCMT, whereas "known unknowns" involve the challenge of approximating latent factors not directly observed but inferred using dense methods such as principal components.

A significant contribution of the paper is its dual methodological introduction of Lasso and OCMT. Lasso employs a penalty technique that facilitates variable selection by shrinking some coefficients to zero, thereby discarding less-informative predictors. Despite its simplicity, Lasso's efficacy depends heavily on tuning parameters and assumptions about correlation structures, as reflected in its reliance on the Irrepresentable Condition (IRC).

Complementarily, OCMT (One Covariate at a time, Multiple Testing) provides an inferential approach to variable selection, advancing beyond IRC limitations by individually testing covariates for significance. OCMT incorporates false discovery rate controls to ensure robustness in the presence of large numbers of potential predictors, outperforming Lasso particularly when multicollinearity is pronounced.

The authors also explore GOCMT, an extension of OCMT, which enhances selection procedures by integrating principal components to account for underlying latent structures, thereby marrying the strengths of sparse and dense techniques.

Empirically, the paper applies these methods to forecast quarterly UK inflation, covering various forecast horizons. The empirical analysis underscores the value of integrating global factors, such as non-UK inflation rates, which substantially affect domestic economic conditions. Across different methods and periods, ARX models, incorporating both UK and foreign inflation, demonstrate superior performance in terms of RMSFE compared to simpler autoregressive models.

The paper highlights the potential pitfalls of overly complex models and variable overfitting that can sometimes occur with Lasso, particularly when conditioned on pre-selected variables. The results emphasize that while advanced machine learning techniques are powerful, their utility may be limited by the particular economic structure and the forecasting context in question.

The findings have practical implications for policymakers, suggesting they accommodate international economic interactions and latent global factors in their decision-support tools. Theoretically, this work calls for a nuanced understanding of model specification in high-dimensional environments, advocating for adaptive approaches that balance the strengths of sparse and dense modeling techniques.

In conclusion, this paper contributes to the field of econometrics by providing a comprehensive framework for high-dimensional forecasting within complex, interconnected economic systems. The fusion of traditional econometric methods with contemporary machine-learning techniques offers invaluable insights for improving forecast accuracy in the volatile global economic landscape. As computational power continues to grow and data becomes more abundant, future research may focus on refining these approaches, particularly in the face of parameter instability and structural economic shifts.

Related Papers

Tweets

https://twitter.com/HPesaran/status/1755421123318329803

https://twitter.com/Chaay/status/1783207639994019952

https://twitter.com/HPesaran/status/1776231603125240293

https://twitter.com/eBlogs/status/1751808210921538025

https://twitter.com/CapivaraMarket/status/1751944582105796980