Sharp PCR bounds for random-design linear regression

Develop sharp finite-sample excess risk bounds for principal component regression (PCR) in random-design linear regression under subgaussian covariates and bounded noise variance, to enable rigorous instance-wise comparisons with gradient descent, stochastic gradient descent, and ridge regression.

Background

PCR has been analyzed sharply in fixed-design settings, where dominance results over ridge are known. In random design—the regime central to this paper’s comparisons—tight PCR risk bounds are lacking, hindering direct comparisons to GD, SGD, and ridge.

The authors note that early-stopped GD is entry-wise close to PCR but emphasize that such proximity does not imply similar risks, especially in high dimensions, making a sharp PCR analysis in random design an open need.

References

Another interesting question is how principal component regression (PCR) compares to GD, SGD, and ridge regression for random-design linear regression. While PCR is easy to analyze in the fixed design setting~\citep{dhillon2013risk}, its sharp bound remains unknown in the more interesting random design setting.

— Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization (2509.17251 - Wu et al., 21 Sep 2025) in Concluding remarks, paragraph “Principal component regression”

Sharp PCR bounds for random-design linear regression

Sponsor

Background

References

Related Problems