Universality of training trajectories in high-dimensional logistic regression

Investigate whether the optimization trajectories (e.g., gradient-descent learning curves and convergence behavior) for penalized logistic regression in the proportional high-dimensional regime are universal across input data distributions, including heavy-tailed and uniform cases, under the data augmentation settings analyzed in the paper. Specifically, ascertain whether differences in required learning rates and non-convergent behavior observed for certain distributions indicate a lack of universality in training trajectories despite the proven universality of global minima.

Background

The paper proves dependent Gaussian universality for the training and test risks and introduces a dependent CGMT framework that characterizes global minima, including applications to data augmentation. While the universality results establish equivalence of risks across distributions, they do not address the dynamics of optimization during training.

In simulations, the authors observed that identical learning rates do not yield convergence for some distributions (e.g., t-distribution with 3 degrees of freedom and uniform) without adjustment, suggesting distribution-specific optimization behavior. They conjecture this reflects a lack of universality in training trajectories, raising a concrete question about the universality of optimization paths under different methods.

References

We find that for these three setups, ${\rm LR}=0.1$ does not lead to convergence within $10^5$ steps. We conjecture that this arises due to the lack of universality of the training trajectories, as illustrated in \Cref{fig:train:trajectory} and as discussed towards the end of \Cref{sec:DA}.

— Universality of High-Dimensional Logistic Regression and a Novel CGMT under Dependence with Applications to Data Augmentation (2502.15752 - Mallory et al., 10 Feb 2025) in Appendix, Section "Simulation Details"

Universality of training trajectories in high-dimensional logistic regression

Background

References

Related Problems