Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the computational and statistical complexity of over-parameterized matrix sensing (2102.02756v1)

Published 27 Jan 2021 in cs.LG and stat.ML

Abstract: We consider solving the low rank matrix sensing problem with Factorized Gradient Descend (FGD) method when the true rank is unknown and over-specified, which we refer to as over-parameterized matrix sensing. If the ground truth signal $\mathbf{X}* \in \mathbb{R}{d*d}$ is of rank $r$, but we try to recover it using $\mathbf{F} \mathbf{F}\top$ where $\mathbf{F} \in \mathbb{R}{d*k}$ and $k>r$, the existing statistical analysis falls short, due to a flat local curvature of the loss function around the global maxima. By decomposing the factorized matrix $\mathbf{F}$ into separate column spaces to capture the effect of extra ranks, we show that $|\mathbf{F}t \mathbf{F}_t - \mathbf{X}*|{F}2$ converges to a statistical error of $\tilde{\mathcal{O}} ({k d \sigma2/n})$ after $\tilde{\mathcal{O}}(\frac{\sigma_{r}}{\sigma}\sqrt{\frac{n}{d}})$ number of iterations where $\mathbf{F}t$ is the output of FGD after $t$ iterations, $\sigma2$ is the variance of the observation noise, $\sigma{r}$ is the $r$-th largest eigenvalue of $\mathbf{X}*$, and $n$ is the number of sample. Our results, therefore, offer a comprehensive picture of the statistical and computational complexity of FGD for the over-parameterized matrix sensing problem.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jiacheng Zhuo (9 papers)
  2. Jeongyeol Kwon (20 papers)
  3. Nhat Ho (126 papers)
  4. Constantine Caramanis (91 papers)
Citations (29)

Summary

We haven't generated a summary for this paper yet.