Convergence guarantees for forward gradient descent in the linear regression model (2309.15001v2)
Abstract: Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient-free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If d denotes the number of parameters and k the number of samples, we prove that the mean squared error of this method converges for $k\gtrsim d2\log(d)$ with rate $d2\log(d)/k.$ Compared to the dimension dependence d for stochastic gradient descent, an additional factor $d\log(d)$ occurs.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.