Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning One-hidden-layer ReLU Networks via Gradient Descent (1806.07808v1)

Published 20 Jun 2018 in stat.ML and cs.LG

Abstract: We study the problem of learning one-hidden-layer neural networks with Rectified Linear Unit (ReLU) activation function, where the inputs are sampled from standard Gaussian distribution and the outputs are generated from a noisy teacher network. We analyze the performance of gradient descent for training such kind of neural networks based on empirical risk minimization, and provide algorithm-dependent guarantees. In particular, we prove that tensor initialization followed by gradient descent can converge to the ground-truth parameters at a linear rate up to some statistical error. To the best of our knowledge, this is the first work characterizing the recovery guarantee for practical learning of one-hidden-layer ReLU networks with multiple neurons. Numerical experiments verify our theoretical findings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xiao Zhang (435 papers)
  2. Yaodong Yu (39 papers)
  3. Lingxiao Wang (74 papers)
  4. Quanquan Gu (198 papers)
Citations (134)

Summary

We haven't generated a summary for this paper yet.