Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 2: Experiments and Analysis (2006.09192v1)

Published 15 Jun 2020 in cs.LG and stat.ML

Abstract: The existence of local minima for one-hidden-layer ReLU networks has been investigated theoretically in [8]. Based on the theory, in this paper, we first analyze how big the probability of existing local minima is for 1D Gaussian data and how it varies in the whole weight space. We show that this probability is very low in most regions. We then design and implement a linear programming based approach to judge the existence of genuine local minima, and use it to predict whether bad local minima exist for the MNIST and CIFAR-10 datasets, and find that there are no bad differentiable local minima almost everywhere in weight space once some hidden neurons are activated by samples. These theoretical predictions are verified experimentally by showing that gradient descent is not trapped in the cells from which it starts. We also perform experiments to explore the count and size of differentiable cells in the weight space.

Citations (1)

Summary

We haven't generated a summary for this paper yet.