Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Stochastic In-Face Frank-Wolfe Methods for Non-Convex Optimization and Sparse Neural Network Training (1906.03580v1)

Published 9 Jun 2019 in math.OC, cs.LG, and stat.ML

Abstract: The Frank-Wolfe method and its extensions are well-suited for delivering solutions with desirable structural properties, such as sparsity or low-rank structure. We introduce a new variant of the Frank-Wolfe method that combines Frank-Wolfe steps and steepest descent steps, as well as a novel modification of the "Frank-Wolfe gap" to measure convergence in the non-convex case. We further extend this method to incorporate in-face directions for preserving structured solutions as well as block coordinate steps, and we demonstrate computational guarantees in terms of the modified Frank-Wolfe gap for all of these variants. We are particularly motivated by the application of this methodology to the training of neural networks with sparse properties, and we apply our block coordinate method to the problem of $\ell_1$ regularized neural network training. We present the results of several numerical experiments on both artificial and real datasets demonstrating significant improvements of our method in training sparse neural networks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Paul Grigas (23 papers)
  2. Alfonso Lobos (4 papers)
  3. Nathan Vermeersch (1 paper)
Citations (5)

Summary

We haven't generated a summary for this paper yet.