Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees (2305.03938v2)

Published 6 May 2023 in math.OC, cs.LG, and stat.ML

Abstract: In this paper, we present a comprehensive study on the convergence properties of Adam-family methods for nonsmooth optimization, especially in the training of nonsmooth neural networks. We introduce a novel two-timescale framework that adopts a two-timescale updating scheme, and prove its convergence properties under mild assumptions. Our proposed framework encompasses various popular Adam-family methods, providing convergence guarantees for these methods in training nonsmooth neural networks. Furthermore, we develop stochastic subgradient methods that incorporate gradient clipping techniques for training nonsmooth neural networks with heavy-tailed noise. Through our framework, we show that our proposed methods converge even when the evaluation noises are only assumed to be integrable. Extensive numerical experiments demonstrate the high efficiency and robustness of our proposed methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Nachuan Xiao (20 papers)
  2. Xiaoyin Hu (10 papers)
  3. Xin Liu (821 papers)
  4. Kim-Chuan Toh (111 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.