Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Adaptive Stochastic Nesterov Accelerated Quasi Newton Method for Training RNNs (1909.03620v1)

Published 9 Sep 2019 in cs.LG and stat.ML

Abstract: A common problem in training neural networks is the vanishing and/or exploding gradient problem which is more prominently seen in training of Recurrent Neural Networks (RNNs). Thus several algorithms have been proposed for training RNNs. This paper proposes a novel adaptive stochastic Nesterov accelerated quasiNewton (aSNAQ) method for training RNNs. The proposed method aSNAQ is an accelerated method that uses the Nesterov's gradient term along with second order curvature information. The performance of the proposed method is evaluated in Tensorflow on benchmark sequence modeling problems. The results show an improved performance while maintaining a low per-iteration cost and thus can be effectively used to train RNNs.

Citations (3)

Summary

We haven't generated a summary for this paper yet.