2000 character limit reached
Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey (2304.11461v1)
Published 22 Apr 2023 in cs.LG, cs.CL, cs.NE, cs.SD, and eess.AS
Abstract: This is a tutorial paper on Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), and their variants. We start with a dynamical system and backpropagation through time for RNN. Then, we discuss the problems of gradient vanishing and explosion in long-term dependencies. We explain close-to-identity weight matrix, long delays, leaky units, and echo state networks for solving this problem. Then, we introduce LSTM gates and cells, history and variants of LSTM, and Gated Recurrent Units (GRU). Finally, we introduce bidirectional RNN, bidirectional LSTM, and the Embeddings from LLM (ELMo) network, for processing a sequence in both directions.